Automation · LLM Pipeline · Production · Germany

Invoice Automation Pipeline

Invoices from multiple suppliers arrive as PDFs by email in different formats. The manual routine was: find email, download PDF, classify supplier, extract fields, file to Drive, enter into accounting. Now it runs on a schedule — automatically, end-to-end. Built for German business operations: multi-supplier, multi-format, idempotent.

Zero rerunsTwo-level deduplication prevents duplicate records from forwarded attachments
2-layerBaseline parser first, Gemini verification second — works even when API is down
Cron-readyGitHub Actions scheduling, deduplication, status tracking, Telegram summaries
Multi-supplierDifferent PDF formats per supplier handled by the LLM extraction layer

How it works

  • IMAP inbox scan → PDF attachment detection → supplier classification.
  • Baseline regex parser extracts structured fields first (fast, free, reliable).
  • Gemini 2.5 Flash verifies and fills gaps, handles format variations across suppliers.
  • Validated records upserted to Supabase, filed to Google Drive folder structure.
  • Telegram summary with run results and any flagged anomalies after each run.

Why it's built this way

  • Reliability over extraction accuracy — the pipeline never silently fails.
  • Deterministic baseline ensures LLM is a verification layer, not a dependency.
  • Idempotent design: re-running the pipeline never creates duplicates.
  • Production-grade: logged, scheduled, monitored — not a script that runs once.

What changed

  • Weekly manual invoice processing routine completely eliminated.
  • Multi-supplier formats handled automatically — no per-supplier configuration needed.
  • Records available in Supabase and Drive immediately after email arrives.
  • Anomaly detection catches edge cases that previously required manual review.

Your use case

  • Any German business receiving 10+ invoices per week from multiple suppliers can automate this entirely.
  • Works with DATEV-compatible output format for German accounting software.
  • Handles German invoice fields: Steuernummer, USt-IdNr., Rechnungsdatum, Nettobetrag, MwSt.
  • Compliant with GoBD archiving requirements via Google Drive structured storage.

Tools used

TypeScript Node.js Gemini 2.5 Flash Supabase Google Drive API Gmail IMAP GitHub Actions Telegram Bot API
Rechnungen automatisch verarbeiten?

I build invoice and document automation pipelines for German businesses. Multi-supplier, multi-format, GoBD-compatible. Let's talk about your workflow.

Contact me All projects
← Back to projects