Skip to content

Document extraction API

Document data your systems can rely on.

Start with a PDF invoice. exdata extracts the fields that matter, keeps the evidence visible, and returns clean JSON for accounting, archive, ERP, and back-office systems.

/api/v1 PDF-first document API
mode: test Sandbox-first rollout
X-Signature Signed webhooks
OpenAPI OpenAPI contract

Why teams use it

Less manual keying. Fewer brittle handoffs.

Teams usually do not need another OCR dump. They need invoice numbers, suppliers, totals, tax rows, due dates, and payment details in the system that runs the workflow.

exdata is built for a controlled path from sample PDFs to production automation: test the result, inspect the evidence, then connect the downstream system.

1

Start with the common case

Process PDF invoices first, then add scans, email attachments, XML, and e-invoices through the same API when the workflow grows.

Explore product
2

Return fields software can trust

Extract document numbers, parties, IBAN/BIC, due dates, totals, VAT IDs, tax rows, cost centers, and references as normalized JSON.

See use cases
3

Keep rollout observable

Test with sample PDFs, inspect the result, and move to production only when polling, webhooks, and mapping are ready.

Review credits

Production workflow

The operating layer around extraction.

Developers get a clean API. Operations teams still get the states, previews, run metadata, blocked reasons, and account controls they need when automation meets real documents.

FormatsMulti-format processing
PDFs first, with scans, images, text, HTML, Word files, XML, EDI-style payloads, and email-based invoice flows available when needed.
TestingSandbox mode
Test-mode tokens exercise the same endpoints, validation, idempotency, webhooks, and error envelope without spending live credits.
ReviewExtraction visibility
See document status, previews, extracted metadata, run versions, failure reasons, and API-shaped JSON in the workspace.

Controlled rollout

For teams that need document automation to be predictable.

Developer teams get a stable API while operations can inspect extraction runs, blocked documents, and source previews.

Developer-owned workflows

Finance and operations teams can move invoice handling into workflows with status, audit context, and JSON outputs.

Operational document pipelines

Account tokens, test mode, usage separation, and top-up controls support cautious rollout before larger production volume.

Controlled integration rollout

Start with one document

Create a token, upload a real sample, and review the JSON your workflow can use.