Skip to main content

Run an Extraction

Extraction pulls structured data from unstructured documents using AI.

Quick reference

VIDEO PLACEHOLDER

Watch: Running an Extraction See how to extract data from a batch of documents.

  1. Go to Extractions in the sidebar
  2. Click New Extraction
  3. Select an extraction model (or create one)
  4. Select documents from workspace
  5. Click Run (choose AI mode: fast or thorough)

When to use Extraction

Not for large tables

Do not use Extraction if you want to convert a large table (50+ rows) to Excel (e.g., bank statements, general ledgers). Use OCR to Excel instead.

Extraction is designed for specific fields and short lists (under 50 items). For larger tables, accuracy drops significantly.

Use Extraction when you want to pull specific data fields from documents without matching to a transaction list. Common use cases:

  • Contract review — Extract key terms, parties, dates, amounts
  • Invoice processing — Extract invoice data for import
  • Lease analysis — Pull lease terms for IFRS 16
  • Document indexing — Extract metadata from similar documents
Need to match documents to transactions?

If you have an Excel list of transactions to verify against documents, use Test of Details instead.

OCR to Excel

For scanned documents with tables, use OCR to Excel in your Workspace to convert them to spreadsheets first.

See Workspace Tools for details.

Review and validate results

After extraction, review the results alongside the source document. The review process works the same as Test of Details — click on any value to see its grounding (the exact source location in the document), verify, correct if needed, then validate.

For the complete review workflow including grounding, validation, and making corrections, see Reviewing AI Results.

Debugging extraction errors

If extraction gives wrong values, use the OCR overlay in the PDF viewer to check if the error comes from OCR (text recognition) or AI (interpretation). See Extraction gives wrong values for the complete debugging guide.

Batch extraction

For processing many similar documents:

  1. Create or select an appropriate extraction model
  2. Select multiple documents
  3. Run extraction as a batch
  4. Review results in bulk
Processing limits

Extraction processes the first 200 pages of each document. For longer documents, split them before processing.