Skip to main content

Create & refine Extraction Models

Extraction models define what data to pull from documents.

Field types

TypeUse forExample
TextFree-form textVendor name, description
NumberNumeric valuesAmounts, quantities
DateDate valuesInvoice date, due date
List/TableRepeating itemsLine items on an invoice

Extracting lists (line items)

The List/Table field type extracts repeating data from a single document — like invoice line items, transaction rows, or inventory lists.

How list extraction works

Define a list field with its columns (sub-fields):

FieldColumns
Line ItemsDescription, Quantity, Unit Price, Amount
TransactionsDate, Reference, Debit, Credit

Moby extracts each row as a separate entry, preserving the table structure.

When to use list extraction

Good for:

  • Invoice line items (up to ~100 rows)
  • Short transaction lists
  • Inventory summaries
  • Contract schedules

Limits: Quality drops after ~50 rows

List extraction works best for smaller tables. For documents with more than 50 rows, quality and accuracy decrease significantly.

For large tables, use OCR to Excel instead. See Workspace Tools → OCR to Excel.

When to choose which
ScenarioUse
Invoice with 20 line itemsList extraction
Bank statement with 500 transactionsOCR to Excel
Contract with a fee scheduleList extraction
Full general ledger exportOCR to Excel
VIDEO PLACEHOLDER

Watch: Creating Extraction Models Learn how to use AI to generate models instantly and refine them for accuracy.

The fastest way to create a model is to let Moby generate it for you:

  1. Go to Models in the sidebar
  2. Click New Extraction Model
  3. Click Generate with AI
  4. Provide a prompt, upload a workpaper, or select sample documents
  5. Review the suggested fields
  6. Adjust names and descriptions if needed
  7. Save the model
Start with AI generation

AI-assisted generation is the default and recommended approach. It saves time and often catches fields you might miss manually.

Manual field creation

If you prefer to build from scratch:

  1. Click New Extraction Model
  2. Add fields manually one by one
  3. Give each field a clear name and description
  4. Save the model

Testing and iteration

Test your model on a few documents before running a large batch:

  1. Select your model
  2. Run on 3-5 sample documents
  3. Review extraction accuracy
  4. Refine field descriptions if needed
  5. Re-test until satisfied

Tips for better accuracy

  • Use descriptive field names — "Invoice Total Amount" is better than "Amount"
  • Add field descriptions — Explain where the field typically appears
  • Handle variations — Mention alternate formats in descriptions (e.g., "Total" vs "Grand Total")
  • Test across clients — Models may need adjustment for different document formats

Advanced Extraction Techniques

For complex extraction scenarios, see the detailed guide: Advanced Extraction Models

The advanced guide covers:

  • Variables — Make models reusable with dynamic parameters like closing dates and entity identifiers. Use {{variable_name}} syntax in your instructions and field explanations.

  • List/Table extraction — Extract repeating rows (invoice line items, transaction lists) with proper column definitions.

  • Loan amortization schedules — Handle date-based row identification and balance extraction.

  • Format requirements — Dates (DD-MM-YYYY), numbers (comma decimal), and text formatting rules.

  • Debugging extraction issues — Troubleshooting and iterative refinement techniques.