Create & refine Extraction Models

Extraction models define what data to pull from documents.

Field types

Type	Use for	Example
Text	Free-form text	Vendor name, description
Number	Numeric values	Amounts, quantities
Date	Date values	Invoice date, due date
List/Table	Repeating items	Line items on an invoice

Extracting lists (line items)

The List/Table field type extracts repeating data from a single document — like invoice line items, transaction rows, or inventory lists.

How list extraction works

Define a list field with its columns (sub-fields):

Field	Columns
Line Items	Description, Quantity, Unit Price, Amount
Transactions	Date, Reference, Debit, Credit

Moby extracts each row as a separate entry, preserving the table structure.

When to use list extraction

Good for:

Invoice line items (up to ~100 rows)
Short transaction lists
Inventory summaries
Contract schedules

Limits: Quality drops after ~50 rows

List extraction works best for smaller tables. For documents with more than 50 rows, quality and accuracy decrease significantly.

For large tables, use OCR to Excel instead. See Workspace Tools → OCR to Excel.

When to choose which

Scenario	Use
Invoice with 20 line items	List extraction
Bank statement with 500 transactions	OCR to Excel
Contract with a fee schedule	List extraction
Full general ledger export	OCR to Excel

Creating a new model (recommended: AI-assisted)

VIDEO PLACEHOLDER

Watch: Creating Extraction Models Learn how to use AI to generate models instantly and refine them for accuracy.

The fastest way to create a model is to let Moby generate it for you:

Go to Models in the sidebar
Click New Extraction Model
Click Generate with AI
Provide a prompt, upload a workpaper, or select sample documents
Review the suggested fields
Adjust names and descriptions if needed
Save the model

Start with AI generation

AI-assisted generation is the default and recommended approach. It saves time and often catches fields you might miss manually.

Manual field creation

If you prefer to build from scratch:

Click New Extraction Model
Add fields manually one by one
Give each field a clear name and description
Save the model

Testing and iteration

Test your model on a few documents before running a large batch:

Select your model
Run on 3-5 sample documents
Review extraction accuracy
Refine field descriptions if needed
Re-test until satisfied

Tips for better accuracy

Use descriptive field names — "Invoice Total Amount" is better than "Amount"
Add field descriptions — Explain where the field typically appears
Handle variations — Mention alternate formats in descriptions (e.g., "Total" vs "Grand Total")
Test across clients — Models may need adjustment for different document formats

Advanced Extraction Techniques

For complex extraction scenarios, see the detailed guide: Advanced Extraction Models

The advanced guide covers:

Variables — Make models reusable with dynamic parameters like closing dates and entity identifiers. Use {{variable_name}} syntax in your instructions and field explanations.
List/Table extraction — Extract repeating rows (invoice line items, transaction lists) with proper column definitions.
Loan amortization schedules — Handle date-based row identification and balance extraction.
Format requirements — Dates (DD-MM-YYYY), numbers (comma decimal), and text formatting rules.
Debugging extraction issues — Troubleshooting and iterative refinement techniques.

Field types​

Extracting lists (line items)​

How list extraction works​

When to use list extraction​

Limits: Quality drops after ~50 rows​

Creating a new model (recommended: AI-assisted)​

Manual field creation​

Testing and iteration​

Tips for better accuracy​

Advanced Extraction Techniques​