Create & refine Extraction Models
Extraction models define what data to pull from documents.
Field types
| Type | Use for | Example |
|---|---|---|
| Text | Free-form text | Vendor name, description |
| Number | Numeric values | Amounts, quantities |
| Date | Date values | Invoice date, due date |
| List/Table | Repeating items | Line items on an invoice |
Extracting lists (line items)
The List/Table field type extracts repeating data from a single document — like invoice line items, transaction rows, or inventory lists.
How list extraction works
Define a list field with its columns (sub-fields):
| Field | Columns |
|---|---|
| Line Items | Description, Quantity, Unit Price, Amount |
| Transactions | Date, Reference, Debit, Credit |
Moby extracts each row as a separate entry, preserving the table structure.
When to use list extraction
Good for:
- Invoice line items (up to ~100 rows)
- Short transaction lists
- Inventory summaries
- Contract schedules
Limits: Quality drops after ~50 rows
List extraction works best for smaller tables. For documents with more than 50 rows, quality and accuracy decrease significantly.
For large tables, use OCR to Excel instead. See Workspace Tools → OCR to Excel.
| Scenario | Use |
|---|---|
| Invoice with 20 line items | List extraction |
| Bank statement with 500 transactions | OCR to Excel |
| Contract with a fee schedule | List extraction |
| Full general ledger export | OCR to Excel |
Creating a new model (recommended: AI-assisted)
Watch: Creating Extraction Models Learn how to use AI to generate models instantly and refine them for accuracy.
The fastest way to create a model is to let Moby generate it for you:
- Go to Models in the sidebar
- Click New Extraction Model
- Click Generate with AI
- Provide a prompt, upload a workpaper, or select sample documents
- Review the suggested fields
- Adjust names and descriptions if needed
- Save the model
AI-assisted generation is the default and recommended approach. It saves time and often catches fields you might miss manually.
Manual field creation
If you prefer to build from scratch:
- Click New Extraction Model
- Add fields manually one by one
- Give each field a clear name and description
- Save the model
Testing and iteration
Test your model on a few documents before running a large batch:
- Select your model
- Run on 3-5 sample documents
- Review extraction accuracy
- Refine field descriptions if needed
- Re-test until satisfied
Tips for better accuracy
- Use descriptive field names — "Invoice Total Amount" is better than "Amount"
- Add field descriptions — Explain where the field typically appears
- Handle variations — Mention alternate formats in descriptions (e.g., "Total" vs "Grand Total")
- Test across clients — Models may need adjustment for different document formats
Advanced Extraction Techniques
For complex extraction scenarios, see the detailed guide: Advanced Extraction Models
The advanced guide covers:
-
Variables — Make models reusable with dynamic parameters like closing dates and entity identifiers. Use
{{variable_name}}syntax in your instructions and field explanations. -
List/Table extraction — Extract repeating rows (invoice line items, transaction lists) with proper column definitions.
-
Loan amortization schedules — Handle date-based row identification and balance extraction.
-
Format requirements — Dates (DD-MM-YYYY), numbers (comma decimal), and text formatting rules.
-
Debugging extraction issues — Troubleshooting and iterative refinement techniques.