Extraction gives wrong values
The AI extracted a value, but it's incorrect. Here's how to debug and fix it.
Debug: Is it OCR or AI?
When extraction gives wrong values, the error comes from one of two places:
| Source | What went wrong | How to fix |
|---|---|---|
| OCR | The document text wasn't read correctly | Improve document quality |
| AI | Text was read correctly, but AI misinterpreted it | Adjust extraction model |
How to check
Use the PDF viewer's OCR overlay to see exactly what text Moby reads from your document:
- Open the document in the PDF viewer
- Click the OCR button in the viewer toolbar (or toggle "Show OCR text")
- The viewer now shows the recognized text overlaid on the document
- Find the value that was extracted incorrectly
- Compare what the OCR shows vs. what the document actually says
If OCR is wrong
The OCR layer misread the document. Common symptoms:
0read asO(zero vs. letter O)1read aslorI- Numbers jumbled or in wrong order
- Text garbled or missing characters
Solutions:
- Use a higher quality scan (300 DPI minimum)
- Ensure good contrast (dark text on light background)
- Use native PDFs when possible (instead of scanned images)
- For stubborn documents, try OCR to Excel which uses a different OCR engine
If OCR is correct but AI extracted wrong
The text was read correctly, but the AI misinterpreted it. Common causes:
- Ambiguous labels — Multiple similar values in the document
- Unclear field definitions — AI doesn't know which value you want
- Unusual document layout — AI confused by formatting
Solutions:
- Make your extraction model field descriptions more specific
- Add examples to clarify which value you want
- Use field hints like "The invoice total at the bottom of the page, not line item amounts"
Common extraction errors
Wrong amount extracted
Multiple amounts appear in the document and AI picked the wrong one.
Fix: Add context to your field description:
Invoice total (the final amount including tax, usually at the bottom)
Date format issues
Date appears correctly in document but extracted in wrong format.
Fix: Specify the expected format in your field:
Invoice date (format: DD/MM/YYYY)
Missing values
Field exists in document but wasn't extracted.
Causes:
- Value is in an image or watermark (not real text)
- Value is in a header/footer that OCR missed
- Field description doesn't match how the value appears
Fix: Check OCR overlay to confirm the text is recognized, then adjust field description.
Extracted from wrong document section
AI pulled value from a different part of the document than intended.
Fix: Be specific about location:
Vendor name (from the "Bill From" section, not the "Bill To" section)
When to contact support
Contact support if:
- OCR consistently fails on similar document types
- AI errors persist despite clear field descriptions
- You see unexpected behavior not covered here