Receipt OCR Technology
How Eatomate extracts grocery items from receipt photos and matches them to nutrition databases with OCR-aware fuzzy matching.
Why Receipt Scanning Matters
Receipt scanning serves as the ground truth for physics-based reconciliation. When you scan your grocery receipts, Eatomate knows exactly what ingredients you purchased, enabling it to cross-reference against your logged meals and achieve 95+% accuracy.
The Accuracy Lock
Receipt scanning creates a closed system: Ingredients_in (receipts) = Meals_out (scans) + Waste. This mass conservation equation allows the system to backfill historical meals with 100% accuracy once ingredients expire.
How Receipt OCR Works
1.Photo Capture or Upload
You can either take a photo of your grocery receipt using your phone's camera, or upload an existing image/PDF. The app automatically detects receipt boundaries and adjusts perspective (unwarps skewed photos).
- Camera Scan: Take a photo from any angle with automatic perspective correction
- Upload: Select an existing image or PDF file from your device
- Supported formats: JPG, PNG, PDF
- Tip: Decent lighting helps, but even dim receipts work
2.OCR Extraction
Optical Character Recognition (OCR) powered by Mistral OCR extracts text from the receipt image. This includes item names, quantities, and prices.
- Technology: Mistral OCR API (99%+ accuracy on clear text)
- Output: Structured line items with text + bounding boxes
- Handles: Faded receipts, thermal paper, handwritten annotations
3.Intelligent Parsing
A custom parser separates receipt header/footer from actual grocery items. It recognizes store-specific formats (Tesco, Sainsbury's, Waitrose, etc.) and extracts quantities.
- Store recognition: Detects 30+ major UK supermarkets
- Quantity extraction: "2x Semi-Skimmed Milk" → quantity: 2
- Price filtering: Ignores non-food items (cleaning products, etc.)
4.Database Matching with OCR-Aware Fuzzy Search
Each extracted item is matched against Eatomate's 100K canonical recipe database using a fuzzy trie with OCR-aware edit distance. This accounts for common OCR errors.
- Algorithm: Damerau-Levenshtein distance with OCR error weights
- Examples: "0rganic" → "Organic" (0→O confusion), "Cherr1es" → "Cherries" (1→i confusion)
- Threshold: 85%+ similarity required for auto-match
5.Barcode Scanning & Classification
After OCR extracts receipt line items, you're shown each item with options to:
For Each Line Item:
- 1.Scan barcode (if barcoded): Point camera at product barcode for instant nutrition lookup from 2M+ database. This auto-fills product name, nutrition, and serving size.
- 2.Mark food vs non-food: Tap "Food" or "Non-Food" to classify the item. Non-food items (cleaning products, toiletries, etc.) are excluded from pantry tracking.
- 3.Set portion remaining: Enter the percentage remaining (defaults to 100%). Example: If you bought a jar of peanut butter and already used half, set to 50%.
Note: For non-barcoded items (fresh produce, bulk items), you can manually search the ingredient database or take a photo of the nutrition label if available.
6.Review & Confirm
After all items are classified and scanned, review the list and tap "Save Receipt". These ingredients are now in your pantry inventory and will be used for physics-based reconciliation.
Time Investment: Week 1: ~2-3 min per receipt (learning curve). Week 4: ~30-60 seconds (barcode database populated). Week 8+: ~10-20 seconds (network effects + auto-matching).
The 2M+ Barcode Database
When you scan product barcodes during receipt processing, Eatomate looks up nutrition data from a database of over 2 million products:
- UK Supermarkets: Tesco, Sainsbury's, Waitrose, Morrisons, Asda, Aldi, Lidl, Co-op, M&S
- International Brands: Coca-Cola, Nestlé, Unilever, PepsiCo, Kellogg's, Mars
- Health Brands: MyProtein, Grenade, Optimum Nutrition, Huel
- Store Brands: Tesco Finest, Sainsbury's Taste the Difference, etc.
Network Effects
Every barcode scan by any Eatomate user adds to the shared database. Rare or new products scanned by others become instantly available to you. This creates a network effect where the database grows smarter with every user.
How to Add Receipts
Eatomate supports adding grocery receipts to your pantry by scanning physical receipts with your phone camera or uploading images/PDFs from your device.
Scan or Upload Receipts
Step 1: Open Receipt Scanner
Tap the "+" button in the app and select "Scan Receipt". You'll see two options: "Scan Receipt" to use your camera, or "Upload Image/PDF" to select an existing file from your device.
Step 2: Position Receipt
Lay the receipt flat on a table or counter. Hold your phone directly above, about 20-30cm away. The entire receipt should be visible in the frame.
Step 3: Capture Photo
Tap the shutter button. The app auto-detects receipt boundaries and crops/unwarps the image. You'll see a preview—tap "Use Photo" to continue.
Step 4: Wait for OCR (3-8 seconds)
The app sends the photo to Mistral OCR API. Processing takes 3-8 seconds. You'll see a loading spinner.
Step 5: Scan Barcodes & Classify Items
For each receipt line item extracted by OCR, you'll be prompted to:
- Scan the barcode (if item is barcoded) — Point camera at product for instant nutrition data from 2M+ database
- Mark as food or non-food — Filter out cleaning products, toiletries, etc.
- Set portion remaining — Enter percentage remaining (defaults to 100%)
For non-barcoded items: Use manual search or photo of nutrition label
Step 6: Review & Save
Review all classified items and tap "Save Receipt". These ingredients are now in your pantry inventory and will be used for physics-based reconciliation.
Time breakdown: OCR processing takes 3-8 seconds. Total receipt time in Week 1: ~2-3 minutes (includes barcode scanning, classification, and portion entry for each item). Week 8: ~10-20 seconds total (network effects + intelligent caching automatically match most items).
Privacy Note
Receipt photos are deleted after 30 days per GDPR compliance. Only the extracted item list (names, quantities, dates) is retained for reconciliation. See GDPR Compliance for details.
Troubleshooting
Problem: "No items detected"
Cause: Receipt photo was too blurry or low-resolution
Fix: Retake the photo with better lighting and hold phone steady. Avoid shadows across the receipt.
Problem: "Wrong items extracted"
Cause: OCR misread text, or receipt has non-standard format
Fix: Manually correct items in review screen. The system learns from your corrections.
Problem: "Processing timeout"
Cause: Poor internet connection (OCR happens server-side)
Fix: Ensure you have stable Wi-Fi or cellular data. The app will retry automatically.