All posts

AI-Powered Receipt Validation: How Brands Run Reward Programs at Scale

M
Muralidharan
May 10, 2025·AI Agents
BlueOshan AI Labs

AI-Powered Receipt Validation: How Brands Run Reward Programs at Scale

Consumer reward programs live and die on receipt validation. Get it wrong in either direction and you have a problem: accept fraudulent submissions and program costs spiral out of control; reject legitimate ones and you damage the customer relationship you were trying to build. At modest volumes, a manual review team can manage. At 300,000 entries across a multi-month campaign, manual review is not a realistic option.

We built the validation pipeline for a global FMCG brand running a purchase-based reward program. The technical challenge was not just volume - it was the condition of the receipts themselves. Consumers submit receipts taken on phones in poor light, folded and crumpled, faded from heat or age, or forwarded as screenshots from retailer apps. A single OCR engine reading a clean, flat receipt is one problem. Reading that same receipt after it has been folded into a wallet for three weeks is a different problem entirely.

Why a single OCR engine is not enough

Standard OCR engines are optimized for clean, high-contrast documents. They produce confidence scores per field, but they are not designed to reconcile ambiguous readings, infer context from surrounding text, or apply campaign-specific logic to what they extract. When a field is partially obscured or a date is smudged, a single OCR engine either guesses or fails. Neither outcome is acceptable when the field in question determines whether a consumer qualifies for a reward.

The two-model ensemble approach

Our pipeline uses AWS Textract for primary extraction - it reads every line of the receipt with per-field confidence scores and handles layout parsing better than generic models. The extracted text and the original image are then passed to Google Gemini for a reconciliation and validation pass. Gemini applies campaign criteria - qualifying store, date range, product category, minimum spend - and cross-checks the extracted fields against the visual content. Where Textract is uncertain, Gemini can reason about the image directly. The two-model design catches errors that either model alone would miss.

Fraud detection built into the pipeline

The same reconciliation pass that validates legitimate receipts also surfaces fraud signals: duplicate receipt hashes, inconsistent store and product combinations, dates that fall outside the campaign window, and images that have been digitally modified. Suspicious entries are flagged for human review rather than automatically rejected, which keeps the false-positive rate low while ensuring nothing obviously fraudulent passes through.

The results across 300,000 entries

Across the campaign, the ensemble validated 300,000+ receipt submissions at 99.2% accuracy, measured against a stratified sample reviewed by the client's quality team. Ambiguous cases - those where neither model reached a high-confidence verdict - were surfaced for human review, accounting for under 1% of total volume. Reward redemption ran automatically for validated entries, with no manual processing queue for the client's operations team.

Key takeaways

  • A two-model ensemble - dedicated OCR plus multimodal LLM reconciliation - handles receipt conditions that single-engine OCR cannot reliably process.
  • Campaign-specific validation logic (qualifying store, date, product, spend) is applied automatically at every submission.
  • Fraud signals are flagged for human review rather than auto-rejected, keeping false-positive rates low.
  • 300,000+ entries validated at 99.2% accuracy, with under 1% requiring human review.
  • Instant reward redemption for validated entries eliminates the operations queue entirely.

We put these principles to work building AI content and validation pipelines for clients.

Explore our solutions