Scaling Document Intelligence for One of the Largest Bank of Asia

When i was working on a project for one of the largest bank of Asia, I had the opportunity to lead the design and development of a large-scale, production-grade document processing system. The scale, complexity, and real-world impact of this system make it

This post outlines the architecture, machine learning components, and infrastructure challenges involved in building an AI-powered system that processes over 2 million document pages daily, translating to roughly 1.46 TB of data every day.

The Challenge: Millions of Pages, Strict SLAs, and Sensitive Information

Every day, the bank receives more than 2 million scanned document pages—ranging from ID cards and address proofs to complex, multi-page financial documents. These come in various formats, including TIFF, PDF, JPEG, and PNG. On average, each document is approximately 6 MB in size, resulting in a daily processing volume

The requirements were ambitious:

    Redact PII (Personally Identifiable Information) reliably across all document types

    Detect and crop various document formats (Aadhaar, PAN, passports, etc.)

    Complete processing within 12 hours, every single day

    Ensure zero downtime, given the system's role in mission-critical onboarding and verification workflows

My Role and Contributions

I served as the lead architect and engineer, responsible for designing the end-to-end system—from model training to production deployment. This included both the AI models and the backend infrastructure required to handle this volume efficiently and

Building the Core AI Capabilities

PII Redaction Engine

We built a PII masking engine that used OCR in combination with deep learning-based field detection. Here's how it worked:

    OCR was performed using a hybrid of Tesseract and Google Vision API for redundancy and accuracy.

    Custom-trained CNN-based models identified PII elements such as names, Aadhaar numbers, PAN numbers, phone numbers, and addresses.

    The system achieved 99.8% precision on masking PII even in documents with low quality or non-standard layouts.

Document Cropping and Classification

To handle the varying layouts and formats of scanned documents, we developed a document detection and cropping model:

    Leveraged YOLOv5 for object detection, trained on annotated datasets covering a wide variety of document types and noise conditions.

    Applied post-processing filters to improve edge detection and alignment.

    Enabled accurate cropping, classification, and standardization of incoming documents.

This step ensured all downstream processes received clean, structured inputs—critical for audit and compliance.

Designing the Scalable Processing Infrastructure

To meet the 12-hour SLA for processing over 2 million pages per day (~12 TB), we built a horizontally scalable architecture with

Key components:

    Microservices architecture using FastAPI for modularity and ease of scaling

    Task queueing system using Celery + RabbitMQ to enable distributed job execution

    Kubernetes to handle autoscaling and deployment across multiple nodes

    Batch processing orchestration with scheduled workers and watchdogs for error handling

    Storage management optimized for large-scale I/O, with intelligent caching and expiry logic

We also implemented shadow mode deployments and canary rollouts to test model upgrades safely in production.

Reliability and Operational Excellence

Since going live, the system has operated without a single unplanned downtime event. This was achieved by investing heavily

    Observability using Prometheus + Grafana

    Health checks, circuit breakers, and fallback logic

    Retry strategies and alerting on latency/volume anomalies

    Continuous benchmarking and load testing to maintain SLA margins

Takeaways

    Accuracy, scale, and reliability can coexist if baked into the architecture from day one.

    AI in production is more than models — orchestration, data flow, and observability are equally critical.

    Real-world ML deployments demand robustness, especially when operating in regulated environments like banking.

Final Thoughts

This project was a strong reminder that delivering AI at scale is a multidisciplinary effort — involving machine learning, system design, DevOps, and data engineering. The solution now processes billions of documents per year, enabling secure and efficient

It's one thing to train a model that works in a notebook. It's something else entirely to deploy a system that works every day, under load, with no room for error. And that's what makes this project one of the most fulfilling parts of my professional

Abhishek Vishwakarma

July 2025

home-lefthome-right

Twitter

LinkedIn

GitHub