Abhishek Vishwakarma

Experiences

Blogs

Let's Talk

Scaling Document Intelligence for One of the Largest Bank of Asia

When i was working on a project for one of the largest bank of Asia, I had the opportunity to lead the design and development of a large-scale, production-grade document processing system. The scale, complexity, and real-world impact of this system make it

This post outlines the architecture, machine learning components, and infrastructure challenges involved in building an AI-powered system that processes over 2 million document pages daily, translating to roughly 1.46 TB of data every day.

The Challenge: Millions of Pages, Strict SLAs, and Sensitive Information

Every day, the bank receives more than 2 million scanned document pages—ranging from ID cards and address proofs to complex, multi-page financial documents. These come in various formats, including TIFF, PDF, JPEG, and PNG. On average, each document is approximately 6 MB in size, resulting in a daily processing volume

The requirements were ambitious:

Redact PII (Personally Identifiable Information) reliably across all document types

Detect and crop various document formats (Aadhaar, PAN, passports, etc.)

Complete processing within 12 hours, every single day

Ensure zero downtime, given the system's role in mission-critical onboarding and verification workflows

My Role and Contributions

I served as the lead architect and engineer, responsible for designing the end-to-end system—from model training to production deployment. This included both the AI models and the backend infrastructure required to handle this volume efficiently and

Building the Core AI Capabilities

PII Redaction Engine

We built a PII masking engine that used OCR in combination with deep learning-based field detection. Here's how it worked:

OCR was performed using a hybrid of Tesseract and Google Vision API for redundancy and accuracy.

Custom-trained CNN-based models identified PII elements such as names, Aadhaar numbers, PAN numbers, phone numbers, and addresses.

The system achieved 99.8% precision on masking PII even in documents with low quality or non-standard layouts.

Document Cropping and Classification

To handle the varying layouts and formats of scanned documents, we developed a document detection and cropping model:

Leveraged YOLOv5 for object detection, trained on annotated datasets covering a wide variety of document types and noise conditions.

Applied post-processing filters to improve edge detection and alignment.

Enabled accurate cropping, classification, and standardization of incoming documents.

This step ensured all downstream processes received clean, structured inputs—critical for audit and compliance.

Designing the Scalable Processing Infrastructure

To meet the 12-hour SLA for processing over 2 million pages per day (~12 TB), we built a horizontally scalable architecture with

Key components:

Microservices architecture using FastAPI for modularity and ease of scaling

Task queueing system using Celery + RabbitMQ to enable distributed job execution

Kubernetes to handle autoscaling and deployment across multiple nodes

Batch processing orchestration with scheduled workers and watchdogs for error handling

Storage management optimized for large-scale I/O, with intelligent caching and expiry logic

We also implemented shadow mode deployments and canary rollouts to test model upgrades safely in production.

Reliability and Operational Excellence

Since going live, the system has operated without a single unplanned downtime event. This was achieved by investing heavily

Observability using Prometheus + Grafana

Health checks, circuit breakers, and fallback logic

Retry strategies and alerting on latency/volume anomalies

Continuous benchmarking and load testing to maintain SLA margins

Takeaways

Accuracy, scale, and reliability can coexist if baked into the architecture from day one.

AI in production is more than models — orchestration, data flow, and observability are equally critical.

Real-world ML deployments demand robustness, especially when operating in regulated environments like banking.

Final Thoughts

This project was a strong reminder that delivering AI at scale is a multidisciplinary effort — involving machine learning, system design, DevOps, and data engineering. The solution now processes billions of documents per year, enabling secure and efficient

It's one thing to train a model that works in a notebook. It's something else entirely to deploy a system that works every day, under load, with no room for error. And that's what makes this project one of the most fulfilling parts of my professional

Abhishek Vishwakarma

July 2025