Hritwick Manna

Consultant

Mumbai, Maharashtra, India2 yrs 9 mos experience

AI EnabledAI ML Practitioner

Key Highlights

Engineered scalable computer vision systems at Walmart.
Developed APIs for document AI at a fast-paced startup.
Optimized model inference speed by 12.5x.

Stackforce AI infers this person is a Computer Vision and AI specialist with experience in SaaS and data infrastructure.

Contact

Skills

Core Skills

Data InfrastructureDocument AiComputer VisionData Analytics

Other Skills

C++CUDACascading Style Sheets (CSS)ClusteringData AnnotationData ScienceDeep LearningDockerExpress.jsGitGitHubGoogle BigQueryHTMLJavaScriptKubernetes

About

🎓 I’m Hritwick Manna, a B.Tech. (Hons.) graduate from IIT Kharagpur with a Micro-Specialization in Artificial Intelligence and Applications. 💼 My career began at Walmart Global Tech, where I engineered scalable computer vision and real-time data systems building inference pipelines, optimizing models with TensorRT, and converting live RTSP streams into actionable business insights used across the organization. After gaining strong experience in large-scale systems, I moved to a fast-paced AI startup (Unsiloed AI), working on synthetic data generation for unstructured document understanding. Here, I contributed to building APIs that turn messy enterprise documents into structured, machine-readable formats an essential layer for AI workflows. This shift from a structured MNC environment to a dynamic startup sharpened my ability to learn fast, prototype faster, and think end-to-end. ⚙️ I have a strong command over Data Structures, Algorithms, OOP and System Design, and I enjoy solving complex engineering challenges that demand both scalability and intelligence. 🚀 Driven by curiosity and craftsmanship, I strive to build technology that’s both meaningful and scalable.

Experience

2 yrs 9 mos

Total Experience

2 yrs

Average Tenure

9 mos

Current Experience

Stealth startup

AI Consultant

Aug 2025 – Present · 9 mos · San Francisco Bay Area · Remote

Built synthetic multimodal datasets with advanced prompt engineering to train document AI models while ensuring privacy compliance.
Created diverse table variations (gridless, hierarchical headers, merged spans, styling/rotation/language noise) for robust table extraction.
Helped improve unstructured-to-structured data infrastructure, enabling documents to become as computable as databases.

Prompt EngineeringSynthetic Data GenerationDocument AIData Infrastructure

Walmart global tech india

2 roles

Software Engineer

Jul 2023 – Jul 2025 · 2 yrs · Bengaluru, Karnataka, India · On-site

✪ Deployed an end-to-end Person Detection Surveillance via RTSP streams, with real-time frame extraction, RT-DETR object detection, and event-based flow for automated labeling and validation.
✪ Integrated processed detection results into Google BigQuery to enable scalable analytics and reporting.
✪ Optimized inference speed by converting ONNX model to TensorRT engine and deploying on NVIDIA
GPU with CUDA compatibility, reducing frame processing time from 1.0s to 0.08s (12.5x faster).
✪ Executed ThreadPoolExecutor for async event handling, aligning threads with system’s 16-core architecture, reducing task queuing delays and enhancing scalability under concurrent RTSP streams.
✪ Implemented timeout handling and fault isolation for pipelines by restarting RTSP processes if no feed was captured within 2 minutes, ensuring uninterrupted surveillance and system reliability under load.
✪ Increased test coverage to 95% by developing integration and unit tests; integrated SonarQube for CI/CD.
✪ Curated high-quality datasets using VIA Annotator and open-source data for efficient model training.
✪ Second Position, Walmart Hackathon: Developed a YOLOv5 model to detect damaged pallets with 75% accuracy on 5000+ annotated images; used clustering to categorize damage types; deployed via Streamlit

RTSP StreamsObject DetectionGoogle BigQueryTensorRTCUDAThreadPoolExecutor+4

Software Engineer Intern

May 2022 – Jul 2022 · 2 mos · Bengaluru, Karnataka, India · Hybrid

✪ Applied UNet and MultiResUNet for shopping cart region segmentation, enhancing model generalisation and performance across scenarios, boosting accuracy from 64% to 85% and lowering Focal Loss.
✪ Evaluated model robustness across diverse deployment scenarios to assess real-world applicability.
✪ Concluded segmentation models best for cart-corral areas and YOLOv5 excels with sparsely placed carts.

UNetMultiResUNetSegmentationModel EvaluationComputer Vision