Ujjwal Kumar

AI Researcher

Gurugram, Haryana, India7 yrs 11 mos experience

AI ML PractitionerAI Enabled

Key Highlights

Led AI initiatives impacting millions globally.
Achieved 96% accuracy in exam credibility scoring.
Boosted fraud detection coverage to 95%.

Stackforce AI infers this person is a Computer Vision and AI specialist in EdTech and Security sectors.

Contact

Skills

Core Skills

Computer VisionArtificial Intelligence

Other Skills

TensorFlowYOLOArcFaceMediaPipeFP16 quantizationTensorRTTensorFlow.jsQwen2GPT-4oKaldiPLDANatural Language Processing (NLP)Large Language Models (LLM)PyTorchDeep Learning

About

I architect and deploy production-grade AI systems that solve real-world integrity challenges in high-stakes digital environments. At Mercer Mettl, I lead computer vision and multimodal AI initiatives powering proctoring and identity verification platforms used by millions of candidates globally. My work spans the full ML lifecycle—from defining annotation strategies and curating domain-specific datasets to fine-tuning SOTA models (YOLO, ArcFace/AdaFace, MediaPipe). Beyond core CV, I explore emerging frontiers: recently augmented our mobile detection pipeline using Vision-Language Models (Qwen2, GPT-4o) to lift recall from 61% → 79%, and contributed to audio-integrity systems via speaker diarization (92% multi-speaker detection accuracy).

Experience

7 yrs 11 mos

Total Experience

7 yrs 11 mos

Average Tenure

7 yrs 11 mos

Current Experience

Mettl

Senior AI Engineer

Jun 2018 – Present · 7 yrs 11 mos · Gurugram, Haryana, India · Remote

AI Proctoring System (15K+ exams/day)
Led a 5-engineer team building a real-time computer vision proctoring engine operating reliably across diverse lighting, devices, and network conditions.
Architected an exam-wide credibility framework aggregating frame-level risk signals into a unified score (96% accuracy), maintained over 6+ years of production use.
Boosted fraud detection coverage from 84% → 95% while maintaining near-zero false positives through a multi-component pipeline: object detection (YOLOv2 fine-tuned to 98.9% F1), pose estimation, facial landmarks, and image quality scoring
Cut inference latency 35% via FP16 quantization + TensorRT, reducing cloud costs without accuracy tradeoffs
Identity & Authentication Platform (5K+ verifications/day)
Designed a hybrid client-server verification flow: browser-based ID/face detection (TensorFlow.js + MediaPipe) performs pre-filtering, backend ArcFace handles final validation.
Delivered a 3MB SSD-MobileNetV2 ID detector (97% mAP) via INT8 quantization—runs entirely client-side for privacy and speed.
Achieved 72% first-attempt success rate on challenging ID-to-selfie matching using AIM-CHIYA (ArcFace fine-tuned on CHIYA dataset: 70% TAR @ 0.5% FAR).
Emerging Initiatives
Vision-Language Models: Elevated mobile detection recall from 61% → 79% by integrating Qwen2-32B/GPT-4o mini with optimized prompting; now fine-tuning Qwen-VL on domain data
Audio Integrity: Developed speaker diarization pipeline (Kaldi PLDA + transfer learning) reformulated as binary classification—92% accuracy detecting unauthorized secondary speakers