Ishan Gupta — Product Engineer
AI Infrastructure / LLM Platform Engineer | Delivered 22% lower $/token and 40% p99 latency reduction on GPU-Kubernetes systems | CNCF OSS Contributor Translating research-grade AI systems into scalable, production-ready distributed infrastructure. Open to Mid-level AI Infrastructure & LLM Platform roles (Senior considered). I enjoy collaborating closely with research, product, and platform teams to turn experimental models into reliable, production-grade AI systems. I’m an AI Infrastructure & Distributed Systems Engineer with 4+ years of experience designing, scaling, and operating production-grade AI and LLM platforms, focused on building high-performance distributed systems that maximize GPU efficiency and reliability. Core engineering focus & impact: - Built and scaled multi-tenant GPU platforms for LLM and vision model training and inference - Designed topology-aware scheduling and GPU bin-packing (MIG, NCCL/RDMA, NUMA) → ~22% lower $/token - Optimized PyTorch runtime and CUDA/JAX workloads for large-scale training - Provisioned petabyte-scale MLOps infrastructure on Kubernetes - Scaled model serving pipelines with KServe/Triton and KEDA/HPA → ~40% lower p99 latency - Reduced idle GPU utilization (~27.5%) via rightsizing, quotas, and preemptible pools - Built horizontally scalable microservices backed by SQL, NoSQL, graph, and vector databases - Implemented end-to-end observability (Prometheus, OpenTelemetry, DCGM) with SLOs and error budgets Technical specialties: - Foundational AI / LLM systems, RAG, and agentic AI for cost-efficient inference - Large-scale model training & fine-tuning (7B–70B) using LoRA/QLoRA, FSDP, DeepSpeed - Parallel and distributed computing for AI training and inference pipelines - High-availability, event-driven backends for AI workloads - GPU/TPU infrastructure (A100/H100, DGX, TPU v5p/v5e) Background: - Experience across SaaS, on-prem, and enterprise AI platforms - Drove cross-geo engineering initiatives at VMware (Broadcom) on large-scale AI infrastructure - CNCF open-source contributor - M.S. in Computer Science (AI focus: RAG & Agentic AI) - Experience building cloud-native, multitenant SaaS LLM platforms and AI infrastructure for AI labs and enterprise environments. Past work: NLP for chatbots, computer vision for gaming/SLAM, on-device ML, GIS data science, and chaos engineering/SRE. In short: I specialize in AI infrastructure, distributed systems, and GPU optimization, turning research-grade LLMs into cost-efficient production deployments.
Stackforce AI infers this person is a SaaS and Cloud Computing Infrastructure Engineer with a focus on AI and Distributed Systems.
Location: Rochester, New York, United States
Experience: 7 yrs 1 mo
Skills
- Cloud Computing
- Distributed Systems
Career Highlights
- Achieved 22% lower $/token in AI infrastructure.
- Delivered 40% p99 latency reduction on GPU-Kubernetes systems.
- CNCF OSS Contributor with extensive cloud-native experience.
Work Experience
Career Break
Professional development (1 yr 8 mos)
Broadcom
R&D Engineer Software 3 (8 mos)
VMware
Member of Technical Staff 3 (3 mos)
Member of Technical Staff 2 (1 yr 9 mos)
VMware Tanzu
Research & Development Engineer (2 yrs 8 mos)
ChaosNative (Acquired by Harness Inc.)
Software Engineer 1 (5 mos)
Software Engineer Intern (2 mos)
EyeROV (IROV TECHNOLOGIES PRIVATE LIMITED)
Computer Vision and Deep Learning Intern (2 mos)
ANZ
Cyber Security - Virtual Intern (0 mo)
Data@ANZ - Virtual Intern (1 mo)
Cloud Native Computing Foundation (CNCF)
Open Source Software Maintainer (LitmusChaos) (2 yrs 7 mos)
LitmusChaos
Open Source Software Maintainer (2 yrs 7 mos)
MayaData (Acquired by DataCore Software)
Software Engineer Intern (8 mos)
HighRadius
Summer Intern (Software Development) (2 mos)
Deloitte
Technology Consutant - Virtual Intern (1 mo)
JPMorgan Chase & Co.
Software Engineer - Virtual Experience (1 mo)
KPMG
Data Analytics Consultant- Virtual Intern (1 mo)
DSC KIIT
Cloud Applications Developer (3 mos)
Data Science, Machine Learning & AI Researcher (1 yr 1 mo)
Core Team Member (1 yr 1 mo)
Skyline Racing
Software Development Engineer (9 mos)
Neurapses Technologies
AI / NLP Intern (11 mos)
DafnTech
Machine Learning Intern (1 mo)
Project SwaG (Swayamchalit Gaadi)
Software Developer (1 yr)
Samvriddhi Infotech
Data science for GIS Intern (5 mos)
Viden.io
Business Development Intern (2 mos)
Think India
Head Of Promotions (4 mos)
Koderunners
Moderator (2 yrs)
Research And Development Engineer (2 yrs 1 mo)
Tech Lead (2 yrs 3 mos)
Public Relation Officer (2 yrs 4 mos)
Yoken
Promotions Officer (2 mos)
HelpAge India
Fund Raising Officer (2 mos)
Education
Master of Science - MS at Rochester Institute of Technology
Graduate coursework for Master's degree at Vellore Institute of Technology
Bachelor of Technology at KIIT - Kalinga Institute of Industrial Technology
Bachelor of Arts - BA at Rabindra Bharati University, Kolkata
ISC at Don Bosco School Liluah
ICSE at Vidyaniketan
Primary School at Aditya Birla Public School