Advit Bhullar

Software Engineer

San Francisco, California, United States3 yrs 2 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Led architecture for $15B/year Ads ML serving.
  • Improved ML model reliability through innovative research.
  • Achieved significant cost savings in ML infrastructure.
Stackforce AI infers this person is a Machine Learning Infrastructure Engineer with expertise in distributed systems and cloud computing.

Contact

Skills

Core Skills

Machine Learning InfrastructureDistributed SystemsMachine LearningAwsWeb Applications

Other Skills

Software EngineeringPerformance OptimizationAWS Step FunctionsNatural Language ProcessingPyTorchBayesian Fine-TuningReact NativeAWS LambdaMachine Learning AlgorithmsSoftware DesignSoftware InfrastructurePython (Programming Language)TensorFlowKerasOpenCV

About

Software Engineer working on large-scale ML serving infrastructure at Meta, focused on building low-latency, cost-efficient, and reliable systems for production machine learning. I work on Ads ML Serving, where I’ve led serving-side architecture for ranking models driving $15B+/year in revenue, shipped heterogeneous compute support (CPU/GPU/accelerators), and delivered significant wins across inference efficiency, reliability, and infrastructure cost. Previously, I interned at Amazon Web Services building backend and NLP-driven systems for security and compliance workflows, and conducted ML research at Purdue University focused on improving large language model reliability. I’m interested in roles at the intersection of distributed systems and machine learning, particularly ML inference, serving infrastructure, and performance-critical systems.

Experience

3 yrs 2 mos
Total Experience
1 yr 9 mos
Average Tenure
1 yr 4 mos
Current Experience

Meta

Software Engineer

Jan 2025Present · 1 yr 5 mos · Sunnyvale, CA · On-site

  • Led serving-side execution for Ads ranking inference, productionizing a centralized, remote-execution ML serving architecture supporting models driving $15B+/year in revenue.
  • Owned migration from disaggregated and legacy architectures to centralized remote execution for CPU- and accelerator-based models, reducing client memory pressure, improving reliability, and driving $150M+/year in incremental revenue.
  • Led mass adoption of centralized remote inference using Meta Training and Inference Accelerators (MTIA), enabling $500M+/year in incremental revenue through large-scale model scaling.
  • Architected heterogeneous hardware support (MTIA and GPU), unblocking new region turn-up in capacity-constrained environments limited to accelerator-specific availability.
  • Optimized inference execution paths and memory utilization, reducing per-request compute cost by 18% and increasing effective system capacity without additional hardware.
Software EngineeringDistributed SystemsMachine Learning Infrastructure

Amazon web services (aws)

Software Development Engineer

May 2024Aug 2024 · 3 mos · Seattle, Washington, United States · On-site

  • Applied natural language processing techniques to accelerate customer support and improve AWS Security Assurance Engineering operations, resulting in a 20% reduction in resolution time.
  • Used AWS Comprehend to gain actionable insights from natural language data, boosting data analysis by 35%.
  • Automated data processing workflows using AWS Step Functions, increasing operational efficiency by 30%.
  • Saved 2000 hours/year for engineering and audit teams, improving model efficiency by 10x.
Machine LearningAWS Step FunctionsAWS

Purdue computer science

Undergraduate Researcher

Jan 2024May 2024 · 4 mos · West Lafayette, Indiana, United States · On-site

  • Advised by Dr.Ruqi Zhang
  • Enhanced LLM calibration with Bayesian fine-tuning and PEFT, boosting prediction reliability by 20%.
  • Implemented novel MCMC algorithm tailored to LLMs, elevating performance and reliability by 15%.•
PyTorchMachine Learning

Amazon web services (aws)

Software Development Engineer

May 2023Aug 2023 · 3 mos · Seattle, Washington, United States · On-site

  • Delivered Service Compliance Status project, reducing compliance reporting time by 25%.
  • Improved data accessibility by 40% by developing a centralized datastore and UI.
  • Reduced manual data entry time by 30% by updating the database with Lambda.
  • Saved 520 hours/year and accelerated compliance-related customer inquiry response time by 1.5x.
Web ApplicationsReact Native

Purdue university

2 roles

Undergraduate Teaching Assistant

Aug 2022May 2024 · 1 yr 9 mos · West Lafayette, Indiana, United States

  • Teaching assistant for CS 18200 (course in Discrete Mathematics in Computer Science)
  • Supervised PSO for approximately 60 students and held office hours for instructional support
  • Grader for assignments and proctor for exams.
  • Course content creator for 300 students

Undergraduate Teaching Assistant

Aug 2022Dec 2022 · 4 mos · West Lafayette, Indiana, United States

  • Teaching assistant for CS25100 (course in Data Structures and Algorithms in Computer Science)
  • Supervised PSO(Practice/Study/Observation) for approximately 50 students and held office hours for instructional support
  • Grader for assignments and proctor for exams.
  • Course content creator for 500 students

Education

Purdue University

Bachelor of Science - BS — Computer Science

Aug 2021Dec 2024

Stackforce found 100+ more professionals with Machine Learning Infrastructure & Distributed Systems

Explore similar profiles based on matching skills and experience