Udit Saxena

Lead ML Engineer

San Francisco, California, United States10 yrs 4 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Lead Machine Learning Engineer with expertise in Generative AI.
Architected large-scale AI infrastructure for enterprise applications.
Presented at Apache Airflow Summit 2024 on orchestration strategies.

Stackforce AI infers this person is a SaaS-focused Machine Learning Engineer with expertise in AI infrastructure and orchestration.

Contact

Skills

Core Skills

Apache AirflowKubernetesLarge Language Models (llm)AwsPytorchCuda

Other Skills

Large Language Model Operations (LLMOps)Apache SparkBatch InferenceVector DBsEmbedding generationAmazon Web Services (AWS)Amazon S3AWS LambdaCC++PhotoshopPythonArch LinuxCLIJava

About

I’m a Lead Machine Learning Engineer at ASAPP, where I design large-scale AI infrastructure for Generative AI systems. My work spans Retrieval-Augmented Generation (RAG) platforms, multi-turn conversational agents, and LLM evaluation infrastructure - combining Airflow, Spark, and Kubernetes with distributed training and GPU inference optimization. My research includes Scalable GPU-Accelerated Euler Characteristic Curves, accepted at NeurIPS 2025 (NeurReps), focusing on differentiable topology for deep learning. I've also been a presenter at Apache Airflow Summit 2024, sharing ASAPP’s orchestration strategies for large-scale GenAI workloads. Apache Airflow Blog: https://medium.com/apache-airflow/airflow-at-asapp-enhancing-ai-powered-contact-centers-0328deb6f03b NeurIPS 2025 Publicaton (NeurReps Workshop): Scalable GPU-Accelerated Euler Characteristic Curves Arxiv: https://arxiv.org/abs/2510.20271

Experience

10 yrs 4 mos

Total Experience

1 yr 4 mos

Average Tenure

4 yrs 6 mos

Current Experience

Asapp

Lead Machine Learning Engineer

Nov 2021 – Present · 4 yrs 6 mos · San Francisco Bay Area

Building large-scale AI infrastructure for enterprise generative AI products serving contact centers. Own critical infrastructure including RAG platforms, Airflow orchestration, and LLM evaluation systems.
Key Achievements:
Airflow & Batch Inference Infrastructure
Architected Airflow-Spark orchestration across Kubernetes clusters, powering 1M daily tasks and 5K DAGs for speech and NLP pipelines
Introduced on-demand Spark clusters and GPU inference pods (PyTorch/vLLM), cutting runtimes by 10×
Speaker at Apache Airflow Summit 2024, featured on Airflow Blog
Retrieval-Augmented Generation (RAG) Platform
Own ASAPP's production RAG infrastructure end-to-end
Built scalable document parsing (text/PDF), embedding generation, and similarity search over vector databases with metadata filtering
Powers enterprise-scale voice and text AI products
Core Conversational AI Development
Developed multi-turn conversational AI systems with retrieval, tool use, and prompt optimization
Collaborated with Platform/SRE on LLM proxy infrastructure (LiteLLM-style routing and observability)
Ran benchmark and simulation-based evaluations of frontier models (GPT-4, Claude, Mistral) across multi-cloud deployments
Deep Learning Training Acceleration
Accelerated model training for speech team using Triton, fused kernels, FlashAttention
Implemented distributed training frameworks (DeepSpeed, FSDP) and PyTorch profiling optimizations
Technologies: PyTorch, Airflow, Kubernetes, Spark, vLLM, CUDA, AWS, Vector DBs (Milvus, OpenSearch), Redis, MongoDB

Apache AirflowKubernetesLarge Language Models (LLM)Large Language Model Operations (LLMOps)Apache SparkBatch Inference

Sumo logic

2 roles

Senior Software Engineer (Machine Learning)

Promoted

Mar 2021 – Nov 2021 · 8 mos

Machine Learning Engineer

Jun 2018 – Mar 2021 · 2 yrs 9 mos

Streaming Clustering: Fast, distributed, approximate streaming clustering algorithms for log data with structured, unstructured schema (2 patents pending - clustering log data using key schema and key-value schema)
Textual Clustering: Increased performance of existing text clustering algorithms by >10x using MapReduce
Airflow integration with Kubernetes: Led cross-geo integration of production ML Airflow backend from EC2 instances on AWS to EKS managed K8S service using ArgoCD, Helm and AWS Elastic Container Registry
Kubernetes module integrations Led various internal module integrations and deployments on K8S for increased flexibility and microservice management.

Lexalytics, inc.

Machine Learning Research Intern

Jan 2018 – May 2018 · 4 mos · Amherst, MA

Explored transfer learning for document classification and semantic analysis using Graph Convolutional Neural Networks

Microsoft research

Student Research Assistant

Jan 2018 – May 2018 · 4 mos

Transfer Learning for Reading Comprehension - Used the BIDAF model from AllenAI as the base model for a question answering system to explore active learning methods for transferring knowledge in reading comprehension from the Stanford SQuAD dataset to the Microsoft NewsQA dataset.

Sumo logic

Software Engineering Intern

May 2017 – Aug 2017 · 3 mos · San Francisco Bay Area

Sprinklr

Product Engineer

Jul 2015 – Aug 2016 · 1 yr 1 mo · Gurgaon, India

As part of the Core Engineering team I was involved with
API integrations and REST-based API extensions (internal and external).
Third party integrations - chiefly SAP C4C and SAP Hybris systems as well as third party social network integrations
Single Sign On implementation with Sprinklr as an Identity Provider
Internal auditing technologies and setting up logging based features on existing and new features so as to enable decisions backed by data.

Adobe

Product Intern

Jan 2015 – Jun 2015 · 5 mos · Bangalore

I was responsible for developing and implementing a User Analytics feature responsible for collecting non-Personal Identity Information about users at Adobe Captivate and further setup a pipeline to clean, mine, analyze the data, and hence provide insights for data driven decisions for the team.

Mlpack

Google Summer of Code, Intern

May 2014 – Sep 2014 · 4 mos

Core Contributor since May 2014.
At MLPACK, I worked on extension of existing library algorithms to include Multi-Class Adaboost algorithms. I was responsible for coding and implementing weak learning algorithms - Decision Stumps and Perceptrons (single layer neural networks) and the Adaboost.M1, Adaboost.MH and the Adaboost.SAMME multi-class boosting algorithms. I was also responsible for implementing Decision Trees, along with template based splitting algorithms.