Shwet Prakash — Machine Learning Engineer

I am a Senior Machine Learning Engineer with a Master’s in Computer Science and over 5 years of experience architecting production-grade AI systems. My expertise lies at the intersection of Generative AI, MLOps, and scalable backend infrastructure, moving beyond simple API wrappers to build autonomous, stateful, and intelligent systems. Currently, I am deeply focused on the Agentic AI landscape. At Quant.ai, I am architecting multi-agent conversational ecosystems using LangGraph and LiveKit, enabling real-time voice interactions for complex domains like airline bookings. My work involves orchestrating specialized agents with async connection pooling, managing session state via PostgreSQL, and optimizing latency for real-time streams. Previously at Ohai.ai, I built a comprehensive LLM agent stack for household assistance. I designed a modular toolchain of 25+ custom tools and implemented RAG systems using Vector Databases and Knowledge Graphs (Neo4j) to give agents long-term memory and context. My background is rooted in rigorous optimization and NLP. At Skuad, I specialized in Knowledge Distillation and quantization, compressing Large Language Models (BERT/RoBERTa) to improve inference latency by 4x without sacrificing accuracy. I also engineered robust MLOps pipelines using DVC, CML, and AWS to ensure continuous delivery of models. Early in my career at Kaleidofin, I built financial recommendation engines and ETL pipelines handling millions of data points. I distinguish myself by not just building models, but owning the full stack—from data engineering and model fine-tuning (LoRA/PEFT) to backend integration (FastAPI/NestJS/Redis) and cloud deployment (AWS ECS/Docker). Core Competencies: GenAI Stack: LLMs, LangChain, LangGraph, Multi-Agent Swarms, RAG, Vector DBs (Qdrant/Pinecone). Deep Learning & NLP: Transformers, Fine-tuning, Knowledge Distillation, Voice AI. Backend & Cloud: Python, TypeScript, AWS (Full Suite), Docker, Kubernetes, SQL/NoSQL. MLOps: Automated training pipelines, Experiment tracking (MLflow), Model Versioning. I am passionate about pushing the boundaries of what AI agents can achieve in real-world applications.

Stackforce AI infers this person is a SaaS expert specializing in Generative AI and MLOps.

Location: Bangalore Urban, Karnataka, India

Experience: 5 yrs 6 mos

Skills

Generative Ai
Mlops
Natural Language Processing

Career Highlights

Expert in architecting multi-agent AI systems.
Proven track record in MLOps and deployment automation.
Strong background in Generative AI and NLP.

Work Experience

Hexaware Technologies

Senior Machine Learning Engineer (GenAI) (3 mos)

Quant

Senior Machine Learning Engineer (8 mos)

Ohai.ai

Senior Machine Learning Engineer (1 yr 4 mos)

Skuad: a Payoneer company

Applied Data Scientist (3 yrs 3 mos)

Kaleidofin Private Limited

Data Science Research Scientist (1 yr 2 mos)

FundsIndia

Data Scientist (2 mos)

ThoughtBit Technologies

Data Science Intern (1 mo)

Indian Institute of Information Technology Design & Manufacturing Kancheepuram

Research Intern on Graph Theory (2 mos)

Education

Bachelors and Masters at Indian Institute of Information Technology Design & Manufacturing Kancheepuram

class 12th at Degree students jobs

Shwet Prakash

Machine Learning Engineer

Bangalore Urban, Karnataka, India5 yrs 6 mos experience

AI EnabledAI ML Practitioner

Key Highlights

Expert in architecting multi-agent AI systems.
Proven track record in MLOps and deployment automation.
Strong background in Generative AI and NLP.

Stackforce AI infers this person is a SaaS expert specializing in Generative AI and MLOps.

Contact

shwet.prakash97@gmail.com LinkedIn

Skills

Core Skills

Generative AiMlopsNatural Language Processing

Other Skills

PythonFastAPILangGraphLangChainMulti-Agent SystemsOpenAI (GPT-5.1)LiveKitHugging Face TransformersHybrid RAGQdrantPostgreSQLAsyncioMLflowDockerGitHub Actions

About

Experience

5 yrs 6 mos

Total Experience

1 yr 9 mos

Average Tenure

3 mos

Current Experience

Hexaware technologies

Senior Machine Learning Engineer (GenAI)

Feb 2026 – Present · 3 mos · Bengaluru, Karnataka, India · Hybrid

Building Agents for ITOps Automation

Quant

Senior Machine Learning Engineer

Jun 2025 – Feb 2026 · 8 mos · Bangalore Urban, Karnataka, India · Hybrid

Technologies: Python, FastAPI, LangGraph, LangChain, Multi-Agent Systems, OpenAI (GPT-5.1), LiveKit, Hugging Face Transformers, Hybrid RAG, Qdrant, PostgreSQL, Asyncio, MLflow, Docker, GitHub Actions, AWS (EC2, ECS, ECR), Pydantic
 Architected a multi-agent conversational AI for airline booking leveraging LangGraph Swarm, FastAPI streaming, and PostgreSQL persistence, implementing 4 specialized agents with async connection pooling and session state management to orchestrate stateful flight search, multi-city bookings, fare selection, change flight, check-in, and AlFursan loyalty workflows, deployed via Docker Compose with 75-80% latency reduction through parallel asyncio execution.
 Architected multi-agent voice AI system on LiveKit Cloud with 2 specialized agents (Category Discovery Agent and Product Suggestion Agent), 15 GPT-5.1 function tools, session-based state management, real-time voice pipeline (VAD→STT→LLM→TTS) with interruption handling, and middleware API integration for dynamic furniture catalog querying (decision tree navigation, 50+ categories, series-based filtering, nested product customization).
 Developed intelligent product search using hybrid RAG with Qdrant vector DB, OpenAI embeddings (text-embedding-3-small), LLM-based query decomposition (GPT-4o-mini), metadata filtering (price/category/dimensions), and async FastAPI deployment with Docker
 Engineered a production-grade MLOps pipeline to automate the deployment of Hugging Face Transformer models (BERT, RoBERTa) on AWS. Leveraged MLflow for experiment tracking and model versioning, containerized a FastAPI-based API with Docker, and built a full CI/CD workflow with GitHub Actions to enable zero-downtime deployments to a scalable AWS ECS and ECR environment.

PythonFastAPILangGraphLangChainMulti-Agent SystemsOpenAI (GPT-5.1)+12

Ohai.ai

Senior Machine Learning Engineer

Jan 2024 – May 2025 · 1 yr 4 mos · Bangalore Urban, Karnataka, India · Hybrid

Technologies: Python, FastAPI, Typescript, Nestjs, Langchain, Openai, GPT-4o, LLM, Agents, RAG, Vector Database, Pinecone, Multimodal AI, AWS (SES, S3, cloudfront, EC2), PrismaORM, Redis, BullMQ Pro, Docker, PostgreSQL, Knowledge graph
 Designed and implemented a production-grade LLM agent stack using LangChain, tailored for a household assistant product with a modular toolchain of 25+ custom tools (calendar, reminders, messaging, email, household management, meal planning etc.), each built with conditional logic, and a dynamic system prompt enriched with contextual chat history to drive accurate tool selection and intelligent agent behavior.
 Implemented Redis-based low-latency caching to retain conversational context, optimizing AI assistant response times and minimizing database load in a real-time production environment.
 Architected scalable background processing pipelines using BullMQ Pro for async email scanning and multimodal document parsing (email/PDF/image → LLM-based extraction); enabled job-level concurrency, fault-tolerant retries, smart rate limiting and downstream task orchestration
 Built a robust LLM-based document understanding engine using Openai GPT-4o; with dynamic system prompt, multimodal inputs (image, email, text), schema-constrained JSON output, and downstream post-processing to extract structured actions (titles, events, todos, reminders)
 Implemented a Neo4j-based knowledge graph system to model and query calendar events using LLM-generated Cypher queries, enabling semantic search over event metadata
 Fine-tuned and benchmarked open-source LLMs using LoRA and Hugging Face for the task of classifying tools and their structured arguments
 Integrated cloud-native infrastructure into a NestJS application, leveraging AWS services (S3, SES, CloudFront, Secrets Manager) to support secure file handling and email workflows, and utilized Prisma ORM across services for efficient database operations.

PythonFastAPITypescriptNestjsLangchainOpenai+14

Skuad: a payoneer company

Applied Data Scientist

Sep 2020 – Dec 2023 · 3 yrs 3 mos · Bangalore Urban, Karnataka, India · Hybrid

Technologies: Python, Deep Learning, Machine Learning, NLP, MLOps, Generative AI, AWS (S3, ECR, Fargate, EC2), Pytorch, Huggingface, GPT-3, LLM, BERT, Langchain, CML, DVC, Mlflow, scikit learn, SQL, Boto3, Docker, Github actions, pandas, numpy, FAISS, chroma
 Employed GPT-4 models to annotate job functions based on their job title, finetuned BERT/Roberta/Distibert models on this annotated data and applied post-training quantization techniques, resulting in a 2x enhancement in model inference latency on CPU instances while keeping 99.8% of the accuracy
 Developed a SetFit Model (Few-Shot Learning approach) on a limited dataset for a email-classification problem, then compressed it with Knowledge Distillation which improved latency by 4x while maintaining an accuracy of 93%
 Finetuned LLM models like GPT-3 to extract timelines and convert them into appropriate date and time format along with the rule based regex system from the Schedule based emails with an overall accuracy of 97%
 Leveraged Base LLM models, like ChatGPT, for automated email response generation using dynamic few-shot examples driven by email categorization
 Designed and implemented a scalable NLP API using FastAPI and Docker, deploying it on AWS Fargate with the aid of AWS ECS and ECR
 Designed MLOPs pipeline using CML, DVC, AWS EC2 instances, and GitHub Actions to implement robust retraining schedules and strategies, ensuring continuous model accuracy and relevance
 Developed and implemented a custom NER model using Spacy to extract entities from email

PythonDeep LearningMachine LearningNLPMLOpsGenerative AI+20

Kaleidofin private limited

Data Science Research Scientist

May 2019 – Jul 2020 · 1 yr 2 mos · Chennai, Tamil Nadu, India

 Developed a Mutual Fund Recommendation model to compute the XIRR of Mutual Funds based on the rolling window for different horizons on the NAV and forecasted future NAV with time series models with LSTMs
 Developed ETL Pipeline of updating both the Mutual Funds NAV and XIRR Returns Data periodically and incorporated them in a Dash Web application which reduces computation time on 20 million data points by 90%
 Incorporated HDF5 & Postgres which significantly reduced the space and improved querying speed by 30% for Fund Database.
 Developed end to end Data Engineering and Machine Learning Modelling pipeline for Credit Risk prediction using MongoDB, python, Model Ensembling and Boosting Methods
 Implemented an intelligent Insurance Automation system with 100% accuracy and saved thousands of work hours annually
 Implemented Anomaly Detection, Customer Analytics, and Payment Prediction Models as per the business use case.

Fundsindia

Data Scientist

May 2018 – Jul 2018 · 2 mos · Chennai, Tamil Nadu, India

 Developed Customer Churn Prediction model on a highly Imbalanced data using SMOTE Techniques and Pyspark based on their derived transactions history and behavioral metrics with FundsIndia Platform.
 Responsible for development of the text classification model on FundsIndia customer reviews data
 Developed a Shiny web application for Customer Segmentation using Clustering and Rule-Based algorithms
 Analyzed Call Logs data to understand the flow of Reg-users becoming an Investor for Onboarding Team

Thoughtbit technologies

Data Science Intern

Dec 2017 – Jan 2018 · 1 mo · Chennai Area, India

 Designed the business strategy of the impact of data science on the GST data, exclusive to sales/purchase data for the small-medium manufacturing sectors.
 Engineered data pipelines to fetch data from warehouses and created interactive Data visualizations and dashboards using python
 Conducted statistical analysis on the GST data and created cron jobs using python and SQL to model future outcomes.

Indian institute of information technology design & manufacturing kancheepuram

Research Intern on Graph Theory

May 2017 – Jul 2017 · 2 mos · Chennai Area, India

Research thesis on graph subclasses stating:
 Structural observation of 3-degree graph with c3, c4 and maximum chordality 4.
 The algorithm, proof of correctness and proof of completeness of construction of this graph class.
 Poly-time algorithm with their proof of correctness for the above graph class for Steiner tree problem, vertex cover problem, FVS, OCT and dominating set ( they all are actually N-P hard ).
But for this graph class, I was able to find a poly-time algorithm with their proof of correctness.
All research writing was done on Latex.