Sarvesh Khetan

AI Researcher

College Park, Maryland, United States2 yrs 2 mos experience
AI ML PractitionerAI Enabled

Key Highlights

  • Over 2 years of experience as an applied data scientist.
  • Expertise in Generative AI and Large Language Models.
  • Passionate about self-improvement and personal growth.
Stackforce AI infers this person is a Data Scientist specializing in AI and Big Data solutions.

Contact

Skills

Core Skills

Generative AiLarge Language Models (llm)Big DataMachine LearningData EngineeringData WarehousingForecastingAutomationWeb DevelopmentDeep Learning

Other Skills

AWS API GatewayAWS EMRAWS GlueAWS LambdaAWS SageMakerAWS Step FunctionAWS Step FunctionsAWS appflowAirflowAmazon Elastic MapReduce (EMR)Amazon S3Amazon Web Services (AWS)Analytical SkillsApache SparkBig Data Analytics

About

Hi there! ๐Ÿ‘‹ I'm Sarvesh. I am a tech enthusiast documenting my learnings on medium (https://medium.com/@khetansarvesh) . Checkout my Github repositories to see implementation of these articles (https://github.com/khetansarvesh ) Currently Iโ€™m pursuing my Masters in Machine Learning @University of Maryland, College Park and have an experience of over 2 years as an applied data scientist. I have worked in GenAI, Big Data, Machine Learning and Full Stack Data Science Projects. During my journey, I have worked with libraries & frameworks in Python such as Pytorch, Langchain, Neo4j, Kafka, Kubernetes and AWS. My interests range from distributed systems, big data to AI, and I'm always eager to explore new technological horizons. My day mostly revolves around exploring and understanding SOTA ML/DL algorithms like Transformers, Attention Mechanism, Diffusion Modelling, Graph Neural Networks...., with some flavors of data tools like snowflake and mongodb. On a personal level, I'm passionate about self-improvement and personal growth. I believe that we can always learn and grow, both professionally and personally. I love reading books ๐Ÿ“—, listening to podcasts ๐ŸŽ™๏ธ and play sports ๐Ÿ๐Ÿ“๐Ÿธ I actively create content on LinkedIn and Medium to help others make a career in tech. I hope you have a lovely day!

Experience

2 yrs 2 mos
Total Experience
2 yrs 2 mos
Average Tenure
--
Current Experience

Microstrategy (now strategy)

AI Engineer

Jun 2025 โ€“ Aug 2025 ยท 2 mos ยท United States ยท Hybrid

  • Optimized a multi-agent ReAct-style Deep Research system with LangGraph by integrating planners and external tools, enabling coordinated search and achieving a 40% improvement in LLM response quality.
  • Built a long-context LLM evaluator using LLM-as-a-Judge and checklist scoring, achieving 94% agreement with human labels on complex, multi-agent outputs.
LangGraphLLM response qualitymulti-agent systemsGenerative AILarge Language Models (LLM)

Piramal capital & housing finance limited

2 roles

Manager (DS2)

Promoted

Mar 2023 โ€“ Aug 2024 ยท 1 yr 5 mos ยท Bengaluru, Karnataka, India ยท Hybrid

  • โ–‡ ๐ƒ๐ž๐ฌ๐ข๐ ๐ง๐ž๐ ๐š๐ง๐ ๐ƒ๐ž๐ฏ๐ž๐ฅ๐จ๐ฉ๐ž๐ ๐๐š๐ญ๐œ๐ก ๐…๐ž๐š๐ญ๐ฎ๐ซ๐ž ๐’๐ญ๐จ๐ซ๐ž ๐Œ๐ข๐œ๐ซ๐จ ๐’๐ž๐ซ๐ฏ๐ข๐œ๐ž
  • โžค Contributed in LLD and implemented feature creation codes by leveraging distributed processing in pyspark
  • โžค Reduced redundant work of modelling team, increasing their efficiency by 2x
  • โžค TechStack : AWS API Gateway, AWS Lambda, AWS Glue/EMR, Spark
  • โ–‡ ๐๐€ ๐‚๐ก๐š๐ญ๐›๐จ๐ญ ๐จ๐ง ๐’๐ง๐จ๐ฐ๐Ÿ๐ฅ๐š๐ค๐ž ๐“๐š๐›๐ฎ๐ฅ๐š๐ซ ๐ƒ๐š๐ญ๐š (๐†๐ž๐ง๐ž๐ซ๐š๐ญ๐ข๐ฏ๐ž ๐€๐ˆ)
  • โžค Started with SQL code generation but quickly realised it's not a scalable solution cause LLMs are not advanced enough to write complex join conditions
  • โžค Experimented with LangChain and Llama Index but later we developed our own versions of same internally
  • โžค(Graph LLM) Transformed tabular data into a knowledge graphs to perform cypher query generation (Cons : unscalable with the increasing graph size)
  • โžค Finally settled on a RAG-based strategy on KG inspired by groundbreaking papers such as Node2Vec and DeepWalk where we created node embeddings of the graph
  • โžค End to End Development and Deployed on AWS ECS
  • โžค Successful implementation resulted in the SLT's trust in this new technology and a series of successful AI initiatives that propelled the companyโ€™s presence in the Generative AI space
  • โžค Tech Stack : AWS Sagemaker, AWS Glue, Databricks, Postgre SQL
  • โ–‡ ๐Œ๐‹๐Ž๐๐’ / ๐‹๐‹๐Œ๐Ž๐๐’
  • โžค Deployed Open Source LLMs on AWS infra
  • โžค Deployed LLM based applications like QA Chatbots, Hiring Module, Customer Call Analysis Model
  • โžค Tech Stack : Docker, AWS ECS
AWS API GatewayAWS LambdaAWS GluepysparkSQLDatabricks+2

Management Trainee (DS1)

Jun 2022 โ€“ Mar 2023 ยท 9 mos ยท Bengaluru, Karnataka, India ยท Hybrid

  • โ–‡ ๐‘๐ž-๐š๐ซ๐œ๐ก๐ข๐ญ๐ž๐œ๐ญ๐ž๐ ๐ฅ๐จ๐š๐ง ๐๐š๐ญ๐š ๐Ÿ๐ฅ๐จ๐ฐ ๐ฉ๐ข๐ฉ๐ž๐ฅ๐ข๐ง๐ž
  • โžค Implemented multiple data views namely : Data Caffe (SOR) Layer, DataMart Layer, Aggregate Layer
  • โžค Added SCD1 (incremental extraction) and SCD2 (incremental update) to reduce table load time and thus optimize DWH costs
  • โžค Introduced and implemented SPC (Data Quality - DQ) concept on all the tables of this layer
  • โžค This redesigned architecture reduced query cost and improved productivity of analysts by 2-3x
  • โžค Introduced team to the advantages of modular coding
  • โžค Started the culture using confluence for documentation and JIRA for progress tracking
  • โžค Tech Stack : SQL, Spark, AWS GLUE, AWS EMR, Airflow
  • โ–‡ ๐‘๐ž๐ฌ๐ฉ๐จ๐ง๐ฌ๐ข๐›๐ฅ๐ž ๐Ÿ๐จ๐ซ ๐ซ๐ž๐๐ž๐ฌ๐ข๐ ๐ง๐ข๐ง๐  ๐›๐š๐œ๐ค๐ž๐ง๐ ๐Ÿ๐จ๐ซ ๐ž๐ฑ๐ž๐œ๐ฎ๐ญ๐ข๐ฏ๐ž ๐ฅ๐ž๐ฏ๐ž๐ฅ ๐๐š๐ฌ๐ก๐›๐จ๐š๐ซ๐ (๐ซ๐ž๐Ÿ๐ž๐ซ๐ซ๐ž๐ ๐›๐ฒ ๐‚๐—๐Ž๐ฌ ๐š๐ง๐ ๐›๐จ๐š๐ซ๐ ๐จ๐Ÿ ๐๐ข๐ซ๐ž๐œ๐ญ๐จ๐ซ๐ฌ)
  • โžค Pivoted data flow pipeline from individual system handling to Data Caffe Layer which resulted into codebase reduced by 10x and query runtime cost saved by 5x
  • โ–‡ ๐ƒ๐ž๐ฌ๐ข๐ ๐ง๐ž๐ ๐š๐ง๐ ๐ƒ๐ž๐ฏ๐ž๐ฅ๐จ๐ฉ๐ž๐ ๐š ๐…๐จ๐ซ๐ž๐œ๐š๐ฌ๐ญ๐ข๐ง๐  ๐Œ๐จ๐๐ž๐ฅ
  • โžค Used DL based Transformer decoder model to forecast daily level sanction, disbursement, login counts and amounts
  • โžค Accuracy achieved about ~90% but it gave the SLT a visibility into the future to expand operations accordingly
  • โžค Tech Stack : Pytorch, Deep Learning, AWS Sagemaker, AWS ECS
  • โ–‡ ๐‚๐จ๐ง๐ญ๐ซ๐ข๐›๐ฎ๐ญ๐ž๐ ๐ข๐ง ๐ก๐ข๐ซ๐ข๐ง๐  ๐๐ซ๐ข๐ฏ๐ž๐ฌ, ๐œ๐จ๐ง๐๐ฎ๐œ๐ญ๐ข๐ง๐  ๐ข๐ง๐ญ๐ž๐ซ๐ฏ๐ข๐ž๐ฐ๐ฌ, ๐๐ž๐ฌ๐ข๐ ๐ง๐ข๐ง๐  ๐จ๐ง๐›๐จ๐š๐ซ๐๐ข๐ง๐  ๐ซ๐จ๐š๐๐ฆ๐š๐ฉ ๐Ÿ๐จ๐ซ ๐ง๐ž๐ฐ ๐ก๐ข๐ซ๐ž๐ฌ
SQLSparkAWS GlueAirflowData EngineeringData Warehousing

Commerceiq

Data Scientist

Mar 2022 โ€“ Jun 2022 ยท 3 mos ยท Bengaluru, Karnataka, India

  • โ–‡ ๐ƒ๐ž๐ฆ๐š๐ง๐ ๐๐ฅ๐š๐ง๐ง๐ข๐ง๐  / ๐…๐จ๐ซ๐ž๐œ๐š๐ฌ๐ญ๐ข๐ง๐ 
  • โžค Used FBProphet algorithm to model the demand
  • โžค Implemented 'one-click automation' feature - capable of automatically doing operations on excel files using openpyxl, thus saving a lot of manual efforts
  • โžค Productionzed Batch Modelling Code by modularizing codebase treating each module as microservice and handling exceptions (OOPs) + providing DQ checks for each module
  • โžค Stitched all modules via AWS Step Function
FBProphetopenpyxlAWS Step FunctionForecastingAutomation

Piramal capital & housing finance limited

Data Scientist - Internship

Jul 2021 โ€“ Dec 2021 ยท 5 mos ยท Bengaluru, Karnataka, India

  • โ–‡ ๐ˆ๐ง๐ฌ๐ข๐ ๐ก๐ญ๐ฌ ๐‹๐š๐›
  • โžค Developed an MVP website : one-stop website for 100s of Power BI dashboards
  • โžค Enabled with data governance for each dashboards
  • โžค Tech Stack : HTML, CSS, Javascripts.
  • โ–‡ ๐’๐ž๐ฅ๐Ÿ ๐’๐ž๐ซ๐ฏ๐ข๐œ๐ž ๐๐ฎ๐ฌ๐ข๐ง๐ž๐ฌ๐ฌ ๐ˆ๐ง๐ญ๐ž๐ฅ๐ฅ๐ข๐ ๐ž๐ง๐œ๐ž (๐’๐’๐๐ˆ)
  • โžค Several teams were using same data from the backend but querying again and again thus increasing database server load
  • โžค Implemented Architecture Design and DataMart for SSBI platform
  • โžค Tech Stack : Spark, AWS Glue, AWS Managed Airflow
  • โ–‡ ๐“๐ž๐ฑ๐ญ๐Ÿ๐’๐๐‹ ๐„๐ง๐ ๐ข๐ง๐ž
  • โžค For the above SSBI platform implemented a Seq2Seq RNN model to convert user input text to SQL
  • โžค Achieved a BLUE score of 0.4
  • โžค Tech Stack : AWS Sagemaker, Pytorch
HTMLCSSJavaScriptWeb Development

Tvs motor company

Deep Learning Engineer

May 2020 โ€“ Jun 2020 ยท 1 mo ยท Chennai, Tamil Nadu, India ยท Remote

  • โ–‡ ๐‘๐ž๐š๐ฅ ๐„๐ฌ๐ญ๐š๐ญ๐ž (๐‡๐จ๐ฎ๐ฌ๐ž๐ก๐จ๐ฅ๐) ๐๐ซ๐ข๐œ๐ž ๐๐ซ๐ž๐๐ข๐œ๐ญ๐ข๐จ๐ง
  • โžค Performed data integration using pandas (handled 9621 x 238 dimension time series dataset)
  • โžค Implemented feature engineering & selection using Regression algorithms (scipy)
  • โžค Achieved an accuracy of >85% using bidirectional LSTM model for multivariate time series forcasting
  • โžค Deployed it in Chennai Campus so that TVS Sales team can take data driven investments
  • โžค Tech Stack : Pytorch
PytorchDeep Learning

Education

University of Maryland

Masters in Applied Machine Learning โ€” Artificial Intelligence

Aug 2024 โ€“ Dec 2025

Birla Institute of Technology and Science, Pilani

Minor in Data Science โ€” Data Modeling/Warehousing and Database Administration

Aug 2020 โ€“ Aug 2022

International Institute of Information Technology Hyderabad (IIITH)

Summer School โ€” Deep Learning in NLP

Jul 2021 โ€“ Jul 2021

Birla Institute of Technology and Science, Pilani

Bachelor of Technology - BTech

Jun 2018 โ€“ Jun 2022

Pace Junior Science College,Andheri

Jan 2016 โ€“ Jan 2018

N.L. Dalmia High School - India

Jan 2005 โ€“ Jan 2016

Stackforce found 100+ more professionals with Generative Ai & Large Language Models (llm)

Explore similar profiles based on matching skills and experience