Aditya Modi

AI Researcher

Sunnyvale, California, United States9 yrs 3 mos experience
Highly StableAI Enabled

Key Highlights

  • Expert in reinforcement learning and large language models.
  • PhD in AI with a focus on efficient exploration.
  • Proven impact in optimizing ad decision-making systems.
Stackforce AI infers this person is a Machine Learning Researcher specializing in AdTech and AI-driven optimization.

Contact

Skills

Core Skills

Reinforcement LearningLarge Language Models (llm)Machine LearningBandit Algorithms

Other Skills

AlgorithmsArtificial Intelligence (AI)Deep Reinforcement LearningMachine Learning AlgorithmsPyTorchPythonRecommender SystemsStatistics

About

I'm a machine learning researcher with a strong background in designing, deploying, and scaling ML-driven optimization for foundation models and decision-making systems. I specialize in translating fundamental research in large language models and statistical modeling into downstream production-ready models for advertising and recommender systems. I am excited to innovate at the intersection of reinforcement learning (RL) and foundation models and finding applications in existing/novel domains. I also work on open-ended research problems in these areas and am always open for collaborations. I completed my PhD in computer science from the University of Michigan Ann Arbor in November 2021. During my PhD, the core area of my research was reinforcement learning with focus on efficient exploration and statistical efficiency of learning. Broadly, I worked on developing methods with provable guarantees for sequential decision-making frameworks like reinforcement learning, (contextual) bandits, active learning and general online learning.

Experience

Meta

Senior Research Scientist

Oct 2025Present · 5 mos · Sunnyvale, California, United States · On-site

Microsoft ai

Data and Applied Scientist

Dec 2021Sep 2025 · 3 yrs 9 mos · San Francisco Bay Area · Hybrid

  • Working on applied research problems in the Microsoft Ads marketplace and serving team (data and applied sciences division of Microsoft). Devising and applying reinforcement learning/bandit algorithms and their combination with multimodal language models in the various components of the ads pipeline in Microsoft. Driving product impact of data-driven decision making in whole page optimization, ad creative generation and multi-objective optimization. Conducting research on fundamental topics in interactive ML and decision making.
  • Keywords: Reinforcement learning, contextual bandits, LLMs, recommender systems, whole page optimization.
PythonPyTorchStatisticsReinforcement LearningBandit AlgorithmsDeep Reinforcement Learning+4

Microsoft

Research Intern

Jul 2018Oct 2018 · 3 mos · Redmond, WA

  • Research Intern in the Adaptive Systems and Interactions (ASI) group in Microsoft Research AI under Debadeepta Dey and Eric Horvitz. Worked on the application of contextual bandit, learning to search and policy search methods to input-adaptive parameter/algorithm selection across components in any modular software pipeline. Work published in AAAI 2020.
PythonPyTorchStatisticsReinforcement LearningBandit AlgorithmsDeep Reinforcement Learning+2

University of michigan college of engineering

3 roles

Graduate Student Research Assistant

Promoted

Apr 2017Nov 2021 · 4 yrs 7 mos

  • Advised by Prof. Ambuj Tewari and Prof. Satinder Singh.
PythonPyTorchStatisticsReinforcement LearningBandit AlgorithmsDeep Reinforcement Learning+3

Graduate Student Instructor

Jan 2017Apr 2017 · 3 mos

  • GSI for the course EECS445: Machine Learning
  • Instructor: Prof. Jenna Wiens
  • Responsible for teaching discussion sessions, curating homework and projects for the course among other administrative responsibilities.
PythonPyTorchStatisticsReinforcement LearningBandit AlgorithmsDeep Reinforcement Learning+3

Graduate Student Research Assistant

Sep 2016Dec 2016 · 3 mos

  • Advised by Prof. Ambuj Tewari (STATS/CSE) and Prof. Barzan Mozafari (CSE). Studied the sample complexity of importance-weighted active learning (IWAL) algorithms based on data-dependent complexity measures for bounded loss functions.
PythonPyTorchStatisticsReinforcement LearningBandit AlgorithmsDeep Reinforcement Learning+3

Indian institute of technology, kanpur

Teaching Assistant

Jul 2015Dec 2015 · 5 mos · Greater Lucknow Area

  • Course: Data Structures and Algorithms (ESO207)
  • Assisting the instructor in smooth conduction of course, designing problems for assignments and grading of assignments and exam papers for semester I of 2015-16.
PythonStatisticsDeep Reinforcement LearningMachine LearningMachine Learning Algorithms

Microsoft

Research Intern

May 2015Jul 2015 · 2 mos · Greater Bengaluru Area

  • Undergraduate research internship advised by Principal Applied Scientist Sundararajan Sellamanickam.
  • The work proposed an estimation method for performance measures of black-box classifiers using scarcely labelled datasets for various non-decomposable performance measures (ROC curve, PR curve, F-measure).
PythonStatisticsDeep Reinforcement LearningMachine LearningMachine Learning Algorithms

Education

University of Michigan College of Engineering

Doctor of Philosophy (PhD) — Artificial Intelligence

Sep 2016Nov 2021

Indian Institute of Technology, Kanpur

Bachelor of Technology (BTech) — Computer Science and Engineering

Jan 2012Jan 2016

Aklank Public School, Kota (Raj.)

AISSE (CBSE)

Jan 2010Jan 2012

Holy Hearts Educational Academy, Raipur (C.G.)

Matriculation

Jan 2010Present

Stackforce found 100+ more professionals with Reinforcement Learning & Large Language Models (llm)

Explore similar profiles based on matching skills and experience