Aditya Modi

AI Researcher

Sunnyvale, California, United States9 yrs 4 mos experience

Highly StableAI Enabled

Key Highlights

Expert in reinforcement learning and large language models.
PhD in AI with a focus on efficient exploration.
Proven impact in optimizing ad decision-making systems.

Stackforce AI infers this person is a Machine Learning Researcher specializing in AdTech and AI-driven optimization.

Contact

adityamodi94@gmail.com LinkedIn

Skills

Core Skills

Reinforcement LearningLarge Language Models (llm)Machine LearningBandit Algorithms

Other Skills

AlgorithmsArtificial Intelligence (AI)Deep Reinforcement LearningMachine Learning AlgorithmsPyTorchPythonRecommender SystemsStatistics

About

I'm a machine learning researcher with a strong background in designing, deploying, and scaling ML-driven optimization for foundation models and decision-making systems. I specialize in translating fundamental research in large language models and statistical modeling into downstream production-ready models for advertising and recommender systems. I am excited to innovate at the intersection of reinforcement learning (RL) and foundation models and finding applications in existing/novel domains. I also work on open-ended research problems in these areas and am always open for collaborations. I completed my PhD in computer science from the University of Michigan Ann Arbor in November 2021. During my PhD, the core area of my research was reinforcement learning with focus on efficient exploration and statistical efficiency of learning. Broadly, I worked on developing methods with provable guarantees for sequential decision-making frameworks like reinforcement learning, (contextual) bandits, active learning and general online learning.

Experience

9 yrs 4 mos

Total Experience

2 yrs 11 mos

Average Tenure

7 mos

Current Experience

Microsoft ai

Data and Applied Scientist

Dec 2021 – Sep 2025 · 3 yrs 9 mos · San Francisco Bay Area · Hybrid

Working on applied research problems in the Microsoft Ads marketplace and serving team (data and applied sciences division of Microsoft). Devising and applying reinforcement learning/bandit algorithms and their combination with multimodal language models in the various components of the ads pipeline in Microsoft. Driving product impact of data-driven decision making in whole page optimization, ad creative generation and multi-objective optimization. Conducting research on fundamental topics in interactive ML and decision making.
Keywords: Reinforcement learning, contextual bandits, LLMs, recommender systems, whole page optimization.

PythonPyTorchStatisticsReinforcement LearningBandit AlgorithmsDeep Reinforcement Learning+4

Microsoft

Research Intern

Jul 2018 – Oct 2018 · 3 mos · Redmond, WA

Research Intern in the Adaptive Systems and Interactions (ASI) group in Microsoft Research AI under Debadeepta Dey and Eric Horvitz. Worked on the application of contextual bandit, learning to search and policy search methods to input-adaptive parameter/algorithm selection across components in any modular software pipeline. Work published in AAAI 2020.

PythonPyTorchStatisticsReinforcement LearningBandit AlgorithmsDeep Reinforcement Learning+2

University of michigan college of engineering

3 roles

Graduate Student Research Assistant

Promoted

Apr 2017 – Nov 2021 · 4 yrs 7 mos

Advised by Prof. Ambuj Tewari and Prof. Satinder Singh.

PythonPyTorchStatisticsReinforcement LearningBandit AlgorithmsDeep Reinforcement Learning+3

Graduate Student Instructor

Jan 2017 – Apr 2017 · 3 mos

GSI for the course EECS445: Machine Learning
Instructor: Prof. Jenna Wiens
Responsible for teaching discussion sessions, curating homework and projects for the course among other administrative responsibilities.

PythonPyTorchStatisticsReinforcement LearningBandit AlgorithmsDeep Reinforcement Learning+3

Graduate Student Research Assistant

Sep 2016 – Dec 2016 · 3 mos

Advised by Prof. Ambuj Tewari (STATS/CSE) and Prof. Barzan Mozafari (CSE). Studied the sample complexity of importance-weighted active learning (IWAL) algorithms based on data-dependent complexity measures for bounded loss functions.

PythonPyTorchStatisticsReinforcement LearningBandit AlgorithmsDeep Reinforcement Learning+3

Indian institute of technology, kanpur

Teaching Assistant

Jul 2015 – Dec 2015 · 5 mos · Greater Lucknow Area

Course: Data Structures and Algorithms (ESO207)
Assisting the instructor in smooth conduction of course, designing problems for assignments and grading of assignments and exam papers for semester I of 2015-16.

PythonStatisticsDeep Reinforcement LearningMachine LearningMachine Learning Algorithms

Microsoft

Research Intern

May 2015 – Jul 2015 · 2 mos · Greater Bengaluru Area

Undergraduate research internship advised by Principal Applied Scientist Sundararajan Sellamanickam.
The work proposed an estimation method for performance measures of black-box classifiers using scarcely labelled datasets for various non-decomposable performance measures (ROC curve, PR curve, F-measure).

PythonStatisticsDeep Reinforcement LearningMachine LearningMachine Learning Algorithms