Karthik Abinav Sankararaman

AI Researcher

San Francisco, California, United States9 yrs 6 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in reinforcement learning and large language models.
  • Led impactful AI research at Meta Superintelligence Labs.
  • Published multiple papers on algorithmic foundations.
Stackforce AI infers this person is a leading AI researcher specializing in reinforcement learning and large language models.

Contact

Skills

Core Skills

Reinforcement LearningLarge Language Models (llm)Bandit Algorithms

Other Skills

Active LearningAgentic SystemsAlgorithmsAndroidBayesian OptimizationCC++CSSComputer ScienceContextual BanditsData StructuresDjangoEclipseHTMLJava

About

I do research on frontier model training — mid/post-training, RLHF, reward modeling, tool use, and agentic behavior. My focus is on making large language models more capable, reliable, and aligned with how people actually use them.At Meta Superintelligence Labs, I lead research on Llama/MetaAI, setting technical direction across data & RL, factuality, model personality & EQ, tool use, and agentic systems. I've developed novel RL algorithms, reward modeling pipelines, and data flywheel systems for continuous model improvement — and worked across teams to translate this research into every Llama release since 2023. Before moving to frontier models, I developed RL and bandit algorithms deployed across Meta's major product surfaces — ads, recommendations, content integrity leading to significant cumulative business impact. This grounded my research in what it means to build systems that work reliably at scale. Along with product impact, I have published several papers covering the algorithmic aspects of these works.I hold a PhD from the University of Maryland in sequential decision making and bandit theory. To know more about me and my research, visit my personal webpage: karthikabinavs.xyz

Experience

9 yrs 6 mos
Total Experience
2 yrs 4 mos
Average Tenure
6 yrs 6 mos
Current Experience

Meta

2 roles

AI Research Scientist, Frontier Model Research

Promoted

Jan 2023Present · 3 yrs 3 mos · On-site

  • Lead research on frontier model training for Llama and Meta AI, setting technical direction across post-training, RLHF, reward modeling, tool use, and agentic systems. Developed novel RL algorithms and data flywheel systems for continuous model improvement. Led post-training efforts across teams for every Llama release since 2023. First launched Meta AI in 2023 and with continuous model quality improvement is now over 1 billion MAU.
RLHFReinforcement LearningLarge Language Models (LLM)Reward ModelingTool UseAgentic Systems

AI Research Scientist, AI for Products

Sep 2019Dec 2022 · 3 yrs 3 mos · On-site

  • Developed RL and bandit algorithms deployed across Meta's major product surfaces — ads, recommendations, and content integrity — delivering significant revenue and cost impact. Work spanned contextual bandits, active learning, lookalike modeling, and Bayesian Optimization, with multiple publications covering the algorithmic foundations.
  • June 2022 – Dec 2022: Modern Recommender Systems - Built next-generation industry-leading recommendation systems.
  • Sept 2021 – June 2022: Facebook/Meta AI Integrity - Developed core integrity systems to keep the platform safe at scale.
  • Sept 2019 – Sept 2021: Online Advertising - Designed and shipped RL-based ad delivery systems, including a first-of-its-kind contextual bandit system driving significant revenue impact.
Reinforcement LearningBandit AlgorithmsContextual BanditsActive LearningBayesian Optimization

Microsoft research india

2 roles

Research

Jun 2018Sep 2018 · 3 mos · New York city

  • Worked on multi armed bandits with resource constraints. Published a paper at FOCS '19 with Nicole Immorlica, Rob Schapire and Alex Slivkins.

Visiting Researcher

May 2017Jul 2017 · 2 mos · Bengaluru, Karnataka, India

  • Joint work when visiting IISc.

Indian institute of science (iisc)

Research

May 2017Jul 2017 · 2 mos · Bengaluru, Karnataka, India

  • Worked on robust algorithms for causal inference. Published a paper at UAI '19 with Navin Goyal and Anand Louis.

Ibm almaden research center

Research

May 2016Aug 2016 · 3 mos

  • Worked on applied machine learning projects. Designed and implemented neural net architectures for time series forecasting in finance domain.

Adobe

Algorithms Research

Jun 2015Aug 2015 · 2 mos · San Jose

  • Worked on design and implementation of algorithms for bidding in auctions and entity-resolution in databases.

University of michigan

Research

Jun 2013Jul 2013 · 1 mo · Ann Arbor

  • Worked on private algorithms in discrete-event systems. Published a paper at WODES '14 with Yi-chin Wu and Stefane Lafortune.

Teritree technologies

Founding Engineer

May 2012Jul 2012 · 2 mos

Early stage startups

Founding Engineer

Jan 2012Jul 2014 · 2 yrs 6 mos

Education

University of Maryland

Doctor of Philosophy (Ph.D.) — Computer Science

Indian Institute of Technology, Madras

Bachelor of Technology Honours (BTech Hons.) — Computer Science and Engineering

Stackforce found 100+ more professionals with Reinforcement Learning & Large Language Models (llm)

Explore similar profiles based on matching skills and experience