Karthik Abinav Sankararaman

AI Researcher

San Francisco, California, United States9 yrs 6 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Expert in reinforcement learning and large language models.
Led impactful AI research at Meta Superintelligence Labs.
Published multiple papers on algorithmic foundations.

Stackforce AI infers this person is a leading AI researcher specializing in reinforcement learning and large language models.

Contact

karthikabinavs@gmail.com LinkedIn

Skills

Core Skills

Reinforcement LearningLarge Language Models (llm)Bandit Algorithms

Other Skills

Active LearningAgentic SystemsAlgorithmsAndroidBayesian OptimizationCC++CSSComputer ScienceContextual BanditsData StructuresDjangoEclipseHTMLJava

About

I do research on frontier model training — mid/post-training, RLHF, reward modeling, tool use, and agentic behavior. My focus is on making large language models more capable, reliable, and aligned with how people actually use them.At Meta Superintelligence Labs, I lead research on Llama/MetaAI, setting technical direction across data & RL, factuality, model personality & EQ, tool use, and agentic systems. I've developed novel RL algorithms, reward modeling pipelines, and data flywheel systems for continuous model improvement — and worked across teams to translate this research into every Llama release since 2023. Before moving to frontier models, I developed RL and bandit algorithms deployed across Meta's major product surfaces — ads, recommendations, content integrity leading to significant cumulative business impact. This grounded my research in what it means to build systems that work reliably at scale. Along with product impact, I have published several papers covering the algorithmic aspects of these works.I hold a PhD from the University of Maryland in sequential decision making and bandit theory. To know more about me and my research, visit my personal webpage: karthikabinavs.xyz

Experience

9 yrs 6 mos

Total Experience

2 yrs 4 mos

Average Tenure

6 yrs 6 mos

Current Experience

Meta

2 roles

AI Research Scientist, Frontier Model Research

Promoted

Jan 2023 – Present · 3 yrs 3 mos · On-site

Lead research on frontier model training for Llama and Meta AI, setting technical direction across post-training, RLHF, reward modeling, tool use, and agentic systems. Developed novel RL algorithms and data flywheel systems for continuous model improvement. Led post-training efforts across teams for every Llama release since 2023. First launched Meta AI in 2023 and with continuous model quality improvement is now over 1 billion MAU.

RLHFReinforcement LearningLarge Language Models (LLM)Reward ModelingTool UseAgentic Systems

AI Research Scientist, AI for Products

Sep 2019 – Dec 2022 · 3 yrs 3 mos · On-site

Developed RL and bandit algorithms deployed across Meta's major product surfaces — ads, recommendations, and content integrity — delivering significant revenue and cost impact. Work spanned contextual bandits, active learning, lookalike modeling, and Bayesian Optimization, with multiple publications covering the algorithmic foundations.
June 2022 – Dec 2022: Modern Recommender Systems - Built next-generation industry-leading recommendation systems.
Sept 2021 – June 2022: Facebook/Meta AI Integrity - Developed core integrity systems to keep the platform safe at scale.
Sept 2019 – Sept 2021: Online Advertising - Designed and shipped RL-based ad delivery systems, including a first-of-its-kind contextual bandit system driving significant revenue impact.

Reinforcement LearningBandit AlgorithmsContextual BanditsActive LearningBayesian Optimization

Microsoft research india

2 roles

Research

Jun 2018 – Sep 2018 · 3 mos · New York city

Worked on multi armed bandits with resource constraints. Published a paper at FOCS '19 with Nicole Immorlica, Rob Schapire and Alex Slivkins.

Visiting Researcher

May 2017 – Jul 2017 · 2 mos · Bengaluru, Karnataka, India

Joint work when visiting IISc.

Indian institute of science (iisc)

Research

May 2017 – Jul 2017 · 2 mos · Bengaluru, Karnataka, India

Worked on robust algorithms for causal inference. Published a paper at UAI '19 with Navin Goyal and Anand Louis.

Ibm almaden research center

Research

May 2016 – Aug 2016 · 3 mos

Worked on applied machine learning projects. Designed and implemented neural net architectures for time series forecasting in finance domain.

Adobe

Algorithms Research

Jun 2015 – Aug 2015 · 2 mos · San Jose

Worked on design and implementation of algorithms for bidding in auctions and entity-resolution in databases.

University of michigan

Research

Jun 2013 – Jul 2013 · 1 mo · Ann Arbor

Worked on private algorithms in discrete-event systems. Published a paper at WODES '14 with Yi-chin Wu and Stefane Lafortune.

Teritree technologies

Founding Engineer

May 2012 – Jul 2012 · 2 mos

Early stage startups

Founding Engineer

Jan 2012 – Jul 2014 · 2 yrs 6 mos

Education

University of Maryland

Doctor of Philosophy (Ph.D.) — Computer Science

Indian Institute of Technology, Madras

Bachelor of Technology Honours (BTech Hons.) — Computer Science and Engineering

Stackforce found 100+ more professionals with Reinforcement Learning & Large Language Models (llm)

Explore similar profiles based on matching skills and experience

Harshit Khanna

Senior Software Engineer

at Zupee

Gurugram, India5 yrs 8 mos exp

Computer VisionData ScienceMachine LearningNatural Language Processing (NLP)

Aditya Singh

Senior Member Technical

at The D. E. Shaw Group

Noida, India2 yrs 10 mos exp

Software DevelopmentArtificial Intelligence (AI)

Yash Choudhary

Product Analyst

at Noise

Gurugram, India1 yr 10 mos exp

AI DevelopmentData AnalysisData EngineeringData ScienceNatural Language Processing

Ananyapam De

Research Fellow

at The University of Göttingen

Delhi, India4 yrs 5 mos exp

Data ScienceMachine LearningMathematicsStatistics

Niraj Bhandarwar

Founding AI Engineer

at Stealth AI Startup

Delhi, India1 yr 3 mos exp

Artificial Intelligence (AI)Machine Learning

Vishal Sharma

Senior Software Engineer L64

at Microsoft

Noida, India9 yrs 9 mos exp

Artificial Intelligence (AI)C++DjangoReactSystems Design

Anoop Gupta

Data Scientist

at Paytm

Noida, India5 yrs 4 mos exp

Artificial Intelligence (AI)Machine Learning

Lakshmi Devi Prakash

Board of Studies

at Manipal Academy of Higher Education -MAHE, Manipal

Bengaluru, India10 yrs 6 mos exp

Artificial IntelligenceMachine LearningData Science