Raghav Gupta

AI Researcher

Mountain View, California, United States10 yrs experience
Highly Stable

Key Highlights

  • Expert in NLP and LLM research.
  • Led multi-objective reinforcement learning projects.
  • Pioneered scalable dialog systems at Google.
Stackforce AI infers this person is a leading AI researcher specializing in NLP and reinforcement learning.

Contact

Skills

Core Skills

Natural Language Processing (nlp)Reinforcement LearningSupervised Learning

Other Skills

AlgorithmsCC++Computer VisionData AnalysisDeep LearningJavaLarge Language Models (LLM)LinuxMatlabProgrammingPythonResearchSynthetic Data Generation

About

ML researcher at Biohub (continuing on from EvolutionaryScale) working at the intersection of AI and biology. Previously at Google DeepMind and Google Research working on NLP and LLM research, focusing on LLM post-training and evaluation, reinforcement learning, conversational agents, and synthetic data. Before that, at Stanford and IIT Bombay studying computer science and artificial intelligence.

Experience

Biohub

Research Scientist

Dec 2025Present · 3 mos · Redwood City, California, United States · Hybrid

Evolutionaryscale

Member of Technical Staff

Sep 2025Dec 2025 · 3 mos · San Francisco, California, United States

Google deepmind

Research Engineer

May 2024Aug 2025 · 1 yr 3 mos · Mountain View, California, United States

  • Research and development in reinforcement learning and LLM-powered conversational agents (published work at AAAI'25, Findings EMNLP'24):
  • Multi-objective reinforcement learning for LLM alignment:
  • Co-led team of 10+ researchers on algorithms for multi-objective reinforcement learning (CLP) and preference optimization (MO-ODPO) for inference-time steerable LLMs with param/prompt-based conditioning. Results on Anthropic-HH and OpenAI summarization. (AAAI'25, Findings EMNLP'24)
  • Deployed CLP in AI Overview in Google Search; replaced manual reward tuning; accelerated development cycles by 30%.
  • Project Astra: Core contributor; TL for team of 4 improving tool use & reasoning capabilities for Astra across verticals.
  • Built cross-vertical framework for synthetic conversational tool use data with LLM self-play (task success 76 90%).
  • Developed multi-aspect, multi-turn LLM-as-a-judge framework; devised novel conversation-level and turn-level metrics.
Natural Language Processing (NLP)Supervised LearningSynthetic Data GenerationLarge Language Models (LLM)Reinforcement Learning

Google

Software Engineer

Aug 2017Apr 2024 · 6 yrs 8 mos · Mountain View, CA

  • Research and development in task-oriented dialog (TOD) systems and efficient BERT models deployed in multiple products. Published extensively at *ACL/EMNLP, AAAI:
  • Schema-guided dialog: Pioneered 'schema-guided' paradigm of TOD systems that scale with little/no training data.
  • Released dialog datasets: Schema-Guided Dialogue, MultiWoZ 2.2 (1.5k GitHub stars) (AAAI'22, NLP4ConvAI'22, AAAI'20)
  • Created SotA scalable dialog models for natural language understanding & state tracking. (EMNLP'23, NAACL'22, ACL'19)
  • On-device BERT modeling: Research lead on mixed-vocabulary BERT distillation.
  • Developed first sub-5 MB (unquantizated) BERT models (latency 97% vs. BERT-Base, negligible downstream accuracy loss). (EACL'21)
  • Launched distilled multilingual BERT to Voice Access (accuracy 63 96%, 200K DAU), Google Recorder (662K MAU).
  • Maintained BERT models at multiple sizes supporting 20+ partner product teams.
  • Conversational retrieval: Developed effective sparse document retrieval methods (train time 98% vs. dense retrievers) for customer support chats; SotA results on conversational recommendation. (NLP4ConvAI'23 - Outstanding Paper)
Natural Language Processing (NLP)Supervised LearningSynthetic Data GenerationLarge Language Models (LLM)Reinforcement Learning

Recruiter.ai

Data Science Intern

Jun 2016Sep 2016 · 3 mos · Palo Alto, CA

  • Applied deep learning and social network analysis to discover high-quality and relevant candidates from GitHub for the candidate search portal, and effected numerous improvements to the relevance ranking engine.

Stanford university

2 roles

Teaching Assistant

Jan 2016Jun 2017 · 1 yr 5 mos · Palo Alto, CA

  • Served as teaching assistant for
  • CS224S: Spoken Language Processing (Spring 2016-17)
  • CS124: From Languages to Information (Winter 2015-16 and Winter 2016-17)
  • CS154: Automata and Complexity Theory (Autumn 2016-17)

Research Assistant

Sep 2015Jun 2016 · 9 mos · Palo Alto, CA

  • Worked in the Stanford NLP Group on tree-structured neural network models with attention mechanisms for natural language inference (paper in ACL). Also worked on the self-training strategies for slot filling in knowledge base construction for the TAC-KBP challenge.

Bar-ilan university

Research Intern

Jun 2015Aug 2015 · 2 mos · Ramat Gan, Israel

  • Worked at the intersection of corpus creation, crowdsourcing and linguistic theory. Explored approaches to enlarge annotated treebank through MTurk, to be further used for the development of advanced parsing algorithms, using cues from linguistic theory and with minimally trained annotators. For this project, we focused on verb-particle constructions and light-verb constructions in English.

Samsung electronics

Software Engineering Intern

May 2014Jul 2014 · 2 mos · Gyeonggi, South Korea

  • Worked on standardization efforts for MPEG-DASH (Dynamic Adaptive Streaming over HTTP). Experimented with various DASH architectures on top of HTTP/2.0 and proposed a new framework better suited to the future of the web.

Institute of science and technology austria

Research Intern

May 2013Jul 2013 · 2 mos · Klosterneuburg, Austria

  • Worked on biological auction theory and combinatorial game theory
  • Generalized classical results for evolutionarily stable strategies in all-pay auctions from the one reward per auction case to the multiple rewards per auction case. Published in Proceedings of the Royal Society: Biological Sciences.
  • Devised and implemented a novel approximate algorithm to minimize the total expected cost of the almost-sure reachability objective in POMDPs. Papers in AAAI 2015 and ICRA 2015.

Education

Stanford University

Master’s Degree — Computer Science

Jan 2015Jan 2017

Indian Institute of Technology, Bombay

Bachelor's Degree — Computer Science and Engineering

Jan 2011Jan 2015

Stackforce found 100+ more professionals with Natural Language Processing (nlp) & Reinforcement Learning

Explore similar profiles based on matching skills and experience