Y

Yash Bhalgat

AI Researcher

Oxford, England, United Kingdom7 yrs 9 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • PhD researcher at the prestigious Visual Geometry Group.
  • Expertise in video generation and multimodal AI.
  • Proven track record in developing cutting-edge AI solutions.
Stackforce AI infers this person is a Researcher specializing in AI and Computer Vision with a focus on generative models.

Contact

Skills

Core Skills

Computer VisionMachine LearningVideo GenerationGenerative AiDeep LearningArtificial Intelligence (ai)Natural Language Processing (nlp)Software Development

Other Skills

Computer Graphics3D ReconstructionAugmented Reality (AR)Virtual Reality (VR)World ModelsLarge Language Models (LLM)Retrieval-Augmented Generation (RAG)MultimodalEdge ComputingAR/VRProject ManagementAlgorithmsPython (Programming Language)C++Object-Oriented Programming (OOP)

About

PhD Researcher with the Visual Geometry Group at Oxford, working on Video Generation for World Modeling, 3D Computer Vision (Understanding and Generation) and Vision-Language Foundation Models. Previously, I was a Research Scientist at Qualcomm AI Research, where I worked on algorithm and system design to develop efficient deep networks for computer vision usecases. I also worked at a startup, Voxel51 Inc., developing video processing pipelines for the cloud. I have a MS in CS from University of Michigan, Ann Arbor and Bachelors from IIT Bombay. I have interned at Meta Reality Labs, IBM Research - Almaden, IBM India Research Lab, TCS Research and Infurnia (a startup based in Mumbai). SKILLS: Python, C++, C, SQL, Julia, MATLAB, R, PyTorch, TensorFlow, Keras, OpenAI gym, Theano, CUDA, git For more information, please visit my webpage: https://yashbhalgat.github.io

Experience

7 yrs 9 mos
Total Experience
1 yr 7 mos
Average Tenure
4 yrs 8 mos
Current Experience

Meta

Research Scientist Intern

Apr 2025Sep 2025 · 5 mos

  • Building large-scale generative 3D/4D foundation models for World Modeling and photorealistic Video Generation.
Video GenerationWorld ModelsMachine LearningComputer VisionComputer Graphics

Multiple startups

AI Consultant

Feb 2023Mar 2025 · 2 yrs 1 mo · Remote

  • 1. AI chip company: Developing real-time low-power Computer Vision algorithms for augmented reality on smart glasses.
  • 2. Content moderation company: Deploying Large Language Model (LLM) solutions to moderate multimodal data online.
  • 3. Togal.AI : Building Computer Vision solutions for detecting, measuring and comparing project features on architectural plans and drawings
Large Language Models (LLM)Augmented Reality (AR)Retrieval-Augmented Generation (RAG)Generative AIComputer VisionMultimodal

University of oxford

DPhil (PhD) Researcher, Visual Geometry Group

Oct 2021Present · 4 yrs 8 mos · Oxford, England, United Kingdom · On-site

  • Research focus: 3D/4D Reconstruction and Generation, Vision-Language (Multimodal) Foundation models, 3D+LLMs
  • Advisors: Andrew Zisserman, Andrea Vedaldi, Joao Henriques, Iro Laina.
  • Publications at CVPR, NeurIPS, ECCV, ACCV, ICLR and 3DV.
Generative AIArtificial Intelligence (AI)Computer VisionMachine LearningComputer Graphics3D Reconstruction+2

Qualcomm

2 roles

Senior Machine Learning Researcher - Qualcomm AI Research

Nov 2020Jul 2021 · 8 mos

Edge ComputingDeep LearningComputer VisionAR/VRProject Management

Machine Learning Researcher - Qualcomm AI Research

Jun 2019Oct 2020 · 1 yr 4 mos

  • Efficient Deep Learning for Computer Vision -- algorithm development and system design
  • Spearheaded the ultra-low resource always-on vision project from model design, quantization to final hardware mapping
  • Filed 12 inventions in 2020-21 of which 6 ideas have been filed for patent protection. Notable works on 3D hand-pose estimation, low-bit quantization, structured and unstructured pruning.
  • Led Qualcomm’s team in the MicroNet Challenge at NeurIPS 2019, and achieved 3rd position in ImageNet track [https://github.com/yashbhalgat/QualcommAI-MicroNet-submission-MixNet]
  • Managed/mentored interns - Jangho Kim and John Yang (PhD @ SNU) with contributions to the AR/VR project
Edge ComputingComputer VisionArtificial Intelligence (AI)AR/VRAlgorithms

Voxel51

Computer Vision and Machine Learning Engineer

Jan 2019May 2019 · 4 mos

  • Researched and developed production pipelines for real-time vehicle tracking for querying on large-scale video databases
Machine LearningComputer VisionSoftware Development

Ibm

AI Research Intern

Jun 2018Aug 2018 · 2 mos · Almaden, San Jose

  • Worked at IBM Almaden Research Lab with the Watson Languages group on task-agnostic classification in the presence of label noise
  • Built ensemble-based frameworks for combining weakly-labeled (or mislabeled) and high-quality samples for the training of a sentiment model.
  • Work accepted to KONVENS 2019
Natural Language Processing (NLP)Machine LearningArtificial Intelligence (AI)

Ifp energies nouvelles

Research Intern

May 2017Jul 2017 · 2 mos · Paris

  • Used Scattering Wavelet Networks for the classification an segmentation of seismic structural "monads". Work done has been accepted as a paper at ICASSP 2018. You can read about it here: http://www.laurent-duval.eu/opus-cats-eyes-seismic-data-classification-scattering-networks.html

Ibm

Research Intern

May 2016Jul 2016 · 2 mos · Bangalore

  • Used CorrNets, an autoencoder-based architecture, to learn the joint representation for images
  • and captions. We were able to obtain state-of-art results for large fashion catalogues search without manual tagging.

Tata research development and design centre (trddc)

Research Intern

Nov 2015Dec 2015 · 1 mo · Pune, Maharashtra, India

  • With specific recognition to stamp detection and segmentation, we proposed a shape-based ranking
  • algorithm to learn the 1st layer of a CNN. Detection accuracy 94% and segmentation IoU 74.81%. Work accepted as a short Paper at the DAS 2016 conference.

Infurnia

Software Engineer Intern

May 2015Jul 2015 · 2 mos · Mumbai, Maharashtra, India · On-site

  • Software module development using CAD modelling engine
  • Developed a range of ‘constraint-modules’ for automated construction of furniture parts using the FreeCAD engine.

Focus analytics

Indoor Navigation System - Intern

Nov 2014Dec 2014 · 1 mo

  • Designing an Indoor Navigation System using Pedometry and Particle filters.
  • My work (Pedometry) involved on IMU sensors:
  • 1. Using various algorithms like PCA, Triad algo to determine the heading of motion,
  • 2. Use FFT, CWT etc. to determine the step frequency and model the step length of the user.

Mars society of india, iit bombay

Navigation and Image Processing Engineer

Aug 2014May 2015 · 9 mos

  • Development of a Mars rover with Mars Society of India for the University Rover Challenge organized by Mars Society, Utah, United States.
  • ● Worked on Sensor calibration and testing for navigation
  • ● Also a part of the Image Processing subsystem for vision-guided navigation of the rover.
Python (Programming Language)C++Object-Oriented Programming (OOP)Software Development

Education

University of Oxford

Doctor of Philosophy - PhD — Artificial Intelligence and Computer Vision

Oct 2021Nov 2026

University of Michigan - Rackham Graduate School

Masters — Computer Science

Jan 2017Jan 2018

Indian Institute of Technology, Bombay

BTech with Honors — Electrical Engineering

Jan 2013Jan 2017

Indian Institute of Technology, Bombay

Minor — Computer Science

Jan 2013Jan 2017

Stackforce found 100+ more professionals with Computer Vision & Machine Learning

Explore similar profiles based on matching skills and experience