Vikram Gupta — CTO

I am a machine learning researcher with a passion for transforming cutting-edge research into AI-powered products. Over a career of 12+ years, I have been involved in building algorithms for ADAS, Autonomous Driving, Occupant Monitoring Systems, Foundational Models, Recommender Systems, Multimodal and Multilingual semantic understanding & Engagement focussed Chatbots. My research work has been published in top-tier Computer Vision, NLP and Speech conferences - CVPR, ICCV, NeurIPS, AAAI, EMNLP, 3DV, Interspeech and ICASSP. I have been fortunate to collaborate with outstanding faculty and researchers from ETH Zurich, Toyota Technological Institute at Chicago (TTIC), Johns Hopkins University, University of Maryland, IBM-IRL, IIT Bombay on various research initiatives. Visit my google scholar profile if you would like to know more about my work: https://scholar.google.com/citations?user=jNjvdEgAAAAJ&hl=en Research Areas: Autonomous Driving Algorithms, ADAS algorithms, Unsupervised/Semisupervised Learning, Deep Clustering, Video Understanding, Face Identification, Domain Adaptation, Emotion Understanding, Gesture Recognition, Few Shot/Zero Shot Activity Recognition, Image classification, Negative mining, Multilingual Audio Search, Semantic Similarity, Speaker Recognition, Time Series data mining. In my previous life, I worked as full stack software and developed micro-services, android games/applications, web interfaces, search and big data solutions.

Stackforce AI infers this person is a Machine Learning Expert specializing in AI-driven solutions for diverse applications.

Location: Bengaluru, Karnataka, India

Experience: 16 yrs 5 mos

Skills

Autonomous Driving Algorithms
Recommendation Systems
Generative Ai
Multimodal And Multilingual Content Understanding
Computer Vision
Data Mining
Natural Language Processing
Software Engineering
Game Development
Machine Learning
Speech Processing

Career Highlights

Expert in developing AI-powered products from cutting-edge research.
Published research in top-tier conferences like CVPR and NeurIPS.
Extensive experience in autonomous driving and computer vision.

Work Experience

Minus Zero

Head of AI (2 yrs 1 mo)

ShareChat

Senior Staff Scientist - Machine Learning (1 yr 6 mos)

Staff Scientist - Machine Learning (1 yr 1 mo)

Mercedes-Benz Research and Development India

Senior Computer Vision Researcher (2 yrs 10 mos)

Netradyne

Senior Research Engineer (6 mos)

Self-Employed

Computer Vision Researcher (5 mos)

Kiwi, Inc.

Senior Machine Learning Engineer - NLP (1 yr 3 mos)

Software Engineer (2 yrs 6 mos)

Morgan Stanley

Software Engineer (1 yr 6 mos)

Indian Institute of Technology, Delhi

Research Associate - Speech Processing (2 yrs 7 mos)

Voxta Communications

Text Independent Speaker Identification (2 mos)

Loughborough University

Optically Switched Reconfigurable Antennas (2 mos)

Pravak Cybernetics

Embedded Programming (5 mos)

Visesh Infotecnics Ltd

Vehicle Tracking (2 mos)

Education

M.Tech at Indian Institute of Technology, Delhi

B.Tech at Indian Institute of Technology, Delhi

Vikram Gupta

CTO

Bengaluru, Karnataka, India16 yrs 5 mos experience

Most Likely To SwitchAI ML Practitioner

Key Highlights

Expert in developing AI-powered products from cutting-edge research.
Published research in top-tier conferences like CVPR and NeurIPS.
Extensive experience in autonomous driving and computer vision.

Stackforce AI infers this person is a Machine Learning Expert specializing in AI-driven solutions for diverse applications.

Contact

Skills

Core Skills

Autonomous Driving AlgorithmsRecommendation SystemsGenerative AiMultimodal And Multilingual Content UnderstandingComputer VisionData MiningNatural Language ProcessingSoftware EngineeringGame DevelopmentMachine LearningSpeech Processing

Other Skills

AJAXASP.NET AJAXAlgorithmsAndroidAndroid DevelopmentArtificial DataAuto ReplyC#C++CSSCar Occupancy MonitoringChatBotsData StructuresDatabasesDeep Learning

About

Experience

16 yrs 5 mos

Total Experience

1 yr 10 mos

Average Tenure

2 yrs 1 mo

Current Experience

Minus zero

Head of AI

Mar 2024 – Present · 2 yrs 1 mo · Bengaluru, Karnataka, India · On-site

Building self-driving cars for Indian roads using End-to-End Foundational Models.

End-to-End Foundational ModelsAutonomous Driving Algorithms

Sharechat

2 roles

Senior Staff Scientist - Machine Learning

Promoted

Apr 2022 – Oct 2023 · 1 yr 6 mos

Recommendation Systems || Generative AI || ChatBots || Trust and Safety ||
Worked towards incorporating auxiliary user and video features in two tower models for improving the recall of recommendation system.
Explored model based techniques for initialising the embeddings of videos using semantic features for improving the recommendations in cold-start phase.
Developed platform for generating multilingual and multimodal content (videos, slideshows, images) using Generative AI - LLMs, Diffusion models, Text2Speech etc.

Recommendation SystemsGenerative AIChatBotsTrust and Safety

Staff Scientist - Machine Learning

Feb 2021 – Mar 2022 · 1 yr 1 mo

Multimodal and Multilingual Content Understanding || Trust and Safety || Generative AI || Video Understanding ||
As part of Trust and Safety team, developed multimodal and multilingual models for analysing videos, audio, images and text to prevent offensive content on Sharechat and Moj.
Developed multimodal video understanding models for studying demand / supply dynamics across various categories like romance, comedy , sports etc.
Developed multimodal models for evaluating the quality of videos uploaded on Moj using semantic features. Video quality was used to prune low-quality videos at the inception.

Multimodal ModelsTrust and SafetyGenerative AIVideo UnderstandingMultimodal and Multilingual Content Understanding

Mercedes-benz research and development india

Senior Computer Vision Researcher

Mar 2018 – Jan 2021 · 2 yrs 10 mos · Bengaluru, Karnataka, India

Domain Adaptation || Artificial Data || Car Occupancy Monitoring || Hand Gesture Recognition || Zero/Few Shot Learning ||

Domain AdaptationArtificial DataCar Occupancy MonitoringHand Gesture RecognitionZero/Few Shot LearningComputer Vision

Netradyne

Senior Research Engineer

Aug 2017 – Feb 2018 · 6 mos · Bengaluru, Karnataka, India

|| Driving Behaviour Mining || Hard Sample Mining || Time Series Data Analysis ||
Worked on mining actionable insights from the driving data of the fleets of our clients. Visual and Inertial data is captured by our devices installed in the vehicles. Used unsupervised clustering techniques over this data to cluster similar driving patterns (Hard braking, sudden acceleration) as well as extract new patterns. These driving patterns can be used to train the drivers and promote safe driving. Explored possibilities where the inertial data can be used as the golden source when visual algorithms fail.
Developed a deep learning based prototype which can predict the video frames which are hard for an existing model to classify. Instead of labelling the complete data, only these hard frames should be labelled by the annotators. This helps in faster and cheaper annotation process without compromising on the performance of the computer vision models.

Driving Behaviour MiningHard Sample MiningTime Series Data AnalysisData Mining

Self-employed

Computer Vision Researcher

Feb 2017 – Jul 2017 · 5 mos · Bengaluru Area, India

|| Computer Vision ||
As a consultant for an AI startup, developed models for detecting the age/gender of a person and classification of cars into various categories.

Computer Vision

Kiwi, inc.

2 roles

Senior Machine Learning Engineer - NLP

Promoted

Oct 2015 – Jan 2017 · 1 yr 3 mos

As part of the Artificial Intelligence team, developed NLP solutions to enable creation of intelligent and smart bots.
Auto Reply
Developed an end to end Auto-Reply solution using deep LSTM encoder-decoder network. Trained and deployed Seq2Seq models on standard and custom conversational datasets.
Semantic Similarity
Developed an algorithm to find semantic similarity between two text phrases using advanced NLP techniques. Developed a custom algorithm to handle negations in text phrases by using constituency parse of the text.
Structured Question Answering
Developed a prototype of a Question Answering System for Structured Data.
Profanity Filters
The above solutions were exposed as REST APIs for the other teams to consume. The objective was to enable the bots to understand the intent of the users and give appropriate responses.
Earlier, I have been involved from the initial stages in the design and development of our flow based Bot making platform - onsequel.com.
Worked closely with the product managers,authors and UX designers to design and develop the core features of the platform including front end and backend.
Integrated ElasticSearch in the tool for providing advanced search functionality to our clients.
Also, worked on a demo chat application as a POC using QuickBlox on the Android platform.
Technologies: Torch, LUA, Play Framework, ElasticSearch, Java, Python, Hibernate, MySql, Redis, Angular JS, HTML, CSS

NLP SolutionsAuto ReplySemantic SimilarityStructured Question AnsweringProfanity FiltersNatural Language Processing

Software Engineer

Mar 2013 – Sep 2015 · 2 yrs 6 mos

Worked as a full stack software engineer with specialization in developing high quality social games on Android over LibGDX library.
Integral part of the team which developed the famous game title "Westbound". Was involved from the conceptualisation phase till the release and maintenance of the game. Worked closely with the product managers, game designers and UI artists.
Designed and developed lot of addictive features and mini-games to enhance the overall game experience.
Solved complex problems and came up with design solutions for both client and server side.
Technologies: Java, Php, LibGdx, Kohana MVC framework, Android, Redis

Full Stack DevelopmentGame DevelopmentSoftware Engineering

Morgan stanley

Software Engineer

Sep 2011 – Mar 2013 · 1 yr 6 mos · Mumbai Area, India

|| Investment Recommendation Engine || Text Classification ||
Worked as part of the backend team of the Investment Recommendation Engine. Involved in the understanding of the existing code base, underlying algorithm and the HADOOP framework.
Developed a Text Classification prototype with an accuracy of more than 93% using SVM classifier and TFIDF as the features.
Developed WCF web services for accessing data from the RTA DB2 database. Designed and developed UI modules using JQuery, Ajax, HTML and CSS.
Explored Microsoft Speech API to develop a speech to text conversion prototype. Developed an algorithm to interpret semantics from the text.

Text ClassificationInvestment Recommendation EngineMachine Learning

Indian institute of technology, delhi

Research Associate - Speech Processing

Jun 2009 – Jan 2012 · 2 yrs 7 mos · New Delhi Area, India

As part of my M.Tech project, I worked on "Language Independent Audio Search" under the able guidance of Prof.Arun Kumar, IIT Delhi, Jitendra Ajmera, IBM-IRL and Ashish Verma, IBM-IRL.
The objective of the research was to search and extract the audio files from a large database containing spoken keywords from a huge audio database without using any language specific information. We trained Neural Networks models on English language data and then evaluated our models on the same task on Hindi language.
We observed that neural networks are able to generalize well across languages without any language specific training.
The results were published in the 2011 INTERSPEECH conference -
http://www.isca-speech.org/archive/interspeech_2011/i11_1125.html

Language Independent Audio SearchSpeech Processing

Voxta communications

Text Independent Speaker Identification

May 2009 – Jul 2009 · 2 mos · Hyderabad Area, India

I worked as an intern with Voxta Communications under the able guidance of Sachin Joshi, Sirish Reddi and Tanmoy Mukherjee.
I developed a "Text Independent Speaker Identification" prototype using Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM) obtaining accuracies of over 95% on more than 200 speakers.
Technologies: CMU Sphinx, Matlab, Perl

Speaker IdentificationSpeech Processing

Loughborough university

Optically Switched Reconfigurable Antennas

May 2008 – Jul 2008 · 2 mos

Pravak cybernetics

Embedded Programming

Oct 2007 – Mar 2008 · 5 mos · New Delhi Area, India

Developed a Graphics User Interface(GUI) for a six motor robotic arm.
Integrated a wireless transmitter receiver with a ATMEL Mega 16 micro controller.
Developed a line tracer robot using IR sensors and PIC microcontroller.