Kunal Dhawan

AI Researcher

Santa Clara, California, United States5 yrs 2 mos experience

Most Likely To SwitchAI ML Practitioner

Key Highlights

Expert in Conversational AI and Natural Language Processing.
Developed robust ASR systems for multiple Indian languages.
Led innovative projects in speech and audio processing.

Stackforce AI infers this person is a Conversational AI and Speech Processing expert in the Telecommunications industry.

Contact

Skills

Core Skills

Machine LearningData ScienceConversational AiNatural Language ProcessingSpeech Processing

Other Skills

SparkDistributed ComputingHashing TheoryCloud ComputingLarge-Scale OptimizationParallel/Distributed DLOptimization for DLModel CompressionNeural Architecture SearchFederated LearningNatural Language UnderstandingASRDeep LearningData AnnotationSentiment Analysis

About

Personal Webpage: https://kunal-dhawan.weebly.com GitHub: https://github.com/KunalDhawan Google Scholar: https://scholar.google.co.in/citations?user=LLhtidwAAAAJ

Experience

5 yrs 2 mos

Total Experience

1 yr 3 mos

Average Tenure

3 yrs 2 mos

Current Experience

Nvidia

4 roles

Senior Research Scientist

Promoted

Mar 2026 – Present · 3 mos · Santa Clara, California, United States

Research Scientist II

Apr 2024 – Feb 2026 · 1 yr 10 mos · Santa Clara, California, United States

Research Scientist

Feb 2023 – Mar 2024 · 1 yr 1 mo · Santa Clara, California, United States

Conversational AI and LLM applied research @ NVIDIA NeMo

Applied Research Intern

May 2022 – Aug 2022 · 3 mos · Santa Clara, California, United States

Deep Learning Applied Research intern with the NeMo team https://github.com/NVIDIA/NeMo

Carnegie mellon university school of computer science

Graduate Teaching Assistant

Aug 2022 – Dec 2022 · 4 mos · Pittsburgh, Pennsylvania, United States

Graduate Teaching Assistant for the course 10-605/10-805 'Machine Learning with Large Datasets' with prof. Ameet Talwalkar and prof. Henry Chai
Topics: Spark, Distributed Computing, Hashing Theory, Cloud Computing, Large-Scale Optimization, Parallel/Distributed DL, Optimization for DL, Model Compression, Neural Architecture Search, Federated Learning
Course website: https://10605.github.io/fall2022

SparkDistributed ComputingHashing TheoryCloud ComputingLarge-Scale OptimizationParallel/Distributed DL+6

Voicezen india

Data Scientist

Oct 2020 – Apr 2021 · 6 mos · Gurugram, Haryana, India

Solving some complex Conversational AI and Natural Language Understanding problems in the Indian context
Led project for SOP step adherence and customer satisfaction detection from raw telephonic speech. Built a robust system from scratch, right from problem understanding with the client (TataSky), data collection & labelling, and defining KPIs to system deployment for processing daily volume of 100+ Hrs of telephonic conversation. The system performed 1) customer problem identification, 2) tracking SOP steps followed (text-based approach on ASR output, used fine-tuned RoBERTa), 3) estimating time taken for each SOP step (to detect anomalies, aid training of customer care executives), and finally 4) detected customer sentiment (speech + text).
Developed systems for telecom domain ASR for 5 Indian languages: Hindi, Telugu, Tamil, Marathi, and Bengali from scratch. Worked on the entire pipeline starting from data procurement, coming up with creative ways for data annotation to get maximal information with minimal expenditure, data selection, leveraging a large amount of unlabelled telephonic conversational data at our disposal, training fully convolutional DNN based architecture in a distributed fashion on cloud. All systems achieved ~10% WER on blind data without the use of complex language models.
Designed a language agnostic speaker diarization system for mono-channel audios using a novel speaker embedding based intra-conversation clustering algorithm. The proposed system achieved 92% accuracy on 150+ Hrs real-life conversation data.
Worked on phantom segment detection: VAD segments which are empty/noisy but still trigger ASR to output logical words due to LM bias, leading to change in meaning of the conversation. Proposed a fundamental frequency based CTC blank ratio metric which resulted in 84% accuracy on 300+ hrs multilingual production data and hence improved performance of downstream systems like sentiment analysis.

Conversational AINatural Language UnderstandingASRDeep LearningData AnnotationSentiment Analysis+1

Jio platforms limited (jpl)

Data Scientist

Jul 2019 – Sep 2020 · 1 yr 2 mos · Greater Hyderabad Area

Member of the Speech and Vision team, Reliance Jio AI CoE. Worked on multiple interesting problem statements which had direct impact on various Jio businesses-
Speech and Audio processing:
Developed text-to-speech (TTS) solutions for Indian languages, responsible for the entire process from system design to deployment. Solution developed was used in Jio Machli application which was awarded the AEGIS Graham Bell Award, 2019.
Built novel speaker and style transfer TTS systems for Hindi, Tamil and Telugu languages which are currently being used across multiple businesses in the Jio fraternity like Retail, Fashion and Healthcare.
Developed a robust speech based keyword recognition system to be integrated with Jio Retail mobile application. The system enables users to search for products, add them to cart and communicate with the app using their voice.
Built a virtual call center assistant from ground up which separates customer and operator segments from a single channel input and then identifies the emotion of the customer solely from this speech input. This helps the call-center team to identify potential unhappy customers and prioritize their resources.
Covid cough detection: built an online non-intrusive initial screening method for detection of Covid-19 by analysing the cough samples of patients. This system also provides acoustic visualizations to help doctors make better decisions. It is being extended to cover major respiratory diseases and will be integrated to Jio health Hub
Video analytics:
Developed a multiple person detection and tracking system for the SLP team to detect thefts and identify trespassers. The system helped reduce false alarms by 100x as compared to sensor based systems.

Text-to-SpeechSpeech ProcessingKeyword RecognitionVideo AnalyticsMachine Learning

University of southern california

Undergraduate Research Fellow

May 2018 – Jul 2018 · 2 mos · Signal Analysis and Interpretation Laboratory (SAIL)

Project Title: An i-vector based Non-Negative Matrix Factorization approach towards Noise Robust Automatic Speech Recognition
Advisor: prof. Shrikanth Narayanan
Motivation: Given the rise of consumer-centric applications like voice interaction with mobile devices and home entertainment systems, it is imperative for Automatic Speech Recognition systems to be robust to the full range of real-world noise and other acoustic distorting conditions.
Project description:
Explored the efficacy of non-negative matrix factorization based time-activations as acoustic features for building an Automatic Speech Recognition System
Proposed a novel noise robust non-negative matrix factorization approach based on the concept of total variability modelling, which comfortably outperforms the state of the art on the challenging Aurora-4 dataset

Indian institute of technology, delhi

Summer Research Intern

May 2017 – Jul 2017 · 2 mos · Centre for Applied Research in Electronics , IIT Delhi

Advisor: prof. Arun Kumar
Project description:
Worked on improving the accuracy of Inertial Navigation System (INS) equations by implementing
differential equation based solution in the discrete-time domain instead of the classical linearized
approximation.
Developed a novel Kalman Filter based Multi-Sensor Data Fusion algorithm for utilizing GPS data available at low frequency and merging it with IMU sensor output to improve the accuracy of vehicle tracking
Evaluated the implemented MATLAB code with Monte-Carlo Simulations to demonstrate its performance under varying initial conditions and trajectories
Keywords: Signal Processing, Multisensor Data Fusion(MSDF), Kalman Filter, MATLAB

Netaji subhas institute of technology

Winter Research Intern

Dec 2016 – Dec 2016 · 0 mo · TICEPD - Texas Instruments Center For Embedded Product Design

Project title: Microcontroller based embedded system design
Advisor: prof. Dhananjay Gadre
Learnt about the working and design of microcontrollers and studied about related concepts like memory management, JTAG, clocks, timer, comparator, communication protocols, ADC.
Obtained practical experience on PCB designing and Fabrication using EagleCAD for schematic and board file design and Toner transfer method for fabrication.
Worked on three main projects during internship: MSP430 Framework, analog color mixer and switching power supply design.

Lg india

Summer Internship

Jun 2016 – Jul 2016 · 1 mo · LG Electronics India Limited, Greater Noida, Uttar Pradesh

Project description: To study the role of thermostat, OLP, PTC, PCB in double door (FF) refrigerators and design approaches to reduce their FFR (Field Failure Rate)
Developed conceptual understanding of quality management and assessment techniques by participating in practical demonstrations in the factory.
Learnt about the interplay of and gained hands-on-experience on usage of various electronic components like PCB, thermostats, OLP, PTC etc in refrigerators produced by LG
More details on the project can be accessed here- http://kunal-dhawan.weebly.com/lg-india.html