Abhishek Srivastava

AI Researcher

San Francisco, California, United States4 yrs 1 mo experience

Most Likely To SwitchAI Enabled

Key Highlights

Expert in AI/ML systems for healthcare automation.
Pioneered reinforcement learning techniques for medical applications.
Strong background in multilingual NLP and speech processing.

Stackforce AI infers this person is a Healthcare-focused AI/ML Engineer with expertise in NLP and automation.

Contact

abhesrivastava@gmail.com LinkedIn

Skills

Core Skills

Machine LearningAi

Other Skills

AI AgentsAutomationC (Programming Language)Data ScienceDeep LearningExplainable AIJavaNatural Language Processing (NLP)NumPyPyTorchPythonReinforcement LearningSciPySpeech RecognitionTensorFlow

About

नमस्ते | Hello 👋🏽 I’m an AI/ML Engineer with experience building intelligent systems that advance automation, reasoning, and explainability in healthcare and beyond. My work bridges applied ML research with system-level innovation, with a focus on making AI systems scalable, aligned, and trustworthy in real-world settings. Beyond healthcare, my background spans noise-robust speech processing (Apple, '22), multilingual open-domain question answering (CMU, '22), information retrieval for specialized domains (UKP Lab, '20), HCI research on multilingual interfaces (Microsoft Research, '20), and tech for low-resource languages (Inria, '19). At the core, I care about people, the languages they use, and innovations in technology that make AI more inclusive and impactful in everyday life. Always open to connecting with researchers, practitioners, and builders working on applied ML, NLP, and human-centered AI. Feel free to reach out! If you're a student in the field, drop me a message with your specific questions and I would find time to reply!

Experience

4 yrs 1 mo

Total Experience

1 yr 4 mos

Average Tenure

3 yrs 3 mos

Current Experience

Optum

Sr AI/ML Engineer

Feb 2023 – Present · 3 yrs 3 mos · San Francisco, California, United States

I lead the design of next-generation AI/ML systems that push the boundaries of automation, reasoning, and explainability in healthcare. My work combines applied ML research with system innovation to tackle some of the hardest problems in revenue cycle management:
➤ 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁𝘀 & 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄𝘀: Creating novel multi-agent architectures with deterministic controls and adaptive routing to reimagine how medical coding and claims automation are done
➤ 𝗟𝗼𝗻𝗴-𝗧𝗮𝗶𝗹 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴: Pioneering reinforcement learning, retrieval-augmented modeling, and few-shot techniques to improve accuracy on rare and complex medical cases
➤ 𝗘𝘅𝗽𝗹𝗮𝗶𝗻𝗮𝗯𝗹𝗲 𝗔𝗜: Engineering innovative interpretability modules that expose evidence for decisions and make large models more trustworthy in regulated clinical settings
➤ 𝗦𝗰𝗮𝗹𝗮𝗯𝗹𝗲 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗼𝗻: Designing adaptive frameworks that boost automation rates while maintaining precision, including new architectures currently under patent review

AI AgentsReinforcement LearningExplainable AIAutomationMachine LearningAI

Carnegie mellon university - school of computer science - language technologies institute

Graduate Teaching Assistant

Sep 2022 – Dec 2022 · 3 mos · Pittsburgh, Pennsylvania, United States

Advanced NLP by Dr. Graham Neubig (Fall '22)
Natural Language Processing by Dr. David Mortensen (Spring '22)

Apple

Machine Learning Intern – Siri Understanding

Jun 2022 – Aug 2022 · 2 mos · Cambridge, Massachusetts, United States

– As a part of Siri Understanding (SaLT), designed, implemented and trained different models at scale to improve robustness of automatic speech recognition (ASR) in Siri.

Technische universität darmstadt

Research Intern

Sep 2020 – Feb 2021 · 5 mos

• Worked on improving dense retrieval performance in cross-domain QA systems through synthetic query generation using transformer-based seq2seq models (such as T5 and BART.) The approach can substitute expensive gold annotation for specialised domains using a synthetically created silver training set.

Microsoft

Research Intern

Jan 2020 – Jun 2020 · 5 mos · Bangalore, India

Presented a detailed case study of Indian Twitter for language and script usage. Showed the extent of script-mixing online, and catalogued the functions of script-alternation in different contexts. We show how script-mixing can intentionally be employed to emphasize certain entities, and also, pronounce sarcasm in certain cases.
Designed a two-stage user study to explore when and where should code-mixing incorporate script-mixing. The first stage measures the likelihood of mixed-script text occurring in different domains, while the second measures the direct preference for the different versions. Results can motivate the design of multilingual interfaces in personal digital assistants and elsewhere.

Inria

Research Intern

May 2019 – Jul 2019 · 2 mos · Paris 12, Île-de-France, France

Built a pipeline for extending a small Arabizi (code-mixed French-Arabic dialect) dataset by filtering out sentences from a web crawled corpus. Using language identification scores from a bag of n-grams model (fastText) as features, we trained a linear SVM classifier to extract actual Arabizi sentences.
Built a normalisation module for code-mixed word variants leveraging distributional (word2vec), lexical (Levenshtein distance) and phonetic (Soundex) similarities. Fine-tuned mBERT for POS tagging, and showed that it can adapt to code-mixed text by continuing the MLM pre-training given sufficient unsupervised data.
In a different project, generated representations for French Twitter users using their tweets by modifying the existing sentence embedding approaches. Showed through plots that the embeddings captured the regional information about the user suggesting the presence of region-specific linguistic features in tweets.

Mathworks

Smart India Hackathon '19 (Winner)

Feb 2019 – Mar 2019 · 1 mo · NIT Calicut, Kerala, India

Using the R-CNN algorithm, we built a vehicle detection module to count the number of cars in the provided aerial footage. We then used the frequencies to train a linear regression model to predict the regular traffic patterns throughout the day in the city.
Treating the junctions as nodes in a graph, we modified the max-flow min-cut algorithm to find the least busy (i.e. quickest) route for a vehicle.
Finally, proposed an adaptive traffic light design that can guide emergency vehicles towards an optimized green corridor whenever requested.