Siddharth Sharma

Lead ML Engineer

Sunnyvale, California, United States8 yrs 9 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • Expert in Natural Language Processing and Machine Learning.
  • Proven track record in developing advanced ML models.
  • Strong academic background with a focus on Machine Learning.
Stackforce AI infers this person is a Machine Learning Engineer with expertise in NLP and Computer Vision.

Contact

Skills

Core Skills

Natural Language Processing (nlp)Deep LearningNatural Language UnderstandingMachine LearningData AnalysisComputer VisionData Integration

Other Skills

AlgorithmsApache FlumeApache SparkArtificial Intelligence (AI)CC++Data MiningData StructuresGPT-4Google BardHTMLHadoopInformation RetrievalJavaLaTeX

About

Working as a Machine Learning Engineer on Natural Language Understanding. I have 18 months of internship experience in Machine Learning/Data Science. My MSE-CS was concentrating on Machine Learning at JHU.

Experience

Google

Software Engineer Machine Learning

Dec 2021Present · 4 yrs 3 mos · Mountain View, California, United States

  • YouTube Ads ML
MathematicsDeep LearningGPT-4Natural Language Processing (NLP)Recommender SystemsGoogle Bard+5

Ushur, inc

Machine Learning Engineer

Sep 2018Nov 2021 · 3 yrs 2 mos · San Francisco Bay Area

  • I work on Natural Language Understanding.
MathematicsDeep LearningGPT-4Natural Language Processing (NLP)Scikit-LearnInformation Retrieval+4

The johns hopkins university

Graduate Research Assistant

Feb 2018Aug 2018 · 6 mos · Baltimore, Maryland Area

  • Simulated High-Performance Computing (HPC) systems with natural and artificially injected faults using The Structural Simulation Toolkit (SST).
  • Created a framework using python to perform node-based and task-based reliability analysis on logs generated by simulated HPC systems. This analysis is independent of Network Structure.
  • Built a Support Vector Machine based classifier to identify artificial fault injection. Weibull and Log-Normal lifetime models were used to parameterize the reliability curves.
MathematicsDeep LearningScikit-LearnInformation RetrievalPyTorchMachine Learning+1

Amazon lab126

Applied Scientist Intern

Sep 2017Jan 2018 · 4 mos · Sunnyvale

  • Simulated human annotators using Bayesian modeling to create synthetic annotated data for Speaker Identification (SID) system.
  • Used Unsupervised Label Refinement (ULR) methods (like Dawid Skene) and showed that these methods work better than Majority Voting for SID annotation.
  • Evaluated human annotator's False Acceptance Rate (FAR) and False Recognition Rate (FRR) for speaker identification by created ground truth data of varying difficulty.
  • Showed that the current annotation process was unacceptable even if we use ULR on labels from multiple annotators to reduce errors.
  • Showed that metadata of test utterance and enrolled utterance did not have enough signal to judge annotation difficulty.
  • Created training and testing data to evaluate domain classifier.
MathematicsLarge Language Models (LLM)Machine Learning

Center for bioengineering innovation and design at johns hopkins university

Computer Vision Intern

Jun 2017Jul 2017 · 1 mo · Baltimore, Maryland

  • Created training data for localization and detection of Zika virus-carrying mosquito species for a custom built trap.
  • Used Faster Region-based Convolutional Neural Networks (R-CNN) to localize the mosquito species and Residual Neural Network (ResNet) to detect mosquito species with 81% accuracy.
  • The results of this work helped in getting funding for this project.
MathematicsScikit-LearnPyTorchComputer Vision

Inria

Research Intern

Nov 2015May 2016 · 6 mos · Lille Area, France

  • Integrated clinical and -omics data of Acute Myeloid Leukemia (AML) from The Cancer Genome Atlas (TCGA) Multimodal Representation Learning.
  • Used Regularized Generalized Canonical Correlation Analysis (RGCCA) and Sparse Generalized Canonical Correlation Analysis (SGCCA) for creating multimodal representation and selecting important genes.
  • Suggested design matrices for RGCCA by using graphical methods from MixOmics package and by considering biological results already provided by The Cancer Genome Atlas (TCGA) consortium.
  • Reproduced results of Lasso-based models of data integration and a paper on Survival Analysis.
MathematicsScikit-LearnData Integration

Arcelormittal

Summer Internship

Jun 2014Jul 2014 · 1 mo · Kazakhstan

  • Windows Server 2012
  • Project Mentor: Alexandr Chsherbov
  • Installed server 2012 and maintained 50 clients.
  • Activated services like authentication, mailing, active directory service, internet access, remote client management and group policy management.

Education

The Johns Hopkins University

Master's degree — Computer and Science

Jan 2016Jan 2018

The LNM Institute of Information Technology

Bachelor of Technology (B.Tech.) — Computer Science Engineering

Jan 2011Jan 2015

Stackforce found 100+ more professionals with Natural Language Processing (nlp) & Deep Learning

Explore similar profiles based on matching skills and experience