Harsh Singhal

CEO

San Francisco, California, United States18 yrs 7 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • 16+ years in Machine Learning and Data Science.
  • Led ML teams at top tech companies.
  • Innovative solutions with patents and publications.
Stackforce AI infers this person is a Data Science and Machine Learning expert with extensive experience in B2B and B2C applications.

Contact

Skills

Core Skills

Machine LearningData ScienceArtificial Intelligence

Other Skills

AnalyticsApache SparkBig DataBusiness AnalyticsCluster AnalysisData AnalysisData MiningData VisualizationDecision TreesFactor AnalysisHadoopHadoop StreamingHiveInformation ExtractionInformation Retrieval

About

With over 16+ years of experience, I'm an industry-recognized leader in Machine Learning, Data Science, and Artificial Intelligence. My career journey spans various verticals across global markets and top-tech companies like LinkedIn, Netflix and Palo Alto Networks with a focus on delivering data-driven solutions and driving business outcomes. I have a proven record of building and scaling high-performance teams - a testament to this is my success at Koo, where I spearheaded the expansion of ML/Data Science teams. I am also passionate about fostering the next generation of data science leaders, with a focus on mentoring and career development. As a result-driven professional, I have been able to apply ML/AI techniques to a wide array of business problems, from bot detection and spam detection to sales product recommendation and account takeover prevention during my time at LinkedIn and Netflix. This broad spectrum of experience shows my flexibility and adaptability in handling complex business challenges. My innovative approach is underpinned by patents in key areas like bot detection and threat detection. I also have publications in renowned forums like the IEEE Systems, Man, and Cybernetics Society. I maintain an online publication datascience.fm that attracts close to 5k users/month and has seen contributions from student leaders and industry professionals. Read my ebook "How to build a data product for people who own a keyboard" - https://buildadataproduct.netlify.app/ GenAI Products I created Dafinchi.ai to glean insights from earnings reports without needing an army of analysts. I created https://moleculesearch.ai to provide molecule-based patent search to medicinal chemistry enthusiasts. This product applies vector similarity search using RDKit Postgres extension. The product also includes ChatGPT based patent summary and applicant patent landscape analysis. I also post videos on www.youtube.com/@hiharshsinghal

Experience

18 yrs 7 mos
Total Experience
1 yr 3 mos
Average Tenure
1 yr 3 mos
Current Experience

Glean

AI Governance

Mar 2025Present · 1 yr 3 mos · San Francisco Bay Area

  • Contributing to AI & Data Governance efforts at Glean.

Adobe

ML Engineering Leader

Dec 2023Mar 2025 · 1 yr 3 mos · Greater Bengaluru Area · On-site

  • Engineering Leader (Machine Learning) responsible for the ML charter on the Genuine Engineering team.
  • I lead a team of MLEs to develop ML models to detect account sharing, identity inflation and non-compliant usage of Adobe products.
  • The ML charter encompasses the entire stakeholder chain at Adobe and directly impacts revenue.
Machine LearningData Science

Ramaiah institute of technology

Visiting Professor

Oct 2023Mar 2025 · 1 yr 5 mos · Bengaluru, Karnataka, India · On-site

  • Visiting Professor focused on providing industry exposure to students and faculty on AI/ML.
  • I work with faculty and head of departments to develop syllabus and conduct training programs for GenAI adoption in pedagogy.
  • I also work closely with students to develop their exposure to applied AI and industry best practices by conducting seminars, hackathons and bootcamps.
  • Many notable projects my students accomplished have been published by them on datascience.fm
  • Some notable ones;
  • Using LLMs as recommender systems https://datascience.fm/leveraging-llms-in-recommendation-systems/
  • Big Banyan Tree - Setting up Spark for large scale analysis. The work here contributed to a dataset of websites and their javascript libraries extracted from a sample of 5+ years of Common Crawl dataset.
  • https://huggingface.co/big-banyan-tree?ref=datascience.fm
  • Related articles;
  • https://datascience.fm/bigbanyantree-enriching-warc-data-with-ip-information-from-maxmind/
  • https://datascience.fm/zero-to-spark-apache-spark-cluster-setup/
Artificial IntelligenceMachine Learning

Koo india

Head of Machine Learning & AI

Sep 2021Nov 2023 · 2 yrs 2 mos · Greater Bengaluru Area

  • Responsible for all of ML/AI driven features on Koo.
  • Grew the ML team from 3 to 20 engineers, comprising folks in Data Science, Machine Learning, and ML Ops.
  • My team successfully delivered ML-powered product features such as ChatGPT assisted writing tools for creators, Semantic Search, Multilingual Topics, People You May Know, Content Recommendation, Feed Ranking, Trending tag, and all aspects of Content Moderation and Spam detection, and all personalization across the app.
  • Press Coverage
  • Coverage of developing Topics in all Indian languages. https://www.indianext.co.in/koo-launches-topics-in-10-languages/
  • Panel discussion on Indian language tech covered by GoI think tank on AI. https://indiaai.gov.in/article/making-india-digitally-inclusive-with-ai
  • Koo had a successful run and showed the world what is possible when technologists and entrepreneurs decide to take on large incumbents. We concluded our journey in mid July 2024 after more than 4 years building a world-class product. I moved on from Koo with a heavy heart but with immense pride in what we accomplished.
  • https://timesofindia.indiatimes.com/technology/social/indian-social-media-app-koo-shuts-down-read-ceos-emotional-note/articleshow/111458299.cms
Machine LearningArtificial Intelligence

Career break

Relocation

Feb 2021Sep 2021 · 7 mos · Bengaluru, Karnataka, India

  • I relocated to Bangalore from Bay Area, California in 2021. I spent a decade in the Bay Area from 2011 to 2021.
  • After arriving in Bangalore I consulted on a variety of ML problems with startups and worked closely with many to build their teams.
  • With an agile team of MLEs, I cracked a few challenges in extracting molecule information from Medical Patent documents.
  • Check out moleculesearch.ai which is a novel patent search product that takes a molecule as an input to find patents containing the same and similar molecules. You can also see AI based summary of Patents and summary of research focus of Companies. These analyses are useful for FTO, Patent Landscaping and Competitive Analysis research.

Netflix

Data Science & ML@ Netflix

Oct 2019Feb 2021 · 1 yr 4 mos · Los Gatos, California

  • End-to-end development of Analytics, Data Pipelines, and Machine Learning models to reduce malicious Account Takeover activity.
  • Implemented DS/ML solutions to score logins to classify account takeover and developed extensive tooling and analytics necessary to deploy and track model outcomes.
Data ScienceMachine Learning

Linkedin

Data Science

Jul 2018Sep 2019 · 1 yr 2 mos · Sunnyvale, California

  • Development of quantitative methods to identify the impact of anti-abuse models and to reduce the risk exposure of members’ data.
  • I worked at the intersection of Data Science, Product, Legal, and Artificial Intelligence to drive projects cross-functionally.
Data ScienceMachine Learning

Palo alto networks

Principal Data Scientist

Jan 2018Jun 2018 · 5 mos · Santa Clara, CA

  • Architected and developed predictive models for enterprise sales optimization by predicting the likelihood to purchase security products and services. Extracted data from CRM systems to analyze structured and textual data assets to assess pipeline conversion likelihood for sales team.
  • Implemented a product recommendation model as part of an internal award-winning mobile app used by the entire sales team to manage quota and sales pipeline. The framework I developed not only recommended products but also provided the "why" to help sales folks better socialize the recommendations.
Data ScienceMachine Learning

C1x inc.

Head of Data Science & Data Engineering

Aug 2016Dec 2017 · 1 yr 4 mos · San Francisco Bay Area

  • I led data science and data engineering @ C1X.
  • Architected the entire Analytics and Data Science platform on AWS to extract insights from 2TB/day of RTB logs. Utilizing AWS services such as EMR(Spark), Athena, and RDS I leveraged probabilistic data structures like HLL and Sketches to quickly analyze billions of log events. Implemented machine learning models to improve relevance and click-through rates of digital inventory of 2B impressions/day.
  • Improved marketplace engagement for one of the largest e-commerce retailers in Asia and increased revenue by identifying users with a high propensity to buy and by recommending products to sellers that should be advertised for increased sales.
  • I managed a team of 8 data scientists and engineers spread across San Jose and Bangalore.
  • Extensive knowledge and experience with AWS Analytics and AI stack.
Data ScienceMachine Learning

Mz

Data Scientist

Apr 2016Aug 2016 · 4 mos · San Francisco Bay Area

  • Developed bot detection models to identify players using automation to extract an unfair advantage over real users. Identified game anomalies related to in-game purchases resulting in revenue impacting bug fixes.
  • My work here led to a patent.
  • Developed a SQL query analyzer that classified anomalous queries being executed to identify unauthorized database access.
Data ScienceMachine Learning

Linkedin

2 roles

Senior Data Scientist

Jul 2014Apr 2016 · 1 yr 9 mos

  • I developed solutions to fight a variety of fraud use-cases. I primarily focused on bot detection, credit card fraud and identifying fake jobs. Instrumental in developing geo-IP reputation systems which blocked automated extraction of LinkedIn pages by content scrapers and bots. My work here led to a patent.
  • The credit card fraud model I developed saved more than $300,000 in chargebacks every quarter. Developed a text mining model to detect fake job postings protecting members from becoming victims of spam and phishing attempts.
  • Full-stack machine learning workflows developed using Azkaban, Hadoop, Hive, Pig, Voldemort, Kafka, Python and R. Extensive experience communicating model insights and business impact with C-level executives and product managers.
Data ScienceMachine Learning

Data Scientist

Dec 2011Jul 2014 · 2 yrs 7 mos

  • I focused on reducing malicious use of API endpoints by identifying abuse and anomalous usage patterns.
Data ScienceMachine Learning

[24]7.ai

Data Science

Feb 2010Dec 2011 · 1 yr 10 mos · Bangalore

  • I developed multiple solutions to assess chat interaction quality based on text transcripts. The solutions that were developed were implemented in Python and used NLTK for NLP processing.
  • My solution scored every chat transcript and provided business teams with a robust quality metric that covered all interactions and agents. This was orders of magnitude more useful than the existing manually sampling and scoring process.
  • My work here led to a patent.
  • I played the role of a technical lead to develop multiple text mining and NLP solution that leveraged manual annotations, identified greeting effectiveness and grammatical accuracy of conversation, issue resolution score, and NPS prediction. We used GLM models to score interactions on rich feature sets that included both structured and unstructured data sources.
  • Our work was featured in KDD 2011 Industry Expo; Applications of data mining and machine learning in online customer care. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '11).
  • I developed a GUI tool in R using RGtk to visualize score distributions and identify specific agents and their conversations that were outliers and needed manual review. This GUI tool was used by account managers to identify areas of improvement while a more web-based reporting solution was being developed.
  • Recipient of the company-wide Innovation Award for 2010-2011 for advancing the practice of Text Mining and Quantitative modeling across 24/7 Innovation Labs.
  • Contributing inventor of two patent-pending solutions in the domain of customer care analytics using text mining and ensemble feature selection methods.
Data ScienceMachine Learning

Mu sigma inc.

Data Science

Jul 2008Oct 2009 · 1 yr 3 mos · Bangalore

  • I taught Mu Sigma the R programming language and ran their R course in Mu Sigma Univ the first semester.
  • I developed modeling techniques to predict residual and base-line advertising effect on consumer buying patterns that helped assess ROI on multi-channel advertising campaigns. Successfully expanded project pipelines for multiple client accounts by recommending advanced data mining strategies and implementations using R.
  • Developed the R model server (using Rserve) of a rapid modeling framework which was adopted internally by data scientists to reduce the time taken to develop predictive models. The R powered model server was also licensed to key clients as a budget optimization predictive modeling tool delivered as a Saas offering.
Data ScienceMachine Learning

Cosmo lab, rutgers university

2 roles

Research Assistant

Sep 2007May 2008 · 8 mos · New Jersey

  • Developed a proprietary text mining algorithm for information retrieval by using linear programming optimization models implemented using GAMS solver.
  • To showcase results, I developed a web application in Perl/CGI which was deployed for our industry collaborators who could then use a simple web interface to play with the algorithm and query and retrieve matched documents.
  • Related work led to a technical paper http://dl.acm.org/citation.cfm?id=1821282

Research Assistant

Jan 2007Aug 2007 · 7 mos · New Jersey

  • I collaborated with academicians from the Public Policy department to use iterative proportional fitting algorithm to estimate non-disclosed industry data from NAICS. Interpolation on steroids. I wrote crawlers in Perl to extract data from NAICS website to create Input-Output reports.
  • I received my department's accolades and appreciation for cross-disciplinary research work.

Siemens

Process Improvement Intern

May 2007Aug 2007 · 3 mos · New Jersey

  • Implemented the use of Arena, a process simulation software in the Manufacturing and Quality Control department to drive process efficiency improvements. Simulation results informed critical changes in process re-structuring.
  • Leveraged Design of Experiments to isolate critical environmental factors impacting the quality of hearing instruments.

Education

Rutgers University

MS — Industrial & Systems Engineering

Ramaiah Institute Of Technology

Bachelors of Engineering — Industrial Engineering & Management

Bishop Cotton Boys'​ School - India

Stackforce found 100+ more professionals with Machine Learning & Data Science

Explore similar profiles based on matching skills and experience