S

Saurabh Kataria

CTO

Mountain View, California, United States19 yrs 2 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Expert in machine learning and recommendation systems.
  • Led significant projects at LinkedIn and Twitter.
  • Strong background in natural language processing.
Stackforce AI infers this person is a Machine Learning Engineer specializing in SaaS and Social Media technologies.

Contact

Skills

Other Skills

Data MiningAlgorithmsJavaC++Artificial IntelligenceComputer ScienceCMatlabPythonPerlBig DataHadoopApache PigSoftware DevelopmentDistributed Systems

About

I am currently a tech lead for Tweet Search at Twitter. Mainly responsible for ranking and relevance pipelines, modeling infrastructure for search tech stack. Previously, I was a tech lead at LinkedIn working in Homepage Feed team. I have experience in the field of machine learning, recommender systems, natural language processing, and search systems. Prior to newsfeed relevance modeling, I was part of job search and recommendation team at LinkedIn improving search relevance for job search related queries.

Experience

19 yrs 2 mos
Total Experience
2 yrs
Average Tenure
1 yr
Current Experience

Linkedin

Tech lead at LinkedIn

Apr 2025Present · 1 yr

  • Generative Recommendation Systems for LinkedIn Feed

Snap inc.

Machine Learning Engineer

Apr 2023May 2025 · 2 yrs 1 mo

Twitter

Staff Machine Learning Engineer

Feb 2022Apr 2023 · 1 yr 2 mos · San Francisco Bay Area

  • Staff Software Engineer, Tech Lead for Tweet Search
  • Responsible for ranking and relevance for tweet search stack
  • Delivered improvements in social engagement metrics on SRP with various ranking improvements such as improved data pipeline, feature normalizations, searcher-author affinity signals.
  • Worked on Explore-Exploit for integrating candidate sources to Tweet Search ranking candidates’ pool

Linkedin

3 roles

Staff Software Engineer

Promoted

Jun 2020Feb 2022 · 1 yr 8 mos

  • Working on various deep learning efforts for homepage relevance feed
  • Transforming feed models to be fully served by a deep learning stack
  • Image understanding for feed ranking and candidate selection

Senior Software Engineer / Applied machine learning researcher

Promoted

Sep 2017May 2020 · 2 yrs 8 mos

  • * Worked on building end to end content understanding model pipeline for feed updates with deep learning stack. Pipeline serves content relevance signals to various parts of feed relevance models for homepage ranking.

Software Engineer / Applied Machine Learning Engineer

Sep 2016Aug 2017 · 11 mos

  • Worked on implementing DSSM (a deep learning based query/document representation mechanism) for job search.
  • Improved relevance for long tail queries by improving their candidate selection.

Parc

Senior Research Scientist

Jan 2015Aug 2016 · 1 yr 7 mos · Webster, New York

  • o User Modeling from Social Media Content
  •  Designed and Implemented supervised topic models for tweets and content organization [Java, Mallet]
  •  Implemented methods for entity disambiguation and linking to Semantic Knowledge Base (e.g. Freebase) for users’ interest detection [Java, Apache Stanbol, MongoDB]
  •  Designed and implemented multi-modal deep learning based methods for images, text and graph data in users’ social networks for multi-label classification [python, Caffe, Theano, Keras]
  •  Designed and implemented natural language based feature extraction methods for Life Event Detection in social media [Python, Scikit-learn, Stanford NLP]
  •  Led the development of prototype for Real-time Interest Profiling engine combining above modules.
  • o Leveraging Natural Language Processing for Patient Profiling and Predictive Health Modeling
  •  Implemented deep learning based Neural Reasoning models to extract (embedding) features out of textual and semantic knowledge sources (e.g., PubMed/MedLine, UMLS, etc.) [python, Theano, cTakes]
  •  Worked on methods for patient profiling and disease progression from medical forums (such as healingwell, WebMD) for various chronic diseases. [python, cTakes]

Xerox

Research Scientist

May 2012Dec 2014 · 2 yrs 7 mos · Webster NY

  • o Data Mining for Customer Care Center Optimization
  •  Analyzing Agent and Customer Conversations: Researched and developed multi-task learning techniques for analyzing conversations/logs. Implemented various machine learning techniques for optimizing customer satisfaction such as dialogue act classification, detecting issue resolution status, call center agents’ efficiency. [Mallet, Scikit-learn, Java, Python]
  •  Linking Users across CRM databases and Social Media: Designed and implemented machine learning based techniques for linking users across their corresponding CRM database transactions and social media accounts as well as linking users across various social media accounts (e.g., twitter, Flickr, and Instagram). Implemented various NLP (e.g. Entity Recognition, personality identifier, Interest detection) and ML (e.g. Kernel Density Estimation, Logistic Regression) based techniques for scoring similar users. [Python, Scikit-learn]
  •  Community Detection for Peer to Peer Support: Designed and implemented topic modeling based community detection techniques in social media. The statistical method proposed can distinguish between users’ conversations towards an implicit community in the social network and personal statuses. Co-led the development of user-community recommendation engine based upon the community detection algorithm. [Mallet, Java]

Yahoo! labs

Research Intern

May 2010Aug 2010 · 3 mos · Bengaluru Area, India

  • Hierarchical Topic Models for Entity Disambiguation:
  • Researched and implemented weakly semi-supervised topic models for detecting, disambiguating, and linking entity mentions in textual documents to their Wikipedia profiles. The developed methods outperformed state of art entity linking methods and overcame issues such as context window selection, collective linking in documents, etc. [Mallet, Java]

Xerox research center europe

Research Intern

Jun 2009Sep 2009 · 3 mos · Grenoble Area, France

  • o Font Image Retrieval: Researched and implemented a retrieval system for font description files that are visually similar to user’s “hand-drawn” font. Implemented various combinations of ML based classifiers and feature extraction techniques. [Matlab, C++]

Intelligent information systems lab at penn state university

Research Asst

Aug 2007Mar 2012 · 4 yrs 7 mos · State College, Pennsylvania Area

  • o Citation Recommendation Engine (RefSeer): Researched and implemented a content based document recommendation system that ranks scientific literature based upon likelihood of being cited with given user’s scientific text. The developed system interfaces with CiteSeerX and extends topic models for linked corpora as its underlying ranking algorithm. [Mallet, Java, Spring Framework]
  • o Information Extraction from Scientific Literature: Researched and implemented information extraction modules that extract structured output from 2-D plots of certain classification (e.g., bar chart and line chart) and textual description of figures. The modules are integrated with CiteSeerX digital Library system. [Matlab, C++]

Globallogic

Software Engineer

Jan 2005Jan 2006 · 1 yr · Noida Area, India

Education

Penn State University

Doctor of Philosophy (Ph.D.) — Information Sciences and Technology

Jan 2006Jun 2012

Indian Institute of Technology (Banaras Hindu University), Varanasi

B. Tech. — Computer Science

Jan 2001Jan 2005

Stackforce found 100+ more professionals with Data Mining & Algorithms

Explore similar profiles based on matching skills and experience