S

Soumyajit De

AI Researcher

London, England, United Kingdom11 yrs 1 mo experience
AI ML PractitionerAI Enabled

Key Highlights

  • Expert in Machine Learning and Natural Language Processing.
  • Proven track record in improving ad performance and revenue.
  • Strong background in algorithm development and research.
Stackforce AI infers this person is a Data Scientist with expertise in Machine Learning and Cloud Infrastructure.

Contact

Skills

Core Skills

Machine LearningNatural Language Processing (nlp)Data ScienceSoftware EngineeringData EngineeringCloud InfrastructureMentoringSoftware DevelopmentResearchAlgorithm Development

Other Skills

AI ProductivityAlgorithm DesignAlgorithmsAnacondaApache KafkaAzure Cosmos DBAzure Data FactoryAzure Data LakeAzure Data Lake AnalyticsAzure Data Lake StorageBERT (Language Model)BLASBig DataC#C++

Experience

11 yrs 1 mo
Total Experience
5 yrs 1 mo
Average Tenure
11 mos
Current Experience

Meta

Machine Learning Engineer

Jul 2025Present · 11 mos · London, United Kingdom · On-site

  • Applying machine learning to extraction and classification tasks on webpage data as part of the Website Intelligence team.
Large Language Models (LLM)Machine Learning AlgorithmsTransformer ModelsPython (Programming Language)PyTorchPresto+17

Microsoft

3 roles

Senior Data & Applied Scientist

Promoted

Aug 2023May 2025 · 1 yr 9 mos

  • Leading RichAds modeling team's efforts in improving the clickability & quality of Search Ad-Extensions & Dynamic Search Ads (DSA) across EMEA, APAC & LATAM markets.
  • Introduced click-prediction (CP) models for extensions & DSA headlines utilising historical signals, contributing 3-6% revenue increase across tiers.
  • Developed a global feature-store, upgrading on the region-specific design, extending ranking service from 5 to 100+ markets.
  • Incorporated semantic query-context signals into CP model, resulting in a +3% Delta AUC on impressed ads.
  • Curated an offline selection approach for autogenerated extensions using historical query-context features. Exploited marginalised scores from a semantic CP model while allowing for random exploration. Scaled & globalised this pipeline, enabling daily ranking of ~10B items.
  • Addressed a combinatorial variant ranking problem by formulating a theoretical approach, conducting large-scale hypothesis testing, defining features, and using DCNv2 model. This resulted in an offline +4% Delta AUC on impressed ads.
  • Designed an E2E personalisation paradigm leveraging long-term and real-time user-interest signals to provide a personalised ranking scheme, making the items more relevant and diverse.
  • Technology & Tools: MapReduce (Scope), PyTorch, Keras, Huggingface, ONNX, Pandas, Matplotlib, SciPy, NumPy, SkLearn, LangChain, PySpark, Jupyter, Docker, Kubernetes, Tensorboard, WandB, Azure Data Factory, Azure Data Lake Storage, Azure Data Lake Analytics, Distributed FS (Cosmos), Kafka, BLAS, LaPack, Intel MKL, GDB, JDB, Valgrind, Perf, RESTful APIs, OAuth, Git, Conda, Pip
MapReduce (Scope)PyTorchKerasHuggingfaceONNXPandas+30

Data & Applied Scientist 2

May 2021Aug 2023 · 2 yrs 3 mos

Data StructuresBERT (Language Model)Transformer ModelsStatisticsPythonAnaconda+6

Software Engineer II

Dec 2018May 2021 · 2 yrs 5 mos

  • Bengaluru Area, India
  • Area: Ad-decorations for bing.com
  • My work broadly involved candidate generation (extraction), in particular,
  • Algorithm ideation for candidate extraction from various data sources
  • Designing/development/deployment/monitoring of extraction pipelines
  • A/B testing/analysis of performance/mainstreaming of the newly generated candidates
  • On occasions, developing additional back-end plugins for serving in real-time
  • Achievements: Quarterly Excellence Award Q4 2019-2020
  • Technologies: SCOPE, C#, Python, Azure Data Factory
Data StructuresAzure Data Lake StorageAzure Data LakeC#Apache KafkaStatistics+7

Oracle

2 roles

Senior Member Of Technical Staff

Promoted

Dec 2016Dec 2018 · 2 yrs

  • Area: Oracle Cloud Infrastructure
  • Designed & implemented a majority of the Marketplace REST API.
  • Employed batch-processing & application-layer caching to reduce the response times of multi-page GET-calls from ~2 mins to ~10 secs.
  • Implemented a seamless onboarding workflow of existing SaaS customers to PaaS service offerings within a tenant automation framework.
  • Technologies: Java/JEE, Jersey/Jackson, ADF, JPub, PL/SQL, OAuth2.0, WebLogic
DockerData StructuresAlgorithmsSoftware EngineeringCloud Infrastructure

Member of Technical Staff

Sep 2016Dec 2016 · 3 mos

Data StructuresAlgorithms

Google summer of code

3 roles

Peer Mentor

May 2016Aug 2016 · 3 mos · London, United Kingdom · Remote

MentoringLow-Level DesignObject Oriented DesignC++PythonSoftware Development

Student Software Developer

May 2014Aug 2014 · 3 mos · Mumbai, Maharashtra, India · Remote

Python (Programming Language)C++Machine LearningSoftware Development

Student Software Developer

May 2013Aug 2013 · 3 mos · Mumbai, Maharashtra, India · Remote

C++Data StructuresAlgorithmsSoftware DevelopmentResearch

University college london

Research Assistant - Gatsby Computational Neuroscience Unit

May 2016Jul 2016 · 2 mos · London, United Kingdom · On-site

  • Worked with Prof Arthur Gretton and Dr. Heiko Strathmann as a research assistant.
  • Devised a cache-friendly algorithm for non-parametric two-sample tests involving MMD estimator that showed ~300x speed-up over naive implementation.
  • Proposed & implemented a multi-threaded variant that outperformed competing algorithms, built with state-of-the-art solvers, by reducing the runtime tenfold.
  • Co-authored a paper where this is utilised in a discriminator for GANs.
  • Technologies: C/C++, Cachegrind, Perf, OpenMP, Python
C++Data StructuresValgrindStatisticsPythonAnaconda+5

Oracle

Member of Technical Staff

Jul 2014Apr 2016 · 1 yr 9 mos · Kalyani Magnum, Bangalore

Data StructuresAlgorithms

Indian institute of technology, bombay

Research Assistant - Center for Formal Design and Verification of Software Lab

Jul 2011Nov 2013 · 2 yrs 4 mos · Mumbai, Maharashtra, India · On-site

  • I worked as a system administrator in Center for Formal Design and Verification of Software Lab.
  • I implemented bounded-model-checking using Symbolic Simulation of RTL programs as a part of a library (in C++) developed by our lab for hardware verification via symbolic simulation.
C++Python (Programming Language)Machine LearningSoftware Development

Education

Indian Institute of Technology, Bombay

M.Tech — Computer Science and Engineering

Jan 2014Present

Kalyani Government Engineering College

B.Tech — Computer Science and Engineering

Jan 2011Present

Stackforce found 100+ more professionals with Machine Learning & Natural Language Processing (nlp)

Explore similar profiles based on matching skills and experience