Aakash Goel — AI Researcher

Senior Data Scientist with 9.5+ years of experience leading the design and delivery of large-scale Machine Learning and Generative AI systems across security, cloud, and enterprise productivity. Currently at Microsoft (UK), where I work on LLM-powered platforms for threat intelligence, shaping both technical direction and execution. I operate at the Staff/Principal level, owning ambiguous problem spaces end-to-end—translating business and security needs into scalable AI architectures, aligning stakeholders, and guiding teams from experimentation to production. My recent work includes leading the development of an internal ChatGPT-style platform for threat analysts, built on RBAC-enabled vector databases, agentic RAG, and deep research agents. The system onboarded 200+ analysts, handled 10k+ queries in two months, and materially reduced manual investigation effort while improving depth and speed of threat analysis. Previously at Microsoft India (R&D), I led multiple high-impact initiatives: * Ransomware detection systems operating at 12+ million signals per hour, driving end-to-end model strategy, evaluation, and production deployment for both consumer and enterprise security * Private-repo code assistant for OneDrive and SharePoint, where I defined the GenAI architecture (RAG, embeddings, fine-tuned Code LLMs, multi-agent orchestration), built evaluation frameworks, and drove adoption—achieving 90% human-eval satisfaction * GenAI accelerators and platforms for document QA, summarization, and rapid prototyping across teams Earlier in my career at Fractal Analytics, I delivered ML and NLP solutions for large enterprise clients, spanning customer analytics, recommendation systems, text similarity, and risk modeling—building a strong foundation in applied ML at scale. Leadership & technical focus areas: * Technical strategy for GenAI, LLM platforms, and agentic systems * ML system design, evaluation rigor, and production readiness * NLP, Information Retrieval, Entity Extraction, and RAG * Mentorship, technical reviews, and cross-org collaboration I’m motivated by high-leverage problems, building durable AI platforms, and helping organizations move from GenAI experimentation to reliable, responsible, and scalable impact.

Stackforce AI infers this person is a Cybersecurity and AI specialist with a focus on scalable ML solutions.

Location: Cheltenham, England, United Kingdom

Experience: 9 yrs 11 mos

Skills

Machine Learning
Generative Ai
Data Science
Cybersecurity
Teaching
Mentoring

Career Highlights

Led development of large-scale ML systems at Microsoft.
Expert in Generative AI and NLP with extensive mentoring experience.
Successfully onboarded over 200 analysts to AI platforms.

Work Experience

Microsoft

Senior Data Scientist (1 yr 8 mos)

Senior Data Scientist (1 yr 7 mos)

Data & Applied Scientist II (1 yr 3 mos)

Great Learning

AI Mentor (1 yr 11 mos)

upGrad International

Data Science Mentor (1 yr 5 mos)

Fractal Analytics

Senior Data Scientist (4 yrs 7 mos)

Monster India

Research Engineer (7 mos)

Indian Institute of Technology, Delhi

Research Staff (3 mos)

Research Intern (1 mo)

DRDO

Project Intern (1 mo)

Education

Master of Technology - MTech at Birla Institute of Technology and Science, Pilani

BTech - Bachelor of Technology at Kurukshetra University

Research Intern at Indian Institute of Technology, Delhi

Non Medical Science at D.A.V Centenary public School, Rohtak

Aakash Goel

AI Researcher

Cheltenham, England, United Kingdom9 yrs 11 mos experience

Most Likely To SwitchAI Enabled

Key Highlights

Led development of large-scale ML systems at Microsoft.
Expert in Generative AI and NLP with extensive mentoring experience.
Successfully onboarded over 200 analysts to AI platforms.

Stackforce AI infers this person is a Cybersecurity and AI specialist with a focus on scalable ML solutions.

Contact

Skills

Core Skills

Machine LearningGenerative AiData ScienceCybersecurityTeachingMentoring

Other Skills

RBAC-enabled vector databasesAgentic RAGDeep research agentsEntity extractionData structuringSemi Supervised LearningBoostingLLMAzure ML endpointPrompt EngineeringRAG systemNLPDeep LearningEntity MappingMachine Learning Algorithms

About

Experience

9 yrs 11 mos

Total Experience

1 yr 11 mos

Average Tenure

3 yrs 3 mos

Current Experience

Microsoft

3 roles

Senior Data Scientist

Oct 2024 – Present · 1 yr 8 mos

Project: Internal ChatGPT for Threat Analyst
Orchestrated the setup of an RBAC-enabled vector database, facilitating onboarding of 200+ analysts and handling 10k+ queries in 2 months, resulting in significant savings in man-days of effort.
Leveraged various versions of the Agentic RAG system and deployed deep research agents to tackle
complex 5-6 sentence queries, providing detailed multi-page reports for comprehensive threat
analysis, including chat-based resulting in faster threat hunting and enhanced response capabilities.
Project: Structured Data Model
Implemented 20+ entity extractions at scale to structure threat actor data, enabling improved data
organization, queryability, and analysis efficiency within the threat analysis domain.

RBAC-enabled vector databasesAgentic RAGDeep research agentsMachine LearningGenerative AI

Senior Data Scientist

Promoted

Mar 2023 – Oct 2024 · 1 yr 7 mos

Project: Ransomware Detection for ODC and ODB (Managed Data Science end to end)
Consumers – Already reached General Availability (GA) in Oct-2023
Enterprise – Getting ready for private preview. Scale: 12+ million signals per hour for a single region
Tools/Techniques: Semi Supervised Learning, Boosting, TLC (model in csharp), LLM, Substrate, Euclid
Project: Code Assistant for Private repository (One Drive and SharePoint) – VS code extension
Designed and implemented code generation and chat with your code VS code extension and AzureML
endpoint using ranking based RAG system from the ground up. Getting used internally to improve
developer productivity.
Designed and implemented evaluation tool to accelerate and monitor the performance of code generation. Human Eval result on Likert scale is 90%.
Tools/Techniques: Azure ML endpoint, FAISS, Embeddings, Prompt Engineering, RAG system, Fine
Tuning, Code-LLAMA-13B, Multi Agent approach.
Project: E+D Hackathons
Quiz Generator over documents (T5, Embedding, Prompt Engineering, Automated Evaluation)
Teams Meeting Summarization (BART, GPT model)
Developed generic python package “accelerator” for Domain specific Question Answering System (Data: Documents, code, urls)

Semi Supervised LearningBoostingLLMAzure ML endpointPrompt EngineeringMachine Learning+1

Data & Applied Scientist II

Dec 2021 – Mar 2023 · 1 yr 3 mos

Great learning

AI Mentor

Nov 2022 – Oct 2024 · 1 yr 11 mos · India

Teaching Generative AI, Data Science, Machine Learning, NLP, and Deep Learning to empower the next generation of AI professionals.

Generative AIData ScienceMachine LearningNLPTeachingMentoring

Upgrad international

Data Science Mentor

Jan 2022 – Jun 2023 · 1 yr 5 mos · India · Remote

Delivered training in Machine Learning, NLP, Computer Vision, Deep Learning, and Generative AI to working professionals and fresh graduates. Mentored professionals and guided them through successful career transitions into Data Science roles.

Machine LearningNLPGenerative AIMentoringTeaching

Fractal analytics

Senior Data Scientist

May 2017 – Dec 2021 · 4 yrs 7 mos · Gurgaon, India

Working as Senior Data Scientist in AIML (Artificial Intelligence and Machine Learning ) Team.
Fine grained News classification System.
Entity Mapping to find Network opportunities.
Designing Deep Learning Architecture for Semantic Textual Similarity problem.
Predicting Coupon Redemption: Help retailer’s marketing team to effectively design coupon construct and precisely select target population. Developed model based on the customer, campaign and item level features.
Entity Extraction: Given seed Entities and Corpus, expand entities of specific type in unsupervised manner using Bootstrap Pattern Learning.
Aspect/Opinion Mining: Given a review containing multiple opinions or multiple features of product/service, need to come up with opinion of user for each feature/service.
Interactive Information Retrieval System: User write about his disease/symptoms in free text form, System need to recommend Vitamins based on user’s query. Used Open Information Extraction (OpenIE) and Knowledge graph to understand entity relations and capture attribute for disease/symptom/vitamin.
Identification of records associated with Catastrophe events (Classification Model).
Trying to embed common sense (Knowledge based systems) in systems.
Conversion of Natural Language Query into Business Query and generating results.

Deep LearningNLPEntity MappingMachine LearningData Science

Monster india

Research Engineer

Oct 2016 – May 2017 · 7 mos · Noida

Writing NLP Code for Chatter bot and wrote Auto Spell Corrector for city names.
Problem Statement: Extract Location, Job Title, Skill, Experience from Free Text (Misspelled) and Auto-Correcting all the tag values.
Designing and coding (Scoring Model,Part of Speech Tagging, Chunking, Understanding Pre and Post Context of Text) Machine Learning Algorithms for Resume Parsing.
Parsing of "Search Log" to identify User's Behaviour of searching || Making of User's Trajectory of Searching.
Working on Natural Language Processing (NLP), Decision Analytical Model Designing, Text Mining.
Working over Job Recommendation Systems based on user‘s skill set and history pattern.
Working on Job Search Predictive Analytics, Machine learning Algorithms.
Worked over String, Pattern Matching.
Used Machine Learning Algorithms (Association Rule Mining, Regression, Scoring Models, KNN, K - Mean, Clustering, Naive Bayes, Decision Tree, PCA, Sampling).
Automatic Document Categorization
Coding Language: Python (Numpy, Pandas, NLTK, Scipy, Scikit-Learn, Parallel Programming).

NLPMachine Learning AlgorithmsData Science

Indian institute of technology, delhi

2 roles

Research Staff

Jul 2016 – Oct 2016 · 3 mos · Delhi

Requirements Gathering.
Working with Highly Skilled Professors on Production systems.
Coding and Writing Scripts.
Database Management in MySQL DB.
Writing algorithms over Pattern Matching, String Matching.
Writing scripts for automation of repetitive work/ improving efficiency.

Research Intern

Jun 2015 – Jul 2015 · 1 mo · Delhi

I worked on "Development of APP for Tourist Recommendation using Crowd-sourcing Concept". Objective of work was to generate recommendation of locations close to the tourist's current location.
I read up the relevant literature on "Use of Crowd-sourcing for making Application". I worked on Overall System design which includes designing the database and user Interface of the system. Also, I worked on Fuzzy Inference system to generate Rank of tourist places based on responses of crowd present at that place. Got an opportunity to present my research paper on this in ICCTD,Singapore.

Drdo

Project Intern

Jul 2014 – Aug 2014 · 1 mo · Delhi

My Responsibility was to analyse possible security threats to the system and to come out with solution for the threats. In response, i designed Software Authentication Utility for Defense Applications. More Specifically, I had designed User Interface for this utility & done coding in C++ using QT Software.