Amba Kumari

Data Scientist

Hyderabad, Telangana, India6 yrs experience

AI EnabledAI ML Practitioner

Key Highlights

Designed advanced NLP solutions with high accuracy.
Developed models that significantly reduced attrition rates.
Created scalable, data-driven solutions for complex problems.

Stackforce AI infers this person is a Data Scientist specializing in NLP and Machine Learning solutions for SaaS applications.

Contact

ambasmk03@gmail.com LinkedIn

Skills

Core Skills

Natural Language Processing (nlp)Machine LearningStatistical Modeling

Other Skills

AI SolutionsClassificationData ModelingEnsemble ModelsExploratory Data AnalysisFAISSFeature EngineeringFine-tuningFlaskGPT APILarge Language Models (LLM)Microsoft ExcelPredictive AnalyticsPython (Programming Language)Retrieval-Augmented Generation (RAG)

About

At State of Mind.ai, I’ve designed advanced NLP solutions using LLMs like GPT-4 and LLaMA2 — from Retrieval-Augmented Generation (RAG) chatbots for multi-org policy Q&A to zero-shot classification and fine-tuned models with 98% accuracy. With a strong foundation in statistics (IIT Bombay M.Sc., ISS qualified), I blend theory with practical execution — developing disengagement prediction models that directly impacted attrition rates and employee engagement. My work spans Flask-based backend APIs, FAISS vector stores, and collaboration with product/dev teams to turn ML pipelines into real-world tools. Passionate about turning complex problems into scalable, data-driven solutions.

Experience

6 yrs

Total Experience

2 yrs 9 mos

Average Tenure

6 mos

Current Experience

Centific

2 roles

Senior Data Scientist

Promoted

Nov 2025 – Present · 6 mos · Hyderabad, Telangana, India · Hybrid

Data Scientist

Aug 2025 – Oct 2025 · 2 mos · Hyderabad, Telangana, India · Hybrid

State of mind.ai

Data Scientist

Aug 2022 – Jul 2025 · 2 yrs 11 mos · India · Remote

Project 1
Developed and implemented a Theme Model leveraging NLP to analyze and extract themes and sentiments from employee' texts.
1. Applied Ensemble Models for Sentiment Analysis.
2. Topic Modelling using LDA.
3. Implemented a Zero-Shot Classification model from Hugging Face to classify text data.
4. Employed GPT-4o-mini, along with prompt engineering techniques to confine classifications to bounded classes tailored to specific categories, enhancing Zero-Shot Classification.
5. Applied Fine-tuning to further refine model performance and achieve a 98% accuracy rate.
Project 2
Built a Disengagement Model for multiple companies to categorize their employees into appropriate groups in order to understand employees’ disengagement in their organization and be able to reduce the churn rate.
1. Optimized and created employee segmentation using RFMBC models, aimed at improving user engagement and reducing churn rate.
2. Implemented Feature Engineering, Variable Combination and Classification.
3. Used Statistical Techniques WOE and IV to evaluate the predictive power of the variables and for creating a Variable Combination.
4. Collaborated with HR teams to integrate the Model into existing processes, allowing for early identification of at-risk employees.
5. Successfully reduced attrition rates, based on insights generated by the Model. Improved user engagement by 65% and reduction in churn rate by 25% within six months.
Project 3
RAG-based Chatbot API for Policy Document Q&A
1. Built a Flask-based API to serve a Retrieval-Augmented Generation (RAG) pipeline for querying policy documents from multiple organizations.
2. Preprocessed documents, generated vector embeddings using OpenAI’s model, and stored FAISS indexes locally
for fast semantic retrieval.
3. Designed dynamic index loading and similarity search based on organization-specific context.
4. Integrated GPT API to generate accurate, context-aware answers from top-k retrieved content.

Natural Language Processing (NLP)Large Language Models (LLM)Retrieval-Augmented Generation (RAG)Machine LearningPython (Programming Language)Statistical Modeling

Infochord technologies pvt. ltd.

Data Scientist

Dec 2019 – Jul 2022 · 2 yrs 7 mos · Hyderabad, Telangana, India

Machine Learning Projects:
1. Fuel consumption rate Analysis in Python
Feature engineering has been done using PCA to reduce the number of predictor variables
Data pre-processing involved missing value imputation and Outlier detection for each of the variables
Applied Multiple Regression technique involving multiple parameters to predict C-rate
Improved the accuracy of the model using Random forest and Boosting technique with around 92% accuracy
EMS/Non-EMS Fuel savings
Analyzed each of the routes independently to pre-process the data
Random forest technique is used to choose the important features for each of the routes and to get the fuel prediction
Computation of fuel savings for EMS trains using trained model of Non-EMS
2. Prediction of subscription of Term Deposits for bank clients in Python using Machine Learning
Data pre-processing involved Missing value imputation and Outlier detection for each of the variables.
Feature engineering has been done using Standardization & Handled Categorical Features using One Hot Encoding.
Applied SMOTE (Synthetic Minority Oversampling Technique) to handle Imbalanced Dataset.
Applied Logistic Regression technique.
Applied Recursive Feature Ellimination to repeatedly construct the Model and choose either the best or worst performing feature.
Computation of Accuracy of the model using Confusion Matrix and ROC.