Rahul Jha — AI Researcher

🚀 Senior Data Scientist with ~5 years of experience in building end-to-end AI solutions from scratch. Built multiple AI solutions which generated a combined business value of more than 💰$50M for customers and automated ⏱2 million hours of manual work till now! Well versed in building AI applications with knowledge in Machine learning, Gen AI, Deep Learning, Natural Language processing (NLP), API building and Model Deployment. 👨‍💻 Career Achievements: 1) Currently solving real world problems as a Senior Data Scientist at NICE Actimize to save banks from frauds and money laundering. 2) Machine Learning, Deep Learning, Natural Language Processing(NLP) & Gen AI Expert across multiple domains including Consulting, Healthcare and Financial Crime & Compliance Analytics. 3) 🏆 Won Impact Award 2024 and Spot Award for building Xceed Copilot Post Model which saved 💰$43.5M for banks and automated ⏱815,100 hours of manual work for fraud analysts! 4) Led DS teams right from understanding the business problem, finding the ideal solution, presenting the idea to the Managers & VPs to creating the AI solution and deploying it to production as an API. 💻 Tech Stack that I use as a Senior Data Scientist for solving real world business problems: - Packages and Programming Languages : Python, R, Pandas, Numpy, Matplotlib, Seaborn, NLTK, Hugging Face, Spacy, Scikit-learn, SHAP, etc. - Databases : MySQL, Microsoft SQL Server, Snowflake, Mongodb. - Machine Learning Algorithms : Linear Regression, Logistic Regression, Naive Bayes, Support Vector Machines, KNN, Decision Tree, Random Forest, XGBoost, Catboost, K-Means, etc. - Gen AI : LLMs, GPT, ChatGPT, Open Source LLMs, Llama, Mistral, LangChain, RAG, NL2SQL and Prompt Engineering. - Time Series : EWMA, ARMA, ARIMA, SARIMA , SARIMAX,etc. - Healthcare Analytics : Named Entity Recognition (NER), SHAP, LLMs, Hugging Face, BERT, BioBERT, ClinicalBERT and Spacy for healthcare data, Machine Learning for medical coding,etc. - Marketing Analytics : Market Basket Analysis (Apriori Algorithm) , Customer Segmentation ( K-Means Clustering) , Recency Frequency and Monetary Value (RFM) Analysis,etc. - Financial Crime and Compliance Analytics : Catboost, SHAP, Behavioral Analytics, Customer Segmentation, AML, FRAML, etc. - Data Visualization : Tableau and Power BI. 🙋‍♂️ I get excited about opportunities where I can use my data science skills to solve real world business problems that helps companies either make more money or save more money! 📧 Feel free to reach me at rahuljha1381998@gmail.com or with a LinkedIn InMail!

Stackforce AI infers this person is a Senior Data Scientist specializing in AI solutions for Fintech and Healthcare industries.

Location: Mumbai, Maharashtra, India

Experience: 5 yrs 10 mos

Skills

Machine Learning
Deep Learning
Natural Language Processing
Gen Ai

Career Highlights

Generated over $50M in business value through AI solutions.
Automated 2 million hours of manual work for clients.
Awarded Impact Award 2024 for innovative ML solutions.

Work Experience

NICE Actimize

Senior Data Scientist (2 yrs 6 mos)

Episource

Data Scientist - NLP & Data Science (7 mos)

Associate Data Scientist - Nlp & Data Science (1 yr 6 mos)

Think360.ai

Associate Data Scientist (1 yr 3 mos)

Education

Bachelor of Engineering at Dwarkadas J. Sanghvi College of Engineering

Science at Mithibai College of Arts Chauhan Institute of Science and A.J. College of Commerce and Economics

at Our Lady Of Health High School - India

Rahul Jha

AI Researcher

Mumbai, Maharashtra, India5 yrs 10 mos experience

Most Likely To SwitchAI ML Practitioner

Key Highlights

Generated over $50M in business value through AI solutions.
Automated 2 million hours of manual work for clients.
Awarded Impact Award 2024 for innovative ML solutions.

Stackforce AI infers this person is a Senior Data Scientist specializing in AI solutions for Fintech and Healthcare industries.

Contact

rahul.jha2@nice.com LinkedIn

Skills

Core Skills

Machine LearningDeep LearningNatural Language ProcessingGen Ai

Other Skills

API buildingAmazon Web Services (AWS)Applied AIApplied Machine LearningAprioriArtificial Intelligence (AI)Artificial Neural NetworksBitbucketBusiness AnalysisBusiness Intelligence (BI)C (Programming Language)C++Cascading Style Sheets (CSS)ChatGPTDVC

About

Experience

5 yrs 10 mos

Total Experience

1 yr 11 mos

Average Tenure

2 yrs 6 mos

Current Experience

Nice actimize

Senior Data Scientist

Dec 2023 – Present · 2 yrs 6 mos · Pune, Maharashtra, India · Hybrid

Created the Xceed Copilot Post Model from scratch by working with the Managers, Principal Software Architects, Product Managers and VPs to save $43.5M for banks and won the Impact Award 2024 from 90+ teams across the company along with the Spot Award for 2024 Q4.
🔷 Xceed Copilot Post Model – ML Solution saving $43.5M for banks and providing 80% productivity boost
Built a ML model from scratch which provides a probability score(0-1) for an alert and that score will decide whether the alert should be sent to an analyst for further investigation or be eliminated as a false positive.
Reduced on an average 40% FP’s which saved 32,606 hours of manual investigation work and $1.7M for 4 banks and provided 80% productivity boost to fraud analysts!
We will be saving 815,100 hours of manual work which would save $43.5M in cost even if we consider just 25% of Nice Actimize Premier’s customer base.
Presented the project to VPs, SVPs, CEO of NICE Actimize and the CEO of NICE and won the Impact Awards 2024 in the “Accelerating AI Leadership” category from 90+ teams across the company.

Machine LearningDeep LearningNatural Language ProcessingAPI buildingModel DeploymentPython+10

Episource

2 roles

Data Scientist - NLP & Data Science

Promoted

May 2023 – Dec 2023 · 7 mos · Hybrid

Built multiple end-to-end AI projects from scratch including ML, NLP and Gen AI technologies and saved more 1 million hours of manual work for medical coders in ~2 years at Episource.
🔷 UnifiedNER – NLP Solution to find Face2Face, Telehealth and Provider metadata in medical charts
Implemented an end to end NER solution to extract entities which can tell us whether the patient and the provider met face to face(F2F entities) or had an online interaction (Telehealth entities) along with entities for provider metadata from medical charts/pdfs
Led the end to end development right from getting the raw data from Snowflake, working with the annotators for creating the dataset and building multiple NER models to deploying the model in production.
UnifiedNER was used to replace 4 production models as it had 20%-30% better weighted F1 score along with a low inference time which saved 520,000 hours of manual reading work for medical coders in order to find if an encounter is a F2F or Telehealth encounter.
🔷 M.E.A.T Validation using Gen AI – Using LLMs like ChatGPT to clean medicine-disease mapping
Implemented an end to end solution using LLMs like ChatGPT and Llama 2 to understand which medicine-disease mapping is correct and can be used to validate if a patient is suffering from a disease or not and which mappings can be removed from the knowledgebase to reduce false positives.
Used multiple prompting techniques likes Zero Shot Prompting, Few Shot Prompting, Role Based Prompting and Chain of Thought(COT) Prompting for validating the medicine-disease mapping.
Reduced 6.97% False positives by removing incorrect medicine-disease mapping using the best prompt with the highest accuracy with the ChatGPT API(GPT 3.5 Turbo Model) and LangChain.

Natural Language ProcessingGen AIMachine LearningPythonSnowflake

Associate Data Scientist - Nlp & Data Science

Nov 2021 – May 2023 · 1 yr 6 mos · Hybrid

🔷 Autocoding Engine – Machine Learning Solution to autocode medical charts and reduce false positives
Built an Autocoding Engine which provides a probability score(0-1) for a disease term found in medical charts and that score will decide if that disease will be captured by a human medical coder or not.
With this machine learning solution, we were able to reduce 48% false positives at 0.05 threshold with no negative impact on the entire NLP Engine.
Saved 499,200 hours of manual medical coding work by eliminating 48% FPs with Autocoding Engine!
🔷 Clinical Suspecting using ML – Machine Learning Solution to suspect HCC Conditions for patients
Implemented a machine learning solution to predict HCC Conditions using 1 million patient’s data for the client.
Performed feature engineering using patient’s historical data like procedures, labs, medicines and diagnosis to generate ~32,000 features and the task was to predict HCC conditions for the patient.
Clinical Suspecting Engine was used to replace a rule based system to predict HCC Conditions as it was beating the rule based system by a F1 score of 26% along with prediction explanations.

Machine LearningNatural Language ProcessingPython

Think360.ai

Associate Data Scientist

Aug 2020 – Nov 2021 · 1 yr 3 mos · Mumbai, Maharashtra, India · Hybrid

Worked with multiple teams to build end-to-end DS solutions across different domains for clients.
🔷 Customer Reactivation Campaign– Multiple Machine Learning solutions to lower customer churn rate
Worked on solving a business problem related to reactivation of customers for a leading global specialty retailer of health and nutrition related products to engage and convert one-time buyers into regular customers.
Implemented product association with Market Basket Analysis to find the combination of products that are frequently brought together by customers using algorithms like Apriori and Association rules.
Analyzed customer behavior across channels, product categories, discounts, loyalty programs, etc. and performed customer segmentation with the help of K-Means Clustering to create customer segments based on product affinity, channel preference among other factors.
Increased customer reactivation by 17% and repeat customer basket value by 9% for the customized messaging group as compared to the general group.

Machine LearningMarket Basket AnalysisK-Means ClusteringPython