Karun Thankachan

Data Scientist

New York City, New York, United States12 yrs 8 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Created $100M YoY impact through ML products.
  • Authored multiple research papers and secured two patents.
  • Speaker at top universities sharing insights on ML.
Stackforce AI infers this person is a Data Science expert specializing in AI/ML solutions for eCommerce and enterprise applications.

Contact

Skills

Core Skills

Machine LearningData ScienceAi ResearchGenerative AiDeep LearningRecommender SystemsMlops

Other Skills

Agentic AI DevelopmentLarge Language Models (LLM)Natural Language Processing (NLP)PersonalizationAmazon Web Services (AWS)KerasPredictive ModelingPython (Programming Language)TensorFlowForecastingApache SparkScalaClouderaStatistical ModelingQuantitative Analytics

About

For speaking/judging or other collabs, drop an email here - thankachankarun639@gmail.com I work on building the next generation of AI and ML systems that define personalized shopping experiences and help make sense of the fast-moving world of AI/Machine Learning. Over the past decade, I have worked at S&P 50 ecommerce giants - Dell, Amazon, Walmart, where I have built, scaled, and matured big data systems, applied ML solutions, and Agentic AI workflows. My work has resulted in two patents, multiple published research papers, and end-to-end ML products that have created $100M YoY impact. I have also spoken at universities such as Carnegie Mellon University, Rutgers, and USC, where I share insights on the future of ML and how to build a meaningful career in data science. Follow me to help understand the latest research advances in AI/ML, get practical career advice, and build a successful career in tech.

Experience

12 yrs 8 mos
Total Experience
1 yr 8 mos
Average Tenure
9 mos
Current Experience

Gps ai conferences & podcasts

Guest Speaker

Dec 2025Present · 5 mos

Generative AIDeep Learning

Analytics vidhya

Contributing Writer

Aug 2025Present · 9 mos

  • Authored articles and tutorials on Data Science and Machine Learning
  • Speaker on DataHour
Data ScienceMachine Learning

Ieee

Senior IEEE Member

Jan 2025Present · 1 yr 4 mos

  • Member of IEEE Standards P3579 - Standard for Artificial Intelligence Applied to Time Series
  • Reviewer at IEEE TNNLS (Top 10 AI Journal), IEEE TFS (Top 10 AI Journal), IEEE IOT
  • Senior IEEE Panel Member, 2025
  • Contributor for IEEE Inisght
Data ScienceMachine LearningAI Research

Walmart

Senior Data Scientist, eCommerce

Jun 2023Present · 2 yrs 11 mos · New Jersey, United States · Hybrid

  • Developing agents and predictive models to improve product selection and availability on Walmart.
Machine LearningAgentic AI DevelopmentData ScienceLarge Language Models (LLM)

Amazon

3 roles

Applied Scientist II

Nov 2022Jun 2023 · 7 mos · Arlington, Virginia, United States

  • Lead efforts in recommender system for new hire content recommendation, information extraction from free-text to automate thematic analysis, creation of text embedding to help predict employee career outcomes
Data ScienceNatural Language Processing (NLP)PersonalizationMachine LearningRecommender SystemsAmazon Web Services (AWS)+2

Data Scientist II

Promoted

Jan 2021Oct 2022 · 1 yr 9 mos · Arlington, Virginia, United States

  • Lead development of large-scale parallelized feature selection, data pre-processing, XAI and monitoring systems to help improve efficiency and interpretability of downstream career outcome prediction models.
  • Developed production-grade AWS based platform (S3, Glue, ECR, ECS, SageMaker Pipelines) for serving online and batch machine learning solutions built for FinTech/Accounting teams within Amazon
Data SciencePython (Programming Language)Machine LearningTensorFlowMLOpsForecasting+2

Data Science Intern

May 2020Aug 2020 · 3 mos · Seattle, Washington, United States

  • Experimented with ETS, ARIMA, DeepAR, and Prophet for forecasting transactional volumes of Ring products to allow the business to proactively work with payment vendors. Achieved MAPE of 9% with the SARIMA model
  • Developed an ETL Python pipeline hosted on AWS Lambda, S3, and DyanmoDB to automate the month-end close process for Accounting Team and created savings of 213 manual hours annually
  • Secured 1st Place in the Machine Learning Academy, Natural Language Processing competition, 2020 for "Product Safety Classification"
  • Condcuted a three day session on Machine Learning for Natural Language Processing for 40+ Amazonians via the Machine Learning Academy.
Data SciencePython (Programming Language)Machine LearningApache SparkAmazon Web Services (AWS)Scala

Towards data science

Contributing Writer

Jun 2022Dec 2022 · 6 mos

  • Authored articles and tutorials on Data Science and Machine Learning
Data SciencePython (Programming Language)Machine LearningMLOpsForecastingPredictive Modeling

Carnegie mellon university

3 roles

ML for Large Datasets Teaching Assistant

Aug 2020Dec 2020 · 4 mos · Pittsburgh, Pennsylvania, United States

Data SciencePython (Programming Language)Natural Language Processing (NLP)Machine LearningPredictive Modeling

Data Science for PMs, Teaching Assistant

Mar 2020May 2020 · 2 mos

Data SciencePython (Programming Language)Machine LearningMLOpsForecastingPredictive Modeling

Research Assistant

Dec 2019May 2020 · 5 mos

  • Technology for Effective and Efficient Learning (TEEL) Lab at CMU focuses on research in learning methods, where I worked on curriculum development and automated testing for data exploration and machine learning basics
Data SciencePython (Programming Language)Natural Language Processing (NLP)Machine LearningPredictive Modeling

Dell technologies

3 roles

Software Development Engineer II (Data Science)

Jul 2017Jul 2019 · 2 yrs

  • Promoted to architect and lead the development of minimum viable data science solutions across Dell Order Experience teams on the Dell Data Reservoir (140 node Hadoop Cluster)
  • Mapped application logs to a Finite State Machine and used Bayesian networks to identify root-cause (accuracy ~85%) of application errors and recommend solutions thereby reducing issue resolution time by 50% (Patent Approved)
  • Collaborated with delivery, sales, and analytics divisions to develop a multivariate linear regression model that predicted the lead time for stock items thus improving the accuracy of Estimated Delivery Date (EDD) by ~37%
  • Implemented an ensemble model (SVM, XGBoost, and Logistic Regression) to predict (accuracy ~92%) and proactively act on issues in order processing thereby improving customer experience (Patent Approved)
Data SciencePython (Programming Language)Natural Language Processing (NLP)Machine LearningPredictive Modeling

Software Development Engineer I (Data Engineering & Analytics)

Promoted

Jul 2015Jun 2017 · 1 yr 11 mos

  • Built a 6-node Hadoop Cluster to analyze and visualize near real-time data on Tableau that allowed to obtain actionable insights on bottlenecks that reduced turn-around time on issues by 33%
  • Led the development of an alerting framework based on statistical process control rules (accuracy~85%) to proactively identify processing anomalies and prioritize them based on business impact
  • Implemented a multivariate linear regression model (accuracy ~90%) to predict the time an order should spend in a processing stage and helped prioritize orders to be resolved based on the perceived business impact of delay
Data SciencePython (Programming Language)Natural Language Processing (NLP)Machine LearningMLOpsScala+2

Software Development Intern

May 2014Jul 2014 · 2 mos · Bangalore

  • Built a parser and sentiment analyzer (accuracy ~78%) to extract consumer comments from Dell Twitter Feed and identify the sentiment towards Dell products and the delivery experience
  • Developed an interactive dashboard to analyze consumer opinions and enabled senior executives gauge real-time sentiment during the release of bitcoin as a payment option on Dell.com (2014)
Python (Programming Language)

National institute of technology calicut

Senior Executive @ Computer Science and Engineering Association

Jul 2012Apr 2015 · 2 yrs 9 mos · Calicut

  • Developed website for FOSSMeet 2012, the fourth largest open-source get together in India.
  • Developed and maintained websites for Computer Science and Engineering Department (CSE Alumni Website, CSEA Website).
  • Part of selecting and mentoring junior executives of CSEA (fields include Web design and Development, competitive coding)
Data SciencePython (Programming Language)Machine LearningPredictive ModelingCloudera

Education

Carnegie Mellon University

Masters in Computational Data Science — Data Science

National Institute of Technology Calicut

Bachelor of Technology (B.Tech.) — Computer Science and Engineering

Stackforce found 100+ more professionals with Machine Learning & Data Science

Explore similar profiles based on matching skills and experience