Joe Karuthedath

AI Researcher

Atlanta, Georgia, United States5 yrs 2 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • 7x Databricks Certified professional
  • Led enterprise AI adoption at NTT DATA
  • Expertise in scalable AI/ML systems design
Stackforce AI infers this person is a Data Scientist specializing in AI/ML solutions for enterprise applications.

Contact

Skills

Core Skills

Machine LearningDatabricksData EngineeringMlopsData ScienceData Analysis

Other Skills

Artificial Intelligence (AI)StatisticsPySparkPythonLarge Language Models (LLM)A/B TestingHypothesis TestingLinear RegressionMLflowSQLContinuous Integration and Continuous Delivery (CI/CD)Causal InferenceTerraformMachine Learning AlgorithmsLarge Language Model Operations (LLMOps)

About

At NTT DATA, I help organizations design and deliver enterprise AI, machine learning, and modern data architecture solutions. My work sits at the intersection of ML engineering, AI engineering, MLOps, data engineering, and solution architecture. I enjoy taking ideas from concept to production, building scalable AI/ML systems, architecting robust data platforms, deploying production-grade models, and creating governed, reliable foundations for analytics and enterprise AI. I specialize in Databricks, AWS, and cloud-native ecosystems, with hands-on experience across Databricks Lakehouse, MLflow, Unity Catalog, PySpark, Spark SQL, dbt, CI/CD, infrastructure automation, and model monitoring. Across client engagements, I’ve worked on agentic AI, predictive modeling, simulation, experimentation and hypothesis testing, large-scale data migration, and AI-enabled accelerators. At NTT DATA, I lead the AI/ML focus area within the Databricks CoE and serve as a resident solutions architect, helping define architecture patterns, guide delivery, and accelerate enterprise AI adoption across the organization. I’m a 7x Databricks Certified professional, having completed all seven Databricks certifications, and I’ve also been recognized as a Databricks Champion. I’m especially interested in the intersection of agentic AI, RAG, ML, AIOps, MLOps, big data engineering, generative AI helping enterprises move from experimentation to production with systems that are scalable, secure, and aligned to business value. What matters most to me is delivering real value with data, AI, and ML, not just building technically strong solutions, but creating systems that improve decision-making, scale effectively, and drive meaningful business outcomes. My background includes an MS in Business Analytics from Emory University’s Goizueta Business School, along with an undergraduate degree in Mechanical Engineering. That combination has shaped the way I approach problems with a mix of analytical thinking, engineering discipline, and business context. If you are someone who leverages Data and AI to deliver business value, I would love to connect!

Experience

5 yrs 2 mos
Total Experience
1 yr 8 mos
Average Tenure
3 yrs
Current Experience

Ntt data north america

2 roles

Data Scientist Senior Consultant

Jan 2024Present · 2 yrs 4 mos · Remote

  • Chick-Fil-A (Client):
  • Engineered a risk model for Chick-Fil-A products.
  • Deployed model using MLflow on databricks.
  • Designed and implemented a scalable data pipeline using dbt and Databricks to process microbial
  • measurement data from lab tables, ensuring data quality through cleaning, validation, and outlier removal.
  • Simulated scenarios using monte-carlo simulation to asses risks in transmission.
  • Developed an interactive visualization using ActiveGraf.
  • Primrose (Client):
  • Developed and implemented a scalable data ingestion pipeline in Databricks to extract and integrate
  • customer feedback from an API for Net Promoter Score (NPS) calculation, Franchisee management and
  • Procare Data
  • Leveraged Spark SQL and PySpark for data transformation, cleansing, and processing, ensuring efficient
  • handling and analysis of large datasets in a distributed environment.
  • Orchestrated data workflows using Databricks Jobs and Delta Lake, delivering actionable insights to
  • stakeholders.
Machine LearningDatabricksData EngineeringArtificial Intelligence (AI)StatisticsPySpark+19

Data Scientist Consultant

May 2023Jan 2024 · 8 mos · Remote

  • Cencora (client)
  • In my role, I led the development of a machine learning solution to automate data mapping for a Fortune 10 pharmaceutical client, significantly enhancing efficiency. My work focused on creating a sophisticated model ensemble to intelligently predict database mappings, leveraging advanced feature engineering and metadata insights.
  • I played a pivotal role in bridging technical solutions with business goals, engaging with VP-level stakeholders through effective communication and data storytelling. My efforts culminated in the deployment of the solution via Azure Databricks dbfs, adhering to MLOps principles for scalability and reusability. The project utilized technologies such as Azure Databricks, PySpark, Spark ML, Azure DataFactory, and Hive.
Machine LearningMLflowStatistical ModelingData Build Tool (DBT)PySparkData Modeling+12

Costar group

Data Scientist

Jun 2022Mar 2023 · 9 mos · Atlanta, Georgia, United States

  • As a Data Scientist working with CoStar Group, my experiences and responsibilites are as follows with examples:
  • Traffic prediction for Apartments.com, 2023. Used fb-prophet which gave better results over ARIMA and SARIMA time series forecasting models.
  • Built propensity models to predict churn. Used by customer service to identify landlords at high risk of churn. Deploy the model on-prem server.
  • Time series forecast of rental tools (property management software of Apartments.com) revenue, number of applications received, leases completed, premium packages purchased. The model was deployed on on-premise Linux server and scheduled to run every month using cronjob, results were send back to update table in on-premise sql server. These results were visualized using Power BI dashboard.
  • Analytics lead for the Rental-Tool product. Defined, measured, visualized KPI’s on Power BI dashboards for monitoring product performance, eg Churn rate of customers
  • A/B testing experimentation for improving Search Engine Optimization (SEO) for Apartments.com
  • Landlord journey analysis, BigQuery was used to identify the most common tracks that landlords publishing listings took. Identified funnels and drop off points and communicated to product teams.
  • Dynamic Price Filter built Apartments.com. Instead of using same price filters for all 210 designated market areas in US, dynamic price filters were developed based of current rent in the specific market. Python, SQL were used to automate the process, which can now be repeated with little effort when rents change in the future.
  • Build statistics based algorithm to detect anomaly in data and an alert system which informed stakeholders. Deployed the service on Linux, scheduled to run daily using Cron. General purpose, can be used with multiple data structures
  • Google Analytics tags were implemented on product pages, to track user behavior via Google Universal Analytics and Google Analytics 4
Data ModelingProject PlanningMachine LearningMicrosoft SQL ServerPythonA/B Testing+4

Fedex

Data Scientist, student consultant

Jan 2022May 2022 · 4 mos · Atlanta, Georgia, United States

  • Extracted intelligent business insights by doing descriptive analytics on package data and translated the business problem into a data problem. Investigated the background and set up milestones for the project. Gave due diligence to dynamic variables like network traffic, congestion points, routes taken etc.
  • Working with Big Data: Performed data cleaning and exploratory data analysis on terabytes of data by using cloud computing, Pyspark, SparkSQL, duckdb.
  • Data Story Telling: Conducted weekly meetings with the key stakeholders to communicate data-driven findings organized in Tableau and continuously optimized the model to develop insights on improving on-time service in an agile fashion.
Pattern Recognition

Lanware solutions

Data Scientist Intern

Mar 2021Jun 2021 · 3 mos · Kochi, Kerala, India

  • Conducted data cleaning, visualization, and model building on sample data sets under the guidance of a project manager; utilized Python, Scikit learn, TensorFlow, and Matplotlib.

Byju's

Business Development Trainee

Dec 2020Feb 2021 · 2 mos · Bangalore Urban, Karnataka, India

  • Managed, updated data acquired by multiple online and offline campaigns to drive targeted customer acquisition.
SQLPython (Programming Language)Microsoft Power BIData Analysis

Zeekoi technologies private limited

Data Analyst

Jun 2019Nov 2020 · 1 yr 5 mos · Kollam, Kerala, India · On-site

  • ● Analyzed large datasets to identify trends and insights using SQL, Python
  • ● Created data visualizations and dashboards in production using Power BI to present insights to stakeholders
  • ● Designed and developed data pipelines and ETL processes to ensure data accuracy and completeness
  • ● Conducted ad-hoc analyses and data mining to support business decisions

Education

Emory University - Goizueta Business School

Master of Science - MS — Business Analytics

Rajagiri School of Engineering & Technology

Bachelor of Technology - BTech — Mechanical Engineering

Jyothi Nivas Public School - India

Stackforce found 100+ more professionals with Machine Learning & Databricks

Explore similar profiles based on matching skills and experience