S

Soham Desai

Software Engineer

San Francisco, California, United States3 yrs experience
AI EnabledAI ML Practitioner

Key Highlights

  • Expert in AI-powered data solutions and analytics.
  • Proven track record in predictive modeling and data visualization.
  • Strong cross-functional collaboration and leadership skills.
Stackforce AI infers this person is a Data Scientist with expertise in AI, Machine Learning, and Data Engineering across SaaS and EdTech industries.

Contact

Skills

Core Skills

Data ScienceMachine LearningData EngineeringData AnalysisBusiness IntelligenceData VisualizationPredictive ModelingAi DevelopmentDatabase ManagementData AnalyticsEngineeringComputer VisionBusiness Analytics

Other Skills

Statistical Data AnalysisPython (Programming Language)Microsoft Azure Machine LearningCloud ComputingTime Series ForecastingPowerBIApache KafkaAnomaly DetectionApache SparkPySparkDatabricksKafkaStatistical AnalysisCommunicationHypothesis Testing

About

Building AI-powered data solutions that turn messy, high-volume information streams into clear, profitable business wins. I blend machine learning, real-time data engineering, and product analytics to help B2B and B2C teams spot opportunities faster, personalize experiences deeper, and deploy resources smarter. Experienced in leveraging generative AI (LLMs, RAG), predictive modeling, SQL/Python, and analytics with agile, cross-functional execution, analyzing data in Databricks, Kafka, and Snowflake into actionable insights and solutions across Cloud, SaaS, CPG, and EdTech. If you’re building data-intensive products, I’m always up for a chat. Explore more of my work here: https://soham-desai-portfolio-vision.lovable.app/ Skills: - Tech Stack: Python, SQL, R, PySpark, Databricks, Kafka, Snowflake, Microsoft Azure, AWS, GCP, LLMs, RAG. - Analytics & ML: Predictive modeling, anomaly detection, clustering, A/B & hypothesis testing, statistical inference. - Visualization & BI: Tableau, Power BI, Excel (Pivot Tables, Power Query) - Product & Process: Agile delivery, stakeholder management, roadmap prioritization, data storytelling, cross-functional leadership

Experience

3 yrs
Total Experience
1 yr
Average Tenure
6 mos
Current Experience

Meta

Software Engineer

Dec 2025Present · 6 mos · San Francisco, California, United States

Microsoft

Data Scientist (Capstone)

Oct 2024Jun 2025 · 8 mos · Seattle, Washington, United States · Remote

  • Data Scientist | Azure Cost & BI Platform (https://azure-insight-collective.lovable.app/)
  • Engineered a real-time anomaly detection pipeline using PySpark, Databricks, and Kafka to monitor cloud resource KPIs (e.g., cost/hour, CPU usage), flagging underutilized or misconfigured resources early, preventing sudden cost spikes.
  • Conducted statistical analysis and thematic clustering on a 67-participant user survey to quantify key pain points (e.g., tool fragmentation), shaping prioritization for a platform redesign, and improving user experience.
  • Built a unified Business Intelligence dashboard aggregating cost, savings, and sustainability KPIs, which improved insight-to-action time by 40% and CSAT by 35% among pilot users.
Statistical Data AnalysisPython (Programming Language)Microsoft Azure Machine LearningCloud ComputingTime Series ForecastingData Visualization+7

Beats by dre

Data Analyst Extern

Jul 2024Aug 2024 · 1 mo · Seattle, Washington, United States · Remote

  • Data Scientist | Consumer Insights, CPG
  • Applied K-Means clustering to psychographic and usage data of 4000+ Gen Z survey responses to segment customers into personas, enabling personalized speaker campaigns via social media, projected to boost customer engagement by 20%.
  • Developed logistic regression model (AUC = 0.81) and ran hypothesis tests on survey responses to identify loyalty drivers; found that higher sound and battery ratings increased repurchase odds by 20-25%, informing Beats’ customer retention strategy.
  • Presented insights to marketing leadership using interactive Tableau Dashboards, visualizing High Lifetime-Value (LTV) customers and higher loyalty among premium buyers (22%) to scale revenue by focusing on these user segments.
Data AnalysisData VisualizationCommunicationHypothesis Testingk-means clusteringAudience Segmentation+6

Radical ai

Machine Learning Intern

Jun 2024Aug 2024 · 2 mos · Seattle, Washington, United States · Hybrid

  • Developed and deployed AI-driven quiz tool using Gemini LLMs with Google Cloud, Streamlit, Vertex AI, and LangChain to generate dynamic quizzes with real-time feedback, improving user learning outcomes by 30%.
  • Devised a data ingestion pipeline with ChromaDB and PyPDFLoader to automate PDF processing and text chunking, boosting data transformation accuracy by 15%.
  • Built a RAG-based chatbot by integrating LLMs with Streamlit and engineering custom NLP pipelines, increasing user satisfaction by 15% through personalized, contextual responses (Prompt Engineering).
Statistical Data AnalysisGenerative AILarge Language Models (LLM)Artificial Intelligence (AI)Retrieval-Augmented Generation (RAG)Machine Learning+1

Six ladders

Data Analyst

Mar 2023Sep 2023 · 6 mos · Mumbai Metropolitan Region · On-site

  • Data Scientist | EdTech
  • Analyzed interview data to identify skill gaps in undergraduate students and collaborated cross-functionally with Technical, and Marketing teams to launch new courses (Soft Skills, Resume Building), driving a 20% increase in revenue in 6 months.
  • Built a predictive-modeling recommender using XGBoost on post-launch student data to predict course completion likelihood which led to personalized course paths for each student, boosting employability by 17%.
  • Designed and executed an A/B test to validate hypothesis that adding social proof (e.g., star ratings) to course cards would boost engagement; confirmed a statistically significant 12% lift in 'Enroll' Clickthrough Rates, leading to a site-wide rollout.
  • Reduced a critical student-facing page load time by 60% (12s to 4s) by optimizing SQL queries and normalizing database schema of 300k+ student data; decreased related support tickets by 70% during peak periods.
Cross-functional CollaborationsPredictive ModelingPython (Programming Language)A/B TestingStatistical Data AnalysisXGBoost+12

Getparking

Machine Learning Intern

Jan 2023Mar 2023 · 2 mos · Mumbai, Maharashtra, India · Hybrid

  • Constructed an Object detection model leveraging YoloV5 architecture for parking lot management system enhancing portal’s overall capabilities by adding 3 new functionalities.
  • Performed data scraping for data collection, acquiring extensive datasets of 37 GB for said project containing pictures of parking lots across all weather conditions to better train the model
  • Orchestrated the model training process, tweaking hyperparameters and ensuring the incorporation of data augmentation techniques, such as blur, contrast and brightness modification, and rotation, to attain optimal accuracy of 92%.
Image ProcessingDatasetsTensorFlowDeep LearningPyTorchObject Detection+10

Cavis

Data Analyst Intern

Jun 2022Nov 2022 · 5 mos · Mumbai, Maharashtra, India

  • Performed feature engineering on data from a portfolio of 15+ cloud kitchens (order volume, delivery times, food costs) to train a random forest model (R² = 0.67) that predicted a composite performance score and identified 3 “happy-but-underperforming” kitchens; prompted a strategic pivot to optimize their operational workflows, increasing order capacity by 25%.
  • Led a 4-intern team to build an NLP pipeline that classifies 20K+ kitchen reviews; the fine-tuned BERTweet model flagged high-risk feedback with 81% accuracy (F1: 0.79), cutting manual review effort by 40%.
  • Developed a SQL-based data validation and anomaly detection pipeline to clean and normalize 50k+ kitchen inventory logs; flagged input errors and abnormal stock activity, reducing stock discrepancies by 19% across 3 kitchens.
Data ValidationTensorFlowPyTorchSQLData WarehousingData Pipelines+11

Dj init.ai

Social Media and Analytics Head

May 2022May 2023 · 1 yr · Mumbai, Maharashtra, India · On-site

  • Administered InIT Hackathon 2023 coordinating with a team of 50 members.
  • Mentored 25 juniors on machine learning projects by conducting lectures on trending topics.
Image ProcessingMachine LearningDatasetsData AnalysisTensorFlowKeras+10

Ambani organics private limited - india

Data Analyst Intern

Feb 2022May 2022 · 3 mos · Mumbai, Maharashtra, India · Hybrid

  • Centralized fragmented departmental data into a Microsoft SQL Server backend via Power Query ETL pipelines and Excel (XLOOKUP, Pivot Tables), streamlining monthly reporting and reducing manual processing by 3+ hours.
  • Drove a 15% increase in sales by identifying high-margin product bundles through market basket analysis on 2 years of transactional data and aligning their promotion with peak seasonal demand using time series forecasting (ARIMA).
Market Basket AnalysisSQL Server Integration Services (SSIS)Microsoft OfficeBusiness AnalyticsData VisualizationExtract, Transform, Load (ETL)+5

Djs karting india

ESI Head

Mar 2021May 2023 · 2 yrs 2 mos · Mumbai, Maharashtra, India

  • Designed, procured and managed the battery pack, battery management system and motor controller of the kart.
  • Collected data to tweak gear ratio for optimizing track speed.
  • Using Arduino Uno and magnetic reed switches, displayed kart velocity, battery percentage, and laps remaining.
  • Drafted comprehensive cost and design reports to support innovative enhancements.
Data AnalyticsBatteriesMicrosoft OfficeStrategic ThinkingBattery Management SystemsProblem Solving+2

Education

University of Washington

Master of Science - MS

Sep 2023Jun 2025

DJSCE

Bachelor of Technology - BTech — Information Technology

Jan 2019Jan 2023

Pace Junior Science College

Apr 2017Mar 2019

CNM School - India

ICSE

Apr 2017Present

Stackforce found 100+ more professionals with Data Science & Machine Learning

Explore similar profiles based on matching skills and experience