Sudipta Singh

Data Engineer

San Jose, California, United States6 yrs experience

Most Likely To SwitchHighly Stable

Key Highlights

Built scalable ETL pipelines reducing SLA failures by 40%.
Developed ML models achieving 93% accuracy in churn prediction.
Engineered real-time data pipelines optimizing data workflows.

Stackforce AI infers this person is a Data Engineering and Data Science professional with expertise in SaaS environments.

Contact

sudipta15103056@gmail.com LinkedIn

Skills

Core Skills

Data EngineeringCloud ComputingData AnalyticsMachine LearningData Science

Other Skills

A/B TestingA/B testingAWSAirflowAlgorithmsAnalytical SkillsBadmintonBig DataBlockchainBusiness AnalysisCC++Creative Problem SolvingData AnalysisData Mining

About

Strategic and solutions-focused Data Engineer with 5+ years of experience architecting secure, scalable, and resilient data platforms in enterprise environments. Proven ability to design and optimize end-to-end data workflows that enable governed self-service analytics, real-time processing, and ML integration. Skilled at translating complex business needs into robust data architectures that drive actionable insights and ensure compliance. Passionate about empowering cross-functional teams with clean, trusted data and delivering measurable outcomes through data storytelling, automation, and process efficiency I bridge the gap between raw data and business intelligence with a deep understanding of data architecture and a strong analytical mindset. I am driven by a commitment to transforming raw data into actionable insights, telling compelling data stories, and streamlining processes to drive business growth through analytics!

Experience

6 yrs

Total Experience

3 yrs

Average Tenure

3 yrs 11 mos

Current Experience

Grainger

Data Engineer II, Grainger Data & Analytics -GDA

Jul 2022 – Present · 3 yrs 11 mos · Chicago, Illinois, United States · Remote

 Built secure, scalable ETL pipelines using Airflow, Snowflake, and AWS, reducing SLA failures by 40% and enabling self-service analytics for high-impact data domains
 Developed Streamlit in Snowflake (SiS) apps powered by Cortex Analyst, enabling 100+ business users to interact with ML forecasts via natural language
 Developed reusable Terraform modules to provision and manage infrastructure across Grainger’s data lake and analytics platform, including S3 buckets, IAM roles, Snowflake objects, secrets, and CI/CD pipelines, reducing environment setup time by 80% and enforcing consistency across dev, test, and prod
 Implemented observability tooling by combining Snowflake query history, Vault audit logs, and GitHub telemetry, increasing platform reliability and reducing root-cause diagnosis time
 Collaborated with cross-functional teams to integrate ML models into production pipelines and led onboarding and mentorship for new hires and interns, fostering consistent engineering practices and knowledge sharing

ETL pipelinesAirflowSnowflakeAWSTerraformobservability tooling+3

Meijer

Analytics Student Consultant

Jan 2022 – May 2022 · 4 mos · United States

Analyzed purchase history to establish customer segments using K-means clustering and created Tableau dashboard with results and intervention strategies to increase engagement and user retention
Created churn prediction ML model to mark at risk customers with 93% accuracy to optimize marketing on valuable customers
Selected to present poster in INFORMS Business Analytics Conference, 2022 (April 2022 at Houston, TX)

K-means clusteringTableauML model developmentData AnalyticsMachine Learning

Octro inc.

Data Science Engineer

Jun 2019 – Jul 2021 · 2 yrs 1 mo · Noida

 Engineered scalable algorithms to analyze customer behavior patterns, orchestrated a real-time data pipeline leveraging Kafka, Hadoop, Spark, and HBase; reduced job execution time by 80% by optimizing data ingestion and feature learning workflows
 Designed dashboards to derive insights, facilitated & assessed communications to increase user engagement
 Led A/B testing for campaign optimization across apps with 50M+ users, evaluating impact on ROI, revenue, retention, and game time; achieved up to 5% increase in user retention for select campaigns
 Built a fraud detection algorithm to monitor daily promotional balance transfers across 2.5M users, reducing fraudulent activities by 40%
 Applied K-means clustering on 1M+ non-depositors daily to segment users and enable hyper-targeted promotional strategies based on behavioral profiling

KafkaHadoopSparkHBaseA/B testingfraud detection+2