Sarthak Madan

Data Engineer

Delhi, India3 yrs 8 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Reduced ETL runtimes by 90% using PySpark.
  • Implemented cost-saving strategies, lowering cloud spend by 25%.
  • Enhanced data quality, preventing 40% of downstream issues.
Stackforce AI infers this person is a Data Engineer specializing in cloud-native data pipeline development.

Contact

Skills

Core Skills

Apache AirflowAwsData Quality EngineeringPysparkEtlData AnalysisData Engineering

Other Skills

AWS Step FunctionsAmazon EC2Amazon Elastic MapReduce (EMR)Amazon RedshiftAmazon S3Amazon Web Services (AWS)Analytical SkillsArcGISAws s3Cloud ApplicationsDatabasesELTExtract, Transform, Load (ETL)Geographic Information Systems (GIS)Git

About

Data Engineer with 3+ years of experience building scalable, cloud-native data pipelines using PySpark, Airflow, and AWS. I specialize in transforming legacy ETL workflows into automated, high-performance architectures that reliably process 100M+-record datasets with minimal latency.I’ve delivered major efficiency wins—cutting ETL runtimes by 90%, reducing cloud spend through EMR/S3 optimization, and improving data quality with validation frameworks that prevent 35–40% of downstream issues.My strengths include workflow orchestration, cost-efficient cloud design, performance tuning, data quality engineering, and end-to-end pipeline ownership. I enjoy solving complex data problems, modernizing systems, and building reliable pipelines that scale.

Experience

Precisely

4 roles

Data Engineer 2

Promoted

Jul 2025Present · 8 mos

  • Automated deployment of Airflow DAGs using CI/CD, reducing release cycles from weeks to days.
  • Implemented S3 lifecycle policies and optimized Parquet storage, lowering monthly cloud costs by 25%.
  • Built and maintained a 120M+ record data product using Python, SQL, Airflow, S3, and Redshift.
  • Added data validation and monitoring workflows, reducing data issues by 40%.
  • Migrated orchestration from AWS Step Functions → Airflow, improving workflow speed and reliability by 20%.
  • Performed AWS cost analysis across EMR and S3, identifying idle compute and reducing cloud spend by 18%.
Amazon RedshiftExtract, Transform, Load (ETL)GitAws s3Apache AirflowData Analysis+9

Data Engineer 1

Promoted

Jul 2024Jul 2025 · 1 yr

  • Built & optimized PySpark ETL pipelines on AWS EMR, reducing runtime by 90%.
  • Modernized legacy ETL into modular Python pipelines and Airflow DAGs, cutting manual effort by 30%.
  • Migrated workflows from AWS Data Pipeline → Step Functions, improving orchestration reliability.
  • Designed Tableau dashboards on Redshift for real-time data quality insights.
  • Improved query performance via partitioning, clustering, and SQL tuning across multiple datasets.
Microsoft OfficeMicrosoft ExcelArcGISAmazon Web Services (AWS)LinuxSQL+25

Associate Software Engineer

Jul 2022Jul 2024 · 2 yrs

  • Built Spark-based auditing tools, reducing audit time by 75%.
  • Optimized ETL for 150M+ records, improving transformation performance by 80%.
  • Reengineered Redshift SQL with Python integration, reducing processing time by 30%.
  • Implemented automated data quality rules (null checks, duplicates, schema validation), reducing issues by 35%.
Amazon Web Services (AWS)LinuxSQLPython (Programming Language)ELTDatabases+11

Associate Software Engineer Intern

Feb 2022Jun 2022 · 4 mos

National centre for medium range weather forecastingê(ncmrwf)

Summer Internship

Jul 2021Aug 2021 · 1 mo · Noida, Uttar Pradesh, India

Indian school of business

Intern

May 2021Jun 2021 · 1 mo

Decathlon sports india

Summer Intern

Jun 2018Jul 2018 · 1 mo · New Delhi Area, India

Education

TERI School of Advanced Studies

Msc.Geoinformatics

Shivaji College, Delhi University

Bachelor of Arts - BA — Geography

Jan 2017Jan 2020

Stackforce found 100+ more professionals with Apache Airflow & Aws

Explore similar profiles based on matching skills and experience