Sudhanshu Srivastava

Data Engineer

Gurugram, Haryana, India5 yrs 4 mos experience

Key Highlights

Proven expertise in designing scalable data architectures.
Strong background in data modeling and ETL processes.
Experience with cutting-edge technologies like AWS and Azure.

Stackforce AI infers this person is a Data Engineering expert in SaaS environments, specializing in scalable data architectures and machine learning.

Contact

sudhanshusrivastavashanu@gmail.com LinkedIn

Skills

Core Skills

Data EngineeringMachine LearningData ArchitectureData WarehousingEtl

Other Skills

AWSAWS SageMakerAmazon AthenaAmazon S3Amazon Web Services (AWS)Azure Data FactoryBig DataC++Core JavaData AnalysisData AnalyticsData IngestionData MaintenanceData MartData Mining

About

I’m highly skilled and experienced data engineer with a passion for data and a strong background in computer science. I’ve demonstrated exceptional proficiency in designing, implementing, and maintaining robust data pipelines and infrastructure. I possess a deep understanding of data modeling, ETL (Extract, Transform, Load) processes, and data warehousing techniques. I’ve expertise in leveraging cutting-edge technologies and tools such as Big data, Spark, Hadoop, SQL, Python, and cloud platforms like AWS and Azure. I’ve a proven track record of designing scalable and efficient data architectures, ensuring data quality and integrity, and optimizing data processing workflows.

Experience

Publicis sapient

Senior Associate Data Engineering L1

Aug 2025 – Present · 7 mos · Noida, Uttar Pradesh, India · Hybrid

Gartner

2 roles

Software Engineer - ML Engineer

Jan 2025 – Aug 2025 · 7 mos · Gurugram, Haryana, India

Designed scalable data pipelines, improving model retraining time.
Conducted feature engineering and data cleansing on datasets with millions of records.
Integrated MLflow for model versioning and experiment tracking, improving reproducibility and team collaboration.
Used XGBoost predictive modeling, improving customer churn prediction by 22%.
Automated model deployment using Docker and AWS SageMaker.
Collaborated with cross-functional teams (data scientists, engineers, product managers) in Agile environment.

PythonSQLAWS SageMakerNumPyPySparkXGBoost+4

Software Engineer - Data Engineer

Nov 2022 – Dec 2024 · 2 yrs 1 mo · Gurugram, Haryana, India

Implemented and maintained data architecture built around automated ingestion, data security,
compliance, and governance.
Designed the infrastructure required for optimising extraction and transformation of data using
AWS to improve loads by 62%.
Designed and Implemented a Job Monitoring package, enabling users to effortlessly track ETL job statuses (run, success, fail) of EMR and batch pipeline.
Optimised and migrate the infrastructure from spark v2 to v3 to reduce runtime by 50-60% and
cost by ~80% monthly.
Designed and engineered multiple data models and ETL pipeline to seamlessly extract data from
APIs, transforming it in alignment with the specified data model.

PythonPySparkBig DataData Modelingspark optimizationAmazon Web Services (AWS)+3

Brillio

4 roles

Senior Data Engineer

Promoted

Apr 2022 – Oct 2022 · 6 mos

Designed and implemented generic & optimized ingestion pipeline to load 100 million+ sales and customer data.
Worked closely with Architecture team to transition old database model into Data Warehouse, Data Mart and Reporting structure.

Microsoft Power BIMicrosoft SQL ServerSynapseData MartPySparkData Engineering+1

Data Engineer

Sep 2020 – Apr 2022 · 1 yr 7 mos

Automated the manual data reconciliation process between the source, staging and target, increased the data accuracy and efficiency by 95%.
Worked in KPI analysis of sales data and presented the results to the senior management and stakeholders.
Created ETL error log which stores errors so if mapping fail, it sends email notifications with error attachments.

Amazon S3Amazon AthenaSnowflakeInformatica CloudData EngineeringETL

Data Engineer Trainee

Jul 2020 – Aug 2020 · 1 mo

Developed a real-time data pipeline to process data using ADF and store the processed data in SQL Server.
Created PowerBI reports and published on application.

Microsoft Power BIAzure Data FactoryMicrosoft SQL Server