Shivam Pandey

Data Engineer

India4 yrs 4 mos experience

Key Highlights

  • Expert in building scalable ETL workflows.
  • Proficient in resolving complex data quality issues.
  • Strong foundation in cloud technologies and data processing.
Stackforce AI infers this person is a Data Engineer specializing in scalable ETL workflows and cloud-based data solutions.

Contact

Skills

Core Skills

Data EngineeringPyspark

Other Skills

Python (Programming Language)SQLData ArchitectsData ScienceCloud azureAmazon Web Services (AWS)PostgreSQLData MaintenanceAWSDataframes and Spark SQLAzure DatabricksAzure Data LakeData Build Tool (DBT)SnowflakeAzure Data Factory

About

Passionate about turning raw data into actionable insights and always ready to solve complex data challenges that create real business impact. Data Engineer with hands-on experience designing and optimizing data pipelines and architectures that turn complex datasets into meaningful insights. I work across the full data lifecycle, from ingestion and cleaning to modeling, warehousing, and delivery, ensuring that data is well-structured, accurate, and ready for analysis. With a strong foundation in data processing, transformation, and orchestration, I specialize in building scalable ETL workflows and solving data quality issues such as inconsistencies, missing values, and schema mismatches. My work helps ensure that analytics teams receive clean, analysis-ready data at the right time and scale. Key Skills & Expertise: • Data Engineering: Experienced in end-to-end pipeline development, from business requirement analysis to implementation, with a focus on efficiency, reliability, and reusability. • Apache Spark: Skilled in building and optimizing Spark jobs for large-scale distributed data processing, reducing cost and improving runtime performance. • PySpark & Spark SQL: Strong command over large-scale data manipulation and transformation using PySpark and Spark SQL. • Cloud Technologies: Azure: Databricks, Data Factory, ADLS Gen2, Synapse Analytics AWS: S3, RDS, Redshift, EMR, EC2, Lambda GCP: BigQuery, Cloud Storage (GCS), Dataflow (Apache Beam), Pub/Sub, Composer (Airflow) • Automation & Orchestration: Proficient in designing automated workflows and managing data pipelines using tools like Azkaban, Tidal, Argo, and ADF. Interpersonal Skills: Effective in time management, team collaboration, mentoring, communication, and handling end-to-end project ownership across cross-functional teams. Career Highlights: Industry experience in the Supply Chain domain through my work at Körber Supply Chain. Skilled at resolving complex data quality issues and ensuring data consistency across systems Built and maintained high-performing pipelines for structured and semi-structured data at scale Contributed to team-wide initiatives focused on data governance, cost optimization, and reliability Outside of work, I actively build hobby-level data engineering projects to experiment with new technologies and stay aligned with evolving industry trends. I’m always exploring better ways to handle data using tools like Python, SQL, Apache Spark, and modern cloud platforms.

Experience

4 yrs 4 mos
Total Experience
1 yr
Average Tenure
3 mos
Current Experience

Exl

Assistant Manager - Data Engineering

Mar 2026Present · 3 mos · Gurugram, Haryana, India · Hybrid

Data EngineeringPython (Programming Language)SQLData ArchitectsData Science

Fractal

Data Engineer

Sep 2025Mar 2026 · 6 mos · Gurugram, Haryana, India · On-site

  • Owned scalable data pipelines from ingestion to analytics layers, implementing medallion architecture, incremental processing, and schema-drift handling to ensure reliable, analytics-ready data.
  • Optimized PySpark workloads using partitioning and join strategies to handle large-volume datasets efficiently while reducing compute costs and processing time.
  • Partnered with business and analytics teams to design dimensional models and KPIs, improving reporting accuracy and decision-making speed.
  • Established monitoring, alerting, and RCA practices for data pipelines, improving system reliability and reducing production incidents.
PySparkCloud azureAmazon Web Services (AWS)SQLPostgreSQLPython (Programming Language)+2

Bnp paribas

Software Engineer

Aug 2024Aug 2025 · 1 yr · Bengaluru, Karnataka, India · Hybrid

  • Built and optimized ETL pipelines using Azure Data Factory (ADF) to ingest structured and unstructured data from SQL Server, Blob Storage, and REST APIs, reducing manual efforts by 70%
  • Scheduled and maintained ADF pipelines with incremental loads and Data Flow transformations, improving refresh time by 40%
  • Engineered star schema data models within Azure Synapse Analytics, reducing query response times by 60% and increasing report generation speed by 45% for stakeholders
  • Implemented Azure Function-based alerting system to notify data engineers of pipeline failures, cutting issue identification time by 60% and improving overall data pipeline reliability
  • Integrated webhook notifications and real-time alerts via Azure Functions within ADF pipelines, improving issue resolution time by 50%
  • Applied partitioning and distribution techniques in Synapse to optimize query execution on 500M+ record datasets
  • Collaborated with analysts and business teams to define 20+ KPIs and design dimensional data models, resulting in a 30% improvement in reporting accuracy and 40% faster decision-making cycles
  • Mentored junior data engineers and analysts on Spark optimization, SQL performance tuning, and best practices in cloud-based data architecture.
PySparkData EngineeringSQLDataframes and Spark SQLPython (Programming Language)Azure Databricks+3

Namastedev.com

Namaste React Bootcamp

Nov 2022Feb 2023 · 3 mos

  • I was part of the Namaste React Bootcamp organised by Akshay Saini.
  • I was awarded as Top Performer of the Bootcamp consisting of 100 software engineers.
JavaScriptJavaScript LibrariesReactJsTypeScript

Körber supply chain

Software Engineer

Jan 2022Aug 2024 · 2 yrs 7 mos · Remote

  • Engineered scalable DBT projects using modular structure, Jinja macros, and layered architecture, reducing pipeline complexity by 30%
  • Developed reusable macros and materializations in DBT to standardize transformation logic, cutting development time by 20%
  • Implemented 50+ standard and custom DBT tests to monitor data integrity and ensure continuous data quality across ELT stages
  • Documented YAML-based DBT models, improving model transparency and reducing analyst dependencies
  • Designed dimensional data models and built analytics-ready data marts using DBT and Snowflake, enabling fast and reliable BI reporting
  • Optimized model performance through strategic materializations (incremental, table), reducing query costs and runtimes
  • Automated ELT workflows using dbt Core, enabling CI/CD and improving deployment efficiency
  • Led code reviews and maintained DBT Docs, ensuring consistency, data discoverability, and faster team onboarding
Azure DatabricksAzure Data LakeCloud azureData ModelingAirflowSnowflake+1

Codequotient

Supercoder

Jul 2021Dec 2021 · 5 mos · Remote

  • Designed and implemented the user and admin dashboard panels, ensuring they
  • are intuitive and user-friendly
  • Created APIs that allow different services to communicate with each other, ensuring smooth data transfer and functionality
  • Identified and fixed major bugs in the system, which may include UI issues and functionality issues; troubleshooted and debugged code to identify and resolve the root cause of issues
Cascading Style Sheets (CSS)Problem SolvingObject-Oriented Programming (OOP)Software DevelopmentProgrammingJavaScript+3

United group of institutions

Frontend Developer

Jan 2021Aug 2021 · 7 mos · Prayagraj, Uttar Pradesh, India · On-site

  • * Streamlined collaboration efforts with cross-functional teams to transform design concepts into highly responsive and user-friendly interfaces, enhancing the overall user experience and driving a 20% increase in customer satisfaction and a 10% boost in conversion rates

Education

Dr. A.P.J. Abdul Kalam Technical University

Bachelor of Technology - BTech — COMPUTER SCIENCE AND ENGINEERING

Aug 2018Jul 2022

Trendytech

Big Data Masters — Data Engineering

Stackforce found 100+ more professionals with Data Engineering & Pyspark

Explore similar profiles based on matching skills and experience