Mukul Chauhan

Data Engineer

Gurugram, Haryana, India4 yrs 10 mos experience

Highly Stable

Key Highlights

Expert in optimizing data pipelines using PySpark.
Led a team to enhance data access and reporting.
Passionate about emerging technologies in data engineering.

Stackforce AI infers this person is a Data Engineer specializing in Fintech data solutions with strong cloud computing expertise.

Contact

mukul.chauhan@hoonartek.com LinkedIn

Skills

Core Skills

Data EngineeringCloud Computing

Other Skills

AirflowAmazon Web Services (AWS)Amazon WorkSpaceAndroidAndroid SDKAndroid StudioApache SparkApache Spark StreamingAzure Data FactoryAzure Data LakeAzure DatabricksBatch ProcessingBig DataBlob StorageC (Programming Language)

About

I am a Data Engineer with a robust background in leveraging technology to enhance data processing and analysis. I have honed my skills in Python, PySpark, SQL, Azure Data Factory, Azure Synapse, and AzureCloud. My expertise includes designing and implementing data pipelines, optimizing data processing workflows, and collaborating with cross-functional teams to deliver data-driven insights. I am passionate about staying up-to-date with emerging technologies and best practices in data engineering, allowing me to continuously improve and innovate in my role.

Experience

4 yrs 10 mos

Total Experience

3 yrs

Average Tenure

1 yr 10 mos

Current Experience

Hoonartek

Consultant(Data Engineer)

Aug 2024 – Present · 1 yr 10 mos · Gurugram, Haryana, India · On-site

Currently working on a major banking project focused on migrating and optimizing SQL-based stored procedures into PySpark scripts, running efficiently in Databricks with Unity Catalog for secure data governance.
Led a team of 8 engineers in designing and managing the Mart layer for optimized data access and reporting, ensuring high performance and scalability.
Developed and orchestrated complex ETL pipelines using Azure Data Factory (ADF) to seamlessly ingest data from various sources, including databases, files, and APIs, into Databricks for processing.
Performed extensive data transformation tasks, including data cleansing, filtering, and aggregations, leveraging PySpark's DataFrame and SQL APIs for efficient data processing.
Implemented advanced transformations, such as joins, groupings, and window functions, to derive actionable insights from large datasets.
Applied strong knowledge of OLTP, OLAP, and data warehousing principles to design and optimize data models that support analytical and transactional processing.
Used Unity Catalog for centralized data governance and compliance, ensuring secure data access across multiple environments.
Demonstrated expertise in handling large datasets (10TB+) and applying best practices in PySpark optimization to reduce processing times and enhance ETL performance.

PySparkSQLAzure Data FactoryDatabricksData GovernanceETL+3

Rws group

Data Engineer

Aug 2021 – Aug 2024 · 3 yrs · Indore, Madhya Pradesh, India · Hybrid

I worked as a Data Engineer to implement data transformation and ETL processes using PySpark within Azure Data Factory (ADF)
Key Responsibilities:
1. Utilized PySpark's DataFrame and Spark SQL APIs for data transformation and ETL operations.
2. Utilized Azure Data Factory (ADF) to extract data from various sources such as databases, files,
3. Applied data cleansing, filtering, and aggregation techniques using PySpark DataFrame and SQL API.
4. Implemented complex data transformations, including joining, grouping, and window functions, to derive meaningful insights from raw data.

PySparkAzure Data FactoryData TransformationETLData EngineeringCloud Computing