Deepanshu Aggarwal

Data Engineer

Noida, Uttar Pradesh, India7 yrs 10 mos experience

Key Highlights

  • Over 5 years of experience in data engineering.
  • Awarded Shining Star Award at Samsung for performance.
  • Expert in building scalable data pipelines and ETL processes.
Stackforce AI infers this person is a Data Engineer with expertise in cloud-based data solutions and big data technologies.

Contact

Skills

Core Skills

EtlApache SparkData EngineeringAzure Data EngineeringWeb Development

Other Skills

ADLSAnalytical SkillsApache AirflowApache KafkaApache Spark StreamingAttention to DetailAzureAzure Cosmos DBAzure Data FactoryAzure Data LakeAzure Data Lake StorageAzure DatabricksAzure FunctionsAzure SQLAzure SQL DB

About

I have more than 5 years of experience working with multinational product based company and considers myself as a working and proactive person for providing solutions and is very passionate about coding and problem solving. Got Shinning Star Award at Samsung for good performance Skills : Java, Python, Spark, Azure, Azure Data Factory, Databricks, Kafka, HDFS, Airflow

Experience

7 yrs 10 mos
Total Experience
1 yr 8 mos
Average Tenure
1 yr 1 mo
Current Experience

Publicis sapient

Senior Associate Data Engineer L2

May 2025Present · 1 yr 1 mo

Expedia group

Big Data Engineer 2

Jan 2024May 2025 · 1 yr 4 mos · Gurugram, Haryana, India · Hybrid

  • 1. Designed and implemented ETL pipelines to extract data from diverse sources including data lake tables and parquet files. Processed the data and stored it into Unified Feature Store (UFS) integrated with MongoDB, ensuring optimised data accessibility.
  • 2. Collaborated with Machine Learning Scientists (MLS) to facilitate seamless deployment of their models into production environments using Model Registry Service (MRS) and Model Deployment Service (MDS).
  • 3. Utilising Apache Airflow for scheduling and monitoring the job workflows by developing the DAGs in Python.
  • 4. Optimized a Spark Streaming application, reducing average latency from 45 seconds to 10 seconds by optimizing state management and refining stream processing logic, leading to improved performance and faster data processing capabilities.
ETLMongoDBApache AirflowSpark StreamingPythonApache Spark

Times internet

Big Data Engineer

Apr 2022Jan 2024 · 1 yr 9 mos · Noida, Uttar Pradesh, India

  • 1. Designed & Developed multiple batch data pipelines using Spark DataFrame API to process 10 millions Ad-Serving records from
  • Kafka on an hourly basis and storing them into HDFS
  • 2. Generating multiple performance metrics by aggregating the data on various Ad-Attributes & storing data into the Cassandra Tables which is used by Ad-Engine to update its learning & performance caches
  • 3. Integrated Data Lake's extensive dataset with Trino, enabling seamless ad hoc querying, in-depth analysis, and effective debugging of Ad-Serving data.
  • 4. Utilized Apache Airflow for scheduling and monitoring the job workflows by developing the DAGs in Python.
Spark DataFrame APIKafkaHDFSCassandraApache AirflowApache Spark+1

Riverbed technology

Azure Software Engineer

Jul 2021Apr 2022 · 9 mos · Bengaluru, Karnataka, India

  • 1. Developed an end-to-end Azure data engineering pipeline to ingest, process, and transform data for advanced analytics.
  • 2. Utilized Azure Data Factory (ADF) for orchestrating data ingestion from Azure SQL DB and CSV files into Azure Data Lake Storage (ADLS) Gen2.
  • 3. Applied Medallion Architecture (Bronze, Silver, Gold layers) and used Azure Databricks for data cleansing, deduplication, and standardisation.
Azure Data FactoryAzure SQL DBAzure Data Lake StorageAzure DatabricksAzure Data Engineering

Samsung economics research institute

Software Development Engineer

Aug 2018Jul 2021 · 2 yrs 11 mos · Noida

  • 1. Worked on scalable PySpark data ingestion pipelines to ingest and process
  • Samsung’s device telemetry and app usage analytics into the data lake,
  • enabling efficient downstream analysis and reporting.
  • 2. Developed a scalable internal task management web application using PHP,
  • Bootstrap, and MySQL, improving operational efficiency and cross-team
  • collaboration across Samsung departments.
PySparkPHPMySQLData EngineeringWeb Development

Education

IMS Engineering College

Bachelor of Technology - BTech — Computer Science

Jan 2014Jan 2018

Stackforce found 100+ more professionals with Etl & Apache Spark

Explore similar profiles based on matching skills and experience