Pankaj Ahuja

Data Engineer

Mumbai, Maharashtra, India6 yrs 11 mos experience
Most Likely To SwitchAI Enabled

Key Highlights

  • Won 3 national level hackathons.
  • Created a near real-time data lakehouse.
  • Reduced Redshift disk space by 30%.
Stackforce AI infers this person is a Data Engineering expert in SaaS environments.

Contact

Skills

Core Skills

Data EngineeringPythonGoogle Cloud Platform (gcp)TerraformEtlSqlSoftware Development

Other Skills

Python (Programming Language)Amazon ECSExtract, Transform, Load (ETL)Delta lakeDynamo DBGlueS3RedshiftAnalytical SkillsPySparkOracleHadoopCommunicationJavaMSSQL Server

About

Experienced Data Engineer with in-depth knowledge and hands-on AWS, Redshift, Bigquery, GCP, Airflow, S3, SQL, Spark, Python, Kafka. Won 3 national level Hackathons by building many android apps and have knowledge about Java, native android and hybrid apps.

Experience

6 yrs 11 mos
Total Experience
2 yrs 3 mos
Average Tenure
2 yrs 10 mos
Current Experience

Confluent

2 roles

Senior Data Engineer

Mar 2026Present · 3 mos · Remote

Data EngineeringPython (Programming Language)Python

Data Engineer II

Aug 2023Mar 2026 · 2 yrs 7 mos · Remote

Google Cloud Platform (GCP)Terraform

Amazon

Data Engineer

Jun 2021Aug 2023 · 2 yrs 2 mos · Pune, Maharashtra, India · Hybrid

  • Created near real-time data lakehouse on delta lake and pipelines to track traffic and orders reducing data delay SLA from 24 hours to 5 minutes.
  • Developed pipelines for Weekly Business Reports for leadership to analyze business areas and created admin user interface to monitor, re-trigger and debug pipelines.
  • Created architecture that automated tracking and analyzing active AB Tests on website and made process end to end self service for product managers.
  • Designed and created pipelines and datamarts encapsulating logic that powers monetization to Influencers based on traffic and conversions leveraging clickstream data , Dynamo DB, Glue, and S3.
  • Collaborated on conducted training sessions for PMs and TPMs on SQL and usability of Datamarts to improve adoption.
  • Created datamarts to analyze different areas of e-commerce website such as Payments, Checkout and Product Pages for Data scientists and Analysts.
  • Reduced Redshift disk space by 30% using automated identification and offloading unused datasets and reduced query runtime using WLM and admin views.
  • Developed Generalized pipeline alarms and monitor framework to add alarms to any data pipeline .
  • Planned migration of 1.5 PB production Redshift cluster from ds2 to RA3 with downtime of 1 hour.
Amazon ECSExtract, Transform, Load (ETL)Data EngineeringETL

Tata consultancy services

Data Engineer (Big Data)

Jul 2019Jun 2021 · 1 yr 11 mos · Mumbai Area, India

  • Developed, designed and launched application in PySpark to compute aggregated values on time series data based on
  • user provided formulas and improving response time from days to hours, accomplished this using agile methodology.
  • Created well abstracted spark job used by various teams to perform SCD Type 2 updates on 500 dimension and master hive
  • tables stored in parquet format.
  • Automated tasks of extracting metadata and lineage from tools using Python scripts and saved 70+ hours’ manual efforts.
  • Migrated TBs data from client’s Oracle warehouse to Hadoop with audit and reconciliation.
  • Being part of the project right from start, Influenced and converted proof of concepts to production deliverables.
Extract, Transform, Load (ETL)Analytical SkillsData EngineeringETL

Applab technologies

Intern

Jan 2017Jan 2017 · 0 mo · Thane, Maharashtra, India

  • Employee Management System.Java frontend using JDBC and MSSQL Server.
Software DevelopmentCommunication

Education

Vivekanand Education Societys Institute of Technology Sindhi Society Chembur Mumbai 400 071

Bachelor of Engineering — Computer Engineering

Jan 2015Jan 2019

Stackforce found 100+ more professionals with Data Engineering & Python

Explore similar profiles based on matching skills and experience