Sahil Kakkar

Data Engineer

Gurugram, Haryana, India5 yrs 2 mos experience

Highly Stable

Key Highlights

Expert in building data pipelines on AWS.
Proficient in real-time data ingestion using Kafka.
Strong background in ETL processes and data validation.

Stackforce AI infers this person is a Data Engineer specializing in Fintech and cloud-based data solutions.

Contact

Skills

Core Skills

Data EngineeringAwsEtlMobile Development

Other Skills

Apache AirflowSQLKubernetesS3HiveSparkAirflowDBTPySparkPythonJenkinsFlutterHTTPJSONAmazon Redshift

About

Data Engineer with experience in designing, building, and operating reliable and cost‑efficient data pipelines on AWS, covering batch & CDC ingestion and big data processing of high-volume Fintech and HCM data into curated datasets ready for analytics. Current CTC: 11 LPA Expected CTC: 17 LPA (non-negotiable) Notice Period: Immediately Available Location: Gurugram (ready to relocate) Availability of PF/Form 16 from current & previous organizations (Yes/No): Yes Availability of Payslips, Offer Letters & Relieving Letters from current & previous organizations (Yes/No): Yes 15 Years of Full-Time Education (Yes/No): Yes

Experience

Wipro

2 roles

Data Engineer

Promoted

Feb 2024 – Jan 2026 · 1 yr 11 mos · Hybrid

My R&Rs included migrating analytics workloads from an on‑prem Cloudera/HDFS stack to a Kubernetes-based, S3-backed data platform. As Data Engineer, I was responsible for end-to-end pipeline migration, automation framework, and platform tuning.
Migrated existing Hive, Spark and Control-M jobs to Airflow DAGs and DBT models.
Enabled streaming ingestion for a near-real-time analytics product using Kafka + PySpark Structured Streaming, integrating Kafka with S3/MinIO storage.
Troubleshot and tuned distributed Spark jobs running on Kubernetes; created container images, Helm charts and deployment patterns.
Collaborated with architects, product owners and infra teams to set priorities, define SLAs and onboard new analytical use cases.
Developed validation pipelines with checks for row-level diffs and schema drifts.

Apache AirflowSQLData EngineeringAWS

Spark Developer

Mar 2022 – Jan 2024 · 1 yr 10 mos · Hybrid

Developed scalable ETL pipelines using PySpark, Hive and Python.
Administered Hive and Impala environments, ensuring high availability, performance tuning and security compliance.
Implemented and enforced data retention and lifecycle policies (automated archival to S3) across large datasets.
Built and maintained CI/CD pipelines with Jenkins and uDeploy for multi-environment promotion.
Introduced automated alerting and ServiceNow integration for incident lifecycle.

SQLHiveData EngineeringETL

Librarypro

Developer L1

Sep 2020 – Feb 2022 · 1 yr 5 mos

I used domain-driven design and state management (Riverpod) to separate views from states. I used HTTP Post request to implement Razorpay payment gateway using api keys. In addition to domain-driven design, I used caching to store session token locally to escape the need for manual login by user on opening the app. I also used JSON encoding-decoding to store data as Json Strings in database.
I used back4app for backend service.
I can't share git repo of the codebase publicly.

FlutterMobile Development