N

Navnoor Singh

Software Engineer

India4 yrs 9 mos experience
Highly Stable

Key Highlights

  • Processed over 300 TB/day of data.
  • Reduced cloud costs by 40% through optimization.
  • Improved data pipeline performance significantly.
Stackforce AI infers this person is a Data Engineer with expertise in Big Data and cloud-native architectures.

Contact

Skills

Core Skills

Apache SparkAws

Other Skills

Apache IcebergKafkaKubernetesPythonEMRAirflowReal-time analyticsETLRESTPySparkDistributed ComputingBig DataApache AirflowExtract, Transform, Load (ETL)Hadoop

About

Performance-driven Data Engineer (4+ years of experience) specializing in building large-scale data pipelines, real-time streaming systems, and cloud-native data platforms. Currently working as an SDE 3 (Big Data) at Baazi Games, where I design and optimize high-volume data systems powering gaming analytics, financial transactions, and user behavior insights. I have hands-on experience processing 300+ TB/day of data, reducing cloud costs by 40%, and enabling faster BI insights through scalable Spark-based architectures on AWS. 🔹 Core Expertise: • Apache Spark (batch & streaming) • Apache Iceberg (Lakehouse architecture, partitioning, compaction) • Kafka & Debezium (CDC pipelines) • AWS (EMR, S3, Glue, Athena) • Kubernetes & distributed systems 🔹 What I work on: • Designing end-to-end CDC pipelines (MySQL/Postgres → Kafka → Iceberg) • Building and optimizing real-time and batch data pipelines at scale • Implementing high-performance Lakehouse architectures • Tuning Spark jobs (AQE, joins, memory optimization) • Ensuring data reliability, scalability, and fault tolerance 🔹 Impact Highlights: • Processed 300+ TB/day of data across distributed systems • Reduced cloud costs by 40% through optimization and efficient architecture • Improved data pipeline performance and query latency for faster analytics 🔹 Interests: I am deeply interested in distributed systems, real-time analytics, and next-generation data architectures inspired by companies like Netflix and Uber. Always open to connecting and discussing data engineering, system design, and scalable data platforms.

Experience

4 yrs 9 mos
Total Experience
2 yrs 2 mos
Average Tenure
4 mos
Current Experience

Baazi games

SDE3 - Big Data

Jan 2026 – Present · 4 mos · Delhi, India · Hybrid

  • Big Data | Designing Lakehouse & CDC Pipelines | Spark | Iceberg | Kafka | AWS | Airflow | Real-time Analytics
Apache SparkApache IcebergKafkaAWSKubernetes

Nielsen

Member of Technical Staff - 2

Feb 2025 – Jan 2026 · 11 mos · Gurugram, Haryana, India · Hybrid

  • â—¦ Spearheaded migration of legacy Informatica system to a modern Spark + Python/Polars stack,
  • cutting licensing costs and boosting throughput.
  • â—¦ Reduced Informatica license cost worth $800k and minimized developer hours by building a
  • framework and automatic DAG creator to accelerate job migration.
  • â—¦ Architected a metadata-driven Spark framework + AI agent, reducing migration timelines by
  • 60%.
  • â—¦ Adopted EMR Serverless, realizing 40% cost savings on intermittent workloads.
  • â—¦ Built Python automation scripts to monitor EMR clusters and terminate idle resources, saving
  • thousands in cloud spend.
Apache SparkPythonEMRAirflowAWS

Airtel digital

3 roles

Senior Software Engineer

Promoted

Dec 2023 – Feb 2025 · 1 yr 2 mos

  • â—¦ Reduced Spark job runtime by 37.5% (4h → 2.5h) on trillions of records, improving data
  • availability.
  • â—¦ Scaled real-time data apps processing 300+ TB/day, powering personalization and BI analytics.
  • â—¦ Developed a Spark-on-Kubernetes operator to abstract infra complexity and increase pipeline
  • reliability.
  • â—¦ Built a generic metadata-driven codebase, cutting new aggregation task development time by
  • 50%.
Apache SparkKubernetesReal-time analyticsAWS

Software Engineer

Aug 2021 – Jan 2024 · 2 yrs 5 mos

  • Engineered clustering-based pipelines to infer work/home locations from telecom data for
  • geospatial analytics.
  • â—¦ Designed a pipeline to calculate user transportation modes from CDR data, powering mobility
  • insights.
  • â—¦ Built observability tools (Spark Listener + Airflow integration), reducing troubleshooting time.
  • â—¦ Standardized deployments with a custom Airflow operator, reducing manual errors by 80%
ETLREST

Software Developer Intern

Feb 2021 – Aug 2021 · 6 mos

PySpark

Auribises technologies pvt ltd

Intern

Jun 2019 – Aug 2019 · 2 mos · Ludhiana, Punjab, India

Education

Guru Nanak Dev Engineering College, Ludhiana

Bachelor of Technology - BTech — Computer Science

Jan 2017 – Jan 2021

CHANDIGARH UNIVERSITY

Master of Business Administration - MBA — Marketing

Jun 2021 – Jul 2023

DAV public school ludhiana

High School — Engineering Science

Jan 2001 – Jan 2017

Stackforce found 100+ more professionals with Apache Spark & Aws

Explore similar profiles based on matching skills and experience