Nitin Pandey

Engineering Manager

Bengaluru, Karnataka, India9 yrs 9 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Over 10 years of experience in data engineering.
  • Expert in big data technologies and cloud platforms.
  • Proven track record of successful data migrations.
Stackforce AI infers this person is a Data Engineering expert with extensive experience in SaaS and AdTech industries.

Contact

Skills

Core Skills

Big DataData Engineering

Other Skills

AWSAgile MethodologiesAirflowAlgorithmsAnalytical SkillsApache BeamApache SparkBigQueryBusiness AnalysisCC++CSSCascading Style Sheets (CSS)CodingData Analysis

About

Engineer with 10+ years working in the data domain. TECH STACK / FRAMEWORKS USED: Cloud Platforms - AWS, GCP, Databricks Big Data – • OSS - Apache Spark, Apache Kafka, Apache Hudi, Delta, Apache Airflow • CDH – Hadoop, Yarn, Hive, HDFS, MapReduce, Impala • GCP - Apache Beam (Cloud Dataflow), Big query, Stack Driver • AWS – Redshift, Athena, DynamoDB • Databricks – Delta Lake, Databricks Delta Languages – Python, Scala, Java, Shell-Scripting Backend - Redis, Celery, Flask, MongoDb

Experience

Uber

3 roles

Engineering Manager II

Promoted

Jun 2024Present · 1 yr 9 mos

Staff Software Engineer

Jan 2024Jun 2024 · 5 mos

Senior Software Engineer

Apr 2022Dec 2023 · 1 yr 8 mos

Makemytrip

2 roles

Lead Data Engineer

Jun 2020Apr 2022 · 1 yr 10 mos

  • Data Migration
  • Created a utility as a M * N connector which can read data from multiple sources (like MySQL, SQL- SERVER, Redshift, MongoDB, Kafka) and write to sinks (Delta Lake, Redshift). Supports run-time data transformations. Written an optimized spark JDBC reader which allows parallel and distributed reads. Has inbuilt offset management, metric logging and monitoring for slow jobs/failures.
  • Redshift to Delta Migration
  • Migrated from data warehouse architecture in AWS Redshift to Databricks Delta Lake. Added support for delta in all the ingestion tools and created parallel workloads in Delta and ensured smooth transition without any business impact or downtime.
Data MigrationSpark JDBC ReaderDelta LakeRedshiftData TransformationBig Data+1

Senior Data Engineer

Mar 2019Jun 2020 · 1 yr 3 mos

  • Airflow Platform
  • Provided Airflow as a service for entire organization. The setup was powered by a GitHub repo and docker. Adding a new DAG, modifications was made easier for everyone without managing Airflow. Currently runs over 300 DAGs for all lines of businesses.
  • Go-Memory
  • Created a platform to get user interactions done on the website/mobile app with a very low latency in real-time (<1s). The platform allowed consumers to search for queries like – last n hotel/flight searches done by user, gross revenue from a user over a window, last n bookings etc. and so on. Powers use cases like hotel rankings, upcoming bookings notifications/alerts for a user, reviews analysis for a user etc.
  • Goibibo’s Data Platform
  • Created the initial version of data platform over Amazon Redshift. Dumped all the data to redshift and provided hourly refreshes for batch jobs. Created many ETL/ELT pipelines. Setup schema repository, contacts for data logging and pipelines to ingest this data to Redshift. Wrote 75+ dags in 3 months.
AirflowGitHubDockerETL/ELT PipelinesData EngineeringBig Data

Equifax

Data Engineer

Oct 2018Mar 2019 · 5 mos · Bengaluru, Karnataka, India

  • Created the backend and the data pipeline for optimahub – a platform which helps advertisers optimize the spend and track ROIs spend on each channel – social, search, SEO, SEM, display etc. Deployed the solution using big data technology stack on GCP. Technologies/Frameworks/Languages used:
  • Apache Beam (Cloud Dataflow), Dataproc, Bigquery, cloud storage, IAM
Apache BeamGCPBigQueryData PipelineData EngineeringBig Data

Zs

2 roles

Senior Data Engineer

Dec 2017Feb 2018 · 2 mos

  • EDL – Common Components
  • Developed a framework using which provided many utilities which can be used to carry out specific needs of a project. Example – AWS S3 to HDFS Copy can be done using one such utility without needing to write a single code afterwards. Deployed many such utilities as a part of setting backbone of EDL – which was used as a platform to host multiple projects. Technologies/Frameworks/Languages used
  • 1. Hadoop Ecosystem – Hive, Impala, Apache Kafka, Spark, Python 2. Logging – Log4j, Logstash, Kibana, ELK
  • 3. Configuration files – JSON Team Size: 6
HadoopAWSPythonKafkaData EngineeringBig Data

Data Engineer

Jun 2015Nov 2017 · 2 yrs 5 mos

  • Large data files from different vendors containing patient data was received and loaded into
  • data lake. Multiple transformations and business rules were applied as a part of automated process developed. All the parameters/properties were highly configurable. Provided functionalities like creating copy of data, modify as required and share among colleagues. Technologies/Frameworks/Languages used
  • 1. Hadoop Ecosystem – Hive, Impala, Map-reduce, distcp, s3cmd, oozie, CDH, HDFS, Hue, Kerberos, Python, Java, AWS, Redshift
  • 2. Logging – Log4j, logstash, Kibana, ELK
  • Team Size: 6
HadoopAWSJavaPythonData EngineeringBig Data

Education

Army Institute Of Technology ( Pune University)

Bachelor of Engineering (B.E.) — Information Technology

Jan 2011Jan 2015

Stackforce found 100+ more professionals with Big Data & Data Engineering

Explore similar profiles based on matching skills and experience

Nitin Pandey - Engineering Manager | Stackforce