S

Shubham Gupta

Software Engineer

New Delhi, Delhi, India5 yrs 5 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in building robust data solutions.
  • Significant cost reductions in data infrastructure.
  • Strong background in Data Engineering and Big Data.
Stackforce AI infers this person is a Data Engineering expert with a strong focus on Big Data solutions.

Contact

Skills

Core Skills

Data EngineeringBig Data

Other Skills

AirflowAlgorithm DesignAlgorithmsAmazon RedshiftAmazon S3Amazon Web Services (AWS)Apache KafkaAuroraBootstrapCCDCCascading Style Sheets (CSS)Cloud ComputingData StructuresDatahub

About

Experienced Data Platform Engineer, building robust data solutions to create an impact. Believes in experiencing everything new in life and making an impact through hard work and consistency. Loves applying core CS concepts to practical software solutions that can scale. Keen interest in Data Engineering, Algorithms, Data Structures, and Problem Solving. Data Engineering ( Presto/Trino, Kafka, Python, GoLang, Pinot, Flink, Doris, Apache Spark, Airflow, Hadoop, Debezium, Hudi, Sqoop, Hive, Elasticsearch, Amazon Redshift, MySQL, Postgresql, RDS, Aurora, Lambda, DynamoDB) DevOps ( Docker, Kubernetes, Helm, Skaffold ) Competitive Coding ( C++/ Python ) Android/IOS App Development using React Native MERN stack for Web Development Always willing to work with new technology and people

Experience

Blinkit

4 roles

Software Development Engineer 3

Jan 2025Present · 1 yr 2 mos

Software Development Engineer 2

Promoted

Oct 2021Jan 2025 · 3 yrs 3 mos

  • Event Gateway at Zomato: Built an extensive Golang based event gateway to enable realtime routing capabilities of event streams to pluggable destinations with custom transformations
  • > Built in-house capabilities to deprecate Rudder as a customer segmentation tool
  • Datalakehouse at Blinkit: Built a robust CDC based datalakehouse with over 1000 source tables (10TB data) getting replicated at 15 minutes lag
  • > Built support to power data-marts on our datalake using Trino Iceberg Connector with upsert capabilities in near real-time
  • > Built a robust FSM based Web UI to make the onboarding process of tables in the datalake self-served
  • > Reduced data lake querying cost by 90% ($12000 monthly) by identifying issues in the presto querying engine and inefficient querying patterns
  • > Leading deprecation of Redshift cluster by building the required capabilties on in-house Trino cluster to reduce cost by $25000 monthly
  • > Reduced the refresh frequency of lake tables to 15 minutes from existing 3 hours enabling near-realtime querying capabilities
  • > Migrated datalake source DBs from existing RDS to Amazon Aurora
  • > Upgraded Debezium from v1.3 to v1.9.4 and also migrated kafka connect to k8s from existing EC2
  • > Patched data in lake to fix data discrepancy in over 25 tables
  • > Zero downtime upgrade of Airflow cluster with over 1500 dags from v1.10 to v2.13
  • > Reduced Redshift storage by (1TB) by identifying the unused tables and automating the pruning process of tables
  • > Optimized existing Redshift cluster to scale to 100,000 queries/day saving around $15000/month upgrade cost
GolangTrinoCDCPrestoAirflowDebezium+5

Software Development Engineer

Sep 2020Sep 2021 · 1 yr

  • Data Governance at Blinkit: Introduced and drove the adoption of a robust data cataloging tool (Datahub by Linkedin)
  • > Setup one-click deployment for the tool using docker, k8s helm charts
  • > Integrated data catalog with ETL pipeline (Airflow) BI tool (Redash) to automate the catalog generation process
  • > Migrated Datahub with over 4000 datasets from version v0.6 to v0.8.18
  • Data Tools at Blinkit: Managing and building features on top of existing in-house/OSS tools
  • > Built data exporting functionality in Redash used by over 2000 active users daily
  • > Primary owner of in-house built ETL frameworks
  • > Built multiple slack-bots to automate reporting & monitoring process
  • > Reduced Data Infrastructure [ Aurora — Xplenty — Redshift ] cost by 10-15%
DatahubDockerAirflowRedashData Engineering

Software Engineer Intern

Jan 2020Sep 2020 · 8 mos

  • > Working in the Data Engineering Team
  • > Debugged issues in ETL pipeline causing data loss
  • > Worked on Internal BI tools, adding features to the existing tools
  • > Built multiple slack-bots to automate the reporting and monitoring process
  • > Reduced Aurora IOPs cost by 10% by optimizing queries
  • > Reduced consumption of Xplenty(ETL pipeline) node hours by 50 hours per day
  • > Reduced Redshift disk space usage by 15%
  • > Migrated ETL pipeline from Xplenty to an in-house built Source Replication Pipeline

Policybazaar.com

Software Engineer Intern

May 2019Aug 2019 · 3 mos · Gurugram, Haryana, India

  • > Worked as a Big Data Developer Intern
  • > Built a scalable and configurable big data analytics platform
  • > Wrote scripts for data cleansing and preparation of the data warehouse
  • > Structured data warehouse to reduce the query time
  • > Communicated with front-end & back-end team to engineer design patterns
  • > Apache Hadoop, Hive, Pyspark, Elasticsearch, Kafka
HadoopHivePysparkElasticsearchKafkaBig Data

Cogitans - techspace

Lead

Aug 2018Jul 2019 · 11 mos · USICT

  • > Cogitans is the Web Development Club of USICT
  • > Worked and lead many live projects
  • > Conducted classes on web development

Mojito labs mojitolabs.com

Full Stack Engineer

Feb 2018Jun 2018 · 4 mos · New Delhi, Delhi, India

  • > Worked as Full Stack Developer here.
  • > Worked with AngularJS, responsive web designs
  • > Made backend APIS in PHP + MySql

Gotech services

Full Stack Engineer

Sep 2017Dec 2017 · 3 mos · New Delhi, Delhi, India

  • > Worked as a Full MEAN stack developer cum freelancer
  • > Built a CMS with an extensive back-end using NodeJS
  • > Worked on multiple websites making reactive and responsive web designs

Education

Guru Gobind Singh Indraprastha University

Bachelor of Technology (B.Tech.) — Information Technology

Jan 2016Jan 2020

N. C. Jindal Public School, Punjabi Bagh

12th (CBSE BOARDS)

Jan 2015Jan 2016

N. C. Jindal Public School, Punjabi Bagh

10th (CBSE BOARDS)

Jan 2013Jan 2014

Stackforce found 100+ more professionals with Data Engineering & Big Data

Explore similar profiles based on matching skills and experience