Vaibbhav B Vasa

Engineering Manager

Bengaluru, Karnataka, India9 yrs 8 mos experience
Most Likely To Switch

Key Highlights

  • Led migration to Lakehouse architecture using Databricks.
  • Optimized data processing costs by 20% through innovative strategies.
  • Built financial data pipelines ensuring privacy and compliance.
Stackforce AI infers this person is a Data Engineering expert with a strong focus on cloud technologies and big data solutions.

Contact

Skills

Core Skills

Data EngineeringDatabricksCost OptimizationAwsGcpEtl

Other Skills

Advanced JavaAirflowAmazon Web Services (AWS)Apache AirflowApache FlinkApache KafkaApache OozieApache SparkApache SupersetCachingDatabasesDockerGoogle Cloud Platform (GCP)Graviton MachinesHadoop

About

Computer Science graduate with a strong background in CS fundamentals along with an experience of developing applications on a variety of languages and technology stacks. Currently, managing the Data Platform team at Zepto Technologies managing data at huge scale Software Engineer | Data Engineer | Java | Python | Spark | Hadoop | Kafka | Flink | Presto | Docker | Kubernetes | Helm | Databricks

Experience

Zepto

2 roles

Engineering Manager

Promoted

Apr 2025Present · 11 mos · Bengaluru, Karnataka, India · On-site

Lead Data Engineer

Nov 2023May 2025 · 1 yr 6 mos · Bengaluru, Karnataka, India · On-site

  • 1. Lead the entire migration from Warehouse(Redshift & Hevo) to Lakehouse(Databricks)
  • 2. Deployed Open Source Airflow, Superset & Kafka Connect on K8s.
  • 3. Build CDC Layer using Kafka & Debezium to load Data to Databricks Delta Tables
  • Created multiple frameworks using Airflow & Databricks Job Compute for below use cases:
  • a. Bronze to Silver Framework to Load Backend Postgres & Mongo Tables to Databricks
  • via CDC
  • b. Flats Framework for Analysts to create Gold tables.
  • c. Notebook Scheduling framework for scheduling jobs in Databricks
  • d. Exporter Framework for exporting Data from Databricks to Google Sheets/S3 etc.
  • 4. Carried out below Cost Optimization activities as well to improve utilization & reduced overall
  • cost by ~20%:
  • a. Use Graviton Machines for All Workloads.
  • b. Clubbed Multiple Table Runs(20) into a single job to save Driver Cost
  • c. Liquid clustered tables < 500GB, Partitioned & ZOrdered the ones > 500GB
  • d. Used Pools to save on Compute Spin Up Time Cost and improve reusability
  • e. Leverage Caching & Optimizing PySpark Code in Notebooks for long running jobs.
  • f. Use Spot machines with a less frequency of interruption
Apache SupersetAirflowPySparkApache AirflowDatabricksData Engineering

Navi

Data Engineer III

Dec 2022Oct 2023 · 10 mos · Bengaluru, Karnataka, India · Hybrid

  • As a part of the Navi's Data Platform team, we served the Data Analytics & Data Science team to cater to
  • their data needs. Worked mostly on below components in the past 6 months:
  • 1. Primarily worked on Financials Data, especially credit score related and built pipelines to
  • target customers based on their credit score ensuring privacy and security norms.
  • 2. Worked on deploying daily jobs to Presto On Spark via Self Serve Analytics approach. Made
  • changes to Open Source Presto codebase to make it compatible with AWS graviton instances.
  • 3. Automated several workflows in Airflow for External Data Sources. Automated Table
  • Partitioning with Hudi using Airflow to avoid manual errors.
  • 4. Worked on building Flink pipelines using Scala for real time use cases
AirflowApache SparkAmazon Web Services (AWS)ScalaPrestoPinot+4

Glance

Data Engineer II

Nov 2020Dec 2022 · 2 yrs 1 mo · Bangalore Urban, Karnataka, India

  • As a part of the Glance Data Platform team, I have worked in planning, deploying, and upgrading the entire software stack in GCP and Azure using Microservices and Distributed systems. Along with that, also managed the underlying ETL jobs to efficiently process and store the data. As an Individual Contributor & Teamwork we have achieved the below milestones:
  • 1. Consuming, Storing & Managing Petabyte Scale Data ensuring quality & consistency.
  • 2. Improved SLA by 4 hours by Migrating various Hadoop jobs to Spark using Scala.
  • 3. Onboarding the entire Stack of Roposo & Shop101 to Trino reduced Infra cost by almost 45%.
  • 4. Deploying self-managed Airflow for Scheduling ETL Jobs processing data in Spark & Trino
  • 5. Writing SQL queries & Custom UDFs to aggregate raw data for business needs
  • 6. As the SME for Trino/Presto for entire Glance, used to research, experiment, and deploy various configuration and session properties of Trino, to utilize the resources efficiently.
Google Cloud Platform (GCP)Kafka StreamsApache SupersetAirflowTrinoHive+6

Adobe

2 roles

Data Engineer

Promoted

Oct 2019Nov 2020 · 1 yr 1 mo

  • Data Reprocessing for Billions of records in order to cleanse the data in Adobe Analytics
  • Migrating legcy Apache Pig ETL Process to WFE(Apache Airflow) for scaling and processing higher volumes of data.
HadoopMapReduceAirflowApache OoziePySparkAmazon Web Services (AWS)+2

Associate Data Engineer

Nov 2016Sep 2019 · 2 yrs 10 mos

  • Creation of Custom Solutions for features not available OOTB in Adobe products
  • Creation of Big Data Pipelines in Spark on AWS
  • Design and development of pipeline involving - Reading data from Adobe Live Streams
HadoopMapReduceApache Oozie

Deloitte digital

2 roles

Business Technology Analyst

Aug 2016Oct 2016 · 2 mos · Greater Hyderabad Area

Business Technology Analyst-Intern

Jan 2016Jul 2016 · 6 mos · Greater Hyderabad Area

Chalkstreet

Product Development Architect-Intern

May 2015Jun 2015 · 1 mo · Greater Coimbatore Area

Education

Birla Institute of Technology and Science, Pilani

Post Graduate Program in Big Data Engineering

Jan 2020Jan 2021

Great Lakes Institute of Management

Post Graduation Program in Data Science Engineering

Jan 2018Jan 2018

KIIT - Kalinga Institute of Industrial Technology

Bachelor of Technology (B.Tech.) — Computer Science and Engineering

Jan 2012Jan 2016

Stackforce found 100+ more professionals with Data Engineering & Databricks

Explore similar profiles based on matching skills and experience