Vaibhav Puram

Senior Software Engineer

Bengaluru, Karnataka, India10 yrs 8 mos experience

Most Likely To Switch

Key Highlights

Over 10 years of experience in big data lifecycle management.
Contributed to major open source projects like Apache Spark and Ranger.
Led critical data migration projects handling over 5PB of data.

Stackforce AI infers this person is a Big Data Engineer with expertise in Data Governance and Analytics.

Contact

Skills

Core Skills

Data PlatformApache SparkData GovernanceApache RangerTechnical ArchitectureData Modeling

Other Skills

Distributed SystemsMicrosoft AzureJavaCustomer AdvocacyAlgorithmsProject DeliveryKubernetesProduct DevelopmentShell ScriptingKey Performance IndicatorsCustomer CommunicationBusiness RequirementsScalaCustomer DataData Streaming

About

As a Senior Software Engineer, Data Platform at Microsoft , with over 10+ years of experience in the information technology and services industry, I design, develop, and integrate various generic components that solve the organisational needs in the big data lifecycle and management, such as data generation, migration, processing, discovery, access control, and governance. I also enhance and contribute to open source tools, such as Apache Spark, Apache Atlas, and Apache Ranger, to improve their performance and functionality at scale. Prior to this, I have worked on multiple data engineering and analytics projects, such as Flipkart Pay Later, a buy now pay later product that predicts customer behaviour and affordability based on data insights. I have also built and deployed ETL and ML pipelines, REST services, and Hadoop clusters using technologies like Hive, Java, Apache Airflow, and Kubernetes. I hold a B.Tech. degree in Computer Science from RAJIV GANDHI UNIVERSITY OF KNOWLEDGE TECHNOLOGIES, BASAR, and multiple certifications in Kubernetes for Developers and Certified Kubernetes Application Developer. I am passionate about solving complex data problems and creating value for the business and the community.

Experience

10 yrs 8 mos

Total Experience

1 yr 9 mos

Average Tenure

2 yrs 5 mos

Current Experience

Microsoft

Senior Software Engineer

Dec 2023 – Present · 2 yrs 5 mos · Bengaluru, Karnataka, India · Hybrid

Distributed SystemsData PlatformApache SparkMicrosoft AzureJava

Rakuten symphony

Senior Software Engineer, Data Platform

Apr 2022 – Nov 2023 · 1 yr 7 mos

As a part of Rakuten Symphony Product suite , I am an integral member of Data Platform product development team. I design, develop & integrate various generic components that will solve the organisational needs in the big data lifecycle and management, such as data generation, data migration, data processing, data discovery, data access control & governance etc., considering the performance at scale. And In the process, making the Open source Tools perform better, by enhancing them to our needs and contributing back to the community.
Implemented Synchronous Listeners in Apache spark to fix Spark Atlas Connector issues of missing data objects & lineages.
Extended Apache Ranger for non hadoop stack like S3/MinIO, Cassandra/Yugabyte.
Enhanced Trino S3(Hive) Tables auto partition sync for registering new partitions.
Designed and implemented monitoring frameworks and pipelines to capture and calculate KPIs for different components in Data Platform.
Lead a cross functional team of 5(UI/Backend/DE/QA) ,including myself as an individual contributor, for a Drag Drop UI based Data transformation engine.
Lead various Critical Data Migration projects with more than 5PB of data to newer Systems of Data Platform and also enhancing the performance of system and applications in the process.

Customer AdvocacyData GovernanceAlgorithmsTechnical ArchitectureProject DeliveryApache Ranger+20

Rakuten india

Senior Software Engineer, Data Platform

Aug 2021 – Apr 2022 · 8 mos · India

I am a member of Rakuten Data Platform product development team. My responsibilities include design, develop & integrate various generic components that will solve the organisational needs in the big data lifecycle, such as data generation, data migration, data processing, data discovery, data access control etc., considering the performance at scale.

Customer AdvocacyData GovernanceAlgorithmsTechnical ArchitectureProject DeliveryApache Ranger+17

Flipkart

2 roles

Software Engineer 2, Customer Insights

Promoted

Apr 2021 – Aug 2021 · 4 mos

Insights team as a part of FPG (Fintech of Flipkart), aims at predictingcustomers behavior in various aspects like risk, affordability etc., based on which the customers will be lent credit, either with in Flipkart like recommending products by the affordability, or by third party financial firms like banks, insurance companies etc.
Designed the analytics system (set up the Hadoop cluster and its ecosystem, choosing the right tech stack for the use case, model data for optimized read/writes etc.,) for various insight generation use cases. Flipkart Pay Later(BNPL) product is the primary use case.
Closely worked with Data Scientists and build ETL + ML pipelines for the needs.
Developed a Wrapper REST service with Persistence Layer for Apache Livy for remote Spark job submissions.

Customer AdvocacyHiveE-CommerceTechnical ArchitectureProject DeliveryKey Performance Indicators+12

Software Engineer, Customer Insights

Sep 2019 – Apr 2021 · 1 yr 7 mos

E-CommerceTechnical ArchitectureProject DeliveryKey Performance IndicatorsBusiness RequirementsCustomer Data+5

Enquero

Software Engineer

Apr 2019 – Sep 2019 · 5 mos · Bengaluru Area, India

ITDSD is VMWare’s Data engineering and analytics solutions platform. It will help in providing efficient data warehouse thereby catering analytics ready data that will help in gaining business insights.
As a part of this project, I have worked on License Fraud program, which helps in detecting fraudulent licenses’ usage. Also involved in developing ETL pipelines as per the business’s needs.

Project DeliveryKey Performance IndicatorsBusiness RequirementsTroubleshootingTeamworkData Modeling+2

Tata consultancy services

3 roles

IT Analyst

Promoted

May 2018 – Mar 2019 · 10 mos

TCS ACTIVE ARCHIVE
TCS Active Archive is a high performance, scalable, secure, and cost effective Big Data archive that keeps data ‘warm’ for easy access and analytics. It is a readymade solution developed for the projects/enterprises that face the costly and complex task of managing huge volumes of data and storing it to meet regulatory requirements. It provides solutions in the form of ETLs that helps in the complex data analytics.
Developed web services using Java that drive Data Ingestions. This involved modules that
Crawls any Website(Using Apache Nutch) and put the data to MongoDB/ Elastic search.
Creates/Triggers jobs for RDBMS to HIVE ingestion and streams application logs using flume and saving the same in HFDS for further processing.
HDFC LIFE TRANSFORMATION
As a part of middleware layer (IBM SOA) developed Java web services that serves the critical functionality of transformation process. Owner of RMS (Requirement Management Service), a critical module within SOA layer that provides automation for fulfilling the required documents without user's intervention in the Insurance journey

Business RequirementsTroubleshootingTeamwork