Parv Rastogi — Data Engineer

Hello! I'm Parv Rastogi, a results-driven Data Engineer with 3.5 years of experience in data acquisition, management, and pipeline optimization, seeking full-time data engineering roles. Proven ability to optimize data pipelines, achieving a 40% reduction in processing time, 20% increase in reliability and 40% reduction in individual job cost through Spark Jobs amd ETL Optimizations. 𝙆𝙚𝙮 𝘼𝙘𝙝𝙞𝙚𝙫𝙚𝙢𝙚𝙣𝙩𝙨: - Reduced data processing time by 60% through automation scripts, enhancing operational efficiency by 25%. - Increased data pipeline reliability by 20% by contributing to infrastructure management during FaaS execution. - Identified and resolved 15% of infrastructure-related incidents, minimizing system downtime. - Improved data quality and downstream application performance by 40% through proactive data pipeline maintenance. 𝙏𝙚𝙘𝙝𝙣𝙞𝙘𝙖𝙡 𝙎𝙠𝙞𝙡𝙡𝙨: - Cloud Services (ADLS, Azure Databricks) - Programming Languages (Python, SQL) - Data Engineering (Hadoop, Spark, DataBricks) - Databases (MySQL, PostgreSQL, Presto) - Data Warehouses (Hive, Snowflake) 𝙎𝙤𝙛𝙩 𝙎𝙠𝙞𝙡𝙡𝙨: - Communication - Teamwork - Cross-Team Collaboration 𝙋𝙧𝙤𝙛𝙚𝙨𝙨𝙞𝙤𝙣𝙖𝙡 𝙅𝙤𝙪𝙧𝙣𝙚𝙮: In my previous role at Innovaccer Analytics Pvt Ltd., a US-based healthcare company, I engineered and maintained data pipelines, ensuring the ingestion of clean and validated data onto the platform. I proactively identified and resolved issues across the data processing pipeline, demonstrating expertise in SQL commands and scripting for platform specifications. Notably, I developed Python automation scripts, integrating real-time Slack alerts, to optimize data load processes, showcasing my problem-solving abilities. 𝙒𝙝𝙮 𝙈𝙚? As a quick learner with a passion for the data engineering domain, I'm confident in my potential to be a valuable addition to any Data Engineering team. My experience crafting custom PowerBI reports and interactive dashboards underscores my ability to conceptualize, design, and develop solutions tailored to specific business requirements.

Stackforce AI infers this person is a Data Engineer specializing in SaaS and Healthcare data solutions.

Location: Bengaluru, Karnataka, India

Experience: 3 yrs 8 mos

Skills

Data Engineering
Sql
Data Quality

Career Highlights

Achieved 60% reduction in data processing time.
Increased data pipeline reliability by 20%.
Crafted custom PowerBI reports for enhanced data visualization.

Work Experience

Indium

Data Engineer (1 yr 4 mos)

Uber

External Consultant (1 yr 4 mos)

Career Break

Professional development (2 mos)

Health and well-being (2 mos)

Innovaccer

Data Analyst - Data Engineering (2 yrs 4 mos)

Associate Software Engineer Intern (6 mos)

Education

Bachelor of Technology - BTech at KIET Group of Institutions

12th at Seth Anandaram Jaipuria School

10th at Seth Anandram Jaipuria School

Parv Rastogi

Data Engineer

Bengaluru, Karnataka, India3 yrs 8 mos experience

AI EnabledAI ML Practitioner

Key Highlights

Achieved 60% reduction in data processing time.
Increased data pipeline reliability by 20%.
Crafted custom PowerBI reports for enhanced data visualization.

Stackforce AI infers this person is a Data Engineer specializing in SaaS and Healthcare data solutions.

Contact

Skills

Core Skills

Data EngineeringSqlData Quality

Other Skills

Apache SparkApache HiveData MigrationAIPySparkPowerBIData VisualizationPostgreSQLLinuxJavaScalaPiperAzureSnowflakeExtract

About

Experience

3 yrs 8 mos

Total Experience

2 yrs 4 mos

Average Tenure

1 yr 4 mos

Current Experience

Indium

Data Engineer

Feb 2025 – Present · 1 yr 4 mos · Bengaluru, Karnataka, India · Hybrid

Data Migration & Optimization: Migrated critical Uber trip datasets from GDW to DI, optimizing SQL queries and boosting query performance by 25%.
Framework Contribution: Enhanced Sparkle (Spark) framework by adding bucketing support, enabling consistent data distribution and faster joins for large ETL pipelines across DI teams.
AI-driven Migration: Developed a POC using Cursor and Claude AI for GDW migrations, cutting manual effort by 40% and accelerating migration timelines.
Feature Development: Engineered and maintained Python and Java Spark jobs, resolving data inconsistencies and delivering business-driven features.
Legacy Datasets Deprecation: Streamlined data ecosystem by assisting deprecation of legacy datasets, migrating user ETLs and queries to modern SOTs.
Customer Support and Migration: Assisted data science and business teams through dataset adoption, ensuring smooth SQL query migration with minimal disruptions.

Apache SparkApache HiveData EngineeringSQL

Uber

External Consultant

Feb 2025 – Present · 1 yr 4 mos · Bengaluru, Karnataka, India · Hybrid

Apache SparkApache Hive

Career break

2 roles

Professional development

Dec 2024 – Feb 2025 · 2 mos

Health and well-being

Dec 2024 – Feb 2025 · 2 mos

Innovaccer

2 roles

Data Analyst - Data Engineering

Jul 2022 – Nov 2024 · 2 yrs 4 mos · Noida, Uttar Pradesh, India

Engineered and maintained data pipelines, ensuring the ingestion of clean and validated data onto the Datashop Platform, demonstrating expertise in SQL commands and scripting for platform specifications.
Identified bottlenecks in data ingestion from diverse sources, such as SFTP and ADLS. Developed Python scripts to streamline data flow, achieving a 75% reduction in processing time. Integrated real-time Slack alerts within scripts to identify and resolve data load issues, leading to a further 25% improvement in operational efficiency.
Contributed to infrastructure management during Function as a Service (FaaS) execution, guaranteeing end-to-end ownership, preventing failures or interruptions in analytics job execution and increasing the overall reliability of the system by 20%.
Leveraged monitoring and observability tools to identify and resolve 15% of infrastructure-related incidents during the initial investigation phase. This minimized system downtime and ensured stability, demonstrating a resourceful approach to incident management and system reliability.
Orchestrated seamless data acquisition and management processes by implementing robust data cleaning and preprocessing strategies, utilizing PostgreSQL and Snowflake for efficient extraction from diverse sources.
Proactively identified and mitigated data pipeline issues, leading to improved data quality and a 40% increase in downstream application performance.
Played a key role in crafting a custom PowerBI report and an interactive dashboard for a client, showcasing directed efforts to conceptualize, design and develop solutions to meet specific business requirements. The dashboard streamlined data exploration and analysis, providing the client with an accessible platform for in-depth study of the chronic disease population.

PySparkSQLData EngineeringData Quality