Parin Shah

Data Engineer

Arlington, Virginia, United States3 yrs 8 mos experience
AI ML PractitionerAI Enabled

Key Highlights

  • Expert in architecting reliable data pipelines.
  • Proven track record in improving data accuracy.
  • Strong focus on performance and reproducibility.
Stackforce AI infers this person is a Data Engineer specializing in SaaS and data pipeline architecture.

Contact

Skills

Core Skills

MentoringTeachingData EngineeringAwsBatch ProcessingData Warehousing

Other Skills

Amazon Web Services (AWS)Apache AirflowPrompt engineeringData mappingData ingestionCustom parsing logicPandasSQLPySparkPython (Programming Language)Team LeadershipTableauData LakehouseConfluentData Analysis

About

I build data pipelines that don’t wake people up at 2 a.m. 😆 My strength is taking messy, high-volume data and turning it into reliable, observable systems with clear ownership, SLAs, and failure recovery built in from day one. My focus stays on performance, correctness, and reproducibility. I’ve worked across batch and streaming architectures, security and analytics use cases, and environments where data quality failures actually cost money. Currently pursuing an MS in Computer Science (’26) and looking for data engineering roles where scale, accountability, and production rigor are non-negotiable.

Experience

3 yrs 8 mos
Total Experience
3 yrs
Average Tenure
8 mos
Current Experience

I360

Data Engineer

May 2026Present · 0 mo · Arlington, VA · On-site

Khoury college of computer sciences

Teaching Assistant - Natural Language Processing

Jan 2026Apr 2026 · 3 mos · Boston, MA · Hybrid

MentoringTeaching

I360

Data Engineer Co-op

May 2025Dec 2025 · 7 mos · Arlington, VA · On-site

  • Architected a GenAI-driven entity resolution pipeline using prompt engineering to map invoiced customers to parent entities and industry sectors, reducing manual mapping overhead and accelerating sales and financial analytics workflows.
  • Designed and implemented a real-time deduplication framework for large-scale data ingestion pipelines, identifying root causes and resolving them via custom parsing logic, significantly improving data accuracy and downstream reliability.
  • Analyzed historical pipeline failures and implemented data validation, cleaning, and quality checks across live data pipelines, improving overall data accuracy by ~20% and reducing downstream reporting issues.
  • Automated end-to-end sales analytics by replacing manual SQL-based workflows with a Pandas-driven pipeline, generating aggregated metrics for recurring executive reporting and reducing manual intervention.
Amazon Web Services (AWS)Apache AirflowData EngineeringAWS

Quantiphi

2 roles

Senior Data Engineer

Apr 2023Jul 2024 · 1 yr 3 mos · Mumbai, Maharashtra, India · Hybrid

  • Maintained batch data pipelines processing TBs using PySpark, SQL, and Amazon EMR, improving pipeline throughput by ~30% through parallel processing and optimized Spark configurations
  • Deployed and operated production data pipelines with automated testing, monitoring, and CI/CD workflows to support live-streaming and batch workloads, ensuring on-time processing of critical datasets.
  • Mentored and trained newly onboarded interns, guiding them to gain proficiency in AWS cloud tools, project management services, and software development lifecycles; enabling them to contribute effectively to live projects.
PySparkPython (Programming Language)Data EngineeringBatch Processing

Data Engineer

Jul 2021Apr 2023 · 1 yr 9 mos · Mumbai, Maharashtra, India · Hybrid

  • Developed 30+ automated, time-based KPIs by integrating demographic and sales datasets, improving inventory planning efficiency by ~15% for a global e-commerce client
  • Architected a near real-time data warehouse ingesting 18+ semi-structured cybersecurity sources, halving the data refresh latency.
  • Identified and resolved data quality issues in 20+ manually uploaded Point of Sale (POS) data sources; implemented robust error-handling solutions to improve data accuracy and reliability, improving data quality by 30%.
PySparkPython (Programming Language)Data EngineeringData Warehousing

Education

Northeastern University

Master's degree — Computer Science

Sep 2024May 2026

SVKM's NMIMS Mukesh Patel School of Technology Management & Engineering

B. Tech (Honors) — Computer Science

Jul 2017May 2021

Stackforce found 100+ more professionals with Mentoring & Teaching

Explore similar profiles based on matching skills and experience