Damini Gupta

Data Engineer

Bengaluru, Karnataka, India9 yrs 7 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Reduced data pipeline runtimes by up to 90%.
Designed Bronze-Silver-Gold Delta Lake layers for improved data quality.
Certified in Databricks and Snowflake, showcasing expertise.

Stackforce AI infers this person is a Data Engineer specializing in Fintech and Data Engineering solutions.

Contact

Skills

Core Skills

Data EngineeringAzureSnowflakeEtl

Other Skills

AWS S3Ab InitioAmazon S3Amazon Web Services (AWS)AnalyticsApache AirflowApache NiFiApache Spark SQLApache Spark StreamingAzure Data FactoryAzure Data LakeAzure DatabricksAzure SQLCoding ExperienceCross-team Collaboration

About

As a Senior Data Engineer, I specialize in turning complex data challenges into scalable, business-ready solutions. With expertise in Azure, Databricks, and Snowflake, I’ve built and optimized batch and streaming pipelines that reduced runtimes by up to 90% and enabled real-time insights across Supply-Chain, Fintech(Banking,insurance) ,Telecom, and Pharma domains. Skilled in PySpark, Python,PL/SQL,Azure,Linux I design robust ETL/ELT frameworks, data models, and governance practices while mentoring teams and driving performance improvements. Highlights: ⚡ Built real-time and batch pipelines on Databricks ,snowflake and Azure, integrating diverse sources and ensuring reliability ⚡ Designed and optimized fact/dimension models and KPIs, accelerating business insights and reporting ⚡ Designed Bronze–Silver–Gold Delta Lake layers with Delta Live Tables (DLT), improving data quality & observability ⚡ Implemented best practices in python, Spark, SQL, and Delta Lake, reducing compute costs and improving data reliability ⚡ Automated workflows with Airflow, and Databricks Workflows, ensuring scalability and fault tolerance. Certified in Databricks and Snowflake, with a B.E. in Computer Science (BIT Mesra) and experience at EY, Talentica, and Vodafone. Let’s connect if you’re passionate about modern data engineering and cloud platforms!

Experience

9 yrs 7 mos

Total Experience

2 yrs 4 mos

Average Tenure

4 yrs 4 mos

Current Experience

Ey

2 roles

Senior Data Engineer II

Jun 2024 – Present · 2 yrs · On-site

Built scalable Azure Databricks pipelines using PySpark to process data across Sales, Demand Planning
◦ Ingested multi-format data (Excel, CSV, Parquet, Delta) with schema validation, incremental loading, and down-stream compatibility.
◦ Designed and deployed a real-time ingestion pipeline with PySpark, processing over 5M+ records per KPI with zero data loss and high reliability.
◦ Developed scalable PySpark logic to orchestrate forecast data preparation, enabling seamless comparison of planner-generated forecasts and Machine Learning based predictions, and operationalized (accuracy, bias).
◦ Consolidated weekly snapshots to compute forecast accuracy KPIs (2 years) for Power BI reporting, orchestrated via Databricks Workflows.
◦ Troubleshoot PySpark jobs (caching, partitioning, parallel execution), reducing pipeline latency by 30%.
◦ Designed Bronze, Silver, Gold layers in Delta Lake using Delta Live Tables (DLT), with quality rules and dimensional modeling.
SDR: Legacy ETL Migration to Snowflake
◦ Modernized legacy PL/SQL ETL into Snowflake SQL, improving performance and maintainability.
◦ Streamlined procedures into modular workflows, improving runtime by 40% and simplifying integration with down-stream systems.

Azure DatabricksPySparkData IngestionData ProcessingDelta LakeData Engineering+1

Senior Data Engineer I

Jan 2022 – May 2024 · 2 yrs 4 mos · On-site

WBO: Data Mart Development for Wholesale Banking
◦ Developed transformations using Apache Spark SQL and DataFrame APIs for large-scale banking datasets.
◦ Automated pipelines for 8 products, leveraging reusable Spark components to improve efficiency and reduce manual effort and cluster management.
◦ Orchestrated workflows with Apache Airflow (DAG scheduling, retries, parameterization) ensuring fault-tolerant ETL.
◦ Implemented SCD Type 2 for customer/transaction history to support analytics and compliance reporting.
◦ Resolved cross-system inconsistencies via rule-based validations, schema reconciliation, null handling, improving data quality.
◦ Optimized data lake storage and query performance by 90% using partitioning, Z-order clustering, and dimensional modeling.
IRDAI: Regulatory Reporting Platform
◦ Built pipelines in Azure Data Factory to load Excel data from Blob Storage into Azure SQL Database.
◦ Coordinated with 12 cross-functional teams to unify flows into relational databases, improving data accessibility and database design, lead a team of 4 people.
◦ Optimized SQL queries enabling performance tuning, reducing report generation by 70% for 93 regulatory reports, ensuring compliance and security and Regulated Industries.
◦ Designed data models and ER diagrams for domains (Policies, Claims), supporting unified and governed data management as per business requirements.

Apache Spark SQLDataFrame APIsApache AirflowSQLData QualityData Engineering+1

Talentica software

Software Engineer

Jul 2021 – Jan 2022 · 6 mos · Pune, Maharashtra, India · Remote

Designed and deployed ingestion pipelines from AWS S3 to Snowflake using streams, tasks, and pipes for near real-time data processing
Built ETL workflows with Snowflake SQL and Python, reducing daily pipeline runtime and improving maintainability
Developed and delivered 20+ Looker dashboards, enabling real-time operational and customer-level insights with snowflake as data warehouse.
Improved data availability and reporting speed, supporting critical decision-making for a US Banking client enabling access and outbound data sharing.

AWS S3Snowflake SQLPythonLookerETLData Engineering

Vodafone shared services india

Software Engineer

Jul 2018 – Jul 2021 · 3 yrs · Pune/Pimpri-Chinchwad Area · On-site

Developed and optimized ETL pipelines in Ab Initio and Teradata SQL, ensuring GDPR compliance and secure handling of telecom data
Enhanced BTEQ/SQL and Bash scripts across 60+ source systems, automating schema updates and improving efficiency
Streamlined ETL logic to maintain data integrity across customer, billing, and usage datasets
Supported issue resolution and delivered data warehouse solutions for high-volume telecom transactions, improving reliability and quality

Ab InitioTeradata SQLETLData IntegrityData Engineering