Sunil Badugu

Operations Associate

Hyderabad, Telangana, India8 yrs 1 mo experience

Most Likely To Switch

Key Highlights

Reduced data pipeline runtime from 3+ hours to 14 minutes.
Cut compute costs by nearly 90% through optimization.
Mentored junior data engineers and shared insights.

Stackforce AI infers this person is a Big Data Engineer with expertise in cloud data technologies and pipeline optimization.

Contact

Skills

Core Skills

Data EngineeringBig DataData Pipeline DevelopmentData ManagementEtl Development

Other Skills

Azure DatabricksData Governance and SecurityData QualityContinuous Integration and Continuous Delivery (CI/CD)Azure devopsPySparkDelta LakeGitHubPython (Programming Language)Master Data ManagementHadoopMicrosoft SQL ServerSpark StreamingAzure Data FactoryData Lakes

About

Reduced a 2.5 billion row pipeline from 3+ hours to 14 minutes Recently, I worked on optimizing a large-scale data pipeline and honestly, it was one of those problems every data engineer enjoys. What started as a 3+ hour runtime ended up being reduced to just 14 minutes, cutting compute costs by nearly 90%. But this wasn’t about throwing more resources at the problem It came down to going back to fundamentals: Better partitioning strategy Fixing inefficient PySpark transformations Reducing unnecessary shuffles Leveraging Delta Lake optimizations And building a more config-driven, reusable pipeline design Over the past 9+ years, I’ve grown from writing ETL scripts to working on enterprise-scale data platforms. I’ve been fortunate to work across domains like gaming, finance, and MDM, which really helped me understand how different businesses use data at scale. What I focus on: ✔️ Scalable, reusable data pipelines ✔️ Medallion architecture design ✔️ Data quality & governance ✔️ Batch + streaming with Delta Lake Outside of work, I enjoy mentoring data engineers and sharing practical insights from real-world projects.

Experience

8 yrs 1 mo

Total Experience

1 yr 1 mo

Average Tenure

1 yr 5 mos

Current Experience

Tredence inc.

Associate Manager - Data Engineering

Dec 2024 – Present · 1 yr 5 mos · Bengaluru · On-site

Designed config-driven pipelines that dynamically load source, target, and schema rules from configuration tables, improving reusability, scalability, and maintainability.
Optimized large-scale ingestion pipelines processing 2.5B+ rows, reducing workflow runtime from 3+ hours to ~14 minutes and lowering compute costs by ~90%.
Integrated Stone Branch with Azure Databricks to automate and orchestrate end-to-end pipelines with centralized scheduling, dependency management, and real-time monitoring.
Developed REST API–based ingestion workflows using ADF ForEach loops to load data from SAP HANA, CSV, and Event Hubs into landing zones.
Implemented Data Quality (DQ) frameworks to validate incoming data and enforce governance standards.
Designed and enforced Row-Level Security (RLS) to ensure secure and compliant data access.
Applied watermarking for incremental data loads to improve pipeline efficiency and performance.
Implemented checkpointing and Delta Lake time travel for exactly-once processing, recoverability, and auditability across streaming and batch pipelines.
Utilized Medallion Architecture (Bronze → Silver → Gold) to build structured, clean, and analytics-ready data layers.
Optimized data pipelines using Databricks, PySpark, and Delta Lake for high-performance processing and analytics.
Define framework to validate source configurations, orchestrate Parquet data loading from Landing to Silver using Foreach loops, and resolve ingestion issues proactively.

Azure DatabricksData Governance and SecurityBig DataData QualityContinuous Integration and Continuous Delivery (CI/CD)Azure devops+1

Manuh technologies

Senior Data Engineer

Feb 2024 – Sep 2024 · 7 mos · Hyderabad · Hybrid

Designed and deployed end-to-end Azure data pipelines ingesting multi-source data into ADLS Gen2 using Azure Data Factory and PySpark.
Built and maintained Databricks notebooks for data transformation, applying Delta Lake best practices for reliability and performance.
Collaborated with cross-functional stakeholders to define data requirements and deliver analytics-ready datasets on time.
Conducted data quality checks and resolved ingestion issues to maintain pipeline stability in a fast-paced project environment.

Azure DatabricksPySparkGitHubPython (Programming Language)Data EngineeringData Pipeline Development

Self-employed / independent

Self-employed / Independent

Sep 2023 – Feb 2024 · 5 mos

Worked as an independent data engineering consultant, advising on Azure Databricks pipeline design and data governance. Completed Databricks Certified Data Engineer Professional certification and Profisee PaaS AKS Deployment certification during this period. Mentored junior data engineers and contributed to pipeline architecture reviews for small teams.

Wipro

Senior Software Engineer

Jan 2022 – Sep 2023 · 1 yr 8 mos · Hyderabad, Telangana, India · Hybrid

Designed and implemented data models and optimized SQL tables for efficient data processing and storage.
Executed project plans, managed tasks, and tracked progress to ensure timely delivery.
Managed customer master data using Profisee (MDM).
Monitored and optimized post-deployment data processes to maintain quality and performance.
Worked with gaming finance and customer data for analytics and reporting.
Developed SSRS reports using DBMail for reconciliation, campaign, and data load reporting.
Provided production support and resolved client-raised issues.
Collaborated with clients through Monday.com for file management and data processing.

Master Data ManagementHadoopAzure DatabricksMicrosoft SQL ServerPySparkSpark Streaming+7

Nityo infotech

Senior System Engineer

Jul 2021 – Dec 2021 · 5 mos · Pune Division

Developed and optimised SQL-based data workflows for reporting and operational data processing.
Supported data migration and integration tasks, ensuring data integrity across source and target systems.
Contributed to requirement analysis and technical documentation for data pipeline enhancements.

Azure DatabricksSpark StreamingApache SparkAzure Data FactoryData EngineeringData Pipeline Development

Career break

Career transition

Nov 2020 – Jul 2021 · 8 mos

Took a deliberate career break to deepen expertise in cloud data technologies , While already experienced with Azure data services, Spark, and SQL, I focused on strengthening advanced concepts and optimization techniques.

Tech mahindra

Senior Associate Engineer

Jul 2019 – Nov 2020 · 1 yr 4 mos · Hyderabad

Built and maintained ETL pipelines using SSIS to extract, transform, and load data into SQL Server data warehouses.
Designed Star Schema data marts to support business reporting and analytics requirements.
Developed SSRS reports providing stakeholders with operational and performance insights.
Troubleshot and resolved data pipeline failures, improving overall reliability and reducing manual intervention.
Collaborated with business analysts to translate reporting needs into efficient SQL queries and data models.

Azure DatabricksGitHubSpark StreamingApache SparkData EngineeringETL Development

Infosys

2 roles

Senior Process Executive

Promoted

May 2018 – Jun 2019 · 1 yr 1 mo

Using ETL packages pull data from local driver files to SQL tables.
Involved in the design of the database and created Data marts extensively using Star Schema.
Created detailed SSRS reports using SQL data for various business insights.
Generated daily reports to monitor SSIS packages' data load status.
Streamlined the process of pulling data from local driver files to SQL tables using ETL packages.

SQL Server Integration Services (SSIS)Microsoft SQL ServerSQL Server Reporting Services (SSRS)Data EngineeringETL Development

Process Executive

Mar 2017 – May 2018 · 1 yr 2 mos

Developed new enhancements for including the fields to the extractors as per user requirements.
Responsible for incorporating all the review comments in design documents.
Developed and Reviewed the SSIS jobs and getting confirmation from clients.
Extensively used SSIS to develop various parallel jobs to perform various ETL operations such as extract, cleanse, transform, integrate and load data into SQL server Datawarehouse tables.

SQL Server Integration Services (SSIS)Microsoft SQL ServerData EngineeringETL Development