Druva Bobbilla — Product Manager

Senior Data Engineer | GCP & AWS | Cloud Data Migration & Advanced Data Engineering I build and scale cloud-native data platforms that turn massive, complex datasets into fast, actionable insights. Over 7+ years across Fortune-500 environments, I’ve helped migrate large, legacy data estates from Teradata and other RDBMS to Google Cloud BigQuery, consistently improving performance and lowering cost. What I bring to the table • Led multi-terabyte cloud migrations using Cloud Composer / Airflow DAGs with dynamic parameters for datasets and project IDs. • Designed and executed end-to-end data validations—cross-system comparisons, automated QA with Gen-AI, and reconciliation scripts that repeatedly hit 100% accuracy. • Wrote extensive Python and Unix scripts, applying complex regular expressions to automate checks, parse multi-record layouts, and orchestrate ingestion pipelines. • Used PowerShell and shell scripting to inspect and filter files in GCS buckets, speeding diagnostics and audits. • Implemented RBAC policies, policy tags, and conformance-field strategies to meet strict governance and compliance needs. • Built resilient ETL pipelines, data lakes, and data-warehouse solutions, designed catch-up strategies for downtime recovery, and supported seamless production operations. • Partnered with leads and managers to present validation metrics and cost/performance analyses that influenced key business decisions. • Delivered observability and monitoring enhancements, automated pipeline alerts, and performance tuning that reduced processing times and improved system reliability. Tech Snapshot GCP (BigQuery, Cloud Composer/Airflow, Vertex AI, GCS) • AWS (S3, EMR, Redshift) • Hadoop • Spark • Hive • Advanced SQL • Python • Unix shell • PowerShell • Git • CI/CD • Data Warehousing • Data Lakes • Data Pipeline Automation • Orchestration • Monitoring & Performance Tuning I’m passionate about turning legacy data estates into high-performing, cloud-native ecosystems and using analytics to unlock strategic value. Whether designing data pipelines that scale to billions of rows or mentoring teams on modern data-engineering practices, I thrive on solving complex problems and delivering measurable results. Beyond the technical side, I value collaboration and knowledge sharing—from mentoring junior engineers to helping leadership shape data strategy. ► Open to senior data engineering, cloud architecture, or technical leadership roles where I can scale next-generation platforms and drive data-driven innovation.

Stackforce AI infers this person is a Data Engineering expert specializing in cloud-native solutions and big data analytics.

Location: Irving, Texas, United States

Experience: 8 yrs 2 mos

Skills

Cloud Data Migration
Data Engineering
Infrastructure As Code
Cloud Data Solutions
Big Data Solutions

Career Highlights

Led multi-terabyte cloud migrations with zero downtime.
Engineered resilient ETL pipelines and data lakes.
Developed automated solutions enhancing data quality and performance.

Work Experience

Infosys

Technology Lead (2 yrs 5 mos)

Logging-In.com Inc

Data Engineer (1 yr 3 mos)

Tata Consultancy Services

Data Engineer (10 mos)

TeknXpert

Software Engineer (3 mos)

IBM

Technology Analyst (3 yrs 5 mos)

Education

Master's degree at University of Colorado Denver

Bachelor's degree at Jawaharlal Nehru Technological University

PGP In Big Data Analytics and Optimization at International School of Engineering (INSOFE)

Druva Bobbilla

Product Manager

Irving, Texas, United States8 yrs 2 mos experience

Most Likely To Switch

Key Highlights

Led multi-terabyte cloud migrations with zero downtime.
Engineered resilient ETL pipelines and data lakes.
Developed automated solutions enhancing data quality and performance.

Stackforce AI infers this person is a Data Engineering expert specializing in cloud-native solutions and big data analytics.

Contact

Skills

Core Skills

Cloud Data MigrationData EngineeringInfrastructure As CodeCloud Data SolutionsBig Data Solutions

Other Skills

AirflowAmazon AthenaAnalytical SkillsApache Spark StreamingBig Data AnalyticsBigQueryClouderaCollibra PlatformCyber-Physical SystemsData AnalyticsData ManagementData QualityData ScienceData VisualizationDatabases

About

Experience

8 yrs 2 mos

Total Experience

1 yr 7 mos

Average Tenure

2 yrs 5 mos

Current Experience

Infosys

Technology Lead

Dec 2023 – Present · 2 yrs 5 mos · Irving, Texas, United States · On-site

Led a large-scale enterprise data-warehouse migration from multiple RDBMS platforms to Google Cloud BigQuery, modernizing analytics and reducing cost while maintaining strict data-quality standards.
Executed the end-to-end cloud migration strategy, delivering a seamless cut-over with zero unplanned downtime.
Engineered data-transformation and migration pipelines using SQL, BQSQL, PySpark on Dataproc, and automated complex workflows through Airflow (Cloud Composer) and Tidal to increase reliability and efficiency.
Redesigned legacy Hadoop jobs to enable direct ingestion into BigQuery, significantly lowering latency and processing time.
Applied Hive and advanced SQL/BQSQL for large-scale data queries and built reconciliation frameworks to ensure 100% data integrity throughout migration.
Created reusable Python and Unix scripts, leveraging advanced regular expressions to automate ETL tasks, data validations, and multi-record file parsing.
Used PowerShell and shell scripting to inspect and filter files in GCS buckets, accelerating diagnostics and environment-wide audits.
Implemented rigorous data-validation and reconciliation strategies, providing metrics and dashboards that assured business stakeholders of post-migration accuracy.
Proactively monitored production pipelines, rapidly resolving issues and safeguarding data accuracy and reliability.
Partnered with project leads and managers, presenting validation results and performance analyses that directly supported key business decisions.

Google Cloud BigQuerySQLPySparkAirflowTidalCloud Data Migration+1

Logging-in.com inc

Data Engineer

Aug 2022 – Nov 2023 · 1 yr 3 mos · Michigan, United States

Translated complex business requirements into detailed low-level design specifications, setting the foundation for high-performing data solutions.
Engineered serverless Python Cloud Functions to automate diverse, high-impact use cases, improving scalability and reducing manual overhead.
Built production-grade infrastructure-as-code with Google Cloud SDK and Terraform, enabling repeatable, fully automated environment creation across projects.
Designed and deployed configuration frameworks that powered robust ETL pipelines, delivering faster, more reliable data ingestion and transformation.
Led source-to-target data analysis and architected the target data model, ensuring accuracy and optimal performance for downstream analytics.
Executed rigorous unit and integration testing, guaranteeing enterprise-grade reliability and zero-defect releases.
Developed interactive analytics dashboards in Google Data Studio and Tableau, giving leadership real-time insight and driving data-driven decision-making.
Authored and maintained comprehensive technical documentation in Confluence, and consistently kept Jira stories current to support transparent, agile delivery.

PythonGoogle Cloud SDKTerraformGoogle Data StudioTableauData Engineering+1

Tata consultancy services

Data Engineer

Oct 2021 – Aug 2022 · 10 mos · Vernon Hills, Illinois, United States · On-site

Translated business requirements into low-level design, optimizing data modules for efficiency and scalability.
Utilized Python to craft cloud functions for diverse use cases, enabling dynamic DAG creation, GCS file processing, and seamless data loading into BigQuery.
Implemented efficient stored procedures and complex SQL queries for CDC and SCD, empowering curated data consumption on BigQuery.
Developed custom functions and authorized views for secure data sharing across projects within BigQuery.
Leveraged gcloud SDK and Terraform to script infrastructure creation, streamlining the deployment process.
Spearheaded the creation of dynamic DAGs using cloud functions, DAG templates, and configuration files, improving data pipeline flexibility.
Orchestrated data pipelines using Airflow (Composer) with Bash, Python, BigQuery, and Dataproc operators, ensuring smooth data flow.
Designed configuration files for building ETL pipelines, ensuring data transformation and integration accuracy.
Wrote Spark code to ingest data from various sources like Oracle, Postgres, and Pub/Sub, supporting Full Refresh, Incremental use cases, and data curation.

PythonBigQueryAirflowSQLData EngineeringCloud Data Solutions

Teknxpert

Software Engineer

Jul 2021 – Oct 2021 · 3 mos · Charlotte, North Carolina, United States

Designed and maintained scalable data ingestion pipelines from diverse sources using tools like Kafka, Sqoop, and Flume.
Optimized Hadoop cluster performance through hardware and software fine-tuning, ensuring high availability.
Collaborated on HBase schema and Hive table design for optimal representation and query performance.
Ensured data integrity and accuracy with quality checks, working closely with data governance teams.
Oversaw Hadoop cluster health and resolved issues using tools like Ambari and Cloudera Manager.
Worked with data scientists and BI developers to translate data needs into scalable Hadoop solutions.
Implemented robust security measures, including Kerberos authentication and data encryption.

KafkaHadoopHBaseHiveData EngineeringBig Data Solutions

Ibm

Technology Analyst

Feb 2016 – Jul 2019 · 3 yrs 5 mos · Hyderabad, Telangana, India · On-site

Writing Sqoop Jobs to ingest data from RDBMS systems like Oracle,
Create different Hive type of tables including Managed and External tables.
Create HQL files to build Aggregated tables, data enrichment and UDF functions.
Writing PySpark Applications to read the HDFS files, Perform data clean up.
Writing Spark code to perform complex transformation using Spark SQL and UDF functions.
Submitting Spark applications in lower environments and analyzing the logs.
Monitoring and analyzing the issues in Production.
Loading Data into Hbase using Spark.
Writing oozie workflows for orchestrating the jobs and scheduling.
Creating Client specific Data marts to share with the Stakeholders