Shruthi Madumbu

Data Engineer

Plano, Texas, United States13 yrs 4 mos experience

Key Highlights

  • Expert in building scalable data pipelines.
  • Proficient in multiple cloud platforms including GCP, Azure, and AWS.
  • Strong background in data quality and ETL processes.
Stackforce AI infers this person is a Data Engineering expert with extensive experience in cloud-based data solutions.

Contact

Skills

Core Skills

Data EngineeringCloud Data WarehousingData QualityBig Data ProcessingRisk ManagementCloud MigrationApplication DevelopmentEtl Development

Other Skills

Apache SparkGoogle Cloud Platform (GCP)dataprocGoogle BigQuerySnowflakedelta lakeScalaPython (Programming Language)Apache AirflowGreat ExpectationsHiveHadoopBig DataExtract, Transform, Load (ETL)Technical Design

About

Data Engineering - Data Analytics - Data Pipeline Design and Implementation - GCP - Azure - AWS - Snowflake

Experience

Empirx health

Senior Data Engineer

May 2025Present · 10 mos · United States · Remote

Digimarc

Staff Data Engineer

Mar 2023Mar 2025 · 2 yrs · United States · Remote

  • Part of Digimarc’s Data Engineering team, responsible for creating data infrastructure and building Enterprise Datalake in GCP and Warehouse in Snowflake to execute data strategies with complete ownership of managing both the platforms.
  • Created Re-Usable and Config driven highly scalable data pipelines processing CDC logs to build ODS using Spark-Scala framework in Dataproc and creating Delta Lake tables in GCS.
  • As a Snowflake Admin, had setup SSO, Users and created all the necessary RBAC and Integrations along with objects like Warehouses, Databases, Schemas etc.
  • Automated Data ingestion into Snowflake using Airflow from external stages in GCS.
  • Created Data marts including facts and dims in Snowflake optimized for reporting applications.
  • Implemented Data Quality at scale using Great Expectations framework and integrated with Airflow to run on daily basis and generated DQ reports.
  • Worked on several data migration projects from On-Prem Legacy systems to GCP Cloud platforms reducing the Legacy infrastructure cost by 50%.
Apache SparkGoogle Cloud Platform (GCP)dataprocGoogle BigQuerySnowflakedelta lake+5

Visa

Staff Data Engineer

Nov 2021Jan 2023 · 1 yr 2 mos

  • Part of Visa’s Payment and Risk team led a team of 5 to build a Risk Alert Monitoring
  • Framework enabling users to create reports with self-service capabilities.
  • Processed large volumes (Petabytes) of Visa’s transactional data using distributed computing
  • with PySpark on Visa’s On-Prem systems
  • Integrated YAML based data quality framework within data pipelines for capturing all data
  • quality issues and alerting the users.
  • Worked with cross functional teams to setup infra for robust CICD frameworks for pipelines
  • deployments in production.
  • As a tech lead, collaborated with product owners for creating and driving initiatives. Involved in
  • spring planning, defining architecture and best practices.
Data EngineeringHiveApache SparkApache AirflowHadoopScala+4

Honeywell

Senior Data Engineer

Oct 2019Nov 2021 · 2 yrs 1 mo

  • Part of Honeywell’s Centralized Analytics team (HIA), responsible for building Enterprise
  • scale applications like Data Legos for Sales, Procurement, CRM.
  • Built data pipelines using Apache Spark, integrating different ERP systems from different
  • sources like SAP, HANA, Every Angle, Snowflake, raw files, Salesforce APIs etc.
  • Worked on Migration from On-Prem to Azure cloud – Involved in End-End
  • implementation like migrating the ranger policies for access, databases, tables, code
  • changes, new platform integration with Databricks.
  • Involved in several POCs – building a new Data Quality Framework using Amazon Deequ
  • Libraries, Scala APIs to fetch data from 3rd Party services.
Data EngineeringBig DataExtract, Transform, Load (ETL)Technical DesignAzure DatabricksAzure Data Factory+7

Ibm

Bigdata Developer

Sep 2018Sep 2019 · 1 yr

  • Worked for ANZ bank, developed ETL pipelines in Spark using AWS EMR for regulatory
  • reporting applications.
  • Created Jobs for importing data from Teradata to S3 on daily basis using Sqoop and creating
  • reporting datasets in Redshift.
  • Worked on optimizing data processing in Teradata that has the business logic required for
  • regulatory reporting of Risk and Finance returns.
  • Worked on several automation scripts in Unix and Python to perform Teradata database
  • migrations while not impacting existing users and workloads.
Data EngineeringBig DataExtract, Transform, Load (ETL)ETL Development

Cognizant

ETL Developer

Sep 2015Sep 2018 · 3 yrs

  • Worked for Retail Client, migrated the ETL workflows from Informatica to Storm and
  • produce flat files for further processing.
  • Worked intensively on Teradata Utilities like TPT (stream, load, export) and BTEQ.
  • Replicated the data from Teradata to Cassandra and Hive databases using Sqoop for
  • data analytics.
  • Involved in Teradata DBA activities for Database migration, user & database role
  • creations and tagging them for the batch and consumer roles.
Data EngineeringExtract, Transform, Load (ETL)Data ModelingETL Development

Tata consultancy services

Teradata Developer

Jun 2012Aug 2015 · 3 yrs 2 mos

  • Worked for Retail Client, worked on Teradata Utilities FastLoad, MultiLoad and
  • FastExport to transfer the data from DB2 – Files – Teradata.
  • Involved working in databases like Oracle, MySQL and writing PLSQL scripts.
  • Received On the Spot Award for suggesting the successful fine-tuning techniques,
Extract, Transform, Load (ETL)ETL Development

Education

G. Pulla Reddy Engineering College

Bachelor of technology — electronics and communication engineering

Jan 2008Jan 2012

Rotary high school

secondary education

Jan 1995Jan 2006

Sri Chaitanya Junior College

Stackforce found 100+ more professionals with Data Engineering & Cloud Data Warehousing

Explore similar profiles based on matching skills and experience