Bhaushi Aiyappa C

Data Engineer

India12 yrs 8 mos experience
Highly Stable

Key Highlights

  • Over 9 years of experience in Big Data and ETL.
  • Expert in developing scalable ETL pipelines using PySpark.
  • Proven track record in data quality improvement and performance tuning.
Stackforce AI infers this person is a Data Engineering expert with extensive experience in Big Data solutions and ETL processes.

Contact

Skills

Core Skills

PysparkAzure DatabricksApache FlinkTalend

Other Skills

Python (Programming Language)Apache AirflowGoogle Cloud Platform (GCP)DatabricksDatabricks AlertsPerformance MonitoringDelta LakeRDBMSSFTPApache KafkaBig DataAmazon S3AWS S3Talend cloudHadoop

About

Result Oriented Professional, Highly Motivated, Solutions driven with over 9 years of BigData and ETL (Extract Transform and Load) experience in the areas of Design and Development. Involved in complete Software Development life-cycle (SDLC) of various projects, including requirements gathering, ETL designing, Data Modeling, Development, Production Enhancements, Hypercare. Excellent Interpersonal and communication skills with an ability to remain highly focused and self-assured in fast-paced and high-pressure environments.

Experience

12 yrs 8 mos
Total Experience
2 yrs 11 mos
Average Tenure
11 mos
Current Experience

Walmart

Senior Data Engineer

Jul 2025Present · 11 mos

  • Developed reusable PySpark ETL functions with parameterized notebooks in Databricks for multiple ingestion pipelines, reducing code duplication.
  • Configured job failure alerts and performance monitoring using Databricks Alerts and cluster logs to ensure 24/7 data pipeline reliability.
  • Parsed and transformed nested JSON data using PySpark into flattened tables stored in Delta Lake for consumption by analytics teams.
  • Implemented partitioning strategy in Delta tables based on date and region fields to optimize query performance and reduce shuffle.
  • Experienced on working with ETL using Talend data integration.
Azure DatabricksPython (Programming Language)PySparkApache AirflowGoogle Cloud Platform (GCP)

Vendavo

Senior Data Engineer

Aug 2024Oct 2024 · 2 mos

  • • Experienced in Apache flink to fetch source data from RDBMS/SFTP servers/Apache Kafka and load to target tables.
Apache AirflowApache Flink

Accenture

Application Development Team Lead

Dec 2019Aug 2024 · 4 yrs 8 mos

  • Developed scalable ETL pipelines using PySpark in Databricks to ingest and transform batch data from AWS S3 to Delta Lake, enabling daily reporting for business stakeholders.
  • Implemented data validation and quality checks in PySpark to detect duplicates, nulls, and schema mismatches in input datasets, improving data reliability by 30%.
  • Utilized Delta Lake in Databricks for version-controlled data storage, allowing time-travel queries and rollback for critical financial datasets.
  • Tuned Talend Jobs for performance, including parallel execution and memory management.
  • Designed robust error handling and logging frameworks within Talend.
  • Implemented Scala/Pyspark code to load Hive data into client network by INTELLIJ IDE using Multiproduct(MP) code.
  • Worked on developing Talend Integration by loading data from Rest API/RDBMS/Hadoop Hive to Rest
  • API/RDBMS/Hadoop Hive that increased productivity in the region in less than 2 year period.
  • Experienced working with Apache Airflow with python, pyspark programming which can be used as a
  • replacement to Talend. Experienced in working with Multiproduct code.
  • Experience in writing sql views, stored procedure combining multiple tables in Talend.
  • Experience in fetching the source data from Hive, using Talend Data Integration perspective and loading the transformed data back to Hive.
  • SOAP, POSTMAN and Oracle Concurrent program have been used to invoke, extract and load data in Talend.
  • Experienced in loading data from files to Hadoop Hive tables using python programming.
  • Seamlessly processing and analysing terabytes to petabytes of data in a cost-effective manner in Databricks.
Azure DatabricksPySparkTalendBig DataAmazon S3

Capgemini

Application Development Analyst

Aug 2018Dec 2019 · 1 yr 4 mos · India

  • Experience in creating Standard (ETL & ELT) and Spark Batch job, ECL (End Customer logic) jobs.
  • Proficiency in developing SQL with various relational databases like Hive , Microsoft SQL Server, Vertica, TERA.
  • Used Spark-SQL to load data and create schema RDD and load it into Hive tables and handled structured data using SparkSQL.

Unisys

Junior Analyst

Jul 2016Aug 2018 · 2 yrs 1 mo · India

  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Experience on real time logs and analyzing the same using Flume.
  • ● Proficiency in developing SQL queries with various relational databases like Hive, Microsoft SQL Server, Redshift, Snowflake, Oracle MySQL.
  • ● Experience in writing sql views, stored procedure combining multiple tables.

Coorg institute of technology, ponnampet

Assistant Professor

Sep 2012May 2016 · 3 yrs 8 mos

Education

Visvesvaraya Technological University

Bachelor of Engineering - BE

Visvesvaraya Technological University

Master of Technology - MTech — Microelectronics & Control Systems

Stackforce found 100+ more professionals with Pyspark & Azure Databricks

Explore similar profiles based on matching skills and experience