Pavan Kumar Yerravelly

Software Engineer

Hyderabad, Telangana, India10 yrs 9 mos experience

Most Likely To SwitchAI ML Practitioner

Key Highlights

Expert in building data pipelines using Databricks and Talend.
Strong background in data engineering with extensive ETL experience.
Proven ability to enhance team productivity through reusable solutions.

Stackforce AI infers this person is a Data Engineering expert in the SaaS industry.

Contact

Skills

Core Skills

Data EngineeringApache SparkEtl Development

Other Skills

Amazon S3Apache AirflowApache KafkaApache Spark StreamingApache SqoopArtificial Intelligence (AI)Artificial Intelligence for BusinessAzure Data LakeCC C#CSSData ArchitectsData MaintenanceDatabricks

About

Experienced Data Engineer with a demonstrated history of working in the information technology and services industry. Skilled in PySpark, Databricks, Snowflake, Talend Open Studio, Apache Hive, Kafka and HBase. Strong information technology professional with a master degree focused in Computer Science and Engineering. My freelancing projects can be found at https://www.projectpro.io/user/instructor?faculty_id=60 and blog at https://www.projectpro.io/article/real-world-data-engineering-projects-/472

Experience

10 yrs 9 mos

Total Experience

2 yrs 8 mos

Average Tenure

4 yrs 4 mos

Current Experience

Microsoft

2 roles

Senior Data Engineer

Promoted

Sep 2025 – Present · 9 mos · Hyderabad, Telangana, India

Data Engineer

Feb 2022 – Sep 2025 · 3 yrs 7 mos · Hyderabad, Telangana, India

Thales

Lead Data Engineer

Nov 2021 – Jan 2022 · 2 mos · India

Part of InflytAnalytics suite of products development team
Developed optimal/automated extraction, transformation, and loading pipeline of data from a wide variety of data sources using Databricks and Scala
Data pipeline developed following Trunk based and Test Driven development approaches

DatabricksScalaData EngineeringApache Spark

Altimetrik

Senior Data Engineer

Nov 2020 – Nov 2021 · 1 yr · Hyderabad, Telangana, India

Developed configurable and reusable Databricks notebooks which increased team's productivity
Ad-Hoc Scheduling framework development
Client facing and code modification according to new requirements
Code reviews and GitLab maintenance
Leading/Mentoring engineers in the team
Participated in RFP response drafting

Infosys

Data Engineer

Feb 2019 – Nov 2020 · 1 yr 9 mos · Hyderabad Area, India

Import data from various data sources, transformation through Talend and export to various targets like HDFS, Hive tables, and Kafka stream
Hive External and managed table creation and loaded the data into tables and query data using HQL
Implemented ORC data format for Apache Hive computations to handle the custom business requirements
Data extraction to various file formats JSON, CSV etc
Near real-time data extraction using Spark Dataframes and Kafka
Data put and get from HBase using Talend as well as Spark
Kafka producer development using Talend and data validation at the consumer side
Talend job development to copy the files from one server to another using FTP components
Hive UDF development using Java HBase API to generate the natural key for given data and maintain integrity throughout the cluster and various batches
Process automation using shell script and HBase tracking for Delta loads to the fact table
Job scheduling and Log maintenance using Zena scheduler
Distributed version controlling with Git through Talend
CICD using OSJ and UCD
Agile software development approach using JIRA

TalendHDFSHiveKafkaData EngineeringETL Development

Adp

ETL Developer

May 2015 – Jan 2019 · 3 yrs 8 mos · Hyderabad

Implementation of the pass-thru interface which loads data from other HR systems to ADP EHRMS
New Pages and Components creation for various requirements.
Customizations to the pages and E-code validations to satisfy the requirements
Creation of Application Engine programs by adding new sections, steps, and actions
Run control definitions to run the Application Engine Programs, SQRs through Process Scheduler in the portal as per requirement.
Development of customized SQR process to read data from flat files and process Employee Transfers, Terminations, Promotions, and New hires.
Creation of new permission list, assigned to user profile through the role and selected required actions depending on business requirements
POC: Migration of data from EHRMS and Oracle database into Hive using Sqoop.
Analyzing the Hive data and producing reports and business logics/relationship between various types of data.