PRAVEEN KUMAR BARATAM

Data Engineer

Hyderabad, Telangana, India6 yrs 9 mos experience
Most Likely To Switch

Key Highlights

  • Expert in building scalable data pipelines.
  • Proficient in optimizing Spark job performance.
  • Certified Azure Data Engineer with hands-on experience.
Stackforce AI infers this person is a Data Engineer specializing in Big Data solutions across Healthcare and Fintech sectors.

Contact

Skills

Core Skills

Data Pipeline DevelopmentEtl PipelinesData EngineeringSpark Applications

Other Skills

AWS GlueAWS S3Agile MethodologiesAirflowAmazon EC2Amazon S3Amazon Web Services (AWS)AnacondaApache AirflowApache KafkaApache SparkApache SqoopAzure Data LakeAzure DatabricksAzure DevOps

About

I am a Data Engineer with extensive experience in building Big Data systems to provide a Unified Analytics Platform. With expertise in conceptualizing and implementing data pipelines, I am responsible for converting data into informational insights thus helping the organization to make data-driven decisions. My experience in various industries includes Sporting Goods Manufacturing & Food and Beverage Services & Banking domains. ☑️ Key Competencies: ➡️ Designing Big Data ETL Pipelines ➡️ Building a Unified Analytics Platform ➡️ Design Thinking ➡️ Optimize the Job Execution Time ➡️ Strategy Planning and Implementation ➡️ Communication ☑️ Technologies: ➡️ AWS Cloud & Azure Cloud Services ➡️ Apache Spark Programming ➡️ PySpark Programming ➡️ SQL for Data Analysis ➡️ Simplify Data analysis With Python ➡️ Manage Clusters Databricks. ➡️ Data Extraction and Transformation and Load (Databricks & Hadoop) ➡️ Linux Commands ➡️ Job Scheduling using Apache Airflow ➡️ Data Ingestion for Apache Sqoop ➡️ Data Warehouse Software for reading, writing and managing large data set files using Apache Hive. ➡️ Experience with version controls for GitHub, Bitbucket ➡️ Jenkins for automate the parts of software development related to building, testing, and deploying, facilitating CI/CD. ☑️ Key Achievements: ➡️ Microsoft-Certified-Azure-Data-Fundamentals (DP-900) ➡️ Microsoft-Certified-Azure-Data-Engineer-Associate (DP-203) ➡️ Partner Training - Developer Foundations (Databricks Community) ➡️ Partner Training - Developer Essentials (Databricks Community) ➡️ Worked on the optimization of Spark-based Data Integration framework performance using best practices which resulted in performance improvement.

Experience

6 yrs 9 mos
Total Experience
1 yr 10 mos
Average Tenure
3 yrs 4 mos
Current Experience

Hcltech

Senior Technical Lead

Feb 2025Present · 1 yr 3 mos · Banglore · Remote

  • As a Data Engineer in the Pharmaceutical domain, I design, develop, and maintain scalable data pipelines to enable data-driven decision-making.
  • My key responsibilities include:
  • Data Pipeline Development & Optimization:
  • Design and implement ETL pipelines using PySpark and Azure Fabric to process large-scale pharmaceutical data (e.g., clinical trials, drug efficacy, patient records).
  • Optimize Spark jobs for performance tuning, partitioning, and efficient resource utilization in Azure Fabric.
  • Develop and maintain SQL-based data models for analytics, reporting, and regulatory compliance.
Microsoft FabricApache SparkPySparkSQLAzure DevOpsData Pipeline Development+1

Dun & bradstreet technology and corporate services india llp

Software Engineer - II

Apr 2024Jan 2025 · 9 mos · Hyderabad, Telangana, India · Hybrid

  • As a Software Engineer - II at Dun & Bradstreet, I design and implement scalable data solutions to empower business intelligence, analytics, and data-driven decision-making for global information services.
  • My key responsibilities include:
  • Data Engineering & Pipeline Development:
  • Develop and optimize ETL pipelines using PySpark on Google Cloud Platform (GCP) to process large-scale datasets (e.g., business entity data, financial records, credit analytics).
  • Design efficient SQL queries and data models for structured and semi-structured data to support reporting and APIs.
  • Implement data transformations and aggregations to enhance Dun & Bradstreet’s data products and services.
Google Cloud Platform (GCP)ScalaSQLApache AirflowGithubData Engineering+1

Intervue.io

On Demand Interviews Taking

Nov 2023Present · 2 yrs 6 mos · Hyderabad, Telangana, India · Remote

  • >Intervue helps you accelerate hiring by taking up candidate interviews.
  • >I'm Evaluating the Data Engineer Interviews for both Freshers & Experienced Candidates.

Topmate.io

Super Mentor

Jan 2023Present · 3 yrs 4 mos · Remote

  • Talks about #bigdata, #dataengineering, #programmerslife, #bigdataanalytics, and #bigdatadeveloper

Ibm

Data Engineer

Jul 2022Mar 2024 · 1 yr 8 mos · Hyderabad, Telangana, India · Hybrid

  • Developed Spark applications using PySpark for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
  • Responsible for estimating the cluster size, monitoring, and troubleshooting of the Spark databricks cluster.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time,
  • correct level of Parallelism and memory tunning.
  • Performed unit testing on 20+ new scripts weekly for identifying bugs and coordinated with team to fix them.
  • Performed the actual code conversion according to the defined strategy. (SAS Code to PySpark).
  • Modify and refactor the codebase to meet the target language or framework standards.
  • Test the converted code for functionality and compatibility.
  • Collaborate with the technical lead to resolve technical challenges.
  • Ensure version control and documentation of the converted code.
  • Develop and execute test plans for the converted code.
  • Verify and validate the functionality and performance of the converted code.
  • Identify and report any issues or bugs in the converted code and resolving the issues.
  • Ensure compliance with coding standards and best practices.
  • Document the code conversion process and guidelines.
  • Create user manuals and technical documentation for the converted application.
  • Manage the code in GitLab and allow to track changes, collaborate with team members, and maintain a history of codebase.
Azure DevOpsGitlabUnit TestingpytestDockerApache Spark+7

Mindtree

Senior Software Engineer

Sep 2021Jun 2022 · 9 mos · Hyderabad, Telangana, India · Remote

  • Design, development and implementation of performant ETL pipelines using python API (PySpark) of Apache Spark on Databricks
  • Integration of data storage solutions in spark – especially with AWS S3 object storage.
  • Performance tuning of PySpark scripts.
  • Build the infrastructure required for optimal extraction, transformation, and loading (ETL) of data from a wide variety of data sources like SAP Hana, MySQL and other data sources.
  • Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
  • Building Reusable Data ingestion and Data transformation frameworks using PySpark.
  • Coordinated with business customers to gather business requirements. And interact with other technical peers to derive technical requirements and delivered the BRD and TDD documents.
DataBricksPython (Programming Language)SQLAWS GlueTerraformAmazon S3+7

Hcl technologies

Graduate Engineering Trainee

Mar 2019May 2021 · 2 yrs 2 mos · Noida, Uttar Pradesh, India · On-site

Intern theory career solutions pvt. ltd.

Brand Manager

Mar 2018Apr 2018 · 1 mo · India · Remote

Cyient

Summer Internship

May 2017Jun 2017 · 1 mo · Madhapur, Hyderabad · On-site

Education

Jawaharlal Nehru Technological University, Kakinada

Bachelor of Technology - BTech — Computer Science and Engineering

Jul 2015May 2018

Stackforce found 100+ more professionals with Data Pipeline Development & Etl Pipelines

Explore similar profiles based on matching skills and experience