Pavitra Rai

Data Engineer

Pune, Maharashtra, India4 yrs 3 mos experience
Highly Stable

Key Highlights

  • Designed scalable data pipelines on AWS.
  • Automated ETL workflows improving data reliability.
  • Delivered analytics-ready datasets for decision-making.
Stackforce AI infers this person is a Data Engineer specializing in scalable cloud-based data solutions.

Contact

Skills

Core Skills

Data EngineeringBig DataData Analysis

Other Skills

AWS GlueAWS LambdaAmazon EC2Amazon Elastic MapReduce (EMR)Amazon KinesisAmazon RedshiftAmazon S3Amazon Web Services (AWS)Apache KafkaApache SparkData AnalyticsData ModelingData VisualizationData WarehousingDocker

About

I am a Data Engineer with hands-on experience designing and operating scalable, cloud-based data pipelines for large datasets. I have built and automated end you to end ETL workflows, improved data reliability, and delivered analytics-ready datasets for reporting and decision-making. My work includes developing distributed data processing pipelines using PySpark on AWS, managing Amazon S3-based data lakes, and migrating legacy data into scalable cloud architectures. I have implemented real-time ingestion and streaming systems using Apache Kafka and Amazon Kinesis, ensuring low-latency processing and operational stability. I have contributed to improving data quality through robust data modeling, automated validation, and monitoring, resulting in more accurate reporting and reduced operational issues. I hold a Bachelors of Technology degree in Biomedical Engineering and have strong proficiency in Python, SQL, and Power BI. I am comfortable owning the full data lifecycle from ingestion and transformation to delivery and focus on building reliable, production-grade data systems.

Experience

Consultadd inc.

2 roles

Big Data Engineer

Promoted

Jul 2024Oct 2025 · 1 yr 3 mos

  • Designed and optimized large-scale batch data pipelines using PySpark on AWS Glue and EMR, reducing EMR job runtimes by 10–12% through improved joins, partitioning, and execution strategies. Automated data ingestion and pipeline orchestration using AWS Lambda and Step Functions, while migrating legacy datasets into Amazon S3 to improve pipeline reliability and performance by 10–15%. Optimized analytical workloads in Amazon Redshift by refactoring complex SQL queries and implementing Python-based data validation checks, reducing heavy query runtimes by ~20%. Integrated high-volume event data from Amazon MSK and Kinesis into downstream processing workflows to support near real-time analytics and reduce dashboard latency. Containerized data workloads using Docker and deployed them on Kubernetes (EKS), improving deployment reliability and operational consistency.
PySparkAWS GlueEMRAWS LambdaStep FunctionsAmazon S3+7

Analyst | Managment Engineer

Jan 2024Jul 2024 · 6 mos

  • Delivered data-driven insights by analyzing customer and operational datasets using Python and SQL, and built Power BI dashboards that improved workflow visibility and reduced process issue detection time by 15%.
PythonSQLPower BIData Analysis

Club kshitij

3 roles

Joint Secretary

Aug 2023Jul 2024 · 11 mos

  • Led planning and execution of large-scale cultural events and competitions, coordinating a team of 80+ members across logistics, promotions, and operations.
  • Drove event marketing and outreach through social media and campus campaigns, increasing inter-college participation and securing external sponsorships.

Information Manager

Jul 2022Sep 2023 · 1 yr 2 mos

Event Coordinator

Jul 2021Jul 2022 · 1 yr

Shalby limited

Data Analytics Intern

May 2022Jun 2022 · 1 mo · Indore, Madhya Pradesh, India

Education

Shri G S Institute of Technology & Science

Bachelor of Technology - BTech

Jan 2020Jan 2024

St. Arnolds Higher Secondary school

Schooling

Sep 2006May 2019

Stackforce found 100+ more professionals with Data Engineering & Big Data

Explore similar profiles based on matching skills and experience