SURENDRA REDDY

Data Engineer

Bengaluru, Karnataka, India1 yr 8 mos experience

AI EnabledAI ML Practitioner

Key Highlights

Designed scalable data pipelines processing 50M+ records daily.
Improved AI model accuracy by 15% through data annotation.
Established CI/CD workflows enhancing operational efficiency.

Stackforce AI infers this person is a Data Engineer specializing in scalable data solutions and real-time data processing.

Contact

Skills

Core Skills

Data EngineeringEtlData AnnotationMachine LearningFull-stack Development

Other Skills

Python (Programming Language)FlaskPySparkDatabricksApache KafkaAWSDockerCI/CDGitBitbucket PipelinesData QualityMicroservicesPythonBoto3Artificial Intelligence (AI)

About

Experienced in building scalable ETL pipelines, APIs, and automation workflows. Skilled in Python, PySpark, Databricks, and Flask. Passionate about developing data-driven solutions that power efficient, real-time insights.

Experience

1 yr 8 mos

Total Experience

1 yr 8 mos

Average Tenure

1 yr 8 mos

Current Experience

Petrabytes corp

Data Engineer

Oct 2024 – Present · 1 yr 8 mos · Bengaluru, Karnataka, India · On-site

Designed scalable batch and real-time data pipelines on AWS processing 50M+ records daily, reducing execution time by 40% through PySpark and distributed processing optimizations.
Built ETL/ELT pipelines using Databricks Lakehouse with medallion architecture (Bronze/Silver/Gold), enforcing data quality, completeness, and reliability across domains.
Integrated Apache Kafka for real-time event-driven data streaming, enabling low-latency ingestion and reliable processing across distributed microservices.
Implemented RESTful APIs in Python/Flask for pipeline monitoring, health checks, and AWS infrastructure automation via Boto3, improving observability and operational efficiency.
Established CI/CD DevOps workflows with Git and Bitbucket Pipelines to containerize and deploy microservices via Docker across dev and production environments using Infrastructure-as-Code practices.
Contributed to data architecture and dimensional modeling (Star Schema) design discussions; collaborated with Product Managers and Application Engineers following Agile/Scrum methodology.
Monitored live site reliability and resolved performance bottlenecks proactively to maintain service SLAs and ensure zero-downtime deployments.

Python (Programming Language)FlaskPySparkDatabricksApache KafkaAWS+6

Samsung india

Data Trainee

May 2024 – Oct 2024 · 5 mos · Bengaluru, Karnataka, India · On-site

Processed and labelled large-scale video datasets for the Bixby AI project, converting unstructured content into structured training data to improve speech and visual recognition accuracy by 15%.
Collaborated with data scientists in Agile sprints to refine annotation guidelines and iteratively improve labelling consistency, reducing rework rate across the ML training pipeline.
Maintained strict data quality standards through rigorous input validation prior to submission, supporting reliable and reproducible ML model development pipelines.

Python (Programming Language)Artificial Intelligence (AI)Data AnnotationMachine Learning