SURENDRA REDDY

Data Engineer

Bengaluru, Karnataka, India1 yr 8 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Designed scalable data pipelines processing 50M+ records daily.
  • Improved AI model accuracy by 15% through data annotation.
  • Established CI/CD workflows enhancing operational efficiency.
Stackforce AI infers this person is a Data Engineer specializing in scalable data solutions and real-time data processing.

Contact

Skills

Core Skills

Data EngineeringEtlData AnnotationMachine LearningFull-stack Development

Other Skills

Python (Programming Language)FlaskPySparkDatabricksApache KafkaAWSDockerCI/CDGitBitbucket PipelinesData QualityMicroservicesPythonBoto3Artificial Intelligence (AI)

About

Experienced in building scalable ETL pipelines, APIs, and automation workflows. Skilled in Python, PySpark, Databricks, and Flask. Passionate about developing data-driven solutions that power efficient, real-time insights.

Experience

1 yr 8 mos
Total Experience
1 yr 8 mos
Average Tenure
1 yr 8 mos
Current Experience

Petrabytes corp

Data Engineer

Oct 2024Present · 1 yr 8 mos · Bengaluru, Karnataka, India · On-site

  • Designed scalable batch and real-time data pipelines on AWS processing 50M+ records daily, reducing execution time by 40% through PySpark and distributed processing optimizations.
  • Built ETL/ELT pipelines using Databricks Lakehouse with medallion architecture (Bronze/Silver/Gold), enforcing data quality, completeness, and reliability across domains.
  • Integrated Apache Kafka for real-time event-driven data streaming, enabling low-latency ingestion and reliable processing across distributed microservices.
  • Implemented RESTful APIs in Python/Flask for pipeline monitoring, health checks, and AWS infrastructure automation via Boto3, improving observability and operational efficiency.
  • Established CI/CD DevOps workflows with Git and Bitbucket Pipelines to containerize and deploy microservices via Docker across dev and production environments using Infrastructure-as-Code practices.
  • Contributed to data architecture and dimensional modeling (Star Schema) design discussions; collaborated with Product Managers and Application Engineers following Agile/Scrum methodology.
  • Monitored live site reliability and resolved performance bottlenecks proactively to maintain service SLAs and ensure zero-downtime deployments.
Python (Programming Language)FlaskPySparkDatabricksApache KafkaAWS+6

Samsung india

Data Trainee

May 2024Oct 2024 · 5 mos · Bengaluru, Karnataka, India · On-site

  • Processed and labelled large-scale video datasets for the Bixby AI project, converting unstructured content into structured training data to improve speech and visual recognition accuracy by 15%.
  • Collaborated with data scientists in Agile sprints to refine annotation guidelines and iteratively improve labelling consistency, reducing rework rate across the ML training pipeline.
  • Maintained strict data quality standards through rigorous input validation prior to submission, supporting reliable and reproducible ML model development pipelines.
Python (Programming Language)Artificial Intelligence (AI)Data AnnotationMachine Learning

Webstack academy - wsa

Full Stack web Developer in MERN

Aug 2023Oct 2023 · 2 mos · Bengaluru, Karnataka, India · Remote

React.jsMongoDBFull-Stack Development

Education

Cambridge Institute of Technology

Bachelor of Engineering - BE — Electronics and Communications Engineering

Nov 2020Jun 2024

Stackforce found 100+ more professionals with Data Engineering & Etl

Explore similar profiles based on matching skills and experience