Yogesh Tiwari

Data Engineer

Bengaluru, Karnataka, India2 yrs 10 mos experience

Key Highlights

Optimized data pipelines reducing costs by $11K/month.
Enhanced CDC framework improving data ingestion reliability.
Designed database systems for improved data accuracy.

Stackforce AI infers this person is a Data Engineer specializing in Big Data and Cloud solutions.

Contact

Skills

Core Skills

Big Data & ProcessingCloud PlatformsData Engineering PracticesDatabases

Other Skills

AWSAWS CodeBuildAWS Identity and Access Management (AWS IAM)AWS SQSAlgorithmsAmazon Relational Database Service (RDS)Amazon Simple Notification Service (SNS)Big Data AnalyticsC++CDCData ModelingData StructuresDatabricksDelta TablesETL

About

I am a Data Engineer with 3+ years of experience building scalable, reliable, and high-performance data pipelines and platforms. My expertise lies in designing and optimizing batch and real-time data processing solutions using Spark (Structured Streaming), Databricks, and Airflow, along with strong hands-on experience in AWS cloud services (S3, Glue, Lambda). I have worked extensively on streaming pipelines using Kafka and Debezium (CDC), data lake architectures with Delta Lake, and databases like PostgreSQL and MongoDB. My experience also covers performance tuning, pipeline optimization, and migration projects, ensuring both scalability and cost-efficiency. Core Skills: • Programming: Python, SQL • Big Data & Processing: Spark (Structured Streaming, PySpark), Databricks, Apache Airflow, Apache Kafka, Debezium (CDC), Delta Lake • Cloud Platforms: AWS • Databases: PostgreSQL, MongoDB, MySQL • Data Engineering Practices: Performance tuning, pipeline optimisation, migration & Data modelling

Experience

2 yrs 10 mos

Total Experience

1 yr 5 mos

Average Tenure

Current Experience

Junglee games

Data Engineer

May 2024 – Oct 2025 · 1 yr 5 mos · Bengaluru, Karnataka, India · Hybrid

Optimized 30+ Databricks Spark jobs by auditing cluster usage, tuning autoscaling - boosted CPU/memory utilization from 35% to 70%+ and cut compute costs by $11K/month.
Reduced AWS storage costs by 70% by analyzing Spark UI shuffle/spill patterns and eliminating unnecessary NVMe/EBS usage.
Refactored complex streaming pipelines with optimized join strategies, enabling AQE and reducing batch time by 90%.
Redesigned ETL logic for flagship game Rummy, fixing 90%+ of rake & game count discrepancies and unblocking 10+ downstream BI/analytics teams.
Enhanced in-house CDC framework by onboarding critical production tables, debugging ingestion failures, and improving schema handling — boosting stability and reliability.

DatabricksSparkAWSETLCDCBig Data & Processing+1

Sltl group - sahajanand laser technology ltd

Software Engineer - Data

Dec 2022 – May 2024 · 1 yr 5 mos · Gandhinagar, Gujarat, India

Designed and implemented database system for diamond processing with optimized table structures for accuracy.
Built tailored SQL queries for tracking software, improving data presentation and decision-making.
Developed ETL flows for laser machine logs (CSV extraction, schema analysis, cleaning, MySQL loading)- enabled better insights into machine performance & product quality.
Integrated external API-driven JSON data into database, ensuring clean structured ingestion for seamless invoice generation.

SQLETLMySQLDatabases