Rahul Sharma — Software Engineer

Hello! I am a dedicated Data Engineer with over 6 years of experience in designing, developing, and implementing comprehensive data engineering solutions across various industries, including Gaming, Financial Services, and Technology. Currently, I serve as a Senior Data Engineer at Junglee Games, where I lead innovative projects aimed at enhancing data accessibility and operational efficiency. My expertise lies in building scalable data architectures, ETL pipelines, and cloud-native platforms utilizing advanced technologies such as Apache Spark, Kafka, and AWS. Throughout my career, I have achieved significant milestones, including reducing data processing time by 20% and minimizing manual intervention in data pipeline orchestration by 80%. I have also spearheaded the development of key platforms like the Unified Data Platform for real-time Change Data Capture and an internal data integration tool called "Avengers." My passion for data engineering drives me to foster collaboration and continuous improvement within my teams, ensuring that we deliver high-quality solutions that meet organizational goals. Key Achievements: - Developed a Unified Data Platform that reduced data availability time to under 15 minutes. - Created an internal tool that decreased business report generation time by 70%. - Recognized with multiple awards for outstanding contributions, including the Top Rookie Award in May 2024. Technical Skills: Data Architecture Business Intelligence ETL Process Optimization Cloud Computing Data Quality Assurance Real-time Data Processing Data Governance CI/CD Automation Data Integration Data Analytics Data Security BI & Reporting Automation Programming Languages: PySpark, Python, SQL, Scala, Spark, Data Structures, Algorithms Cloud Services and Technologies: Databricks, Amazon S3, AWS Glue, AWS EMR, AWS Lambda, AWS EKS, Spark, Spark SQL, Delta Lake, YARN, Kubernetes, Hive, CI/CD Pipeline Streaming Technology: Kafka, Spark Structured Streaming Data Engineering Tools: Data Modeling and Warehousing, ETL/ELT Data Pipeline, Medallion Architecture Orchestration: Airflow, Kubernetes Familiar: Flask, FastAPI, Elasticsearch, Docker, Git, CI/CD Pipeline Databases: MySQL, MSSQL, PostgreSQL, MongoDB Github Profile https://github.com/panditrahulsharma

Stackforce AI infers this person is a Data Engineering expert in the Gaming and Financial Services sectors.

Location: Madrid, Community of Madrid, Spain

Experience: 6 yrs 4 mos

Skills

Data Engineering
Cloud Computing
Data Governance

Career Highlights

Reduced data processing time by 20%
Minimized manual intervention in data pipelines by 80%
Developed a Unified Data Platform with <15 min SLA

Work Experience

Thoughtworks

Senior Data Engineer (4 mos)

Junglee Games

Data Engineer-2 (4 yrs 1 mo)

Data Engineer-3 (3 yrs 11 mos)

InfoObjects Inc.

Data Engineer (1 yr 2 mos)

Metaorigin Labs

Software Engineer (Big Data ) (1 yr 3 mos)

Education

Bachelor's degree at Rajasthan Technical University, Kota

Rahul Sharma

Software Engineer

Madrid, Community of Madrid, Spain6 yrs 4 mos experience

Highly Stable

Key Highlights

Reduced data processing time by 20%
Minimized manual intervention in data pipelines by 80%
Developed a Unified Data Platform with <15 min SLA

Stackforce AI infers this person is a Data Engineering expert in the Gaming and Financial Services sectors.

Contact

Skills

Core Skills

Data EngineeringCloud ComputingData Governance

Other Skills

Data ArchitectureETL Process OptimizationReal-time Data ProcessingData IntegrationData Quality AssuranceCI/CD AutomationApache KafkaApache SparkAWSKafkaDebeziumSparkAirflowSQLApache Airflow

About

Experience

6 yrs 4 mos

Total Experience

1 yr 8 mos

Average Tenure

4 mos

Current Experience

Thoughtworks

Senior Data Engineer

Jan 2026 – Present · 4 mos · Madrid, Community of Madrid, Spain · On-site

Junglee games

2 roles

Data Engineer-2

Promoted

Apr 2022 – Present · 4 yrs 1 mo

Key Responsibilities:
Lead a team of 4 Data Engineers to design and optimize scalable data pipelines and ETL processes, ensuring high data quality and reliability.
Built an open-sourced Kafka SMT plugin in Java to encrypt/decrypt sensitive data in Debezium connectors, enhancing real-time data security.
Implemented an A/B testing framework for balance recommendations (95% adoption), leading to a 6% increase in deposits and 8% boost in user retention.
Highlighted Projects:
🔹 Unified Data Platform (UDP)
Built a real-time CDC platform using Kafka, Debezium, Spark, Airflow, and AWS to stream data into Amazon S3 with <15 min SLA.
Created a YAML-based data contract system to automate Debezium connector creation and Spark job scheduling via a one-click CI/CD pipeline, reducing manual effort by 80%.
Developed a Flask API to orchestrate and monitor workflows and platform health.
🔹 Avengers – Internal Data Integration & Governance Tool
1. Designed Avengers, a web-based tool (Airbyte-inspired) for automated reporting, data quality validation, PII governance, and lineage tracking.
2. Enabled non-tech users to build ETL pipelines via drag-and-drop UI into the gold data layer.
Reduced business report generation time by 70%.
3. Integrated RBAC to enforce data access controls and compliance standards.
🏆 Awards & Recognition
1. Top Rookie Award – May 2024
2. Troubleshooter Award – November 2023
3. Exception Award – January 2023
Technologies used: PySpark, Spark, Kafka, Airflow, Databricks, AWS (S3, Glue, EMR, Lambda, EKS), Kubernetes, Debezium, Flask, SQL, Delta Lake, CI/CD

Data ArchitectureETL Process OptimizationReal-time Data ProcessingData IntegrationData Quality AssuranceCloud Computing+5

Data Engineer-3

Apr 2022 – Mar 2026 · 3 yrs 11 mos

SQL

Infoobjects inc.

Data Engineer

Apr 2021 – Jun 2022 · 1 yr 2 mos · Jaipur, Rajasthan, India

1. Implemented Spark optimization techniques including caching, multithreading, and broadcast joins, reducing processing time by 20% for daily loads of ~2 million records.
2. Designed and implemented advanced scheduling with Apache Airflow for data pipeline orchestration, reducing manual intervention by 80% and improving workflow efficiency.
3. Built a data mesh platform from scratch using AWS, Databricks, Starburst & Kubernetes, aiding scalable, cloud-native data architecture.
4. Developed configurable self-service data ingestion pipelines to ensure seamless, frictionless data onboarding.
5. Designed a multi-tenant data storage system leveraging AWS S3, Apache Spark, and Delta Lake for efficient and secure data management.

SparkApache AirflowAWSData MeshData IngestionData Engineering+1

Metaorigin labs

Software Engineer (Big Data )

Jan 2020 – Apr 2021 · 1 yr 3 mos · India

1. Led data migration projects from on-premise to cloud (Spark), reducing processing time by 20% and orchestrating AWS S3 Data Lakehouse ingestion to amplify data accessibility by 30%.
2. Developed high-performance data pipelines using Spark and Kafka for diverse structured and unstructured data, including a microservice deployed on Kubernetes with FastAPI for real-time processing status tracking.
3. Engineered specialized tools such as CloutCube a web-based transcription and subtitling tool and built custom Airflow operators and dynamic DAGs to enhance pipeline flexibility and automation.
4. Executed robust ETL processes to ensure data integrity and pipeline stability, and designed a comprehensive BI solution framework for end-to-end business intelligence workflows.