Rahul Sharma

Software Engineer

Madrid, Community of Madrid, Spain6 yrs 4 mos experience
Highly Stable

Key Highlights

  • Reduced data processing time by 20%
  • Minimized manual intervention in data pipelines by 80%
  • Developed a Unified Data Platform with <15 min SLA
Stackforce AI infers this person is a Data Engineering expert in the Gaming and Financial Services sectors.

Contact

Skills

Core Skills

Data EngineeringCloud ComputingData Governance

Other Skills

Data ArchitectureETL Process OptimizationReal-time Data ProcessingData IntegrationData Quality AssuranceCI/CD AutomationApache KafkaApache SparkAWSKafkaDebeziumSparkAirflowSQLApache Airflow

About

Hello! I am a dedicated Data Engineer with over 6 years of experience in designing, developing, and implementing comprehensive data engineering solutions across various industries, including Gaming, Financial Services, and Technology. Currently, I serve as a Senior Data Engineer at Junglee Games, where I lead innovative projects aimed at enhancing data accessibility and operational efficiency. My expertise lies in building scalable data architectures, ETL pipelines, and cloud-native platforms utilizing advanced technologies such as Apache Spark, Kafka, and AWS. Throughout my career, I have achieved significant milestones, including reducing data processing time by 20% and minimizing manual intervention in data pipeline orchestration by 80%. I have also spearheaded the development of key platforms like the Unified Data Platform for real-time Change Data Capture and an internal data integration tool called &quot;Avengers.&quot; My passion for data engineering drives me to foster collaboration and continuous improvement within my teams, ensuring that we deliver high-quality solutions that meet organizational goals. Key Achievements: - Developed a Unified Data Platform that reduced data availability time to under 15 minutes. - Created an internal tool that decreased business report generation time by 70%. - Recognized with multiple awards for outstanding contributions, including the Top Rookie Award in May 2024. Technical Skills: Data Architecture Business Intelligence ETL Process Optimization Cloud Computing Data Quality Assurance Real-time Data Processing Data Governance CI/CD Automation Data Integration Data Analytics Data Security BI & Reporting Automation Programming Languages: PySpark, Python, SQL, Scala, Spark, Data Structures, Algorithms Cloud Services and Technologies: Databricks, Amazon S3, AWS Glue, AWS EMR, AWS Lambda, AWS EKS, Spark, Spark SQL, Delta Lake, YARN, Kubernetes, Hive, CI/CD Pipeline Streaming Technology: Kafka, Spark Structured Streaming Data Engineering Tools: Data Modeling and Warehousing, ETL/ELT Data Pipeline, Medallion Architecture Orchestration: Airflow, Kubernetes Familiar: Flask, FastAPI, Elasticsearch, Docker, Git, CI/CD Pipeline Databases: MySQL, MSSQL, PostgreSQL, MongoDB Github Profile https://github.com/panditrahulsharma

Experience

6 yrs 4 mos
Total Experience
1 yr 8 mos
Average Tenure
4 mos
Current Experience

Thoughtworks

Senior Data Engineer

Jan 2026Present · 4 mos · Madrid, Community of Madrid, Spain · On-site

Junglee games

2 roles

Data Engineer-2

Promoted

Apr 2022Present · 4 yrs 1 mo

  • Key Responsibilities:
  • Lead a team of 4 Data Engineers to design and optimize scalable data pipelines and ETL processes, ensuring high data quality and reliability.
  • Built an open-sourced Kafka SMT plugin in Java to encrypt/decrypt sensitive data in Debezium connectors, enhancing real-time data security.
  • Implemented an A/B testing framework for balance recommendations (95% adoption), leading to a 6% increase in deposits and 8% boost in user retention.
  • Highlighted Projects:
  • 🔹 Unified Data Platform (UDP)
  • Built a real-time CDC platform using Kafka, Debezium, Spark, Airflow, and AWS to stream data into Amazon S3 with <15 min SLA.
  • Created a YAML-based data contract system to automate Debezium connector creation and Spark job scheduling via a one-click CI/CD pipeline, reducing manual effort by 80%.
  • Developed a Flask API to orchestrate and monitor workflows and platform health.
  • 🔹 Avengers – Internal Data Integration & Governance Tool
  • 1. Designed Avengers, a web-based tool (Airbyte-inspired) for automated reporting, data quality validation, PII governance, and lineage tracking.
  • 2. Enabled non-tech users to build ETL pipelines via drag-and-drop UI into the gold data layer.
  • Reduced business report generation time by 70%.
  • 3. Integrated RBAC to enforce data access controls and compliance standards.
  • 🏆 Awards & Recognition
  • 1. Top Rookie Award – May 2024
  • 2. Troubleshooter Award – November 2023
  • 3. Exception Award – January 2023
  • Technologies used: PySpark, Spark, Kafka, Airflow, Databricks, AWS (S3, Glue, EMR, Lambda, EKS), Kubernetes, Debezium, Flask, SQL, Delta Lake, CI/CD
Data ArchitectureETL Process OptimizationReal-time Data ProcessingData IntegrationData Quality AssuranceCloud Computing+5

Data Engineer-3

Apr 2022Mar 2026 · 3 yrs 11 mos

SQL

Infoobjects inc.

Data Engineer

Apr 2021Jun 2022 · 1 yr 2 mos · Jaipur, Rajasthan, India

  • 1. Implemented Spark optimization techniques including caching, multithreading, and broadcast joins, reducing processing time by 20% for daily loads of ~2 million records.
  • 2. Designed and implemented advanced scheduling with Apache Airflow for data pipeline orchestration, reducing manual intervention by 80% and improving workflow efficiency.
  • 3. Built a data mesh platform from scratch using AWS, Databricks, Starburst & Kubernetes, aiding scalable, cloud-native data architecture.
  • 4. Developed configurable self-service data ingestion pipelines to ensure seamless, frictionless data onboarding.
  • 5. Designed a multi-tenant data storage system leveraging AWS S3, Apache Spark, and Delta Lake for efficient and secure data management.
SparkApache AirflowAWSData MeshData IngestionData Engineering+1

Metaorigin labs

Software Engineer (Big Data )

Jan 2020Apr 2021 · 1 yr 3 mos · India

  • 1. Led data migration projects from on-premise to cloud (Spark), reducing processing time by 20% and orchestrating AWS S3 Data Lakehouse ingestion to amplify data accessibility by 30%.
  • 2. Developed high-performance data pipelines using Spark and Kafka for diverse structured and unstructured data, including a microservice deployed on Kubernetes with FastAPI for real-time processing status tracking.
  • 3. Engineered specialized tools such as CloutCube a web-based transcription and subtitling tool and built custom Airflow operators and dynamic DAGs to enhance pipeline flexibility and automation.
  • 4. Executed robust ETL processes to ensure data integrity and pipeline stability, and designed a comprehensive BI solution framework for end-to-end business intelligence workflows.

Education

Rajasthan Technical University, Kota

Bachelor's degree — Computer Engineering

Jun 2016May 2020

Stackforce found 100+ more professionals with Data Engineering & Cloud Computing

Explore similar profiles based on matching skills and experience