Rajeev Kumar — Data Engineer

I’m a passionate Data Engineer skilled in Python, SQL, Azure, and Azure Databricks, with hands-on experience designing and building scalable ETL/ELT pipelines and data architectures in Azure cloud environments. I’ve worked extensively on Medallion Architecture using Bronze, Silver, and Gold layers in Azure Databricks, implementing Autoloader to ensure each file is ingested only once for efficient and reliable data processing. My focus areas include data quality, performance optimization, and data governance, ensuring that business teams get clean and trusted insights. With additional experience in Power BI and Python for analytics, I bridge the gap between data engineering and business intelligence—transforming raw data into meaningful, actionable insights. Always eager to learn and explore new technologies, I enjoy solving complex data challenges and optimizing workflows to deliver impactful, data-driven solutions. I’m currently expanding my skills by learning the AWS Cloud Platform to strengthen my multi-cloud expertise. Skills- Programming & Scripting: Python, PySpark Databases & SQL: MySQL, Azure SQL, RDBMS, Data Modeling (Star/Snowflake schema), Query Optimization, Stored Procedures ETL & Data Engineering: End-to-End ETL/ELT Pipelines, Data Ingestion, Transformation & Loading, Data Quality & Validation, Metadata Management, Data Lineage Big Data & Cloud: Azure Databricks, Apache Spark, Delta Lake, Azure Data Lake Storage (ADLS), Cloud Storage, Synapse Analytics, Azure Data Factory Data Orchestration & Workflow Management: Apache Airflow, Azure Data Factory Pipelines, Scheduling & Automation Streaming & Real-Time Processing: Apache Kafka (basics), Spark Structured Streaming, Event Hubs Analytics & Visualization: Power BI, Tableau, Python (Matplotlib, Pandas, Seaborn) Data Governance & Security: Data Quality, Data Lineage, Role-Based Access Control (RBAC), GDPR/Data Privacy Awareness Version Control & CI/CD: Git, GitHub/GitLab, CI/CD Pipelines Deployment & Containerization : Docker (basics), Kubernetes (exposure) 📧Contact : Rajeevreck@gmail.com

Stackforce AI infers this person is a Data Engineer specializing in SaaS solutions with a focus on Azure technologies.

Location: Noida, Uttar Pradesh, India

Experience: 2 yrs 7 mos

Skills

Data Engineering
Etl/elt Pipelines

Career Highlights

Expert in designing scalable ETL/ELT pipelines.
Proficient in Azure cloud technologies and data governance.
Strong background in bridging data engineering and business intelligence.

Work Experience

Wipro

Data Engineer (2 yrs 7 mos)

Education

B.tech at Rajkiya Engineering College (Kannauj)

Intermediate at Nav janodaya inter college

High school at Nav janodaya inter college

Rajeev Kumar

Data Engineer

Noida, Uttar Pradesh, India2 yrs 7 mos experience

Highly Stable

Key Highlights

Expert in designing scalable ETL/ELT pipelines.
Proficient in Azure cloud technologies and data governance.
Strong background in bridging data engineering and business intelligence.

Stackforce AI infers this person is a Data Engineer specializing in SaaS solutions with a focus on Azure technologies.

Contact

Skills

Core Skills

Data EngineeringEtl/elt Pipelines

Other Skills

AgileApache SparkAzure Data FactoryAzure Data Lake Storage Gen2Azure DatabricksAzure DevOpsAzure Synapse AnalyticsChange Data CaptureData AnalystData IngestionData ProcessingData QualityData visualization with pythonDelta LakeDelta Live Tables

About

Experience

2 yrs 7 mos

Total Experience

2 yrs 7 mos

Average Tenure

Current Experience

Wipro

Data Engineer

May 2023 – Dec 2025 · 2 yrs 7 mos · Noida, Uttar Pradesh, India

Architected and implemented enterprise-grade data pipelines on Azure using Databricks, Azure Data Factory, and Synapse Analytics, aligned with the Medallion Architecture (Bronze–Silver–Gold).
Designed incremental data ingestion frameworks using Delta Lake to efficiently process high-volume datasets while reducing compute costs and load times.
Implemented Change Data Capture mechanisms to capture and process source-system updates with minimal latency and data duplication.
Built and maintained Slowly Changing Dimensions (SCD Type 1 & Type 2) using PySpark and Delta tables to preserve historical accuracy and enable reliable analytics.
Developed Delta Live Tables pipelines for automated data transformations, data quality enforcement, and dependency management in production environments.
Modelled and optimized Fact–Dimension schemas in Synapse Analytics to support large-scale analytical and reporting workloads.
Processed and transformed 10M+ records daily from ADLS Gen2 to Synapse, improving data availability and reducing analytics latency by 35%.
Implemented data reconciliation, audit checks, and exception handling to ensure end-to-end pipeline reliability and compliance.
Integrated pipelines with Azure DevOps CI/CD for version control, automated testing, and multi-environment deployments.
Technical Environment: Azure Databricks, PySpark, Delta Lake, Delta Live Tables, Azure Data Factory, Azure Data Lake Storage Gen2, Azure Synapse Analytics, MySQL, Python, Azure DevOps, Power BI,
Apache Spark, Hadoop, Agile, JIRA.