Vijay Kumar

Data Engineer

Bengaluru, Karnataka, India11 yrs 4 mos experience

Highly Stable

Key Highlights

Expert in building scalable ETL/ELT frameworks.
Proven track record in optimizing Spark pipelines.
Strong background in data quality and automation.

Stackforce AI infers this person is a Data Engineering expert specializing in scalable data platforms and cloud solutions.

Contact

Skills

Core Skills

Designing Scalable Etl/elt FrameworksOptimizing Spark PipelinesBuilding Reusable Pyspark Etl Libraries

Other Skills

Apache SparkAzureData Vault 2.0ETLSparkPerformance tuningHadoopBig DataPySparkData TransformationData SystemsMicrosoft SQL ServerMicrosoft AzureAzure DatabricksAzure Data Lake

About

I’m a Data Engineer experienced in Big Data and modern cloud data engineering. I specialize in building scalable, high-performance data platforms using Azure Databricks, PySpark, SQL, Delta Lake, and Kafka. At NTT DATA, I’ve delivered impact across platform architecture, performance tuning, automation, and data reliability. I built an ETL framework from scratch using Data Vault 2.0, improved Spark processing for 10M+ daily records, automated validation workflows, and helped strengthen data quality, cost visibility, and disaster recovery. My core strengths include: Designing scalable ETL/ELT frameworks. Optimizing Spark pipelines for speed and reliability. Implementing Data Vault 2.0, PIT logic, SCD Type 2, and star schema models. Automating validation and reducing manual effort. Building resilient, production-ready data solutions. I’m known for turning complex data challenges into clean, efficient, and business-ready platforms. I’m open to opportunities where I can lead engineering initiatives, improve data architecture, and drive measurable outcomes.

Experience

11 yrs 4 mos

Total Experience

4 yrs 1 mo

Average Tenure

3 yrs 1 mo

Current Experience

Ntt data

Data Engineer

Apr 2023 – Present · 3 yrs 1 mo · Bengaluru

Designing scalable ETL/ELT frameworks for terabytes of highly complex data.
Optimizing Spark pipelines for speed and reliability.
Implementing Data Vault 2.0, PIT logic, SCD Type 2, and star schema models.
Automating validation and reducing manual effort.
Building resilient, production-ready data solutions.
Converted business requirements into scalable data warehousing solutions.
Implemented AI-based automation to streamline workflows and reduce manual effort.

Apache SparkAzureDesigning scalable ETL/ELT frameworksOptimizing Spark pipelines

Rieter

Data Engineer

Jun 2018 – Jan 2023 · 4 yrs 7 mos · Chandigarh, India

Built reusable PySpark ETL libraries to accelerate development and improve consistency.
Designed a metadata-driven low-code/no-code ETL framework for 200+ tables, reducing development time by 30%.
Processed APF, EPF, SCF, JSON, XML, and binary files using mapPartitions and multithreading, improving performance by 30%.
Enhanced Azure Data Factory incremental and SCD Type 2 pipelines for better efficiency and reliability.

HadoopBig DataBuilding reusable PySpark ETL libraries

Voltas limited - a tata enterprise

Service Engineer

Sep 2014 – May 2018 · 3 yrs 8 mos

Analyzed system logs and service issues for troubleshooting and root cause analysis.
Prepared operational reports and maintained service-related documentation.
Supported textile operations through technical service, reporting, and issue tracking.

Education

The Technological Institute of Textiles & Sciences

Bachlor of Technology