Kalash Kalwani

Software Engineer

Bengaluru, Karnataka, India3 yrs 9 mos experience

AI EnabledAI ML Practitioner

Key Highlights

Reduced data staleness by 80% through innovative solutions.
Optimized Delta Lake storage, improving query performance by 60%.
Achieved 90% reduction in processing time for ETL pipelines.

Stackforce AI infers this person is a Data Engineering specialist with a strong focus on scalable data systems and cloud technologies.

Contact

Skills

Core Skills

Data EngineeringAws

Other Skills

APIAWS Elastic BeanstalkAWS SSMAgile MethodologiesAmazon EC2C (Programming Language)Cascading Style Sheets (CSS)Computer NetworkingData IngestionDatabase Management System (DBMS)DatabricksDebeziumDeep LearningDelta LakeDjango

About

I'm a Data Engineer with over 3.5 years of hands-on experience in building and optimizing scalable, real-time, and batch data pipelines using Python, Spark, SQL, and AWS. I specialize in designing robust data architectures that empower data-driven decision-making, enhance observability, and ensure data integrity across distributed systems. led initiatives to implement CDC frameworks, optimize Delta Lake storage, and improve analytics query performance by over 60%. My work helped reduce data staleness by 80% and latency by 33%, contributing to faster, more reliable insights for multiple teams. I also have a strong track record from my time at Impetus, where I built high-throughput ETL pipelines, reduced processing time by 90%, and implemented monitoring solutions that improved uptime and incident response. I hold Databricks certifications in Data Engineering and Machine Learning and have built deep technical experience across cloud platforms (AWS), data lakes (Delta Lake), orchestration (Airflow), and observability (Prometheus, Grafana). Always eager to solve complex problems, collaborate cross-functionally, and continuously learn — I’m open to connecting with fellow data professionals, product teams, and anyone passionate about scalable data systems.

Experience

3 yrs 9 mos

Total Experience

1 yr 10 mos

Average Tenure

1 yr 11 mos

Current Experience

Tekion corp

Software Engineer

Jul 2024 – Present · 1 yr 11 mos · Bengaluru, Karnataka, India · Hybrid

Designed and implemented a scalable Change Data Capture (CDC) framework using Debezium to stream real-time data from Mongo DB and MySQL into a central platform, reducing data staleness and quality issues by 80%.
Designed layered data architecture (CDM & ADM) to deliver business-ready datasets for 10+ teams, improving analytics query performance by 60% and enabling near real-time decision-making.
Optimized Delta Lake storage using Z-ordering and liquid clustering techniques, reducing Trino query latency by 33%
on large-scale datasets and improving compute efficiency for cross-team analytics workloads.
Designed and implemented a CI Pre-Merge pipeline for data warehouse Python repositories using GitLab and Pytest; improving code quality and deployment efficiency.
Developed observability tooling for EMR clusters using Prometheus and Grafana, automating OS-level monitoring across 50+ EC2 nodes via AWS SSM and EMR tags, improving fleet-level diagnostics and uptime

DebeziumMongo DBMySQLDelta LakeZ-orderingliquid clustering+10

Impetus

3 roles

Software Engineer

Promoted

Jul 2023 – Jun 2024 · 11 mos

Designed and implemented a high-throughput processing system to handle over 4 million payroll events, to meet a critical client audit requirement covering 6 years of data.
Engineered a high-throughput Python-based pipeline with multithreaded API execution and UDFs to ingest historical payroll data (7M+ records across 11 APIs), reducing processing time by 90% and ensuring audit compliance.
Developed complex SQL queries to track system performance metrics, including data ingestion health & job completion rates, to improve incident response time by 10%.

PythonSQLETLAPIUDFData Ingestion+1

Associate Software Engineer

Jul 2022 – Jun 2023 · 11 mos

Developed high-performance ETL pipelines using Databricks on AWS. Integrated diverse client data into Delta Lake with 30% faster loading times and ensured data accuracy & reliability.
Automated pipeline monitoring with Opsgenie, reducing monitoring time by 70% and downtime by 80%, improving SLA adherence.
Built a scalable data pipeline with PySpark and Spark SQL, leveraging Delta Lake for data reliability, ACID transactions, schema evolution, and efficient historical access.

DatabricksDelta LakePySparkSpark SQLOpsgenieData Engineering

Project Trainee

May 2022 – Jun 2022 · 1 mo

• Collaborated with architects and colleagues on POCs like Delta Ingestion Pipelines to streamline systems & automated ETL workflows - reducing manual tasks by 50% and improving team productivity