Jatin Kumar

CTO

Gurugram, Haryana, India13 yrs 3 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Expert in designing high-performance data platforms.
Proven track record in MLOps and machine learning.
Led successful real-time analytics initiatives.

Stackforce AI infers this person is a Data Engineering expert with a focus on real-time analytics and machine learning in the SaaS industry.

Contact

Skills

Core Skills

Data ArchitectureMachine LearningMlopsReal-time AnalyticsData EngineeringEtl

Other Skills

AWSAirflowAmazon Web Services (AWS)Apache AirflowApache KafkaApache PinotApache SparkApache SupersetBI PublisherBusiness IntelligenceCassandraData WarehousingExtract, Transform, Load (ETL)FlinkFlume

About

I am a Data and AI engineering leader passionate about designing and scaling high-performance, resilient, and real-time data platforms. With deep expertise in big data, machine learning, MLOps, and cloud infrastructure, I thrive on solving complex data challenges and driving AI innovation. My work has enabled real-time insights, automation, and cost optimizations at scale, benefiting business operations and decision-making.

Experience

13 yrs 3 mos

Total Experience

2 yrs 7 mos

Average Tenure

8 yrs 1 mo

Current Experience

Careem

3 roles

Engineering Leadership | Data & AI | Scaling ML Platforms, Real-time Data Products & GenAI

Promoted

Jul 2022 – Present · 3 yrs 10 mos · Dubai, United Arab Emirates

Data Experience Highlights
1. As part Architecture Review Board devised a Careem level services disaster recovery plan. Enhanced
disaster recovery, eliminating single points of failure and re-architecting multi-tenant deployments.
2. Reduced MTTA/R as On-Call Incident commander for Careem across Middle east. Mentored senior engineers to refine their strong skills while addressing growth area
3. Created Real-Time Metrics Platform: Enabled live demand and supply monitoring to optimize Careem Transport’s operations.
4. Built & Maintained Kafka Platform: Managed Kafka for four years with zero downtime, then successfully transitioned ownership to the infrastructure team, leveraging deep Kafka internals expertise.
Machine Learning Experience:
1. Cross-Functional Collaboration: Partnered with product, engineering, and analytics teams to define ML project goals, roadmaps, and success metrics, ensuring smooth integration into production environments.
2. MLOps Best Practices: Designed and implemented robust pipelines for model deployment, monitoring, and automation. Collaborated with AWS service teams to optimize SageMaker for governance, cost efficiency, and scalability.
3. Feature Store Development: Created a centralized feature store to accelerate model training and streamline data access for data scientists.
Generative AI:
1. Text-to-SQL Framework: Built a Generative AI-powered solution that converts natural language queries into SQL, boosting analysts’ productivity by 20%.

Data ArchitectureMachine LearningMLOpsKafkaAWSReal-Time Data Products

Staff Software Engineer - Data Platform

Jun 2020 – Jul 2022 · 2 yrs 1 mo · Dubai, United Arab Emirates

1. Built a C360 Platform: Consolidated customer data into a single unified view across multiple products and services.
2. Introduced Apache Pinot: Spearheaded organization-wide adoption of Pinot (Real-Time OLAP store) to power real-time analytics use cases.
3. Customized an open-source visualization platform (Superset) and integrated with Apache Pinot to deliver data to dashboards in millisecond latency, significantly boosting performance.
4. Launched Flink Operator & Jobs: Set up Apache Flink operator and implemented real-time streaming jobs for immediate data insights.
5. Enhanced Trino Query Engine: Integrated Apache Ranger for policy control, introduced caching to reduce S3 API calls by 30%, saving $400K annually.

Apache PinotApache SupersetReal-Time AnalyticsFlinkTrinoData Architecture

Senior Software Engineer - Data Platform

Apr 2018 – Jun 2020 · 2 yrs 2 mos · Dubai, United Arab Emirates

1. Developed an in-house ETL/ELT processing framework with both batch and real-time streaming capabilities, integrated with open-source Delta for ACID compliance.
2. Implemented Binlog Streaming Platform: Streamed binlogs from RDBMS for real-time and batch processes, enabling low-latency data updates.
3. Multi-Source Data Ingestion: Ingested data from various sources (RDBMS, APIs, CSVs, DynamoDB, etc.) to support diverse business needs.
4. Led the migration and integration of Tableau's infrastructure, ensuring scalability and sustainability of the unified platform.
5. Played a key role in scaling a data ingestion framework using Spark, boosting data capacity and analytics output.
6. Managed in house Airflow DAGs for efficient data ingestion into Data lake(S3), resulting in a 15% reduction in data processing time.

ETLApache SparkTableauAirflowData Engineering

Expedia, inc.

Senior Software Engineer - Data Platform

Apr 2017 – Apr 2018 · 1 yr · Gurugram, Haryana, India · On-site

1. Built and led a team of three data engineers, delivering automated ETL pipeline using self-build tool Aquila that is using sql parser for converting stored proc into spark-sql. Also, it provides support for live testing and scheduling of job through UI.
2. Building amd Optimize spark applications to reduce run-time and cost incurred.
3. Developed a scalable centralized data warehouse, implementing automated ETL processes that significantly improved data accuracy and reduced manual efforts.
4. Created a self-service analytics platform using Tableau on AWS EC2 with 50+ data sources and dashboards, streamlining access for 300+ employees and increasing operational productivity.

ETLTableauAWSData Engineering

Centurylink india

Senior Software Engineer

May 2016 – Apr 2017 · 11 mos · Gurugram, Haryana, India · On-site

1. Creating data ingestion pipeline from Kafka and Spark Streaming.
2. Handling data in Cassandra using Spark Streaming.
3. Exploring GraphX library of Spark for solving graph related problems efficiently.

KafkaSpark StreamingCassandraData Engineering

Genpact

Data Engineer

Mar 2015 – Apr 2016 · 1 yr 1 mo · Gurugram, Haryana, India · On-site

1. Developing frameworks for data replication from hive tables to flat files to send it across different interfaces.
2. Optimizing the storage on Hadoop by effective compression and performance tuning of querying
languages like hive.

HadoopHiveData Engineering

3i infotech ltd.

ETL Developer

Jan 2013 – Mar 2015 · 2 yrs 2 mos · Noida · On-site

1. Designed the ETL mappings to synchronize the various stages of flow of data.
2. Taken measures to tune the mapping and sessions both in terms of data loading and metadata management