Vijay Shekhawat

Co-Founder

United Kingdom8 yrs 10 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Led design of multi-cloud data lakehouse.
Achieved 10x cost reduction in analytics.
Developed automation tool for ETL processes.

Stackforce AI infers this person is a Data Engineering expert in SaaS and Big Data analytics.

Contact

Skills

Core Skills

Data EngineeringDistributed SystemsStreaming Data PipelinesData MigrationBig Data AnalyticsEtl

Other Skills

Apache IcebergApache SparkApache KafkaStarRocksApache BeamApache FlinkGoogle Cloud DataflowApache PinotAWS RedshiftApache AirflowMapReduceApache PigStreamSetsSnowflakeGoogle BigQuery

About

👋 Hi, I’m Vijay Shekhawat, a Staff Software Engineer at TRM Labs, where I lead the design and evolution of our Next-Generation Data Platform, a multi-cloud data lakehouse that powers sub-second analytics at the petabyte scale. I specialise in data systems architecture, performance optimisation, and real-time analytics. With nearly a decade of experience building and scaling platforms that blend reliability with speed, my work spans from architecting Apache Iceberg-based lakehouses and StarRocks query engines to orchestrating streaming pipelines with Apache Beam and Flink capable of handling millions of events per second. At TRM Labs, I helped launch the company’s first operational data lakehouse, enabling 100 TB+ analytics with 10× lower cost and millisecond-level query latency. Before that, I led distributed data systems at LinkedIn, Zynga, and Dell EMC, focusing on high-throughput ingestion, advanced ETL frameworks, and large-scale migration initiatives. Beyond engineering, I enjoy mentoring teams, leading workshops on Iceberg and StarRocks, and speaking at conferences like the Apache Beam Summit, sharing lessons from architecting scalable, real-time data systems

Experience

8 yrs 10 mos

Total Experience

1 yr 10 mos

Average Tenure

3 yrs 11 mos

Current Experience

Trm labs

2 roles

Staff Software Engineer

Jun 2025 – Present · 11 mos

Building a multi-cloud data lake with Apache Iceberg at TRM Labs

Apache IcebergData EngineeringDistributed Systems

Senior Software Engineer - Data

Jun 2022 – Jun 2025 · 3 yrs

First Engineer in India.
As the Engineering lead on TRM's Next-Generation Data Platform, I lead software and framework selection, architecture design and platform rollout - Enabling TB scale customer-facing analytics on Lakehouse Architecture of Apache Iceberg and StarRocks at millisecond latency at 1/10th the cost.
As the Technical lead on a real-time streaming data platform - I was responsible for software selection, architecture design, and development. The real-time platform processes millions of events per second and uses Apache Beam, Apache Flink and Google Cloud Dataflow for cross-cloud environments with 99.99% service availability and sub-second latency.

Apache SparkApache KafkaData EngineeringStreaming data pipelinesApache IcebergStarRocks+1

Software Engineer - Data

Feb 2021 – Aug 2022 · 1 yr 6 mos · Bengaluru, Karnataka, India

I am most proud of leading the data engineering revamp for the “Who Viewed Your Profile” feature, one of LinkedIn’s most visited and engagement-driving surfaces, serving insights to hundreds of millions of members globally every week to improve performance and data accuracy.
Additionally, I designed and developed Data Sampling as a Service, a scalable internal platform enabling ML and analytics teams to generate reproducible samples from petabyte-scale datasets.
Built Spark + Kafka-based pipelines and integrated Apache Pinot for real-time insights.
Partnered with multiple data science teams to improve experimentation velocity and analytical reproducibility.

Apache SparkApache KafkaApache PinotData EngineeringStreaming data pipelinesStreaming Data Pipelines

Zynga

Data Engineer 2

Mar 2020 – Feb 2021 · 11 mos · Bengaluru, Karnataka

Responsibilities:
Data Warehouse platform migration from Vertica to AWS Redshift.
Design and Implement Airflow Jobs for Data migration.
Implemented Redshift Spectrum best practices like partitioning,
compression, predicate pushdown and columnar storage formats like
parquet.
Implementing columnar encoding standards for efficient space utilization
and sort and distribution key for performance optimization.
Redshift Cluster provisioning using Terraform along with Work Load
Management configurations.
Created Airflow DAGs for ETL jobs.
Loading Nested JSON data in Redshift using Jsonpath files.
Achievements:
Instrumental in setting up AWS Redshift cluster and helped team members (Data Scientists and Analysts) with best practises adoption.
Reduced Redshift Space utilization by 50% (400 TB to 200 TB) by
implementing new column level encoding.

Apache SparkApache KafkaData EngineeringData Migration

Hormone therapy

Founder

Sep 2019 – Feb 2022 · 2 yrs 5 mos · Bengaluru, Karnataka, India

We are India's first online service for hormonal imbalances. The fastest and easiest way to treat your hormonal imbalances. Growing old is not a choice, but living young is :)
Women. Men. Everyone.

Expedia group

Software Engineer

Sep 2019 – Mar 2020 · 6 mos · Bengaluru, Karnataka, India · On-site

Dell emc

Senior Data Engineer

Jul 2017 – Sep 2019 · 2 yrs 2 mos · Hyderabad Area, India

Part of Enterprise Business Intelligence (Big Data Analytics) team, responsible for ingestion and analytics
of semi-structured and unstructured data.
Responsibilities
Write Map Reduce Jobs, Pig, Spark.
Import and Export data using Sqoop into Hive and HBase from existing SQL Server.
Processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
To assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design, and review.
Designing and development of complex ETL structures for the transformation of data sources(Salesforce, Eloqua, PMC etc.) into data warehouses(Teradata, Greenplum etc.).
Implemented procedures for validation, reconciliation of metadata and error handling in ETL processes.
Analyzed database designs and system integration processes to make suggestions for further enhancement.
Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modelling, and data mining, and advanced data processing
Implemented Unix Shell Scripts for various process improvement.
Achievements
Developed ETL(StreamSets) Development Automation tool (Awaiting Patent).
Dell Quarterly Award (Silver).
Runner-up AI Hackathon for implementing machine-learning driven Agile