Rajarshi Sarkar

Senior Software Engineer

Bengaluru, Karnataka, India9 yrs 6 mos experience

Highly Stable

Key Highlights

9+ years of experience in top tech companies.
Expertise in Big Data and Distributed Systems.
Proven track record in open-source contributions.

Stackforce AI infers this person is a Big Data and Cloud Computing expert with extensive experience in E-commerce and Retail.

Contact

Skills

Core Skills

Big DataAmazon Web Services (aws)

Other Skills

AlgorithmsApache AirflowApache AtlasApache BeamApache GriffinApache KafkaApache NiFiApache RangerApache SparkApache StormApache SupersetAzure Data Lake Store (ADLS)Azure DatabricksCI/CDCassandra

About

Software Engineer with 9+ years of experience across Google, Amazon, and Walmart, specializing in Java, Big Data, Distributed Systems, and Microservices. Proven expertise in building scalable, high-performance, and secure distributed systems across domains like Big Data, Cloud Computing, E-commerce, and Retail. At Amazon Web Services, I was part of the EMR team powering petabyte-scale analytics using Apache Spark, Hive, and Trino. My work includes Hive partition pruning optimizations, integrating Iceberg into EMR with open-source contributions, making S3A the default file system, implementing Fine-Grained Access Control (FGAC), and optimizing EMR releases. Previously at Walmart, I designed and developed the cumulative data repository integrating real-time transactional data across the supply chain, enabling near real-time analytical insights. I also led the development of Data Lake products, including Data Pipeline, Data Quality, Metadata Manager, and Data Acceleration tools, with a strong focus on data integrity, governance, and end-to-end lineage. Open-source contributor to Apache Iceberg, Trino, and Gimel.

Experience

9 yrs 6 mos

Total Experience

4 yrs 5 mos

Average Tenure

8 mos

Current Experience

Google

Sr. Software Engineer

Aug 2025 – Present · 8 mos · Bengaluru, Karnataka, India

Amazon

Software Development Engineer

Apr 2021 – Aug 2025 · 4 yrs 4 mos · Bengaluru, Karnataka, India

I was part of the Amazon Web Services EMR team, which powers petabyte-scale data processing and analytics using Apache Spark, Hive, and Trino. My work includes Hive partition pruning optimizations, integrating Iceberg into EMR with open-source contributions, making S3A the default file system, implementing Fine-Grained Access Control (FGAC), and optimizing EMR releases.

Apache SparkHiveTrinoFine-Grained Access ControlS3AOpen-source contributions+2

Walmart

3 roles

Sr. Software Engineer

Jan 2020 – Apr 2021 · 1 yr 3 mos · Bengaluru, Karnataka, India

Designed and developed Walmart's cumulative data repository that integrated real-time transactional data from all supply chain products to build analytical views in near real-time.

Software Engineer III

Promoted

Jan 2018 – Dec 2019 · 1 yr 11 mos · Bengaluru, Karnataka, India

Designed and developed Walmart's Data Lake products including Data Pipeline, Data Quality, Data Profiler, Data Validator, Metadata Manager & Data Acceleration.

Software Engineer II

Aug 2016 – Dec 2017 · 1 yr 4 mos · Bengaluru, Karnataka, India

Discovered, captured, and integrated data with proper categorization. Ensured data integrity and governance throughout the data lifecycle.

Indian institute of technology, kharagpur

Intern

May 2015 – Jun 2015 · 1 mo · Kharagpur

Derived a representative sample of a large citation graph. The sample was 23% of the large graph citation graph and showed similarity in clustering coefficient and degree distribution.