S

Sanjay Kumar Sahu

Backend Engineer

Bengaluru, Karnataka, India6 yrs 6 mos experience

Key Highlights

  • Built a self-serve platform for ML model training.
  • Engineered a real-time leaderboard handling 80M entries.
  • Led a team to optimize Spark workloads, reducing costs by 35%.
Stackforce AI infers this person is a SaaS and Gaming-focused Software Engineer with expertise in Machine Learning and Cloud Architecture.

Contact

Skills

Core Skills

Machine LearningKubernetesHadoopCloud Architecture

Other Skills

AerospikeAlgorithm DesignAlgorithmsC (Programming Language)C#Cloudera CDP Private CloudContainerizationCore JavaDatabase Management System (DBMS)DatabasesDistributed SystemsDockerHDFSJavaKubeflow

About

Working as Software Engineer at Dream11 • Darwin: Built a self‑serve Ray‑based platform for training and serving ML models, including TensorFlow, PyTorch, and Spark models. • Led a team of 3 to build a self‑serve distributed computing platform on Kubernetes for Spark and ML frameworks. • Spark Workload Optimization: Implemented a remote shuffle service to optimize Spark workloads, reducing ML workload costs by approximately 35%. • Pelican: Built Spark‑as‑a‑Service orchestration gateway (multi‑tenant, runtime‑driven, unified REST APIs) for DE/BE pipelines, delivering 40% cost reduction and 2x faster ML and Spark jobs delivery. • Competition / RMG: Engineered a real‑time leaderboard and PCS system handling 80M entries in 7 seconds, ensuring seamless 60M+ RPM throughput for live gaming events. • Advanced salting to enhance Spark efficiency, reducing expenses by 60% and speeding up leaderboard generation by 5x. Mail Id : sanjaysahu3426@gmail.com

Experience

6 yrs 6 mos
Total Experience
1 yr 3 mos
Average Tenure
1 yr 5 mos
Current Experience

Dream11

SDE2

Dec 2024Present · 1 yr 5 mos · Mumbai, Maharashtra, India · On-site

  • Leaderboard (Real Money Gaming) - Machine Learning Platform team.
  • Darwin: Built a self‑serve Ray‑based platform for training and serving
  • ML models, including TensorFlow, PyTorch, and Spark models.
  • Led a team of 3 to build a self‑serve distributed computing platform
  • on Kubernetes for Spark and ML frameworks.
  • Spark Workload Optimization: Implemented a remote shuffle service
  • to optimize Spark workloads, reducing ML workload costs by
  • approximately 35%.
  • Pelican: Built Spark‑as‑a‑Service orchestration gateway
  • (multi‑tenant, runtime‑driven, unified REST APIs) for DE/BE pipelines,
  • delivering 40% cost reduction and 2x faster ML and Spark jobs
  • delivery.
  • Competition / RMG: Engineered a real‑time leaderboard and PCS
  • system handling 80M entries in 7 seconds, ensuring seamless 60M+
  • RPM throughput for live gaming events.
  • Advanced salting to enhance Spark efficiency, reducing expenses by
  • 60% and speeding up leaderboard generation by 5x.
RayTensorFlowPyTorchSparkKubernetesMachine Learning

Microsoft

Software Engineer 2

Apr 2022Dec 2024 · 2 yrs 8 mos · Bengaluru, Karnataka, India

  • Working as a SWE.
  • Expert in designing and building a platform to offer Hadoop cluster as a service .Working on internals of YARN, HDFS.
  • Worked on providing Hadoop offering on cloud as service. Played a role of key designer of the various offerings on Cloud for Hadoop ecosystem. I have extensive experience in architecture, design and agile development. I have developed expertize in application development in Cloud architecture and development using hadoop and it's ecosystem.
  • I also have expertize in using docker containers to create virtualized platform for underlying services.
  • Handling following key features in my current profile
  • Building APIs/components to mange machine and automate the Provisioning of Bare metal servers and machine preparation so that they are ready for consumption
  • − Building APIs/ components for Resource Management, Orchestration of other service activities
  • − Building shell scripts and python modules to handle orchestration over distributed systems
  • − Building framework to prepare Physical host machines to be ready for building cluster
  • − Cloud Security
HadoopYARNHDFSCloud architectureDocker

Cloudera

Software Engineer

Jul 2020Apr 2022 · 1 yr 9 mos · Bengaluru, Karnataka, India

  • I was a part of compute team at Cloudera.
  • Project : Hadoop YARN Queue Manager : Manages compute queues operations by providing a simple set of functional APIs that hides the complexities of updating capacity scheduler.xml (YARN) . YARN is Apache Hadoop MapReduce framework named as MRv2 or Yarn.
  • My contributions got into various CDP public and private cloud releases.
  • Cluster Management | Hadoop | YARN | Java | Scalable system | Resource management | Distributed Systems
HadoopYARNJavaResource managementDistributed Systems

Infineon technologies

Software Engineer Internship

Jan 2020Jun 2020 · 5 mos · Bengaluru, Karnataka, India

Coding ninjas india

Teaching Assistant

Mar 2018Jun 2018 · 3 mos · Delhi, India

Education

Birla Institute of Technology, Mesra

Bachelor of Engineering - BE — Computer Science And Engineering

Jan 2016Jan 2020

DAV Public School, GandhiNagar, Ranchi

Intermediate (+2) — Science (PCM)

Jan 2014Jan 2016

DAV Public School, Kathara, Bokaro Steel City

Matriculation — General Studies

Jan 2004Jan 2014

Stackforce found 100+ more professionals with Machine Learning & Kubernetes

Explore similar profiles based on matching skills and experience