S

Sushil Kumar Shivashankar

Engineering Manager

Bengaluru, Karnataka, India12 yrs 1 mo experience
Highly Stable

Key Highlights

  • Over a decade of experience in Big Data and Distributed Systems.
  • Led multi-million dollar cloud optimization strategy at Uber.
  • Strong advocate for Open Source with contributions to major projects.
Stackforce AI infers this person is a Big Data and Cloud Computing expert with extensive experience in Data Engineering.

Contact

Skills

Core Skills

Engineering ManagementDistributed SystemsData EngineeringCloud Computing

Other Skills

Algorithm DesignAmazon S3Amazon Web Services (AWS)Apache FlinkApache KafkaApache MesosApache SparkApache ZooKeeperBig DataCC++Cost OptimizationData GovernanceData InfrastructureData Ingestion

About

An Engineer by choice, Architect by nature, and Leader at heart. I am an Engineering Manager and technology leader with over a decade of experience designing, scaling, and modernizing Big Data and Distributed Systems. My passion lies in building high-impact Data Lake, Data Warehouse, and OLAP platforms from scratch, handling Petabytes of data for mission-critical use cases like funnel analysis, segmentation, and web analytics. I am a strong advocate for Open Source and have deep expertise in performance engineering, system design, and the fundamentals of distributed systems. Currently, I lead two teams (Kafka and Flink) at Uber, where I am driving a multi-million dollar cloud optimization strategy using disaggregated storage and focused on platform modernization toward a unified Lakehouse architecture. Previously, I served as a Software Development Manager at AWS EMR & Athena, where I was responsible for the development and growth of engines like Hadoop, Spark, Flink, Presto/Trino, and architecting next-gen S3A performance optimizations to enhance Lakehouse performance. Prior to that, I led platform architecture for Microsoft's Azure Data Platform (HDInsight), including driving its migration onto Kubernetes, and architecting data platforms at scale for Ola Cabs and Flipkart. Specialties: Distributed Systems, Data Lakehouse (Iceberg, Delta Lake, Hudi), Data Engineering, Stream Analytics, System Design, Scalability, Kubernetes, Algorithm Design, Cloud Computing (AWS, Azure), Hadoop, YARN, Spark, Flink, Kafka, Presto/Trino, Data Governance, Real-Time Analytics, Multi-Threading, Cost Optimization, Engineering Leadership.

Experience

Uber

Engineering Manager - 2

Aug 2025Present · 7 mos · Bengaluru, Karnataka, India

  • Leading two teams(Flink and Kafka) from Bangalore.
Engineering ManagementApache FlinkApache KafkaGoogle Cloud Platform (GCP)Cloud ComputingData Streaming+2

Amazon web services (aws)

Software Development Manager

Apr 2022Aug 2025 · 3 yrs 4 mos · Bengaluru, Karnataka, India · On-site

  • Leading the Development and Growth for Flink, Hadoop and Trino offerings for AWS EMR on EC2/EKS and Athena.
  • [0] https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/
  • [1] https://docs.aws.amazon.com/emr/latest/ReleaseGuide/trino-ft.html
  • [2] https://docs.aws.amazon.com/emr/latest/ReleaseGuide/presto-spot-loss.html
  • [3] https://docs.aws.amazon.com/emr/latest/ReleaseGuide/presto-strict-mode.html
  • [4] https://aws.amazon.com/about-aws/whats-new/2023/11/data-lake-queries-amazon-athena-s3-express-one-zone/
  • [5] https://aws.amazon.com/blogs/big-data/run-trino-queries-2-7-times-faster-with-amazon-emr-6-15-0/
  • [6] https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-express-one-zone-storage-class-emr/
Data InfrastructureApache SparkData PipelinesMentoringTeam DevelopmentEngineering Management+2

Microsoft

3 roles

Senior Software Engineering Manager

Apr 2021Apr 2022 · 1 yr

  • Azure HDInsight
Data InfrastructureApache SparkData PipelinesMentoringTeam DevelopmentEngineering Management+2

Senior Software Engineer (Azure Data Platform)

Aug 2020Mar 2021 · 7 mos

  • Azure HDInsight
Data InfrastructureApache SparkData PipelinesData Engineering

Software Engineer - 2 (Azure Data Platform)

Apr 2018Aug 2020 · 2 yrs 4 mos

  • Initially was part of Cosmos Resource Management Team, where most of the work is mainly focused with Apache YARN running on 100k Nodes.
  • Developed a framework to support Azure CosmosDB and any Document Store vendor as a backend for Apache YARN ATSv2, also contributed this work to OSS as YARN-9016.
  • https://issues.apache.org/jira/browse/YARN-9016
  • Built the Infrastructure to support autoscale, based on load or schedule for Azure HDInsight Clusters and later led the project for GA.
  • https://azure.microsoft.com/en-gb/updates/autoscale-for-azure-hdinsight-is-now-general-available/
  • Actively contributing patches to OSS Hadoop.
Data InfrastructureApache SparkData Engineering

The apache software foundation

Open Source Contributor (Hadoop)

Apr 2019Apr 2020 · 1 yr · Bengaluru, Karnataka, India

Ola (ani technologies pvt. ltd)

2 roles

Software Development Engineer 3 (Big Data Platform)

Apr 2017Apr 2018 · 1 yr

  • ◘ Built a real-time stream upserts framework end to end, some key features are
  • A continuous running pipeline listening to a topic on Apache Kafka and can perform UPSERTS in real-time on Distributed FileSystem i.e ObjectStore (S3) or BlockStore(HDFS)
  • Users can parallelly query the data from FileSystem as Hive External Tables via Presto / Tez / SparkSQL
  • All the UPSERTS are eventual consistent while reading and no locks are applied i.e READ and UPSERTS are Parallel.
  • Technologies/Components Used:
  • Apache YARN with Node Labels for compute.
  • Apache Beam for building the continuous running pipeline framework.
  • Apache Spark / Flink on Yarn for pipeline orchestration and Disaster recovery.
  • In-house Data Governance Warehouse MetaData Service to sanitize the messages being read from Kafka.
  • ◘ Part of Building KAAS(Kafa As A Service) with AuthN, AuthZ, Audits and Multitenancy for entire OLA Engg. I developed a ConfigSVC end to end for orchestrating and managing Kafka Clusters and Clients.
Data PipelinesLarge Scale EventsData Ingestion

Software Development Engineer 2 (Big Data Platform)

Apr 2016Mar 2017 · 11 mos

  • Built the Data Lake for Big Data from scratch for democratising and discovering of data for analytics, starting form Preparation -> Ingestion -> Cleaning -> Transformation/Processing -> Consumption
  • Preparation - All the transactional(Entities) or non transactional(Events) systems data in OLA are prepared by registering the metadata in the warehouse with additional PIE(Processing , Ingestion and Execution times) semantics in each payload.
  • Ingestion :
  • Push Based - All the data payloads would be directly pushed to a messaging system via ingestion
  • library.
  • Pull Based - For MySQL by reading binlogs and taking care of the schema management automatically. For NoSQL DB, I wrote a custom sqoop from scratch for bulk pulling the data which syncs the schema automatically with the warehouse metadata and stores the data in the FileSystem by skipping the messaging queue. This was also used for backup and disaster recovery.
  • Cleaning - This is aka journalling, where all the entities/events would be deduplicated and partitioned by date and hour on a particular datetime field to answer the question as of then data for entities.
  • Transformation - Reconciling(snapshotting) of the data periodically by fetching the changed data from Journal Store, to answer the question as of now data for entities.
  • Consumption - The consumption of the journal and snapshot related data were exposed as external hive tables and were queried via Hive/Tez/Spark/Presto engines.
  • Technologies/Components Used:
  • HDP Distro Hadoop 2.7 cluster managed by Ambari
  • Compute via Yarn and Storage via S3A
  • Hive , Tez, Presto and Spark for querying
  • Dropwizard for creating microservices
  • Customised version of Maxwell binlog parser project from github
  • Customised version of Kafka to S3 project called Secor from github
  • In house config service for resolving application config for microservices on marathon
  • Mesos, Docker, Marathon for microservices orchestration
  • Oozie/Azkaban for scheduling
Data PipelinesLarge Scale EventsData Ingestion

Flipkart.com

Software Development Engineer 1 (Big Data Platform)

Jul 2014Mar 2016 · 1 yr 8 mos · Greater Bengaluru Area

  • Was part of developing a batch system to process system generated logs which was around 6-7TB of data getting ingested on normal days and 50+TB on sale days as part of the Data Governance practice and enable processing pipelines to run on them by the data analysts.
  • Generated Canned Reports by writing processing pipelines for funnel analysis by mining the logs that was not available from Omniture Software (Site catalyst) for Analysis's .
  • Later served as part of the Infrastructure and Systems Engineering team of Flipkart's Data Platform where we developed in house Cloud Platform Products through open source Big Data technologies. Some of the key features are like porting compute systems(i.e Spark, Storm etc) on YARN, metering ,billing, auditing , security etc similar to a private cloud. I developed org wide monitoring tool end to end for all the components, contributed to DC Migration for transferring PB’s of data across DC, benchmarking 1000+ node hadoop cluster by writing my own custom benchmarks to handle the Big Billion Day sales load. Open Sourced Project : https://github.com/flipkart-incubator/BlueShift
Data PipelinesLarge Scale EventsData Ingestion

Rns institute of technology (rnsit)

Research Assistant

Jul 2013Mar 2014 · 8 mos · Bangalore

  • Worked as a research assistant under Prof.T Satish Kumar in my final year undergrad.The research was based on applying compiler optimisation for programs running on distributed/multicore systems and also published two Journals on it.
  • 1 - Optimizing Code by Selecting Compiler Flags using Parallel Genetic Algorithm on Multicore CPUs
  • 2 - Compiler Phase ordering and Optimizing MPI runtime parameters using Heuristic Algorithms on
  • SMPs

Sushira swara sangama

Freelance Musician

Jan 2006Apr 2008 · 2 yrs 3 mos · Bangalore Urban, Karnataka, India · On-site

  • Freelance Tabla Player for various artists and events in Bangalore.

Education

Visvesvaraya Technological University

Bachelor's Degree — Computer Science

Jan 2011Jan 2014

MS Ramaiah Polytechnic

Diploma — Computer Science

Jan 2008Jan 2011

Stackforce found 100+ more professionals with Engineering Management & Distributed Systems

Explore similar profiles based on matching skills and experience