Suman Roy

Software Engineer

Seattle, Washington, United States13 yrs 8 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Over 13 years of experience in distributed systems.
  • Expert in AWS infrastructure and Kubernetes.
  • Proven track record of optimizing large-scale systems.
Stackforce AI infers this person is a Cloud Infrastructure Engineer with expertise in distributed systems and performance optimization.

Contact

Skills

Core Skills

Amazon Web Services (aws)Distributed SystemsPerformance EngineeringData Structures

Other Skills

AlgorithmsApplication Performance ManagementCC++Core JavaEnterprise SoftwareHTMLJSPJavaJavaScriptKubernetesLinuxMySQLOperating SystemsOracle

About

With over 13 years of experience, I specialize in building scalable, fault-tolerant distributed systems and large-scale infrastructure solutions. Currently, I contribute to AWS Bedrock, leveraging Kubernetes and AWS infrastructure expertise to support mission-critical services and enable innovation in cloud computing. My work focuses on delivering reliable and cost-effective solutions while maintaining high performance at exabyte scale. As a technical leader, I take pride in fostering collaboration across teams, driving architectural decisions, and mentoring engineers to achieve operational excellence. My efforts have supported the development of cutting-edge systems for data migration, storage durability, and infrastructure optimization, aligning with AWS's commitment to technical excellence and customer success.

Experience

13 yrs 8 mos
Total Experience
4 yrs 6 mos
Average Tenure
7 yrs 1 mo
Current Experience

Amazon web services (aws)

2 roles

Senior SDE, Amazon Web Service

Promoted

Apr 2021Present · 5 yrs

  • Provided technical leadership for low-level storage teams managing I/O subsystems and data durability at AWS S3 Glacier.
  • Drove the technical vision for HDD technology implementation, reshaping AWS's cold storage architecture.
  • Achieved a 99.95% retrieval SLO success rate while reducing operational load by 80% through enhanced monitoring strategies.
KubernetesAmazon Web Services (AWS)Distributed SystemsStorage

Software Development Engineer - II

Mar 2019Apr 2021 · 2 yrs 1 mo

  • Developed sophisticated control and data plane systems for AWS S3 Glacier, managing exabyte-scale operations.
  • Led a team of 7 engineers in architecting high-performance systems for real-time telemetry processing.
  • Implemented a state-of-the-art hardware management platform, enhancing predictive maintenance and reducing operational costs by 60%.
  • Fostered a culture of innovation and mentorship in distributed systems design and implementation.
KubernetesAmazon Web Services (AWS)Distributed SystemsData StructuresAlgorithms

Amazon

Software Development Engineer -II

Apr 2016Mar 2019 · 2 yrs 11 mos · Bengaluru Area, India

  • As a key member of Amazon's Affiliate Marketing team, I contributed to critical systems for processing affiliate earnings at scale. The team built a scalable, fault-tolerant distributed event processing platform that calculated affiliate earnings by processing millions of daily events from Amazon's retail platform, correlating clicks, orders, and shipments driven through affiliate marketing channels. Within this ecosystem, I designed and implemented a real-time event ingestion system that subscribed to high-throughput Amazon digital retail events (processing millions of events per day), tracked digital returns through complex event correlation, implemented sophisticated filtering mechanisms for affiliate-specific returns, and built aggregation pipelines for affiliate-level return notifications integration into the earnings calculation workflow.
  • The system leveraged AWS EMR (Elastic MapReduce) for large-scale data processing, running complex MapReduce jobs to analyze 60 days of historical data including click-through tracking, order completions, shipment confirmations, and return notifications, ensuring precise earnings calculations. Through deep analysis and optimization of EMR workflows, I demonstrated multi-million dollar annual cost savings in EMR infrastructure expenses while maintaining processing SLAs for the earnings calculation pipeline.
  • During the team's migration from a monolithic payment processing system to a scalable microservice architecture, we encountered critical earnings calculation discrepancies between projected and actual affiliate payouts. Demonstrating customer obsession, I designed and implemented a large-scale reconciliation system that enabled automated detection and correction of earning mismatches. This solution significantly reduced customer escalations and resolve payment discrepancies, ensuring accurate affiliate compensation.
Amazon Web Services (AWS)Distributed SystemsData StructuresAlgorithmsJavaSQL

Cognizant technology solutions

Performance Engineer

Aug 2012Apr 2016 · 3 yrs 8 mos · Bangalore Area, India

  • As a Performance Engineer at Cognizant, I specialized in optimizing high-frequency trading systems and developing performance testing solutions.
  • Working with critical trading systems for Indian Stock Exchange, I designed and implemented novel data structures and algorithms that reduced latency to 125 μs (90th percentile), while sophisticated journal data segregation methodology significantly improved trading engine throughput.
  • I developed a Linux user-space BandwidthLimiter tool that enabled precise network traffic control at process and port level through TCP packet manipulation. This tool became instrumental in conducting resilience testing and performance benchmarking by allowing controlled simulation of various network conditions, bandwidth constraints, and latency scenarios.
  • For a major banking client's trading application, I architected a load generation tool that simulated concurrent user behavior and trading scenarios. The tool integrated with the client's proprietary trading protocol, featuring dynamic request generation, configurable load patterns, and real-time performance monitoring capabilities.
  • Improved the performance of trading cycle by adopting Google Protocol Buffers for efficient data serialization and adding strategic caching layers, we significantly improved the application's performance under high load conditions.
  • In a notable innovation project, I prototyped a database virtualization tool that intercepted and analyzed MySQL network packets at the protocol level. The system featured TCP packet capture capabilities, query pattern analysis, customizable response injection, and acted as a standalone database server. The idea was to enable testing environments to operate without actual database servers and their costly licenses, as teams could simulate database interactions by modifying query-response patterns on the fly.
Performance TestingData StructuresAlgorithmsLinuxPerformance Engineering

Stackforce found 100+ more professionals with Amazon Web Services (aws) & Distributed Systems

Explore similar profiles based on matching skills and experience