Shubham Katiyar — Director of Engineering

I bring 16+ years of executive leadership experience at Amazon, working across AWS, Alexa, and AGI. Currently, I lead Runtime Infrastructure and Inference for Amazon's Nova foundation models, architecting both the systems and algorithms that power next-generation AI experiences across all modalities. My research spans breakthrough inference techniques like multi-stage disaggregation, 5D parallelism and speculative decoding, hardware-software co-design for distributed architectures, and novel approaches to multimodal processing. Through close collaboration with academic researchers and publication of our innovations, we advance the field while delivering practical solutions for millions of users through Amazon Bedrock and specialized applications like Alexa+, Prime Video, and Ads.I manage organization of engineers, applied scientists, and technical program managers, to balance fundamental research with rapid productionization of breakthrough techniques. My approach combines deep expertise in distributed systems optimization with strategic research investments in emerging areas like inference-time scaling and reasoning algorithms. Whether we're researching AI systems, or developing real-time streaming algorithms that handle unprecedented Internal scale of millions or requests/seconds and terabits data throughput/sec, I focus on research that creates both scientific advancement and measurable business impact. I'm passionate about building the foundational research and technologies that will define how AI systems scale and perform in the next decade.

Stackforce AI infers this person is a leader in AI infrastructure and cloud computing, specializing in large-scale distributed systems.

Location: Seattle, Washington, United States

Experience: 17 yrs 6 mos

Skills

Artificial Intelligence (ai)
Distributed Systems
Cloud Computing
Large Language Models (llm)
Change Management
Serverless Architecture

Career Highlights

16+ years of executive leadership experience at Amazon.
Pioneered innovations in AI and multimodal processing.
Led large teams to balance research and production.

Work Experience

Amazon

Director of Runtime Infrastructure & Inference | Amazon Nova (AGI) | AWS Bedrock (3 yrs)

Director, Alexa Artificial Intelligence (4 yrs 5 mos)

Amazon Web Services

Senior Manager, Amazon CloudFront | AWS Lambda (9 yrs)

Knolskape Solutions

Research Engineer (1 yr)

Education

Bachelor of Engineering (Hons.) at Birla Institute of Technology and Science, Pilani

Shubham Katiyar

Director of Engineering

Seattle, Washington, United States17 yrs 6 mos experience

Most Likely To SwitchAI Enabled

Key Highlights

16+ years of executive leadership experience at Amazon.
Pioneered innovations in AI and multimodal processing.
Led large teams to balance research and production.

Stackforce AI infers this person is a leader in AI infrastructure and cloud computing, specializing in large-scale distributed systems.

Contact

Skills

Core Skills

Artificial Intelligence (ai)Distributed SystemsCloud ComputingLarge Language Models (llm)Change ManagementServerless Architecture

Other Skills

5D parallelismAWS Annapurna AcceleratorsAWS BedrockAgile DevelopmentAgile MethodologiesAmazon S3Amazon Web Services (AWS)CC++Data AnalysisData ProcessingData VisualizationGWTGenerative AIHadoop

About

Experience

Amazon

2 roles

Director of Runtime Infrastructure & Inference | Amazon Nova (AGI) | AWS Bedrock

Mar 2023 – Present · 3 yrs · Greater Seattle Area

Architected and operated Nova’s inference platform on AWS Bedrock -- Nvidia GPUs, and AWS Annapurna Accelerators (TRN) -- across multiple regions, achieving 99.99% uptime and sub-100 ms median latency for global workloads.
Developed novel multi-stage dissagregated inference, 5D parallelism, architectural sparsity, speculative decoding, and mixed-precision techniques that achieve industry-leading price-performance across Nova Micro to Premier model tiers.
Built real-time multimodal processing at AWS scale to enabling efficient text, image, video, audio, and speech understanding and generation with real-time streaming capabilities.
Directed 600+ organization of top engineers and applied science researchers, balancing research excellence with production reliability at AWS scale.
Scaled Nova’s serving stack to trillions of tokens per day through, serving thousands Amazon (Alexa+, Prime Video, AWS Connect, AWS Q, etc.) and AWS customers.
https://aws.amazon.com/ai/generative-ai/nova/

InferenceMulti-Modal ModelsReal-Time InferenceDistributed SystemsArtificial Intelligence (AI)Cloud Computing

Director, Alexa Artificial Intelligence

Oct 2018 – Mar 2023 · 4 yrs 5 mos · Greater Seattle Area

Re-architected Alexa's core AI infrastructure for GenAI era to serve 100% of device traffic and use-cases, integrating large language models seamlessly connecting billions of customers' voice interactions with optimal experiences across millions of devices, transforming Alexa GenAI-driven intelligent agentic assistant.
Pioneered self-learning systems and runtime that enable AI to continuously identify poor customer experiences, explore alternative response strategies, and automatically improve decision-making without human intervention, as an automated system.
Delivered breakthrough LLM integration innovations that enabled Amazon to license Alexa's GenAI voice technology as 'Alexa Custom Agents' (ACA), creating a new enterprise business line serving Disney, BMW, Lamborghini, and others to build AI-driven custom assistants powered by large language models and advanced natural language understanding

Large Language Models (LLM)Artificial Intelligence (AI)Change ManagementAgile Development

Amazon web services

Senior Manager, Amazon CloudFront | AWS Lambda

Jan 2009 – Jan 2018 · 9 yrs · Greater Seattle Area

Built and scaled Amazon CloudFront’s globally distributed caching infrastructure to handle unprecedented Internal scale, of 15 million+ requests/seconds and 10+ Terabits/second throughput across 120+ edge locations.
Invented and launched AWS Lambda@Edge, enabling serverless compute across global regions — a foundational system design that now influences the architecture of large-scale AI services.
Defined architectural strategy for 10× capacity growth with 50% cost reduction, directly shaping how AWS services achieve global elasticity and resilience.

Cloud ComputingDistributed SystemsSoftware Development

Knolskape solutions

Research Engineer

Jan 2008 – Jan 2009 · 1 yr · Singapore

Designed and developed the first experiential curriculum for INSEAD business school.
Amongst the first 4 employees at the gaming startup, disrupting learning through simulations and experiential learning.