Akshay Verma

SRE (Site Reliability Engineer)

Bengaluru, Karnataka, India8 yrs 3 mos experience

Highly Stable

Key Highlights

Led cloud migration with zero downtime.
Achieved significant cost savings through infrastructure optimization.
Mentored junior engineers, enhancing team capabilities.

Stackforce AI infers this person is a Cloud Infrastructure Engineer specializing in Site Reliability Engineering and DevOps.

Contact

Skills

Core Skills

Site Reliability EngineeringAmazon Web Services (aws)KubernetesDevops

Other Skills

Amazon ECScloud costGravitonCloudflareKustomizeTerraformAWSCI/CDNginxPython (Programming Language)Elastic Stack (ELK)Disaster RecoverydockerJenkinsSQL

About

Site Reliability Engineer at Booking Holdings focused on building reliable and scalable cloud platforms on AWS. Experienced in cloud migrations, automation, and infrastructure modernization, I work towards improving platform resilience, optimizing costs, and enabling engineering teams to operate efficiently at scale.

Experience

8 yrs 3 mos

Total Experience

2 yrs 8 mos

Average Tenure

2 mos

Current Experience

Booking holdings (nasdaq: bkng)

Site Reliability Engineer II

Mar 2026 – Present · 2 mos · Bengaluru, Karnataka, India · Hybrid

Amazon Web Services (AWS)KubernetesAmazon ECScloud costGravitonSite Reliability Engineering

Cred

Site Reliability Engineer

Oct 2021 – Feb 2026 · 4 yrs 4 mos · Bangalore Urban, Karnataka, India · On-site

Owned end-to-end cloud and platform reliability for 800+ microservices running on AWS ECS and EKS, enforcing IAM best practices, account isolation, and security compliance at scale.
Led 70% adoption of AWS Graviton and increased Spot Instance usage to 45%, achieving 40–60% cost savings while improving performance; validated Spot resilience using AWS FIS and improved shutdown handling with SIGTERM-aware services.
Migrated 50+ microservices and databases from Azure VMs to AWS EKS in 7 days, using Kustomize and Cloudflare-based traffic switching, cutting infrastructure costs by ~50% with zero downtime.
Spearheaded full-stack infra modernisation and cloud cost optimisation for a CRED subsidiary - driving ~60% cloud savings, moving legacy monoliths to a resilient containerised stack with zero downtime, unifying observability, streamlining CI/CD, and strengthening the org’s security posture through improved governance and security controls.
Architected and implemented production-grade AWS EKS clusters with Karpenter, KEDA, Spot, and Graviton, enabling cost-efficient autoscaling and resilient node lifecycle management.
Led infrastructure planning for IPL-scale events, handling millions of requests per minute through ELB pre-warming, time-based autoscaling, and custom step-scaling strategies.
Built secure, compliance-ready infrastructure for India’s CBDC launch using Pulumi, establishing critical NPCI connectivity and enabling a successful public rollout.
Owned 24×7 SRE on-call operations via PagerDuty; drove incident triage, RCAs, and postmortems that reduced MTTR and improved overall system reliability.
Mentored interns and junior engineers, conducted interviews, and led hands-on training sessions to raise team reliability maturity and operational ownership.
Led SRE ownership for the Wealth charter, collaborating with product and engineering teams while managing and mentoring 2 SREs to meet reliability, security, and compliance requirements.

TerraformNginxPython (Programming Language)Elastic Stack (ELK)Amazon Web Services (AWS)Disaster Recovery+1

To the new

2 roles

DevOps Engineer

Feb 2019 – Oct 2021 · 2 yrs 8 mos

Built and managed 3-tier app architecture on AWS ECS with auto-scaling, logging, and monitoring via Terraform
and Chef, improving efficiency and cutting costs by 30%.
Developed CI/CD pipelines using Jenkins and AWS CodeStar to enable zero-downtime deployments across multiple environments.
Been part of entire cloud infra migration with zero downtime from AWS Singapore region to AWS Mumbai region using Terraform.
Applied security best practices with Dome9, including CIS benchmarks and server hardening, leading to consistent vulnerability remediation.
Trained and onboarded 25+ DevOps engineers via structured bootcamps, accelerating ramp-up on core tools and processes.

Intern

May 2018 – Jul 2018 · 2 mos · Noida Area, India

AWS: S3, EC2, VPC, Route53, ECS, Lambda, RDS, Cognito, DMS etc.
DevOps tools: Docker, Jenkins, Rundeck, Nagios, Spotinst, TIG(Telegraf, InfluxDB, Grafana).
1. Monitoring via Nagios + Flock Integration
2. Enable ALB + Google Auth on servers using AWS Cognito
3. Setup stage environment using Terraform
4. Setup Docker Compose for various services