Shubham Pandey

DevOps Engineer

Bengaluru, Karnataka, India3 yrs 2 mos experience
Highly Stable

Key Highlights

  • Delivered zero-downtime releases for AWS EKS workloads.
  • Reduced MTTR by 30% through advanced observability.
  • Automated workflows, cutting manual effort by 35%.
Stackforce AI infers this person is a Cloud Infrastructure and Site Reliability Engineering expert in the SaaS industry.

Contact

Skills

Core Skills

Site Reliability EngineeringCloud InfrastructureReliability EngineeringAutomationPerformance EngineeringFull-stack DevelopmentSoftware Development

Other Skills

AWSGCPTerraformCloudFormationIncident managementRCASLA compliancecapacity planningJenkinsGitHub ActionsAnsiblePythonShell scriptingSplunkPrometheus

About

Currently working as a Software Engineer (SRE) at Xoriant, I bring 3 years of experience in cloud infrastructure, reliability engineering, and DevOps automation. My focus is on building scalable, resilient, and high-performing systems across both AWS and GCP environments. Cloud & Infrastructure: AWS, GCP, Terraform, CloudFormation Reliability & Operations: Incident management, RCA, SLA compliance, capacity planning Automation & CI/CD: Jenkins, GitHub Actions, Ansible, Python, Shell scripting Monitoring & Observability: Splunk, Prometheus, Grafana, CloudWatch Key highlights: Delivered zero-downtime releases for containerized workloads on AWS EKS. Reduced MTTR by 30% through advanced monitoring and observability. Automated workflows to cut manual effort by 35%. Ensured platform stability during high-traffic product launches. Certified as an AWS Cloud Practitioner, AWS AI Practitioner, and Google Associate Cloud Engineer, I thrive at the intersection of DevOps and SRE, driving operational excellence and scaling mission-critical systems across multi-cloud platforms.

Experience

3 yrs 2 mos
Total Experience
3 yrs 1 mo
Average Tenure
1 mo
Current Experience

Xoriant

Software Engineer

May 2026Present · 1 mo · Bengaluru · Hybrid

AWSGCPTerraformCloudFormationIncident managementRCA+13

Wipro

Project Engineer - Apple COE

Mar 2023Apr 2026 · 3 yrs 1 mo · Bengaluru · Hybrid

  • Site Reliability Engineering & Cloud
  • Managed reliability of high-traffic Apple Online Store systems handling millions of users.
  • Defined SLIs/SLOs and used error budgets to balance reliability and release velocity.
  • Built scalable AWS infra (EC2, EKS, VPC, S3, IAM) using Terraform & CloudFormation.
  • Ran containerized workloads on Kubernetes (EKS) with auto-scaling and zero-downtime deploys.
  • Implemented HPA for peak traffic handling and system stability.
  • Designed event-driven systems using SQS/SNS.
  • Observability & Reliability
  • Improved availability to 99.95% using Prometheus, Grafana, CloudWatch, Splunk.
  • Reduced MTTR by 30% via better alerting and faster detection.
  • Cut 40% alert noise, improving on-call efficiency.
  • Built dashboards for latency, errors, throughput, and infra health.
  • Incident & Production Support
  • Led outage debugging using logs, metrics, and microservices tracing.
  • Performed RCA, reducing recurring incidents by 20%.
  • Handled on-call rotations and ensured quick recovery.
  • Managed production readiness for high-traffic releases.
  • Performance Engineering
  • Led performance testing for 500+ backend services.
  • Designed load, stress, spike, soak tests using JMeter.
  • Simulated high traffic to find bottlenecks across app, DB, infra.
  • Performed capacity planning for peak readiness.
  • CI/CD & Automation
  • Optimized CI/CD (Jenkins, GitHub Actions), reducing failures by 25%.
  • Reduced deploy time from 15 to 6 mins.
  • Implemented validation, rollback, blue-green deploys.
  • Automated tasks using Python, Shell, Ansible.
  • Kubernetes & Debugging
  • Debugged CrashLoopBackOff, OOMKilled issues.
  • Managed services and ingress configs.
  • Troubleshot TCP/IP, DNS, load balancing issues.
  • Analyzed request flow to find latency bottlenecks.
AWSKubernetesTerraformCloudFormationIncident managementRCA+12

Pw (physicswallah)

Operations Engineer

Feb 2023Mar 2023 · 1 mo · Remote

Tata strive

AWS ReStart- Cloud Practitioner

Sep 2022Jan 2023 · 4 mos · Gurugram · Remote

  • AWS Cloud concepts, AWS services, security, architecture, pricing, and support to build your AWS Cloud knowledge.
AWSCloud conceptsAWS servicesSecurityArchitecturePricing+2

Stackroute learning

Java Full Stack Developer

May 2022Aug 2022 · 3 mos · Uttar Pradesh, India · Remote

HTMLJavaScriptBootstrapHibernateREST APIsCSS+4

Infosys

System Engineer

Feb 2022May 2022 · 3 mos · Mysore, Karnataka, India

  • A great opportunity to work with Infosys as an Intern where I have enhanced my Skills and Knowledge.
JavaScriptAngularSoftware DevelopmentCSSC#HTML5+2

Nucleus software

Software Engineer

Jan 2022Feb 2022 · 1 mo · Remote

Education

United College Of Engineering and Research

Bachelor of Technology - B.Tech — Information Technology

Jul 2018Jun 2022

St. Francis School

Intermediate — PCM

Apr 2017May 2018

St. Francis School

High School — Science

Apr 2015May 2016

Stackforce found 100+ more professionals with Site Reliability Engineering & Cloud Infrastructure

Explore similar profiles based on matching skills and experience