Kartikeya Mittal

SRE (Site Reliability Engineer)

Bengaluru, Karnataka, India4 yrs 9 mos experience

Key Highlights

Expert in scaling cloud-native infrastructure across multiple platforms.
Proven track record in disaster recovery and system reliability.
Strong background in DevOps and automation solutions.

Stackforce AI infers this person is a SaaS Infrastructure Engineer with expertise in Site Reliability Engineering and DevOps.

Contact

Skills

Core Skills

Site Reliability EngineeringKubernetesSoftware ObservabilityDevops

Other Skills

Python (Programming Language)TerraformAWSGCPAzureDisaster RecoveryPrometheusGrafanaLokiAmazon Web Services (AWS)PythonDockerOpenShiftGolangGo (Programming Language)

About

Site Reliability Engineer (SRE) / DevOps Engineer with 4+ years of experience designing and operating large-scale, cloud-native infrastructure across AWS, GCP, and Azure.

Experience

4 yrs 9 mos

Total Experience

2 yrs 2 mos

Average Tenure

3 mos

Current Experience

Airbnb

System Engineer

Feb 2026 – Present · 3 mos · Bengaluru · Remote

Singlestore

3 roles

Senior Site Reliability Engineer

Promoted

Apr 2025 – Feb 2026 · 10 mos · Remote

Scaled infrastructure globally in 14+ regions by deploying 100+ Kubernetes clusters across AWS, GCP & Azure with Terraform, while automating deployments through CI/CD (GitLab) pipelines, reducing
provisioning time by 90% and enabling rapid environment rollouts supporting 2,000+ customer
databases.
Designed & implemented a full Disaster Recovery (DR) solution for backend infrastructure, achieving seamless failover with zero downtime (RTO < 15 min, RPO ~0) during regional outages.
Developed a custom Kubernetes secrets-sync controller (Golang) to replicate secrets across multiple clusters in realtime.
Managed and optimized Kubernetes clusters across multi-cloud environments, with expertise in
troubleshooting networking, and workload issues, reducing incident resolution time and
improving cluster reliability.
Set up and managed end-to-end observability stack (Prometheus, Alertmanager, Grafana, and
Loki) to enable proactive monitoring, custom alerting, and log-based visualization, significantly
improving system reliability and debugging efficiency.
On-call rotation contributor, improving SLIs/SLOs and reducing MTTR through deep debugging of distributed systems.

Python (Programming Language)KubernetesSite Reliability Engineering

Site Reliability Engineer 2

Apr 2024 – Apr 2025 · 1 yr · Remote

KubernetesAmazon Web Services (AWS)Site Reliability Engineering

Site Reliability Engineer

Mar 2023 – Mar 2024 · 1 yr · Remote

Python (Programming Language)KubernetesSite Reliability Engineering

Amadeus labs

DevOps Engineer

Aug 2021 – Mar 2023 · 1 yr 7 mos · Bengaluru, Karnataka, India

Responsible for building , delivering and maintaining highly available platforms for customer Airlines.
Performed root cause analysis ,provided support during major incidents, fixed and documented
problems ,and implemented preventive measures.
Experience in developing automation solutions in python to reduce manual efforts and increase
team efficiency .
Experience with container-based deployements using Docker , working with Docker images, Docker
registries and OpenShift

DevOpsPython (Programming Language)Site Reliability Engineering