Aamir Ansari

SRE (Site Reliability Engineer)

Mumbai, Maharashtra, India4 yrs 4 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in Kubernetes and AWS EKS management.
  • Strong focus on observability and incident response.
  • Proficient in deploying microservices with Helm.
Stackforce AI infers this person is a SaaS Infrastructure Engineer with strong expertise in cloud-native deployments.

Contact

Skills

Core Skills

KubernetesAwsDatadogMonitoringDeployment

Other Skills

AWS EKSAmazon Web Services (AWS)AnsibleBashCICassandraCommunicationComputer ScienceContainerizationContinuous Delivery (CD)Continuous Integration (CI)DatabasesDefining RequirementsDeployment PlanningDesign

About

Site Reliability Engineer with 4 years of experience in Kubernetes (AWS EKS), Helm, Terraform, CI/CD, Linux, and cloud-native production systems. Skilled in rolling updates, blue-green deployments, and fixing production-level incidents in distributed environments. Strong in Datadog monitoring, SLO/SLI design, error/frustration dashboards, networking, and automation workflows. Hands-on with Python, Bash, IaC, Consul, microservices debugging, and optimizing reliability, scalability, and performance. Focused on high availability, observability, root-cause analysis, and delivering stable, efficient, and resilient production infrastructure across AWS and on-prem systems.

Experience

4 yrs 4 mos
Total Experience
2 yrs 10 mos
Average Tenure
4 yrs 3 mos
Current Experience

Cloudbees

Site Reliability Engineer

Jan 2024Present · 2 yrs 5 mos · Remote

  • I’m part of the Platform Engineering team, where we manage infrastructure using Terraform and Pulumi. I’m responsible for production deployments on our AWS EKS clusters, ensuring reliability and scalability across environments.
  • Our platform leverages Kubernetes (EKS) for orchestration, automated through GitHub Actions and CloudBees CI. We use Datadog for observability — defining and tracking SLOs/SLIs to maintain platform stability — and PagerDuty for alerting and incident response.
  • As part of observability initiatives, I collaborate closely with UI teams to identify user frustrations through RUM (Real User Monitoring) sessions and troubleshoot frontend issues. This includes inspecting network activity and APIs via browser developer tools to validate performance and service reliability.
  • We deploy microservices via Helm charts and Helmfiles, where I’ve contributed to creating and maintaining charts to streamline and standardize our deployment workflows.
  • We manage Cassandra and PostgreSQL databases, use NATS for messaging, and HashiCorp Vault for secrets management.
  • I’ve also contributed to service and infrastructure migrations, improving automation, monitoring, and deployment pipelines.
TerraformPulumiAWS EKSKubernetesDatadogSLOs/SLIs+6

Freelancer

Fresher

Mar 2022Present · 4 yrs 3 mos

Media.net

System Operations Specialist

Feb 2022Jan 2024 · 1 yr 11 mos · Mumbai, Maharashtra, India

Education

St Johns Research Institute

Bachelor of Engineering - BE — Computer Science

May 2014Jun 2018

Stackforce found 100+ more professionals with Kubernetes & Aws

Explore similar profiles based on matching skills and experience