Komal Singh

SRE (Site Reliability Engineer)

Maharashtra, India7 yrs 6 mos experience
Highly Stable

Key Highlights

  • 8+ years in DevOps and SRE roles.
  • Expert in cloud infrastructure and automation.
  • Strong advocate for SRE best practices.
Stackforce AI infers this person is a Senior DevOps and SRE professional specializing in scalable cloud infrastructure and automation.

Contact

Skills

Core Skills

Site Reliability EngineeringCloud ComputingDevops AutomationCloud Cost Optimization

Other Skills

Cloud solutionsTerraformAnsibleKubernetesPrometheusGrafanaElastic APMSentryPythonBashCI/CD PipelinesAWS CloudFormationPuppetAWSGit

About

I am a result-oriented Senior Devops and SRE Professional with 8+ years of designing and maintaining scalable, resilient, and observable infrastructure in fast-paced, high-growth environments. I’ve led critical platform initiatives, including high-availability infrastructure setup, cost-optimized cloud infrastructure, optimized CI/CD frameworks, and an incident response system that directly supported large-scale consumer applications. My core focus is to reduce operational toil, enforce SRE best practices like SLAs/SLOs, and empower teams through automation and monitoring. I’m also an advocate of continuous improvement and blameless culture, with a strong grasp of distributed systems, performance tuning, and secure deployment practices. Beyond my technical expertise, I'm a strong advocate for continuous self-development. I enjoy good music, thought-provoking books, and the occasional jog in the rain — a simple way to recharge and find inspiration.

Experience

Nielseniq

Senior Site Reliability Engineer - Cloud Platform

Sep 2025Present · 6 mos · Pune, Maharashtra, India · Remote

  • Managing and building highly available, scalable and reliable Cloud solutions
Cloud solutionsSite Reliability EngineeringDevOps AutomationCloud Computing

Haptik

2 roles

Sr. DevOps Engineer

Promoted

Apr 2022Sep 2025 · 3 yrs 5 mos

  • As a Senior DevOps Engineer, I collaborate across multiple teams to provide scalable, cost-effective solutions. My day-to-day responsibilities include:
  • Implement SRE principles: SLAs/SLOs, observability dashboards, and proactive alerting with Prometheus and Grafana.
  • Automate cloud infrastructure with Terraform and Ansible, eliminating manual toil and improving reliability.
  • Build and maintain scalable Kubernetes clusters (AKS/GKE) with auto-scaling, progressive rollout, and high availability.
  • Reduce incident MTTR by 40% through root cause analysis using Elastic APM, Sentry, and custom logging solutions and a blameless postmortem culture.
  • Enable AI/ML delivery through custom CI/CD pipelines, self-service deployment frameworks, and secure DevOps practices.
  • Optimize cloud costs by 60% via resource rightsizing, intelligent storage tiering, and decommissioning unused assets.
  • Support zero-downtime deployments through canary releases and rollout strategies in coordination with product teams.
  • Automate backup and restore routines and configure high availability across cloud components to ensure RTO and RPO are maintained.
  • Create comprehensive documentation, runbooks, SOPs, and mentor DevOps engineers to uplift SRE and engineering maturity.
Site Reliability EngineeringDevOps AutomationTerraformAnsibleKubernetesPrometheus+3

DevOps Engineer

Jul 2019Mar 2022 · 2 yrs 8 mos

  • As a DevOps Engineer, I was responsible for automation tasks and optimizing the CI/CD pipelines. My day-to-day responsibilities include:
  • Automate routine operational tasks and environment provisioning using Python and Bash, reducing manual interventions.
  • Build and optimize CI/CD pipelines and internal tooling to streamline deployments, significantly improving release velocity and developer productivity.
  • Monitor infrastructure and application health across multiple data centers, enabling faster detection and resolution of issues through proactive alerting.
  • Manage load balancers, DNS configurations, and CDN integration to improve system availability.
  • Assist in designing and automating cloud infrastructure using Terraform and AWS CloudFormation, gaining hands-on experience in managing hybrid cloud architectures.
PythonBashCI/CD PipelinesTerraformAWS CloudFormationDevOps Automation

Directi

Operations Engineer DevOps

Jun 2017May 2018 · 11 mos · Mumbai Area, India

  • At Directi, I was working for the Content Monetisation team of Media.net. It involved maintaining and renewing the existing infrastructure and adding automation scripts to improve performance.
  • Responsibilities:
  • Managing and updating the services across different data centers.
  • Work with Continuous Integration and Continuous Deployment pipelines and tools.
  • Provisioned servers and deployed features using Puppet.
  • Provision and Manage AWS EC2 servers.
  • Analyze AWS S3 data to extract meaningful metrics and create dashboards using Kibana.
  • Work on automating infrastructure by writing python scripts.
  • Issue and renew SSL certificates used for authenticating domain names.
  • Identify and fix the issues that occur in multiple environments and multiple data centers.
  • Ensure high availability, monitoring, alerting & backups are in place.
  • Work with servers, load balancers, heterogeneous DB technologies, virtualization & containers.
  • Tools and Technologies:
  • AWS, Puppet, Git, Docker, Jenkins
PuppetAWSGitDockerJenkinsDevOps Automation

Bmc software

Java Developer Intern

Aug 2016Jun 2017 · 10 mos · Pune Area, India

  • During the internship, I was focused on researching the various way to implement data migration and identifying the most effective technique that would benefit the Company.
  • Responsibilities:
  • Designed and developed an Enhanced Data Migration Utility using BMC Remedy which successfully reduced the number of data migration cycles.
  • Coded in Java by incorporating JDBC API to interact with Oracle Database 12c and Swing GUI widget to design UI.
  • Wrote and published white papers based on the project at the National Conference.
JavaJDBC APIOracle DatabaseSwing GUI

Mahagenco-mspgcl

Summer Intern

Apr 2016May 2016 · 1 mo · Mumbai Area, India

  • I worked on a project which is desktop based and completely coded in C# and SQL server as database. The complete development, management, and deployment was solely my responsibility.

Education

The University of Texas at Austin

Postgraduate Degree — Artificial Intelligence and Machine Learning

Aug 2024Sep 2025

MIT Academy of Engineering, Alandi, Pune

Bachelor’s Degree — Computer Engineering

Jan 2013Jan 2017

Bal Bharati Public School

High School

Jan 2011Jan 2013

Vishwajyot Schools - India

Matriculation

Jan 2007Jan 2011

Stackforce found 100+ more professionals with Site Reliability Engineering & Cloud Computing

Explore similar profiles based on matching skills and experience