SAJEESH KRISHNAN

SRE (Site Reliability Engineer)

Bengaluru, Karnataka, India14 yrs 9 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Expert in Site Reliability Engineering and automation tools.
Proven track record in incident management and performance optimization.
Hands-on experience with diverse cloud and monitoring technologies.

Stackforce AI infers this person is a Site Reliability Engineer with expertise in E-commerce infrastructure management.

Contact

Skills

Core Skills

Site Reliability EngineeringMonitoring

Other Skills

mcpmcp toolAzureOpenStackOneOpstomcatF5NetScalarsAkamaiTorbitGrafanaGraphiteMedusaSeyranELK

About

Site Reliability Engineering with hands-on experience in Linux System, Network and Application operations. Have been primarily responsible for keeping sites up and reliable and donned multiple hats in Network Operations,System Operations, and Application Operations. Good command over Site Reliability,Tools and automation .Expert in wide set of monitoring and metrics systems.

Experience

14 yrs 9 mos

Total Experience

3 yrs 8 mos

Average Tenure

8 yrs 5 mos

Current Experience

Walmart

2 roles

Staff Site Reliability Engineer

Jul 2025 – Present · 10 mos

mcpmcp toolSite Reliability Engineering

Senior Site Reliability Engineer

Dec 2017 – Jul 2025 · 7 yrs 7 mos

Part of Site Reliability team.Responsible for the availability, latency,performance, efficiency, monitoring for all the apps of Walmart sites.
First Contact for P1/P2 Incidents for investigation, triage, mitigation, and recovery from, or for, any potential or actual site or service impacting changes, events, or incidents
Ensured continuous site and service availability, functionality, performance and reliability for eCommerce service, and service support delivery systems and components
Ensured metrics like MTTD,MTTE,MTTM and MTTR are within the SLA by improving process and building tools
Identified site reliability issues and process gaps, and suggest solutions
Facilitated root cause analysis and problem management processes to ensure rapid RCA is performed, verified and eliminated
Automated, built tools [Golang, Python,Bots ,ELK, Graphite,TSDB]
Stack : Azure,OpenStack,OneOps,tomcat,F5,NetScalars,Akamai,Torbit
Tools: Grafana,Graphite,Medusa,Seyran,ELK,Splunk,NewRelic

AzureOpenStackOneOpstomcatF5NetScalars+11

Yahoo! inc.

2 roles

Production Engineer, DevOps

Promoted

Jan 2017 – Dec 2017 · 11 mos

IC :
# Responsible for availability, performance, efficiency, production deployments and change management, monitoring, emergency response, and capacity planning for the Yahoo! Infra tools and Add platform.
# Diagnosing and fixing service related issues using various monitoring tools/platforms.
# Focuzing on both infrastruture(system/clusters/network) and application level debugging.
Writing remediation scripts to fix adhoc/repetitive issues.
# Implementing changes/mass execution/deployment on Yahoo production servers(yinst,pogo,SD etc.)