Tahir Mehraj — SRE (Site Reliability Engineer)
I've spent the last 10 years in the trenches of site reliability engineering—from managing telecom networks in Kashmir to scaling cloud infrastructure at Atlassian serving 45,000+ instances monthly. Here's what nobody tells you about SRE: The real challenge isn't just keeping systems up. It's doing it cost-effectively, at scale, without burning out your team. THE NUMBERS At Atlassian, I cut AWS infrastructure costs by 35% while improving reliability. Saved 906 engineering hours monthly by eliminating pipeline flakiness. Migrated 2,000+ customer instances to the cloud without a major incident. These weren't accidents. They came from obsessive focus on building self-healing systems, making observability actually useful, automating repetitive work, and treating cost optimization as an engineering discipline. WHAT I SHARE HERE I write about the real, unglamorous work of SRE: - War stories from 3 AM incidents and what they taught me - Practical Kubernetes cost optimization patterns - Building observability that prevents outages - Lessons from 10 years of on-call rotations - The business case for reliability engineering - Honest takes on cloud architecture decisions No fluff. No theory without practice. Just what actually works when you're responsible for production systems. MY BACKGROUND Senior SRE at CrowdStrike, previously at Atlassian. Working with AWS, Kubernetes, Datadog, Terraform, Python, and the cloud-native ecosystem. Started in telecom network operations, moved through NOC and infrastructure support, eventually landing in SRE/DevOps. I've been the person getting woken up at 3 AM and the person designing systems that don't wake anyone up. Conducted 80+ technical interviews. Mentored engineering teams. Built automation saving 1,500+ FTE days. Achieved 99.9% uptime while reducing costs. WHY FOLLOW ME If you're dealing with cloud costs spiraling out of control, pipelines that fail randomly, or alerts that wake you up for nothing—I've been there. I share what I learned the hard way so you don't have to. Hit Follow for practical SRE insights without the buzzwords. Want to connect? Send me a note about what you're working on. Always interested in interesting infrastructure challenges. #SRE #DevOps #CloudArchitecture #AWS #Kubernetes #Observability
Stackforce AI infers this person is a Site Reliability Engineer with extensive experience in cloud infrastructure and automation in the SaaS industry.
Location: Bengaluru, Karnataka, India
Experience: 10 yrs 8 mos
Skills
- Site Reliability Engineering
- Cloud Architecture
- Monitoring & Observability
- Continuous Integration And Continuous Delivery
- Data Center Architecture
Career Highlights
- Reduced AWS costs by 35% at Atlassian.
- Saved 906 engineering hours monthly through automation.
- Achieved 99.9% uptime for critical services.
Work Experience
CrowdStrike
Senior Site Reliability Engineer (6 mos)
Atlassian
Software Engineer (1 yr 2 mos)
Senior DevTools Engineer (1 yr 11 mos)
DevTools Engineer (2 yrs 7 mos)
Khoros
Mcs Engineer - II (1 yr)
MCS Engineer - I (4 mos)
Infogain
Critical Support Engineer (1 yr 6 mos)
Ericsson India
Network Operations Engineer (1 yr 8 mos)
Education
Bachelor of Engineering - BE at University of Kashmir