Ravikanth Garimella

SRE (Site Reliability Engineer)

Bengaluru, Karnataka, India15 yrs 10 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Achieved 99.99% uptime across 50K+ servers.
  • Saved millions through architectural simplification.
  • Mentored engineers and improved team productivity.
Stackforce AI infers this person is a Senior Site Reliability Engineer with expertise in SaaS infrastructure and reliability.

Contact

Skills

Core Skills

Site Reliability EngineeringSystem DesignData EngineeringSoftware DevelopmentDatabase Administration

Other Skills

Application DevelopmentAutomationCapacity PlanningCommunicationCritical ThinkingData IngestionDatabasesDecision-MakingDistributed SystemsDockerGitGo (Programming Language)HadoopJavaKafka

About

I’m a Senior SRE with 13+ years of experience designing and scaling distributed infrastructure at massive scale. At LinkedIn, I’ve helped drive 99.99% uptime across 50K+ servers, led search infra that serves 10K+ QPS, and saved millions in infra spend through architectural simplification and deep observability engineering. My core strengths lie in system design, Golang development, capacity planning, and building automation that turns toil into leverage. I believe in designing systems that scale reliably, evolve safely, and run efficiently — all without burning out the humans behind them. I’ve mentored engineers, partnered cross-functionally across SRE, Dev, and PM orgs, and love turning ambiguity into clear, scalable reliability strategies. Currently exploring Staff/Principal SRE opportunities where I can bring high-impact systems thinking, reliability culture, and cost-aware infra design to the table.

Experience

Linkedin

2 roles

Senior Site Reliability Engineer

Promoted

Sep 2019Present · 6 yrs 6 mos

  • Led Linkedin's search infrastructure, supporting 100+ verticals with traffic
  • of 10,000 QPS, maintaining 99.99% uptime.
  • Automated OS upgrades for 50,000+ servers, later scaling the framework
  • to 200,000 servers, reducing downtime and operational workload.
  • Collaborated with Dell to resolve DIMM issues, resulting in $1M cost
  • savings across 5,000 hosts.
  • Optimized search performance through Java profiling, JVM tuning, and
  • load testing, significantly reducing hardware costs.
  • Built CLI tools for compliance tracking, delivering real-time insights for
  • faster issue resolution.
  • Designed a Go-based metric ingestion system to achieve low-latency
  • performance and enhance observability.
  • Managed large-scale deployments and CI/CD pipelines, improving release
  • efficiency and SLA compliance.
  • Spearheaded incident response and postmortem analyses, driving
  • process improvements and reducing recurring issues.
  • Key Achievement:
  • Reduced operational cost and improved search performance by 50%
  • through system optimization.
Large scale System designDistributed SystemsObservability EngineeringCapacity PlanningGo (Programming Language)Site Reliability Engineering+4

Site Reliability Engineer

Nov 2015Sep 2019 · 3 yrs 10 mos

  • Managed large-scale data ingestion pipelines, integrating diverse sources
  • like Kafka, Oracle, MySQL, and third-party vendors.
  • Designed and implemented the China Data Analytics pipeline for seamless
  • multi-colo Hadoop operations.
  • Built tools for efficient debugging and implemented proactive alerting
  • and monitoring systems.
  • Ensured data reliability for executive-level reporting, meeting strict SLAs.
  • Key Achievement:
  • Developed tools that improved triage efficiency by 50%, for dataset
  • availability for China Data Science and Machine Learning Team for
  • analytics
  • Reduced SLAs from 2 weeks to 2 days for new dataset onboarding for
  • China
Data IngestionHadoopOracle Database AdministrationMySQLKafkaMonitoring Systems+2

Bank of america

Senior Software Engineer

Jan 2012Nov 2015 · 3 yrs 10 mos · Hyderabad Area, India

  • Collaborated with stakeholders to design and develop Credit Risk
  • Applications using PL/SQL workflows.
  • Mentored junior engineers to enhance team productivity and coding
  • skills.
  • Deployed applications seamlessly to UAT and production environments.
PL/SQLApplication DevelopmentMentoringSoftware Development

Tech mahindra

Oracle DBA

Jun 2010Sep 2012 · 2 yrs 3 mos · Bhubaneshwar Area, India · On-site

  • Managed 40+ Oracle databases, performing upgrades, patching, and
  • performance tuning.
  • Automated critical tasks like space reporting and data purging, improving
  • operational efficiency.
Oracle Database AdministrationAutomationPerformance TuningDatabase Administration

Education

GMR Institute of Technology (GMRIT), GMR Nagar, Rajam, Srikakulam Dt.,-532127 (CC-34))

Bachelor of Technology (B.Tech.) — Computer Science

Jan 2005Jan 2009

Stackforce found 100+ more professionals with Site Reliability Engineering & System Design

Explore similar profiles based on matching skills and experience