Rahul Varghese

SRE (Site Reliability Engineer)

Bengaluru, Karnataka, India9 yrs 2 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in building AI-assisted development workflows
  • Proven track record in multi-tenant SaaS platform development
  • Strong background in Site Reliability Engineering and cloud-native tools
Stackforce AI infers this person is a Site Reliability Engineer with expertise in SaaS and cloud-native infrastructure.

Contact

Skills

Core Skills

Site Reliability EngineeringCloud-native ToolsSystem Engineering

Other Skills

AI-assisted workflowsAmazon Web Services (AWS)BashBash scriptingCCI/CD pipelinesData AnalysisDatadogDockerEKS infrastructureGitHubGitLabGitOps workflowsGo (Programming Language)Kubernetes

About

Experienced Site Reliability Engineer with excellent skills in Cloud-Native tools. Has a Bachelor of Technology in Electronics and Communication Engineering from FISAT.

Experience

9 yrs 2 mos
Total Experience
1 yr 10 mos
Average Tenure
3 yrs 9 mos
Current Experience

Precisely

2 roles

Principal Site Reliability Engineer

Promoted

Jul 2024Present · 1 yr 9 mos

  • Built AI-assisted pull-request review workflows and intelligent auto-approval bots to accelerate development cycles.
  • Developed an intelligent SRE Slack bot integrated with Datadog MCP and internal knowledge bases for SLO suggestions, alert investigations and log analysis.
  • Designed intelligent alerting —correlating alerts with low-traffic patterns, identifying SLO misalignment from past data,
  • Improved notifications for visibility into a fast-paced release environment—includes infrastructure changes, GitLab events, releases, etc.
AI-assisted workflowsSlack bot developmentalerting designGitLabSite Reliability EngineeringCloud-Native Tools

Senior Site Reliability Engineer

Jul 2022Jul 2024 · 2 yrs

  • Helped build a multi-tenant SaaS platform from the ground up, enforcing microservice best practices for reliability and scalability.
  • Set up EKS infrastructure and GitOps workflows to enable scalable, consistent and fast development environments.
  • Implemented observability best practices using Datadog—SLIs/SLOs, traces, logs, service catalog and reliability scorecards.
  • Led incident management, on-call operations and drove post-incident improvements.
multi-tenant SaaS platformEKS infrastructureGitOps workflowsDatadogSLIs/SLOsincident management+2

Qlik

Site Reliability Engineer

Nov 2018Nov 2021 · 3 yrs · Bangalore

  • Built and designed highly scalable kubernetes infrastructure.
  • Implemented scalable metrics systems with prometheus, grafana, cortex
  • Added observability to the infra with openTelemetry implementations.
  • Managed CI/CD pipelines and software as a code with github
  • Contributed to cloud native open source projects (golang).
  • Designed and migrated monolithic python project to micro-service architecture.
  • SRE Oncall, incident management, encourage SRE best practice in development and other cool SRE stuffs.
KubernetesprometheusgrafanaopenTelemetryCI/CD pipelinesGitHub+2

Endurance international group

System Engineer

Aug 2017Nov 2018 · 1 yr 3 mos · bangalore

  • Hosting Specialist.
  • Built scripts and other automations for various linux web/email server maintenance.
  • Deep troubleshooting and implementing solutions for linux server issues.
  • Maintaining Windows servers.
LinuxscriptingtroubleshootingWindows serversSystem Engineering

Metclouds technologies

System Engineer

Sep 2016Aug 2017 · 11 mos · kakanad

  • Installing and maintaining web/email servers.
  • Troubleshooting website issues.
  • Bash scripting for various automations across teams.
  • Customer handling with ticketing.
  • Managing Webserver administration tools.
web/email serversBash scriptingcustomer handlingSystem Engineering

Clado solutions

Engineering Intern

Jun 2016Sep 2016 · 3 mos

  • - Exploring various linux distributions.
Linux distributions

Education

Federal Institute of Science and Technology: FISAT

Bachelor of Technology (B.Tech.) — Electronics and Communication Engineering

Jan 2012Jan 2016

Stackforce found 100+ more professionals with Site Reliability Engineering & Cloud-native Tools

Explore similar profiles based on matching skills and experience