Rohit Sahu — DevOps Engineer

Experienced and results-driven DevOps / Site Reliability Engineer with 5 years of hands-on expertise in automating and optimizing critical deployments across large-scale infrastructures. Expert in leveraging DevOps principles and Infrastructure as Code (IaC) to design scalable solutions that streamline operations and deliver efficient solutions. Passionate about configuring one-click solutions to drive scalability and enhance productivity. With a strong commitment and proven track record in migrating services to cloud platforms to operational excellence, I continuously strive to enhance system reliability and drive efficiency. I work in the intersection of SRE, DevOps, and distributed systems, focusing on reliability, observability, and deep system behavior. My interest goes beyond dashboards and alerts — I care about the signals systems emit before they fail, and how we can surface what usually stays invisible. Over the years, I’ve worked on production-scale infrastructure where debugging isn’t about logs alone — it’s about understanding system reflexes, performance boundaries, and behavior under stress. That curiosity led me to build K-Reflex. K-Reflex is a project born out of real-world frustration: invisible metrics, delayed signals, and limited insight into what systems are actually doing. It’s an ongoing attempt to rethink observability — drawing inspiration from eBPF, low-level telemetry, and distributed tracing — and turn raw signals into meaningful insight. Beyond tools and tech, I strongly believe in: • Discipline over motivation • Systems thinking over quick fixes • Building before talking • Sharing knowledge openly I’m actively interested in: • Observability & eBPF • Distributed systems & tracing • Platform reliability at scale • Automation, infra design, and system internals Check out: https://kreflex.rohitsahu.me KeyWords: Multi Cloud | Linux | Terraform | Kubernetes | Python | Golang | Senior Site Reliability Engineer | SRE | Senior DevOps

Stackforce AI infers this person is a Cloud Infrastructure Engineer with a strong focus on Site Reliability Engineering in SaaS environments.

Location: Bengaluru, Karnataka, India

Experience: 5 yrs 5 mos

Skills

Cloud Engineering
Site Reliability Engineering
Infrastructure As Code
Automation

Career Highlights

Expert in automating large-scale cloud deployments.
Passionate about enhancing system reliability and observability.
Innovator behind K-Reflex for improved system insights.

Work Experience

Sprinklr

Senior Cloud Engineer (1 yr 3 mos)

Meesho

SDE II - Infrastructure [ SRE + DevOps + Dev Productivity ] (1 yr 7 mos)

Chegg Inc.

Subject Matter Expert (9 mos)

Innovaccer

Site Reliability Engineer (2 yrs 9 mos)

CETPA Infotech Pvt. ltd.

AWS Intern (5 mos)

Education

B.tech at Shri Ram Murti Smarak (SRMS) Institutions

Rohit Sahu

DevOps Engineer

Bengaluru, Karnataka, India5 yrs 5 mos experience

Key Highlights

Expert in automating large-scale cloud deployments.
Passionate about enhancing system reliability and observability.
Innovator behind K-Reflex for improved system insights.

Stackforce AI infers this person is a Cloud Infrastructure Engineer with a strong focus on Site Reliability Engineering in SaaS environments.

Contact

Skills

Core Skills

Cloud EngineeringSite Reliability EngineeringInfrastructure As CodeAutomation

Other Skills

AWSGCPGoogle Cloud Platform (GCP)Cloud MigrationMonitoring & ObservabilitySoftware DeploymentAmazon Web Services (AWS)PythonShellTerraformKubernetesTime ManagementData StructuresGo (Programming Language)Linux

About

Experience

5 yrs 5 mos

Total Experience

2 yrs 2 mos

Average Tenure

1 yr 3 mos

Current Experience

Sprinklr

Senior Cloud Engineer

Mar 2025 – Present · 1 yr 3 mos · Hybrid

AWSGCPCloud Engineering

Meesho

SDE II - Infrastructure [ SRE + DevOps + Dev Productivity ]

Oct 2023 – May 2025 · 1 yr 7 mos · Bengaluru, India · Hybrid

Reliability & Incident Management
Troubleshoot and resolve production infrastructure incidents, lead in-depth root-cause analyses, and develop automated remediation playbooks—improving system uptime and response times.
Infrastructure as Code & GitOps • Designed and maintained Terraform-based, multi-cluster deployments on cloud platforms, integrating GitOps practices to streamline rollouts and enforce configuration consistency.
CI/CD & Automation
Developed and enhanced CI/CD pipelines using industry-standard tools, automating and reducing manual effort across teams.
Cloud Migration & Cost Efficiency
Contributed on a large-scale migration of core services (6+ internal service with ownership ) between public clouds(AWS-> GCP) , collaborating on automation strategies that boosted performance and brought down operational overhead.
Monitoring, Logging & Alerting
Built and refined centralized observability solutions—dashboards, alerting rules, and logging patterns—that enable proactive issue detection and faster troubleshooting.
Internal Tooling & Developer Enablement
Created internal SRE/DevOps tooling (APIs, dashboards, secret-management, RCA workflows, Alerting) to surface reliability metrics, automate routine diagnostics, and empower engineering teams with self-service capabilities. Cross-Functional
Collaboration & Mentorship
Mentored interns and SDEs while collaborating with architecture, security, and product teams to establish robust infrastructure standards, bolster system resilience against critical failures, and drive post-incident reviews and knowledge-sharing guilds.

Google Cloud Platform (GCP)AWSSite Reliability EngineeringInfrastructure as Code

Chegg inc.

Subject Matter Expert

Feb 2021 – Nov 2021 · 9 mos

Provided comprehensive solutions to diverse computer science questions posed by students, with a specialization in coding, mathematics, and related areas.

Time ManagementData Structures

Innovaccer

Site Reliability Engineer

Jan 2021 – Oct 2023 · 2 yrs 9 mos · Noida, Uttar Pradesh, India

Responsible for researching and implementing new DevOps practice Automation ways.
Resolving service, access, and incident Jira issues as an SRE, minimizing downtime.
Troubleshooting day-to-day production issues to ensure uninterrupted operations.
Developing automation scripts for improving efficiency and productivity. (Python, Shell)
Contributing to achieving SRE reliability OKRs for improving system stability.
DBRE Team
Assisted with a robust lambda script for SNS alert on upsize/downsize events for RDS instances.
Collaborated on a python-boto3 script to upgrade 100+ RDS instances to Graviton, reducing costs.
ESCROW
Reconfigured one-click pipeline with Terraform, Helm, Kubernetes, and ArgoCD.
Automated secret bootstrapping and vault credential management.
Developed script for exporting/importing vault credentials via AWS S3.

Software DeploymentAmazon Web Services (AWS)Site Reliability Engineering

Cetpa infotech pvt. ltd.

AWS Intern

Jun 2020 – Nov 2020 · 5 mos · Noida, Uttar Pradesh, India

Collaborated on diverse AWS services, developed scalable architectures with VPC and Auto Scaling, and deployed applications using Elastic Beanstalk and Docker.
Gained hands-on experience in monitoring and optimizing AWS resources with CloudWatch and CloudTrail.

Amazon Web Services (AWS)AutomationCloud Engineering