Muthuraj Thangavel

DevOps Engineer

Bengaluru, Karnataka, India8 yrs 9 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Led DevOps team for Swiggy's cloud infrastructure.
  • Designed private cloud infrastructure with significant budget management.
  • Authored educational content for aspiring SRE professionals.
Stackforce AI infers this person is a Cloud Infrastructure and SRE expert in the SaaS industry.

Contact

Skills

Core Skills

DevopsSreDistributed TracingCloud InfrastructureConfiguration Management

Other Skills

AWSBashGCPGitGolangKafkaLeadershipLinux System AdministrationMonitoringOpenStackPublic SpeakingPuppetPythonVPNVirtualization

About

I lead the DevOps team that runs infrastructure powering all of Swiggy's business. We design, build and operate Swiggy’s cloud infrastructure and supporting platforms, to provide a seamless experience to our internal and external consumers. We are responsible for the key operational pillars (Reliability, Observability, Elasticity, Security and Governance) of the Cloud infrastructure at Swiggy. We thrive to excel & continuously improve on these key operational pillars. I am individual contributor turned engineering manager with deep rooted fundamentals in Linux, Networking, Private and Public cloud infra and SRE principles. I seek to lead from the front and strive to be hands-on. I believe in enabling the team to be the best engineers they can be and letting them work the magic.

Experience

Swiggy

3 roles

Senior Engineering Manager

Promoted

Nov 2024Present · 1 yr 4 mos · Bengaluru, Karnataka, India

  • Leading the DevOps team that takes of Swiggy's Infrastructure and Reliability.
LeadershipDevOpsSRE

Engineering Manager

Aug 2023Nov 2024 · 1 yr 3 mos · Bengaluru, Karnataka, India

SDE-4, Devops and Reliability Engineering

Apr 2022Aug 2023 · 1 yr 4 mos · Bengaluru, Karnataka, India

Linkedin

Site Reliability Engineer

Feb 2021Apr 2022 · 1 yr 2 mos · Bangalore Urban, Karnataka, India

  • Product SRE @ LinkedIn
  • Supporting LinkedIn Events and Groups products.
  • Part of the team that supports LinkedIn events, one of LinkedIn's fastest growing products. Responsible for supporting the day to day operation, monitoring and scaling.
  • Part of the team that supports LinkedIn Groups, currently working on improving SRE tenets for Groups.
  • Distributed tracing inside LinkedIn with jaeger - working on a POC that will integrate Jaeger with in-house tracing system to enable truly real-time distributed tracing, with jaeger as the front-end and our custom home brewed solution as the middleware. The work is done in golang.
  • School of SRE
  • Wrote School of SRE - System Design Module. SoS is an attempt by LinkedIn to give back to the community and address the lack of resources for someone looking to venture into the SRE career path. SoS also aims to serve as an useful resource for ramping up college grads onto the SRE stuff, and is expected to act as a curriculum.
  • My Module is at https://linkedin.github.io/school-of-sre/level102/system_design/intro/ and will serve to teach system design concepts to someone who has a reasonable familiarity with system design concepts.
SREDistributed TracingGolang

Directi

2 roles

Site Reliability Engineer - II

Promoted

Jul 2019Feb 2021 · 1 yr 7 mos

  • Site Reliability Engineer - II
  • Designed and implemented Media.net's private cloud infra based on Openstack with capex of around 750k USD and played a key role in the server and network devices procurement. Hardware selection and negotiation were key functions.
  • Led the charge on setting up the cloud, on-boarding new teams, troubleshooting performance issues and making it production-ready.
  • Scaled up the private cloud to have ~500 instances in production in one location.
  • Worked with multiple teams to ensure we had efficient utilisation of hardware, ensuring infra costs were a fraction of the cloud costs for specific applications.
  • Led and completed the movement from existing legacy Cloudstack setup within 3 months and close to 300 instances involving multiple teams.
  • Worked on inter-region connectivity to AWS sans AWS DirectConnect, using IPSec with Strongswan and ECMP on commodity hardware. This setup continues to save tens of thousands of dollars on a monthly basis.
  • Working on AWS to GCP migration, and bootstrapped the GCP setup. Worked on project and network planning, monitoring and DNS infrastructure on GCP.
  • Setup an automated TLS serving for our domain parking product handling millions of domains every year
  • Worked on migration to a new set of hardware Load balancers, ensuring near-zero downtime for the transition, engaging vendor and internal teams for a seamless upgrade.
  • Have worked on automation with Python, aiming to eliminate repetitive manual tasks
  • Troubleshooted and solved multiple complex problems involving various layers of the stack
OpenStackAWSPythonCloud InfrastructureSRE

Site Reliability Engineer

Jun 2017Jun 2019 · 2 yrs

  • Worked as core member of the Central Infrastructure team
  • Handled the configuration management system with Puppet and scaled it up to support thousands of nodes by adding multiple compile masters.
  • Worked with the Central kafka team to build management tooling around our Kafka infrastructure
  • Worked on Inter-DC and Inter-region connectivity for On-prem and AWS infrastructure, using pfsense on commodity hardware.
  • Setup and maintained an internal apt repository to handle internal packages and also a mirror for upstream packages, integrating with internal CI pipelines.
  • Played key roles in our AWS presence - setup NAT instances to save costs, inter-region tunneling before cross region VPC peering was GA, managed multiple site to site VPN connectivity and inter-cloud connectivity.
  • Managed the internal DNS infrastructure spanning >10 locations.
  • Deprecated multiple legacy setups (e.g., older gitosis) , ensuring smooth transition to their replacements internally.
PuppetKafkaAWSConfiguration ManagementCloud Infrastructure

Education

College of Engineering, Guindy

Bachelor of Engineering - BE — Computer Science

Jan 2013Jan 2017

Stackforce found 100+ more professionals with Devops & Sre

Explore similar profiles based on matching skills and experience