Manivannan G

DevOps Engineer

Bengaluru, Karnataka, India14 yrs 5 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Built and led high-performing DevOps teams.
  • Expert in cloud infrastructure and automation.
  • Achieved 99.99% uptime in production environments.
Stackforce AI infers this person is a DevOps leader in Fintech and SaaS industries, specializing in cloud infrastructure and automation.

Contact

Skills

Core Skills

DevopsCloud OperationsSite Reliability Engineering

Other Skills

AWSAWS EKSAlertingAmazon EKSAnsibleAutomationBashCI/CDChefCloud ComputingCollectDCommunicationComputer NetworkingContinuous Integration (CI)Core Java

About

Hands-on DevOps leader with 14 yrs of SRE/DevOps experience, including 5 yrs as an Engineering Manager. At Qubole & Moveworks, built SRE/DevOps teams from the ground up, coached, and mentored engineers varying from Intern to Staff level. Planning and building cost-optimized infrastructure for distributed systems, without affecting performance at a large scale (both monolith and microservice environments). Automation with shell/python scripts. Designing and implementing monitoring & alerting frameworks with Prometheus, Grafana, Chef/Ansible, Kubernetes. Proficient in AWS, Kubernetes, Linux, Python, Terraform. Intermediate in GCP. Striving for automation, scalability, and reliability of the production platform with 99.99% uptime.

Experience

14 yrs 5 mos
Total Experience
2 yrs 4 mos
Average Tenure
4 yrs 8 mos
Current Experience

Moveworks

2 roles

Senior Engineering Manager - Devops

Promoted

Oct 2024Present · 1 yr 6 mos

Engineering Manager - DevOps

Aug 2021Nov 2024 · 3 yrs 3 mos

  • First DevOps hire in India; built a balanced team of 10.
  • Leading various verticals: Multi-region, CI/CD optimisation, Developer productivity/experience initiatives, Cost optimisation, FedRAMP, reliability and fault-tolerant.

Recko | a stripe company

Engineering Manager - DevOps

Jul 2020Aug 2021 · 1 yr 1 mo · Bengaluru, Karnataka, India

  • Led the DevOps and SRE team at Recko, a Fin-Tech startup, we handle large volume of financial transactions data. Our team had built and manage the production platform which is performant, reliable, secure and compliant (PCI, SOC2). Migrated the monolith infrastructure to Kubernetes on AWS EKS. Contributed technically along with managerial.
  • Led the infra migration from monolith to microservices (Kubernetes on AWS EKS)
  • Planning, executing and optimising:
  • Platform reliability, Scaling, Monitoring, Logging, Tracing, Deployment, Infrastructure as code.
  • Always thrive to identify repeatable/error prone patterns and automate them
  • Enabled seamless deployment and config management with CICD and GitOps (ArgoCD)
  • Helped dev team with containerising apps: review reliability, scalability and security aspects
  • Drive initiatives to improve: cost reduction, infra security, developer productivity
  • Standardised and streamlined the operational processes/tools for change management
  • Define SLO/SLI, measure, monitor and review with stakeholders
  • Review oncall duties and alerts/incidents; reduce the toil and RCA for incidents
  • Hiring, coaching the DevOps engineers in the team, help with technical guidance
KubernetesAWS EKSCI/CDGitOpsMonitoringLogging+4

Qubole

2 roles

Staff Site Reliability Engineer

Jun 2019Jul 2020 · 1 yr 1 mo

  • Joined as one of the founding SRE, built & lead the team.
  • Contributions:
  • Tech Lead for the SRE team, joined as one of the first SREs, built and grown the team of 5.
  • Building, automating and maintaining infrastructure for highly distributed environments in AWS across multiple regions.
  • Design monitoring and alerting solutions (CollectD, SignalFx, Prometheus, ELK, NewRelic)
  • Deployments with Kubernetes, Chef, Jenkins
  • Cloud cost optimization (AWS, GCP)
  • Tools Automation for various purpose (python, shell)
  • Cross-team collaboration with security, QA and dev teams
  • Troubleshooting in Linux systems and application issues
  • Disaster recovery plans and tests
  • Incident management
  • Also part of 16X7 oncall
AWSKubernetesMonitoringAlertingAutomationIncident Management+2

Senior Site Reliability Engineer

Aug 2017Jun 2019 · 1 yr 10 mos

Flipkart

Operations Engineer-III (DevOps)

Jul 2015Aug 2017 · 2 yrs 1 mo

  • DevOps in Flipkart Ads team
  • Had built a scalable production environment on the KVMs
  • Supporting tooling that allows the developers to build and deploy seamlessly with Ansible
  • Designed efficient monitoring and alerting solutions for monitoring critical JVM app & system metrics
  • Installed, Configured and Managed Storm, Aerospike(NoSQL), MySQL, OpenTSDB clusters with Ansible
  • Scripting and automation of mundane tasks to reduce manual intervention with Shell and Python scripts
  • Performance tuning of HAproxy(Load balancers) to scale high qps
  • Monitoring and ensuring system uptime and performance metrics for system and app metrics (using Nagios, CollectD, Graphite cluster, OpenTSDB)
  • As part of 16x7 oncall team, support Linux multi-tier infrastructure in production to ensure high availability availability along with DR plans to ensure business continuity
  • Managed small scale AWS cluster(EC2, S3), Route 53

Centurylink

Software Engineer

Nov 2013Jul 2015 · 1 yr 8 mos · Bangalore

AnsibleMonitoringAutomationShell ScriptingPythonDevOps+1

Nokia

Operations Engineer

Nov 2011Nov 2013 · 2 yrs · Bangalore

Education

Anna University Chennai

B.Tech — Information Technology

Jan 2007Jan 2011

Stackforce found 100+ more professionals with Devops & Cloud Operations

Explore similar profiles based on matching skills and experience