Ayush Rathore

DevOps Engineer

Bengaluru, Karnataka, India4 yrs 4 mos experience
Most Likely To Switch

Key Highlights

  • Led large-scale observability migrations saving significant costs.
  • Architected CI/CD platforms enhancing developer experience.
  • Pioneered GitOps practices improving deployment efficiency.
Stackforce AI infers this person is a SaaS-focused DevOps and Platform Engineer with expertise in observability and CI/CD.

Contact

Skills

Core Skills

DevopsPlatform EngineeringObservabilityCi/cdInfrastructure EngineeringSite Reliability EngineeringSecurity EngineeringCost Optimization

Other Skills

GitOpsKubernetesTerraformNew RelicLGTMTelemetryCost ReductionBackstageArgoCDCrossplaneKarpenterGrafana MimirThanosCustom DevelopmentOpenTelemetry

About

DevOps/Platform Engineer focused on building infrastructure that scales with the product — not against it. Over the past 4 years, I've had the privilege of being among the founding members of the platform/SRE team at two high-growth product companies. That early-stage ownership pushed me beyond just operating infrastructure — I was deeply involved in designing the foundations, making the tradeoffs, and then living with those decisions long enough to understand what scales and what doesn't. At Zepto, I work on observability infrastructure at a scale most teams don't encounter — a unified LGTM-based platform ingesting 100M+ telemetry datapoints/minute and 100+ TB/day across 800+ services, built after leading a full migration away from a fragmented multi-vendor APM stack. Alongside this, I've driven GitOps-first practices at scale — migrating 10k+ alerts and dashboards from ClickOps to code, and building CI/CD platforms that teams actually want to use. What I care most about is the depth behind the tooling. Kubernetes isn't just a deployment target — I've worked with Operators, CRDs, Karpenter, Mutating Webhooks, and spent real time troubleshooting production clusters. Observability isn't just dashboards — it's instrumentation strategy, cardinality, query performance, and making sure on-call engineers can actually find the signal. Security isn't a compliance checkbox — it's a design constraint I apply from the start. If you're building something ambitious in the infrastructure or platform space, I'm always happy to connect.

Experience

4 yrs 4 mos
Total Experience
1 yr 5 mos
Average Tenure
1 yr 6 mos
Current Experience

Zepto

DevOps Engineer 2

Nov 2024Present · 1 yr 6 mos · Bengaluru · On-site

  • 1. Spearheaded end-to-end in-house APM migration away from New Relic, unifying a fragmented multi-vendor observability stack into a single LGTM-based platform ingesting 100M+ telemetry datapoints/minute and 100+ TB/day saving ≥ 250k USD annually, while keeping all developer worklows/habits intact.
  • 2. Architected a comprehensive CI/CD platform using Backstage, ArgoCD, Crossplane for service deployments, boilerplating, cataloging, tiering, migrating 18k+ alerts/dashboards out of ClickOps while enforcing mandatory alert coverage across all services and sustaining seamless DevEx
  • 3. Orchestrated production infrastructure modernization leveraging Karpenter, Kong, Argo Rollouts, and Terraform to accelerate development velocity through intelligent node provisioning, progressive canary deployments, and seamless migrations, optimizing infrastructure costs while maintaining zero-downtime releases
  • 4. Aided the enterprise-scale metric observability stack migration from Thanos to Grafana Mimir by developing custom Tenant ID injector to solve query fanout challenges, supporting infrastructure handling multi-million samples per second at 40M+ active series with sub-100ms latency
GitOpsObservabilityCI/CDKubernetesTerraformDevOps+1

Quartic.ai

Site Reliability Engineer

May 2024Nov 2024 · 6 mos · Bengaluru, Karnataka, India · Remote

  • Kickstarted Observability/APM on all K3S/EKS Environments using OpenTelemetry/SigNoz, decreasing
  • MTTD from several hours to a few minutes, also creating custom PromQL based analytics dashboards
  • for internal frameworks/apps
  • Helm Refactoring: Added reusable template helper functions, macros, initContainer/sidecars in the product helm charts, reducing complexity and upgrade time
  • CI/CD: Replaced package manager, Optimized docker images for Image Size (using multi-stage builds) reducing build time and image sizes significantly
  • Security: Created Security Group and IAM Audit automations and alarms using AWS Config and Azure
  • Policies, Added Hashicorp Vault for secret management, expediting the compliance requirements
OpenTelemetryPrometheusHelmCI/CDAWSSite Reliability Engineering

Robomq

Platform Engineer

Jan 2022May 2024 · 2 yrs 4 mos · On-site

  • Founding member of the Platform Team, migrated manual deployment/build processes to absolute GitOps using Jenkins/Maven/Helm, increasing developer productivity and shortening release cycles by ≥ 20%.
  • Primary Kubernetes/RabbitMQ Cluster Administrator, Managed provisioning and maintenance of All Dev/Prod Cloud Agnostic EC2 based Clusters, using IaC with Terraform.
  • Designed/shipped a production-grade product-critical highly-available SFTP Server solution. (Covered on Medium) which reduced SFTP Server induced product downtime by ≥80%.
  • Cut Cloud costs by ≥ 40%: Right-sizing, Reserved/Spot Instances, Custom anomaly detection scripts.
  • Orchestrated sophisticated network architectures involving REST/OAuth, AMQP based microservices, HAProxy, NGINX, Apache, Websockets, Kubernetes Ingress, Kong/Istio API Gateways. Shipped API based rate-limiting, multi-tenancy and header-based caching to reduce exploitability & load times by ≥ 68% in the product.
  • Launched Observability with Prometheus, Elasticsearch, FluentD, cutting MTTD by 30%.
  • Deployed vulnerability detection and code scanning tools like ZAP, POM, Snort, AWS Inspector, SonarQube, enforced the same with a custom integration with GitHub PR, helping achieve SOC2 compliance.
  • Primary POC for firefighting/RCAs of all production issues, Bootstrapped a Prod Support rotation/escalator bot combining Prometheus, Power Automate and Jira, decreasing MTTR by ≥ 45%
GitOpsKubernetesTerraformPrometheusAWSPlatform Engineering

Education

Birla Institute of Technology, Mesra

Bachelor of Technology - BTech — Computer Science

Jan 2018Jan 2022

Maheshwari Public School

Preliminary — Secondary and Senior Secondary Education

Jan 2006Jan 2018

Stackforce found 100+ more professionals with Devops & Platform Engineering

Explore similar profiles based on matching skills and experience