A

Akshay Agarwal

DevOps Engineer

Hyderabad, Telangana, India9 yrs 8 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • 9+ years of experience in platform and infrastructure engineering.
  • Expert in Kubernetes, Terraform, and cloud architecture.
  • Proven track record of building scalable and resilient systems.
Stackforce AI infers this person is a SaaS Infrastructure Engineer with expertise in cloud platforms and DevOps practices.

Contact

Skills

Core Skills

Google Cloud Platform (gcp)KubernetesNode.jsGoogle Bigquery

Other Skills

AirflowTerraformPythonAmazon Web Services (AWS)Apache FlinkSQLAPI DevelopmentArtificial Intelligence (AI)AgentsMachine LearningSoftware DevelopmentLeadershipBig DataGoal SettingSoftware Design

About

I’m a Platform & Infrastructure Engineer with 9+ years of experience building resilient, scalable systems at high-growth tech companies like Razorpay, Gojek, and Slice. I specialize in designing internal platforms that empower developers — from self-service Kubernetes deployment pipelines and Terraform-based IaC, to monitoring, alerting, and cost-optimized cloud architecture on AWS and GCP. What I bring to the table: Deep hands-on experience in Kubernetes, Terraform, Helm, AWS, and GCP Architect self-service platforms for app delivery, observability, and security Automate infrastructure provisioning with Terraform, Helm, GitOps, and ArgoCD Build robust Kubernetes platforms (multi-tenant, secure, cost-optimized) Implement developer-focused CI/CD pipelines and secrets management at scale Integrate SLOs, auto-scaling, and fine-grained alerting for system health What I care about: Building infra that’s invisible until it’s needed Reducing cognitive load for product developers Metrics-driven scaling and SLO-driven reliability

Experience

9 yrs 8 mos
Total Experience
2 yrs 2 mos
Average Tenure
11 mos
Current Experience

Salesforce

Senior Member of Technical Staff

Jul 2025Present · 11 mos · Hyderabad, Telangana, India · Hybrid

Slice

Lead Infrastructure Engineer

Apr 2024Jun 2025 · 1 yr 2 mos · Hybrid

Gojek

Senior Data Engineer (Infrastructure)

Feb 2022Mar 2024 · 2 yrs 1 mo · Hybrid

  • Developed API in nodejs for Data Console Workflows Tool along with unit test cases. This tool greatly simplifies self serve access to data by combining individual building blocks as Workflow.
  • Carried out design and implementation with test cases for adding functionality for consuming from SSL/TLS enabled Apache Kafka clusters in ELT pipeline.
  • Implemented features in Bigquery Slot scheduler developed in python for managing creation of reservations, assignments of slots to different projects ensuring proper logging and monitoring for tool. Deployed tool in Google App Engine and used cloud scheduler, cloud tasks, cloud Pub Sub, Cloud Logging, Cloud Functions for different aspects of tool.
  • Infrastructure Planning and end-end data infrastructure automation in GCP, involving networks, VPC peering, GKE, and Bigquery, keeping all essentials like scalability, security, reliability, cost under consideration.
  • Deployed and automated management of Apache Flink, Apache Airflow, Apache Mirror Maker, Apache Spark, TimescaleDB clusters in kubernetes GKE.
  • Improved security posture of kubernetes cluster by implementing workload identity, RBAC and private access.
  • Improved observability of services by centralizing logs, metrics of all services running in kubernetes and VM's using tools like Telegraf, Cortex, Promtail, Grafana Loki, Grafana. Created Grafana Dashboards for frequently used logging queries to simplify usage of logging platform, thus effectively onboarding several users to platform.
  • Self Serve alerting through GitOps approach to setting alert rules and thresholds.
  • Developed Helm charts and Ansible Roles for tools deployed in kubernetes and configuring compute instances.
  • Implemented Bigquery usage dashboard for identifying usage in near real time for analysts. This helped to have regular usage estimations and purchasing of committed slots to save 30%+ costs.
  • Adaptable and proficient in learning new concepts quickly and efficiently.
Google Cloud Platform (GCP)Google BigQueryAirflowKubernetesTerraformPython+4

Razorpay

3 roles

Lead DevOps Engineer

Apr 2021Feb 2022 · 10 mos

  • Contributed to OKR planning and worked along with the team to drive initiatives.
  • Improved process for tracking issues using jira by building custom dashboards and boards.

Senior Infrastructure Engineer

Apr 2020Mar 2021 · 11 mos

  • Built data pipelines for SES events to lake using Amazon Kinesis and s3, analysis of AWS cost billing reports and analyzing historical old logs for compliance using hive.
  • Implemented Ververica Platform to run flink workloads in kubernetes using helm charts.
  • Worked on Apache Druid for supporting warehouse capabilities.
  • Have worked on hosting internal docker registry using Harbor open source project comprising of different proxy projects and internal projects, hosting 100's of repos and 350+ engineers onboarded and using actively, millions of image pulls every month.
  • Worked with team in implementation of Big Data Architecture comprising of Hadoop, Hive, Spark, Trino, Airflow, Jupyterhub running in kubernetes using docker containers as highly scalable, available and cloud agnostic, elastic platforms. This being used by all Analysts and for serving external clients.
  • Participated in all phases of system development life cycle from requirements analysis through system implementation with SaaS platform Qubole and Databricks for running Data Engineering workload involving spark, sql Analytics, etc in AWS Cloud.
  • Built custom prometheus target discovery tool for scraping metrics off different EC2 service endpoints using boto library in python, ec2 discovery based on tags assigned to instances.

Infrastructure Engineer

Aug 2018Mar 2020 · 1 yr 7 mos

  • Automated end-end infrastructure setup in bare AWS account using terraform, packer, ansible, kubernetes, etc
  • Worked on highly virtualized infrastructure comprising of Kubernetes to orchestrate several microservices running in docker containers.
  • Experience of running kubernetes in bare EC2 instances and in AWS EKS.
  • AWS Cloud comprising of AWS networking, EC2 virtual machines, RDS, S3, DynamoDB, IAM, Kinesis, SES, SNS etc
  • Worked on Apache Ambari for running Hadoop, YARN, zookeeper and kafka configured running all components in HA mode.
  • Worked on Apache Kafka to run in kubernetes as important component in Data Engineering Pipelines ingesting billions of events per day.

Directi

Developer Operations

Jun 2016Aug 2018 · 2 yrs 2 mos · Mumbai Area, India

  • Managed OVH Cloud running bare metal servers.
  • Used LXC containers for running workload in virtualized environment.
  • Contributed to building several modules from scratch using ansible for configuring web servers, Elasticsearch, email server, etc.
  • Automated DNS management for hundreds of domains.
  • Identified issues, analyzed information and provided solutions to problems.
  • Carried out day-day-day duties accurately and efficiently.
  • Monitored automated build and continuous software integration process to drive build/release failure resolution.

Qualcomm

Engineering Intern

May 2015Jul 2015 · 2 mos · Hyderabad Area, India

  • Worked in the performance team of Qualcomm, Hyderabad.
  • Responsibities
  • 1. Implement a system to analyse and compare various chipsets.
  • Technologies Used
  • 1. C,C++ for the development of the system.
  • 2. ADB for linking mobile with the computer.
  • 3. Worked on unix tools like top, htop, sed, awk and /proc filesystem.

Education

National Institute of Technology Warangal

Bachelor’s Degree — Computer Science

Jan 2012Jan 2016

Shivam Junior College

High School — SSC

Jan 2010Jan 2012

St. Joseph's public School

ICSE — Primary Education

Jan 2002Jan 2010

Stackforce found 100+ more professionals with Google Cloud Platform (gcp) & Kubernetes

Explore similar profiles based on matching skills and experience