surabhi pandey

Director of Engineering

India8 yrs 9 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Achieved 70% cloud cost savings through FinOps strategies.
  • Designed scalable architectures for 10 million users.
  • Certified Kubernetes Administrator with expertise in DevSecOps.
Stackforce AI infers this person is a Cloud Infrastructure Engineer with strong DevOps and FinOps expertise.

Contact

Skills

Core Skills

DevopsSite Reliability EngineeringContinuous Integration And Continuous Delivery (ci/cd)Big Data EngineeringInfrastructure Management

Other Skills

KubernetesInternet of Things (IoT)Data StructuresPython (Programming Language)AnsiblePythonDockerJenkinsApache SparkShell ScriptingAlgorithmsMicrosoft AzureBackup & Recovery SystemsInfluxDBKusto Query Language (KQL)

About

Hi There! I’m glad my tech posts caught your eye. I enjoy sharing insights on Platform Engineering fundamentals, the stuff that keeps you sane while technology shifts, and I specialise in breaking down complex IT jargon into simple, actionable lessons. My expertise covers DevOps, Cloud, and FinOps: basically, anything that saves a business time and money. If you prefer the technical details, here is the "un-simplified" version of my 9-year journey: I have 9 years of experience automating, securing, and optimizing cloud infrastructure for large-scale environments like JPMC, Telstra, and Jio. I treat infrastructure as a product, ensuring that as systems scale, they remain cost-effective, secure, and resilient. Key Impact & Deliverables: • 70% Cloud Cost Savings: Delivered massive reductions through targeted FinOps strategies across multi-cloud environments. • 10 Million User Scale: Designed and deployed distributed architectures using HiveMQ + Kafka to support high-concurrency traffic for 10M+ users. • PySpark Performance: Optimized Big Data ETL jobs, cutting execution time to 1/3rd and significantly lowering compute spend. • Kubernetes & Zero Trust (NIST): Certified Kubernetes Administrator (CKA). I build secure, self-healing clusters using GitOps, RBAC, and automated security gates. • Zero Downtime Operations: 24/7 upkeep of mission-critical systems using Blue-Green and Canary deployment strategies with fully automated CI/CD pipelines. • Transformed and architected Legacy infrastructure into cloud and k8 •Lead teams and educated budding DevOps engineers Core Tech Stack: • Cloud & Containers: Azure, AWS, Kubernetes (AKS), Docker, ArgoCD. • IaC & Automation: Terraform, Ansible, Python, Shell. • Data & Messaging: PySpark, Kafka, HiveMQ, Java, React. • Security & Monitoring: Veracode, AquaSec, Prometheus, Grafana, ELK, Azure Monitor. I have a strong instinct for automation and a commitment to engineering best practices. Whether I'm designing a Disaster Recovery strategy or integrating security compliance into a pipeline, my goal is always to build resilient, long-term infrastructure.

Experience

8 yrs 9 mos
Total Experience
2 yrs 2 mos
Average Tenure
4 yrs 5 mos
Current Experience

Telstra

2 roles

Platform Architect

Promoted

Dec 2025Present · 5 mos · Remote

  • Architecting the next‑generation Telecom Capacity Planning Platform by transforming legacy workflows into a modern, cloud‑native ecosystem built on DevSecOps and FinOps principles.

Senior Devops Engineer

Dec 2021Dec 2025 · 4 yrs · Remote

KubernetesInternet of Things (IoT)DevOps

Jpmorgan chase & co.

Site Reliability Engineer

May 2021Nov 2021 · 6 mos · Mumbai, Maharashtra, India

Data StructuresPython (Programming Language)Site Reliability Engineering

Jio platforms limited (jpl)

Deputy Manager

Mar 2019May 2021 · 2 yrs 2 mos · Navi Mumbai, Maharashtra

  • My day to day responsibilities include:
  • 1. Designing and developing end to end one click/condition triggered CI/CD pipelines for Python, Go, Java, React, Angular and Android code tech stack for multiple environment.
  • 2. Designed dynamic and scalable structure for big data platform (apache spark) via docker.
  • 3. Well versed with pipeline tools- Docker, Ansible, jenkins, Sonarqube, python, ELK.
  • 4. To keep pipelines updated and running 24*7
  • 5. Automated all Test Pipelines and scripts for deployment of well tested and Stable releases.
  • 6. Developed automated Healthchecks to maintain system uptime.
  • 7. Proficient in debugging and troubleshooting as I keep myself informed of end to end Project Flow.
  • 8. Developed python microservices for project specific use case.
  • 9. Hands-on with CI/CD build tools like, gradle, maven, npm and Devops tools like Azure devops, Jenkins, Ansible, Grafana, Prometheus, SonarQube, Pcloudy, Selenium.
  • 10. To identify and automate trivial tasks via python/shell
  • 11. Involved in regular pocs to update and improve existing technologies used in project.
  • 12. SPOC for interaction between developers, DB and other involved teams
  • 13. DB designing for both SQL and Nosql DB as per application use case.
  • 14. RELEASE MANAGER: responsible for major backend releases with rollback and their documentation.
  • 15. Designing Disaster Recovery strategies
  • 16. System Monitoring and Alerting best practices
  • 17. Extremely familiar with Agile Methodology
  • Miscellaneous work and Soft skills:
  • 1. Managed DEVOPS Team: mentoring and guiding new joiners, enabling them to contribute in deliverables ASAP.
  • 2. Maintaining a positive work atmosphere and work with minimum or no supervision. I take full ownership of the deliverables.
  • 2. Regular Volunteer to organize team bonding activities
  • 3. I Keep a well documented work
  • 4. Involved in technical hiring process.
AnsiblePython (Programming Language)DevOps

Wipro technologies

2 roles

Associate Consultant

Aug 2016Apr 2018 · 1 yr 8 mos · Noida Area, India

  • Setup and troubleshooting of entire environment (Prod/Pre-prod/ Dev) readiness of all layers in three tier architecture.
  • Managed 120+VM (linux based/on-premises).
  • Enabled seperate environment (Prod/Pre-prod/ Dev) with continuous integration via Jenkins for automation of build and Ansible for deployment.
  • Management and set up of Git repositiories.
  • Wrote shell scripts to automate trivial tasks.
  • Did Setup of MYSQL and PGsql server with appropriate High availability and failover strategies.
  • Performed setup of Performance Monitoring Tool Dynatrace and Middleware/OS monitoring Tool JON.
  • HandsOn experience with middleware JBOSS and OS fine tuning.
  • Setup of HA proxy at different VMs as per strategy requirement of individual application.
  • Regular spokes person with client, and a key member in final handover of applications. Being a part of Infra team, performed modification of deployment architecture after analysis (pitched solutions to handle failover scenarios and load balancing) and did configurations accordingly
  • Designed and developed restful APIs using Spring Boot and Gradle for chat Bot and web UI.
  • DML designing after understanding the Business requirements and setup of MariaDB with replication.
  • Maintained API documentation via Swagger.
Shell ScriptingAlgorithmsInfrastructure Management

Intern

Jan 2016Jun 2016 · 5 mos · Greater Noida, Uttar Pradesh, India

  • Worked on MVP for Predictive Service Assurance for Telecos.
  • Designed Algorithm based on CDRs

Education

Jaypee Institute Of Information Technology

Bachelor of Engineering - BE

Jan 2012Jan 2016

Sumitra Modern School

Jan 2008Jan 2012

Sacred Heart Inter College, Sitapur

Jan 1998Jan 2008

Stackforce found 100+ more professionals with Devops & Site Reliability Engineering

Explore similar profiles based on matching skills and experience