Saurabh Kumar

Software Engineer

Bangalore Urban, Karnataka, India7 yrs 9 mos experience
Highly Stable

Key Highlights

  • Achieved 99.95%+ availability across cloud-native systems.
  • Implemented CI/CD pipelines enabling multiple daily deployments.
  • Reduced critical vulnerability exposure by ~40% through DevSecOps.
Stackforce AI infers this person is a Cloud Infrastructure and DevSecOps expert in the SaaS industry.

Contact

Skills

Core Skills

Cloud ArchitectureDevsecopsTechnical LeadershipDevopsCloud InfrastructureCloud MigrationWeb Application SecurityFrontend Development

Other Skills

PythonGoDockerKubernetesAWSGCPAzureCI/CDTerraformStatistical time-series modelsAPI securityArchitectural modernizationReliability standardsMentorshipSlack SDK

About

I’m currently working as a Staff Engineer at Okta, with around 8 years of hands-on experience designing, building, and running cloud-native systems across AWS, GCP, and Azure. My core strength is taking systems end to end, from architecture and implementation to production reliability and cost control. As a Staff Engineer and Cloud Architect, I design, build, review, and productionize services myself. I work primarily in Python and Go, shipping backend services and automation that are simple to operate and easy to scale. Using Docker and Kubernetes, I’ve rolled out microservices with zero-downtime deployments, hardened clusters, and reliable rollback strategies. These efforts resulted in 99.95%+ availability, ~40% improvement in p95 latency through profiling, caching, and async IO, and 20–30% lower cloud spend via rightsizing and autoscaling across clouds. Security is embedded into the delivery pipeline, not bolted on later. My DevSecOps work includes threat modeling, policy-as-code, SBOM generation, and continuous SAST/DAST. I’ve integrated image and dependency scanning into CI/CD, enforced artifact signing, and centralized secrets using Vault. This reduced critical vulnerability exposure windows by ~40% and cut remediation time from weeks to days. On the delivery side, I build CI/CD pipelines engineers actually trust. I’ve implemented GitHub Actions, GitLab, and Jenkins pipelines with Helm and Argo CD for progressive delivery, helping teams move from weekly releases to multiple deployments per day while keeping change-failure rates low. With an SRE mindset, I define SLOs, error budgets, and practical runbooks, backed by strong observability using Prometheus, Grafana, OpenTelemetry, and ELK/OpenSearch. MTTR improved by ~35% after instrumenting golden signals and tuning alerts. Infrastructure as Code is my default mode of working. Using Terraform and Kubernetes manifests, I ensure environments are reproducible, reviewable, and auditable. I’ve led both greenfield builds and large-scale migrations across AWS, GCP, and Azure, keeping costs predictable and security controls consistent. APIs and data platforms are a major part of my toolkit. I design and optimize REST APIs, background workers, and internal tools using Python (FastAPI, Django) and Go, with strong test coverage and clear SLIs. On the data side, I’ve built streaming and analytics pipelines using Kafka, Spark, and BigQuery, enabling product insights and automated checks that translate directly into measurable business impact.

Experience

7 yrs 9 mos
Total Experience
3 yrs 9 mos
Average Tenure
2 mos
Current Experience

Okta

Staff Engineer

Apr 2026Present · 2 mos · India

PythonGoDockerKubernetesAWSGCP+5

Rakuten

Technical Lead

Sep 2024Apr 2026 · 1 yr 7 mos · India · On-site

  • > Set technical direction for multi-cloud client-side architecture by leading hyperscaler integrations across AWS, GCP, and Azure, enabling enterprise scale and long-term platform extensibility.
  • > Designed and introduced a multi-stage anomaly detection framework using statistical time-series models, materially strengthening API security and advancing the platform’s threat intelligence capabilities.
  • > Drove architectural modernization of ingestion and metrics pipelines by eliminating systemic tech debt and establishing durable patterns that improved maintainability and future evolution.
  • > Established reliability and security standards for client-side systems, directly contributing to sustained 95% availability while scaling to meet growing customer and traffic demands.
  • > Acted as escalation owner for strategic customers, leading complex onboardings and high-impact incident resolution while feeding learnings back into platform and product improvements.
  • > Provided org-level technical leadership by aligning teams on execution strategy, unblocking cross-team dependencies, and ensuring complex initiatives landed with high quality.
  • > Multiplied impact through mentorship and systemization by developing engineers on advanced systems and documenting critical cloud and traffic workflows to reduce operational risk and improve execution speed.
AWSGCPAzureStatistical time-series modelsAPI securityArchitectural modernization+4

Synopsys inc

4 roles

Staff Engineer

Feb 2024Sep 2024 · 7 mos

  • > Integrated the Slack SDK and its models into core services for alerts, approvals, and chat-ops; cut context switching ~25% and reached ~70% team adoption in 3 months.
  • > Implemented unified forms across in-house apps with schema validation and RBAC; reduced manual entry by ~60%, form errors by ~45%, and request SLAs from days to hours.
  • > Optimized GCP infrastructure (GKE, Compute, BigQuery) with rightsizing and autoscaling; lowered monthly spend ~22% and improved CPU utilization ~35% without capacity loss.
  • > Orchestrated large-scale releases across eight stages (plan, build, test, release, deploy, operate, monitor) using GitHub Actions, Helm, and Argo CD; 5× deploy frequency with <5% change-failure rate.
  • > Lifted availability to ~99.95% by tightening SLOs and adding health checks, circuit breakers, and autoscaling policies.
  • > Built observability with Cloud Monitoring (Stackdriver), Cloud Logging, and OpenTelemetry; cut MTTR ~35% via better alerts, traces, and runbooks.
  • > Implemented preventative maintenance (patch windows, dependency/image scanning, chaos drills); high-severity incidents/quarter down ~30%, MTBI up ~2.1×.
  • > Ran performance assessments (profiling, caching, DB indexing); improved p95 API latency ~40% and doubled throughput on critical paths.
  • > Guided project teams with design reviews, RFCs, and pair programming; review turnaround dropped from ~2 days to <8 hours, and test coverage rose to 85%+.
  • > Collaborated with product, security, and data teams to remove bottlenecks; idea-to-prod lead time shrank from ~3 weeks to ~5 days.
  • > Standardized Infrastructure as Code with Terraform modules and policy checks; drift incidents down ~50% and environment setup time cut from hours to minutes.
  • > Evaluated and integrated fit-for-purpose tech (Pub/Sub, workflows, managed secrets); POC-to-production cycle time reduced ~50% while keeping costs predictable.
Slack SDKGCPGitHub ActionsHelmArgo CDCloud Monitoring+3

Consultant

Promoted

Jun 2022Feb 2024 · 1 yr 8 mos

  • > Deployed on-premises Jira Data Center on Kubernetes with Terraform (IaC), adding HPA/VPA, node pools, and repeatable cluster builds; improved resource utilization from ~55% to ~80% and sustained 99.95% uptime.
  • > Migrated the existing Jira Data Center stack to Google Cloud Platform on GKE with private clusters, Filestore for shared home, and Cloud SQL for PostgreSQL; achieved zero-downtime cutover, cut p95 page load latency by ~35%, and set RTO 30 min / RPO 5 min.
  • > Built end-to-end DevOps pipelines (build, test, security scans, backups, blue-green/canary upgrades); increased deployment frequency 6× (weekly → multiple/day) and reduced lead time for changes from days to hours.
  • > Designed in-house dashboards joining relational and NoSQL data; lowered time-to-insight from hours to minutes, standardized KPIs, and enabled self-service reporting for support and engineering (adopted by 6+ teams).
  • > Managed multiple codebases across the SDLC with trunk-based development, code owners, and automated checks; raised unit/integration coverage to 80%+ and reduced PR cycle time by ~40% while keeping defect escape low.
  • > Delivered Salesforce integrations (REST and Bulk APIs) for users, cases, and asset data; near-real-time sync (<5-minute lag); eliminated double entry for operations; and reduced data mismatch incidents by ~70%.
  • > Institutionalized best practices: Git workflows, CI/CD quality gates, contract and load testing, and automated rollbacks; kept change-failure rate under 5% and improved MTTR by ~35% through runbooks and targeted alerts.
  • > Enforced security with RBAC, SAML/OIDC SSO, network policies, signed images, and encryption in transit/at rest; integrated Trivy/Grype/DAST into pipelines, cutting critical CVEs by ~60% and passing internal audit checks.
KubernetesTerraformJiraCloud SQLREST APIsCI/CD+2

Associate Consultant

Promoted

Dec 2020Jun 2022 · 1 yr 6 mos

  • > Integrated Burp Suite Enterprise for continuous web app testing across 10+ services; tuned scan policies and issue workflows to cut false positives by ~30% and reduce triage time by ~50%.
  • > Leveraged open-source scanners (OWASP ZAP, Nuclei, Trivy, Grype) in CI to gate builds on severity; lowered critical/high CVE backlog by ~65% and moved MTTR for vulns from ~14 days to ~5 days.
  • > Earned GCP Professional Cloud Architect; designed cloud-native reference architectures that improved service reliability to 99.95%+ and trimmed cloud spend by ~20% via rightsizing and autoscaling.
  • > Executed end-to-end integrations using Django/Flask backends with Tableau; built secure REST data services with row-level security, 5-min refresh SLAs, and caching—cut reporting time ~60% and tripled dashboard adoption.
  • > Built and maintained DevOps pipelines (GitHub Actions/GitLab/Jenkins) with canary/blue-green strategies; increased deploy frequency from weekly to daily while keeping change failure rate under 5% and improving MTTR by ~35%.
  • > Wrote Python/Go automation to remove manual release, backup, and compliance checks; saved ~40+ engineer-hours per month and dropped handoff errors by ~90% with idempotent scripts and audit logs.
  • > Provided hands-on consultancy for security best practices, threat modeling, and policy-as-code; implemented Vault-backed secrets, least-privilege IAM, and WAF baselines.
  • > Standardized Infrastructure as Code with Terraform and Kubernetes manifests; reduced environment drift to near-zero and cut provisioning time from days to hours across dev/stage/prod.
  • Optimized application integration patterns (OAuth2/SAML SSO, API gateway, rate limiting, and caching); improved p95 latency by ~35% and stabilized throughput under peak traffic.
  • > Stood up observability with Prometheus, Grafana, OpenTelemetry, and structured logging; defined SLOs/error budgets, halved alert noise, and enabled data-driven on-call runbooks.
Burp SuiteOWASP ZAPDjangoFlaskTableauDevOps+1

Associate

Sep 2018Dec 2020 · 2 yrs 3 mos

  • > Developed user interfaces with React and Angular, employing best practices for responsive and interactive designs. [Metrics: +28% task completion, -35% bounce, LCP p95 1.9s, Lighthouse 95+]
  • > Collaborated with design teams via Figma and InVision to translate UX/UI concepts into functional prototypes, ensuring seamless user experiences. [Metrics: design-to-dev cycle -30%, NPS +12, SUS 86]
  • > Contributed to Burp plugin development, enhancing security testing capabilities and bolstering application defenses against vulnerabilities. [Metrics: 120+ staging vulns flagged, P1s in prod -60%, scan time -40%]
  • > Automated tasks using Python and JavaScript, streamlining workflows and boosting productivity through script-based solutions. [Metrics: ~10 hrs/week saved, CI steps -45%, failure rate -25%, ROI 4.2x]
  • > Managed multiple small projects, overseeing development and deployment stages to meet objectives efficiently. [Metrics: 96% on-time delivery, budget variance -12%, stakeholder CSAT 4.7/5]
  • > Engaged in code reviews, debugging sessions, and optimization initiatives to elevate project quality and performance standards. [Metrics: defect density -38%, p95 latency -42%, MTTR -33%, coverage 85%+]
ReactAngularPythonJavaScriptFigmaFrontend Development

Mayadata

MEAN Stack Web Developer

Aug 2018Sep 2018 · 1 mo · Bengaluru, Karnataka, India

  • > Interfaced with a cross-functional team of business analysts, developers, and technical support professionals to determine a comprehensive list of requirement specifications for new applications.
  • > Investigated new and emerging software applications within the industry to design, select, implement, and use administrative information systems effectively.
  • > Majorly worked on 5 technologies that were Angular, Git, Adobe, Bootstrap 5, and CSS.
AngularGitAdobeBootstrapCSS

Education

MITx Courses

MicroMasters® Program — Statistics and Data Science (Social Sciences Track)

Jan 2025Present

University of Allahabad

B.Tech — Computer Science

Jan 2014Jan 2018

SKD ACADEMY INTER COLLEGE

CLASS 12 — SCIENCE

Jan 2011Jan 2013

Stackforce found 100+ more professionals with Cloud Architecture & Devsecops

Explore similar profiles based on matching skills and experience