Surajit Mondal

SRE (Site Reliability Engineer)

Mumbai, India4 yrs 10 mos experience

Key Highlights

  • Reduced provisioning time by 85% using Terraform.
  • Standardized 250+ DevSecOps pipelines for faster deployments.
  • Built observability stacks improving MTTR by 40%.
Stackforce AI infers this person is a Cloud Infrastructure and DevOps Engineer with expertise in scalable systems.

Contact

Skills

Core Skills

Site Reliability EngineeringCloud InfrastructureDevopsPlatform EngineeringMonitoring & ObservabilitySoftware DevelopmentLeadershipWeb Development

Other Skills

KubernetesGoAmazon Web Services (AWS)TerraformAWSBackstageDockerPrometheusGrafanaELKWeb ScrapingBeautiful SoupDSATeam ManagementC++

About

I'm a Cloud and DevOps Engineer with a strong focus on building secure, scalable, and automated cloud infrastructure. At ICICI Lombard, I’ve led transformative cloud initiatives — from architecting production-grade environments across AWS Cloud to rolling out standard CI/CD and DevSecOps practices across 250+ projects. 🔧 I automate everything I can — provisioning with Terraform, deploying via Jenkins pipelines, and running containerized apps in EKS, Lambda, and ECS. I’ve reduced provisioning time by 85%, improved pipeline consistency by 70%, and helped teams ship faster, safer, and smarter. ☁️ Whether it's Kubernetes administration, multi-cloud networking, or building observability stacks (Prometheus, Grafana, ELK, Jaeger) — I thrive in complex, large-scale environments and love creating solutions that just work (and scale beautifully). ☁️ Beyond Cloud & DevOps, I bring a Platform Engineering mindset — enabling developers with self-service infrastructure, reusable Terraform modules, shared CI/CD libraries, and automated guardrails. My focus is to make infrastructure invisible for developers while ensuring it remains secure, compliant, and cost-efficient. 🏆 Recognized with awards like “One IL One Team” and “Transformation Ambassador,” I bring not just skill, but ownership, collaboration, and an obsession with clean, reliable infrastructure. ⚡ Skills ☁️ Cloud Platforms AWS: IAM, EC2, S3, Lambda, EKS, ECS, API Gateway, CloudFront, EBS, EFS, ELB, RDS, ElastiCache, MSK (Kafka), OpenSearch, AWS DMS, Route 53, VPC, Transit Gateway, NAT Gateway, VPC Peering, VPC Endpoints, Resolver Rules, Hosted Zones, ECR, Secrets Manager, CloudWatch, CloudTrail, SNS, SQS, KMS, ACM, CloudFormation, CodeBuild, CodeDeploy, CodePipeline GCP: Compute Engine (VMs), VPC, Cloud Armor, Cloud Load Balancing, Instance Groups ⚙️ DevOps, IaC & Automation Terraform, Kubernetes, Docker, Jenkins, Git, GitHub, ArgoCD, Helm 📊 Monitoring & Observability Dynatrace, Prometheus, Grafana, Elasticsearch, Jaeger 💻 Programming Languages Python, Golang 🖥️ Operating Systems Linux (Proficient in administration, troubleshooting, and shell/bash scripting) 📜 Certifications: AWS Certified Solutions Architect – Associate AWS Certified Developer Associate Certified Kubernetes Administrator (CKA) Certified Kubernetes Application Developer (CKAD) HashiCorp Certified: Terraform Associate Explore my project section for a detailed overview of my skills. 📬 Let’s connect if you’re into cloud, DevOps, or just like geeking out over scalable systems. Email: monsurajit640@gmail.com

Experience

4 yrs 10 mos
Total Experience
2 yrs 1 mo
Average Tenure
7 mos
Current Experience

Pythian

Software Engineer - Site Reliability Engineering

Oct 2025Present · 7 mos · Hyderabad, Telangana, India · Hybrid

  • Pythian | Client: Google
  • Site Reliability Engineer working on Google Distributed Cloud (GDC) – Air-Gapped platforms, focused on building and operating highly reliable, secure, and scalable cloud infrastructure for isolated and regulated environments.
  • Driving operational excellence through SRE best practices, including automation, observability, incident management, and reliability engineering, while supporting mission-critical workloads at scale.
KubernetesGoSite Reliability EngineeringCloud Infrastructure

Icici lombard

2 roles

Software Engineer - Cloud & DevOps

May 2025Oct 2025 · 5 mos · Mumbai, Maharashtra, India · On-site

  • Implemented multi-region disaster recovery for Kubernetes clusters, automating replication and failover along with RDS, OpenSearch, and Kafka, delivering seamless application continuity, minimal downtime, and resilient infrastructure for business-critical workloads.
  • Built a GCP Landing Zone with multi-project setup, VPC peering, and Cloud Armor, ensuring centralized governance and secure traffic flow.
  • Implemented Gateway Load Balancer (GLB) for DR region, achieving zero downtime and automatic failover for critical workloads.
  • Built Grafana dashboards with a Flask API wrapper to monitor Ansible automation jobs, enhancing observability and reducing troubleshooting time.
  • Mentored and trained team members on AWS, Terraform, Kubernetes, and DevOps practices, accelerating skill development and strengthening the team’s cloud and platform engineering capabilities.
Amazon Web Services (AWS)KubernetesCloud InfrastructureDevOps

Associate Software Engineer - Cloud & DevOps

Jun 2023May 2025 · 1 yr 11 mos · Mumbai, Maharashtra, India · On-site

  • Architected and automated production-grade AWS infrastructure with Terraform, reducing provisioning time by 85% and standardizing 5+ environments.
  • Pioneered a self-service platform by deploying Backstage, enabling developers to generate production-ready .NET applications with standardized templates (GitHub repo, Docker, Jenkins pipeline, Terraform infra, and Kubernetes resources).
  • Transformed application onboarding from a 21-day manual process to an automated <10-minute workflow, empowering teams to launch apps with built-in compliance, scalability, and CI/CD from day one - dramatically boosting developer velocity and platform adoption.
  • Engineered and managed large-scale Kubernetes (EKS) clusters, leveraging Helm to standardize deployments and enforce best practices — achieved 99.99% uptime, accelerated application rollouts by 70%, and cut incident resolution time in half.
  • Built GPU-enabled EKS clusters for GenAI workloads, optimizing model training and inference, and accelerating AI readiness by 40%.
  • Standardized 250+ DevSecOps pipelines, boosting deployment velocity and reducing maintenance overhead by 80%.
  • Engineered AWS network architecture with VPC, Transit Gateway, NAT Gateway, and peering, improving cross-account connectivity and reducing latency by 40%.
  • Integrated a full observability stack (Prometheus, Grafana, ELK, Jaeger) for real-time monitoring, improving detection and reducing MTTR by 40%.
  • Delivered secure, scalable video hosting solution on S3 + CloudFront, reducing latency by 30% while ensuring content protection.
  • Led multiple cloud POCs, testing and validating architectures to drive adoption of innovative, scalable, and secure cloud solutions.
Amazon Web Services (AWS)KubernetesCloud InfrastructureDevOps

Dissent times

Member of Technical Staff

Dec 2022Feb 2023 · 2 mos · Mumbai, Maharashtra, India · Remote

  • Developed robust Python-based web scraping scripts to extract social media content from WordPress sites, automating content aggregation and reducing manual workload by 50%.
  • Enhanced data accuracy and processing speed, enabling editorial teams to access real-time social insights and improving publishing efficiency by 30%.
Web ScrapingBeautiful SoupSoftware Development

Codechef

2 roles

President at TCET Codechef College Chapter

Jun 2021May 2022 · 11 mos

  • Led the student developer community, organizing coding competitions and knowledge-sharing sessions to foster a strong culture of problem solving, open-source contribution, and career readiness.
  • Key Highlights:
  • Organized 20+ HackerEarth coding contests for 3rd-year students under the Training & Placement Cell, engaging over 400 participants.
  • Collaborated with faculty and platform partners to scale community participation and improve student placement outcomes.
DSATeam ManagementLeadership

Problem Setter at TCET Codechef College Chapter

May 2020May 2021 · 1 yr

C++DSA

Lpv weltweit solutions private limited

Software Developer

Jun 2021Jul 2021 · 1 mo · Mumbai, Maharashtra, India

  • Responsibilities included developing, testing, and deploying a Website for conducting Events on the Hostinger Platform.
  • Features - Admin Panel, Admin Authentication, Razorpay Payment Gateway, Time-based system for displaying and accepting registration and submission for the Events.
BootstrapHTMLWeb Development

Education

Thakur College of Engineering & Technology Shaymnarayan Thakur Marg Thakur Villaige Samata Nagar Kandivli (E) Mumbai 400 101

Bachelor of Technology - BTech — Information Technology

Jan 2019Jan 2023

Annasaheb Vartak College of Arts Kedarnath Malhotra College of Commerce and E S Andrades College of Science Vasai Road Dist Thane 401 202

Hsc — Science

Jan 2017Jan 2019

The Saraswati Vidyalaya

SSC

Stackforce found 100+ more professionals with Site Reliability Engineering & Cloud Infrastructure

Explore similar profiles based on matching skills and experience