Shivam Kumar

SRE (Site Reliability Engineer)

Delhi, India1 yr 9 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • Expert in Kubernetes and Terraform for cloud infrastructure.
  • Automated CI/CD pipelines for 10+ microservices.
  • Strong background in monitoring and alerting systems.
Stackforce AI infers this person is a Cloud Infrastructure Engineer with expertise in DevOps and automation.

Contact

Skills

Core Skills

CloudMonitoring & AlertsInfrastructure As CodeCi/cd

Other Skills

AWS CloudFormationAWS LambdaAWS SNSAWS servicesAmazon Web Services (AWS)AnsibleArtificial Intelligence (AI)AzureBashBuild AutomationBurp SuiteC (Programming Language)C++CassandraCodePipeline

About

I'm a Site Reliability Engineer at Kimbal Technologies with 1.5 + year of hands-on experience in building secure, scalable, and automated infrastructure. I specialize in Kubernetes, Terraform, Docker, and ArgoCD, driving GitOps practices and cloud-native deployments. ๐Ÿš€๐Ÿ”ง Key Skills & Interests:--------------------------------------------------------๐Ÿ› ๏ธ Infrastructure as Code (Terraform, Ansible)โ˜๏ธ Cloud (AWS: EKS, EC2, S3, VPC, WAF, etc.)๐Ÿ”„ CI/CD (GitHub Actions )๐Ÿ“ฆ Containerization & Orchestration (Docker, Kubernetes, Helm)๐Ÿ” Secrets Management (HashiCorp Vault)๐Ÿ“ก Monitoring & Alerts (Zabbix, Wazuh, SonarQube)๐Ÿง  Databases (PostgreSQL, MSSQL, Cassandra, Neo4j)๐Ÿ“ค Artifact Management (Nexus Repository)๐Ÿงช POCs & Production Deployments (EMQX, OpenVPN)๐Ÿงฐ Automation (100+ tasks via scripts & tools)๐Ÿง  Scripting (Bash, PowerShell)๐ŸŒ DNS & Network Management across hybrid infraI'm passionate about open source, automation, and solving infrastructure challenges at scale. Always learning, always building. ๐Ÿงฉ

Experience

Kimbal

3 roles

Site Reliability Engineer II

Jun 2025 โ€“ Present ยท 9 mos ยท Delhi, India

  • Implemented AWS SNS and integrated it with internal services, enabling real-time alerts with full logging and CloudWatch metrics tracking .
  • Integrated multi-level alerting in Zabbix based on alert severity, routing notifications via email, Microsoft Teams, and PagerDuty for efficient incident management and response.
  • Automated PostgreSQL backups using pgBackRest with Ansible, storing securely in S3
  • Wrote AWS Lambda functions triggered by EventBridge to automate siloed tasks and cut down manual work.
AWS SNSZabbixPostgreSQLAnsibleAWS LambdaCloud+1

Site Reliability Engineer

Jun 2024 โ€“ Jun 2025 ยท 1 yr ยท Delhi, India

  • Proficient in Kubernetes, Docker, Helm, and ArgoCD, enabling scalable deployments and GitOps-driven workflows across environments.
  • Developed and improved Terraform modules to provision and manage cloud infrastructure as code, enhancing reusability and consistency.
  • Built and maintained CI/CD pipelines for 10+ microservices using GitHub Actions, enabling faster, automated deployments.
  • Experienced with AWS services: EKS, EC2, S3, VPC, WAF, Inspector, and firewall configurations and many more services
  • Automated monitoring and alerting setups using Zabbix, integrated with internal systems for real-time visibility.
  • Administering Vault for secrets management and secure credentials handling across environments.
  • Managing Nexus Repository for storing and distributing build artifacts and Docker images.
  • Developed Ansible playbooks for configuration management of EMQX, OpenVPN, PostgreSQL, and other services.
  • Successfully completed POCs and deployed Wazuh for security monitoring across 10+ environments.
  • Conducted POC and installed SonarQube for automated code quality analysis.
  • Set up EMQX in a production-ready 3-node cluster with auto-discovery and MQTT-based communication.
  • Automated 50โ€“100+ tasks, ranging from system maintenance to cloud provisioning, improving team productivity.
  • Managing databases including MSSQL, PostgreSQL, Cassandra, and Neo4j, with focus on performance and backup strategies.
  • Configured Windows systems extensively, using PowerShell and Bash scripting for automation and maintenance.
  • Responsible for DNS configuration and management across hybrid environments.
  • Led efforts in Disaster Recovery (DR) strategy and execution; conducted 30+ successful DC-DR and DR-DC drills for utility and discom clients.
KubernetesDockerTerraformGitHub ActionsAWS servicesZabbix+10

Site Reliability Engineer Intern

Jan 2024 โ€“ Jun 2024 ยท 5 mos ยท Delhi, India

Gemini solutions pvt ltd

Gemini Ambassador Program

Jan 2022 โ€“ Mar 2023 ยท 1 yr 2 mos

Education

Birla Institute of Technology and Science, Pilani

Master's degree โ€” Artificial Intelligence

Oct 2025 โ€“ Oct 2027

Panjab University

Bachelor of Engineering - BE โ€” Computer science and engineering

Oct 2020 โ€“ Jul 2024

Guru Gobind Singh Public School,Bokaro Steel City

High School โ€” Physics Chemistry Mathematics with Informatics Practices

Jan 2018 โ€“ Jan 2020

Chandigarh College of Engineering & Technology (Degree Wing), Panjab University

Bachelor of Engineering - BE โ€” Computer Science

Jan 2020 โ€“ Jan 2024

Stackforce found 100+ more professionals with Cloud & Monitoring & Alerts

Explore similar profiles based on matching skills and experience