Domeshwar Singh

SRE (Site Reliability Engineer)

Gurgaon, Haryana, India12 yrs 8 mos experience
Highly Stable

Key Highlights

  • Reduced server provisioning time from weeks to hours.
  • Developed self-service capabilities for non-technical teams.
  • Implemented AI solutions for enhanced system reliability.
Stackforce AI infers this person is a DevOps and Site Reliability Engineer with expertise in automation and infrastructure management.

Contact

Skills

Core Skills

Site Reliability EngineeringDevopsInfrastructure ManagementTest Automation

Other Skills

ARGOCDAmazon Web Services (AWS)AnsibleAutomationCC (Programming Language)C++CI/CDContinuous monitoringDockerFunctional TestingGitGrafanaHP QTPHelm

About

Professional Summary: I have worked with Orange for 7.8 years, Bank data for 5 months and almost 2.7 years for Zscaler. In my professional journey of 11.2 years. I went through Automation Testing, Infra-structure Automation, continuous integration & continuous deployment along Devops and Site reliability engineering role into Mobile financial service, Banking and Network security Domain. In this Technical journey I have utilized below skills sets to accomplish my different tasks in different roles. I have an experienced implementing monitoring solution & Design Auto triage system for continuous monitoring and triaging the thousands of servers. Designing self-services for serving on-demand requests for non technical team members . Automation roles skills : Python, Selenium, Robot framework ,UFT. Devops &SRE roles skills : Jira, Linux, Basic networking, Shell scripting, Python 3.0, MySQL, GIT, Jenkins , Ansible, Docker, Kubernetes,Openshift. Infra-Structure Management Skills: AWS, Prometheus, Grafana ,Victoria metrics ,Helm, Kops, Terraform, ISTIO,ARGOCD Application server & Proxies Skills: Apache Tomcat, Squid, NGINX, HAPROXY, firewalls, load balancers, DNS, DHCP, NAT Database Skills: Oracle database management.

Experience

12 yrs 8 mos
Total Experience
4 yrs 4 mos
Average Tenure
4 yrs 4 mos
Current Experience

Zscaler

Staff Site Reliability Engineer

Jan 2022Present · 4 yrs 4 mos · Sahibzada Ajit Singh Nagar, Punjab, India

  • In Zscaler, my team is currently responsible for the deployment of ZIA services across thousands of bare metal servers. Over the years, we have significantly optimized our processes and taken on a broad range of responsibilities, including:
  • 1.Streamlined Server Provisioning: Three to four years ago, it took us around three weeks to manually provision these servers at customer on-premises sites. Through automation, we've reduced that time to just 7-8 hours.
  • 2.Automated Monitoring and Triage: We’ve developed a continuous monitoring system and designed a Python framework that automates the triage of Zscaler services and identifies faulty servers.
  • 3.Self-Service Capabilities: We’ve introduced self-service options for system triage and server provisioning, integrating robust RBAC processes to ensure security and access control.
  • 4.Scripting for Automation: We create Ansible playbooks, shell scripts, and Python scripts to automate various tasks, ensuring efficiency across individual processes.
  • 5.End-to-End Automation: Our team has fully automated the provisioning process, from server deployment to generating automated reports.
  • 6.Capacity Planning Automation: We’ve automated capacity planning and provide regular reports to customers, keeping them informed of their server and service capacity.
  • 7.AI Integration for SRE Challenges: We're now applying AI to real-world SRE challenges such as monitoring, alerting, and incident management to further enhance system reliability and responsiveness.
PythonAnsibleShell scriptingContinuous monitoringAutomationSite Reliability Engineering+1

7n

Senior Software Engineer

Sep 2021Jan 2022 · 4 mos · Gurugram, Haryana, India

  • Working with Platform Service Team. which always makes Development Team life easier.
  • Developer will use our Custom Operators which will keep deployment cycle easier & faster.
  • Role & Responsibility:
  • 1. Monitoring infrastructure using promethous ,Grafana,Victoria matrics.
  • 2. Provisioning Datasource & Dashborad in grafana using Cofiguaration as Code.
  • 3. Development of Self Service for openshift .
  • 4. Deployment Self services using Helm Charts.
  • 5. ARGOCD implemntation for continous Deployment.
  • 6. Supporting developers across organization with their build, deployment or performance related issues on the platform.
  • 7. Code Refactoring for Quarkus Framework which will help to develop Self services.
  • 8. Regular Upgrade of the monitoring System
PrometheusGrafanaOpenShiftHelmARGOCDDevOps

Orange business services

2 roles

DevOps Engineer

Promoted

Jul 2016Jan 2022 · 5 yrs 6 mos

  • Deployment of monolith and Micro-service-based application.
  • Writing CI/CD pipeline.
  • Writing Ansible scripts to automate regular Linux and Unix server configuration tasks.
  • Integration of regular activities from Jenkins and writing Shared Libraries, which can be used by across the team.
  • Creation of optimize Docker images for micro-services and monolith application deployment.
  • Regular Docker images Creation for Single Unit Task deployment.
  • K8S Cluster Setup & Administration.
  • Deployment of monolith & Micro-services Application into K8S Cluster.
  • Infra-structure provisioning into AWS and Micro-soft Azure Cloud.
  • Regular activities on Caching server, Reverse proxy, Load Balancer, Forward Proxy, DNS, DHCP, NAT.
  • Regular Automation using Shell Scripting & Python Scripting.
  • Writing Terraform scripting for Resource provisioning into Cloud.
  • Source Code Management done using GIT.
  • Java Application Deployment using Apache Tomcat.
  • Linux Server Administration along with Server Network Management.
  • Application Monitoring Using Prometheus.
  • Micro-service Application Packaging and deployment using Helm.
  • AWS level infra-structure Management using Kops.
  • Service Discovery using ISTIO.
  • Application deployment into Open shift.
  • Automation Testing using Selenium, UFT, Robot Framework.
  • Oracle Database Management Activity like database Backup & recover.
CI/CDAnsibleDockerKubernetesTerraformDevOps

System Software Engineer

Sep 2013Jan 2022 · 8 yrs 4 mos

  • Designed and Developed Automation framework using QTP and Selenium.
  • Prepared test scripts, Maintained and debug old automation scripts using QTP & Selenium.
  • Brought down the manual effort in testing by automating 50 percent of the application.
  • Good in Functional Testing, Writing Test Cases, Test Case Execution, Bug Reporting, Regression Testing.
  • Good in preparing documentation and reports at the end of test cycle.
  • Able to co-ordinate with multiple projects and meet the deadlines.
  • Hardworking, Organized, Ability to work within tight deadlines, a Self-Motivator, Team Player, who maintains the energy level always high.
  • Strong desire to learn and willingness to take initiative.
QTPSeleniumFunctional TestingTest Automation

Education

Chitkara University

Bachelor's Degree — Computer Science Engineering

Jan 2009Jan 2013

Him Acadamy Public School,Hamirpur

Senior Secondry School Education

Jan 2007Jan 2009

Chitkara University

High School — High School Education

Jan 2006Jan 2007

Stackforce found 100+ more professionals with Site Reliability Engineering & Devops

Explore similar profiles based on matching skills and experience