Anmol S.

SRE (Site Reliability Engineer)

Bengaluru, Karnataka, India4 yrs 3 mos experience

Key Highlights

  • Expert in Site Reliability Engineering with AWS.
  • Proficient in Python and CI/CD automation.
  • Strong background in monitoring and observability tools.
Stackforce AI infers this person is a Site Reliability Engineer with expertise in cloud infrastructure and automation in SaaS.

Contact

Skills

Core Skills

Site Reliability EngineeringAmazon Web Services (aws)DockerPythonDevops

Other Skills

Android SDKBootstrapC++CSSChefContainerizationContinuous Integration and Continuous Delivery (CI/CD)Data StructuresExpress.jsFlutterGitGitHubHTMLJIRAJMeter

About

Hi I am working as Site Reliability Engineer @ Zscaler , If you're hiring and want to discuss any relevant job opportunities,DM me or mail me the JD at anmolsahu99@gmail.com Skill Set -> Programming Language : Python and Shell Cloud Technologies : AWS EC2, S3, Route53, VPC, ECR —> AWS components like EC2, CloudFormation, RDS, S3, DynamoDB, SQS, Kinesis CI/CD : Jenkins Core Competency : CN , OS(Linux), troubleshooting Container & Orchestration : Docker , Kubernetes Monitoring and Observability: DataDog, New relic, CheckMk, Nagios, pingdom ( Metrics, Logs and Traces ) IAC : Terraform SRE Practice : SLI, SLO, SLA and 99.999 uptime related knowledge. Good problem solving in terms of coding in python and knowledge of System Design.

Experience

4 yrs 3 mos
Total Experience
1 yr 6 mos
Average Tenure
1 yr 2 mos
Current Experience

Zscaler

Senior Site Reliability Engineer

Mar 2025Present · 1 yr 2 mos · India · On-site

Zynga

Site Reliability Engineer

Aug 2022Mar 2025 · 2 yrs 7 mos · India · On-site

  • Participated in 24/7 rotational on-call, Followed Day-light model.
  • Troubleshoot production, stage server errors during the on-call. Mainly in Linux environment of CentOS, etc.
  • AWS Cloud maintenance for the partner games teams in order to make instances and node-pool reliable & robust.
  • Automated the recurring alerts with the Stack Storm using Python Scripting. Hence, mostly repetitive alerts where are handled by the Stack Storm itself.
  • Help and support partner game teams with AWS resources uptime during release such as modification and upgradation of resources like AWS EC2, S3, EBS, Route53, ASG, ELB and ECR.
  • Used Jenkins Pipeline to effectively replace bad node with the new one, replacing node of membase and Couchbase to keep Uptime.
  • Used Data Dog, CloudWatch & Grafana and Internal monitoring application for monitoring various internal
  • services based on various metrics.
  • Maintaining documentation, playbook and records of changes, troubleshooting steps and steps to accomplish
  • certain tasks.
  • Dedicated Projects:
  • Created a Dashboard for SRE team to coordinate with the USA/UK/BLR team and other partner teams to support and help them to achieve their respective goals.
  • Created Interactive and attractive UI with HTML, CSS, Bootstrap, JS to display the text and information. Used Node.js runtime environment and Express.js as the Backend framework to host the application. Also, Containerise the application with Docker.
  • Used Python Script to Automate the CI/CD process from pushing the code to GitHub repo to Docker and then finally AWS EC2 instance.
  • Integrated the PagerDuty API to fetch the details of SRE members and their on-call availability, alerts, contact details, etc. in-order to help partner teams to connect and coordinate with the SRE.
  • SRE dashboard helps to coordinate with internal SRE team from global location and external partner teams during events i.e., change of on-call shifts, during escalations, and some unexpected events.
PythonContainerizationPagerDutyJIRADockerStackStorm+4

Leucine

DevOps Engineer

Jan 2022Jul 2022 · 6 mos · Remote

  • >Deploy Application in different environment using python script over the QA and Staged servers.
  • >Created the CI/CD pipeline for various task i.e. Deployment, remove deployment, backups,
  • redeployment, etc. using Jenkins.
  • >Created CI/CD for test cases created by QA team for automation testing of the application and
  • maintained the allure reports on AWS S3 with the help of GitHub Actions.
  • > Exposure of AWS services i.e. EC2, S3, and Route53.
  • >Dockerise the application and its different components using Docker compose & Docker file
  • >Automate the daily tasks i.e. Backup, API monitoring, cronjobs, etc. using Shell Scripting
  • >Automate the task of downloading & manipulating file and logs using Python Scripting
  • >Created monitoring and alerts for the QA, Stage and UAT deployment using Newrelic
Shell ScriptingScriptingPython (Programming Language)PostgreSQLDockerDevOps+4

Cogno ai

Intern

Jul 2021Jan 2022 · 6 mos

  • (PPO was offered)

Codeholic

Intern

Oct 2020Jul 2021 · 9 mos

  • Flutter Developer for different projects. Integrated Video calling APIs in the project, Modification, and Customization of UI of App as per the requirement of the Client. Integrated different services from different API vendors like Google Maps API, Agora Video Calling, and Live Streaming API etc.

Kotumb

Intern

May 2019Jun 2019 · 1 mo

Pajasa apartments

Intern

Apr 2019Oct 2019 · 6 mos

  • Maintained the WordPress website and content on the website on a regular basis.
  • Updated data manually and using 3rd Party API's

Sportsthat

Intern

Apr 2019May 2019 · 1 mo

Education

Guru Ghasidas University

B.Tech — Information Technology

KENDRIYA VIDYALAYA NO.1 G.C.F JABALPUR

Stackforce found 100+ more professionals with Site Reliability Engineering & Amazon Web Services (aws)

Explore similar profiles based on matching skills and experience