Pooja P

SRE (Site Reliability Engineer)

Bengaluru, Karnataka, India10 yrs 2 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Experienced in Site Reliability Engineering for major tech companies.
  • Proficient in cloud technologies and automation tools.
  • Strong background in monitoring and incident management.
Stackforce AI infers this person is a Site Reliability Engineer with expertise in cloud infrastructure and automation in SaaS environments.

Contact

Skills

Core Skills

Site Reliability EngineeringCloud InfrastructureData Reliability EngineeringIncident ManagementMonitoring SolutionsMail Production EngineeringCi/cd Implementation

Other Skills

AWSAmazon Web Services (AWS)AnsibleApache KafkaCC++Cascading Style Sheets (CSS)DockerGitGrafanaHTMLHTML5JIRAJavaJavaScript

About

Strong media and communication professional with a Bachelor's degree focused in Computer Science from B. M. S. College of Engineering.

Experience

Nvidia

Senior Site Reliability Engineer (GeForce Now)

Feb 2021Present · 5 yrs 1 mo

Linux System AdministrationKubernetesDockerAWSGitJIRA+2

Everbridge

Senior Data Reliability Engineer

Dec 2020Feb 2021 · 2 mos · Bengaluru, Karnataka, India

Linux System AdministrationMySQLAnsibleSplunkServiceNowData Reliability Engineering+1

Walmart global tech india

Site Reliability Engineer

May 2018Dec 2020 · 2 yrs 7 mos · India

  • Engender reliability and availability, starting with metrics and measurements.
  • Ensure that proper telemetry, logging and metrics are implemented to measure service health and
  • conform to error budgeting, build architecture and operations tools to run, validate and analyse multiple
  • enterprise-level SaaS services.
  • Provisioning and maintaining Linux Servers in production and development environments.
  • Identify and solve problems quickly, sometimes under pressure if there are issues that are directly
  • affecting users.
  • Help prevent incident recurrence by developing solutions to correct or mitigate issues at their root.
  • Part of stress test, performance test for e-commerce websites.
  • Monitored automated build and continuous software integration process to drive build/release failure
  • resolution.
  • Working with and developing enterprise monitoring/tooling solutions like Grafana, Kibana, Splunk,
  • Graphite, Nagios, New Relic, Greylog and HPOM.
  • Managing different databases like MySQL, MongoDB.
  • Automated configuration management and deployments using Ansible playbooks and Yaml for
  • resource declaration. And creating roles and updating Playbooks to provision servers by using Ansible.
  • Orchestrated Docker container cluster using Kubernetes..
  • Used cloud technologies like Azure for building, testing, deploying, and managing applications and
  • services.
  • Experience on provisioning and managing applications in the cloud using Oneops.
  • Log aggregation setup using Logstash, Elasticsearch and Kibana.
  • Experience working on Xmatters, PagerDuty, ServiceNow, Slack and Jira.
  • Built Monitoring tool which is use to capture alerts from the data sources like Grafana, Splunk,
  • Dynatrace into one place , helped on getting less noisy and more meaningful alerts.
Linux System AdministrationMySQLAnsibleDockerKubernetesAWS+5

Yahoo! inc.

Mail Production Engineer, Assoc

Nov 2015May 2018 · 2 yrs 6 mos · India

  • As a part of Yahoo mail, I worked on mail-storage component. Managing millions of user mails at
  • backend.
  • Respond to incidents and requests for assistance as part of an on-call rotation.
  • Management and Implementation of Linux (Centos, Redhat), Middleware (Apache, Tomcat), and java,
  • python applications.
  • Designed and implemented a continuous build-test-deployment (CI/CD) system with multiple
  • component pipelines using Jenkins to support weekly releases and out-of-cycle releases based on
  • business needs.
  • Managed GitHub repositories and permissions, includes branching and tagging.
  • Maintained Jenkins continuous integration infrastructure and automated releases to
  • DEV/TEST/STG/PROD environments.
  • Built and managed a large deployment of RedHat Linux instances systems with Chef Automation.
  • Implemented automated local user provisioning VMs created in Openstack and AWS cloud through
  • Chef Recipes.
  • Experience with container-based deployments using Docker, working with Docker images, Docker Hub
  • and Docker-registries and Kubernetes.
  • Worked on implementing AWS using EC2, S3, RDS, ECS, Elastic Load Balancer, Auto Scaling groups,
  • CloudWatch(monitoring), AWS Elastic Beanstalk(app deployments), Amazon S3(storage) ,Amazon
  • EBS(persistent disk storage).and managing network security using VPC, load balancer, security groups
  • and route53.
  • Developed automation and deployment utilities using Python..
  • Set up centralized logging and log aggregation/searching using Elasticsearch.
  • Hands-on experience in standing up and administrating on-premise Kafka platform. Creating a backup
  • for all the instances in Kafka Environment.
  • Established infrastructure and service monitoring using Prometheus, Grafana and Splunk.
  • Having experience on Slack, Opsgenie, ServiceNow, Jira ticketing system.
Linux System AdministrationJenkinsGitDockerAWSKafka+2

Education

B. M. S. College of Engineering

Bachelor's degree — Computer Science

Jan 2011Jan 2015

Stackforce found 100+ more professionals with Site Reliability Engineering & Cloud Infrastructure

Explore similar profiles based on matching skills and experience