Ankur Gupta

SRE (Site Reliability Engineer)

Bengaluru, Karnataka, India12 yrs 6 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Over four years of experience in Site Reliability Engineering.
  • Expertise in automation and infrastructure management.
  • Proven track record in enhancing operational efficiency.
Stackforce AI infers this person is a DevOps and Site Reliability Engineering expert in the SaaS industry.

Contact

Skills

Core Skills

Site Reliability EngineeringInfrastructure AutomationDevopsInfrastructure ManagementLinux System Administration

Other Skills

AWSAgile EnvironmentAmazon Web Services (AWS)AnsibleAppdynamicsAutomationBashChefCluster ManagementCommunicationComputer ScienceConsultingContinuous Integration and Continuous Delivery (CI/CD)Data ScienceDocker

About

With over four years at IBM as a Senior SRE Professional, I focus on leveraging automation to enhance operational efficiency and reliability. My expertise includes developing and maintaining Ansible playbooks, creating custom modules, and optimizing clusters to support data science initiatives. These efforts have contributed to scalable and resilient environments for analytics and modeling. Across my career, including previous roles at Goldman Sachs and OpsTree Solutions, I have honed skills in infrastructure automation, cloud technologies, and container orchestration. My mission is to drive innovation in Site Reliability Engineering by implementing streamlined solutions that empower teams and support organizational goals.

Experience

Ibm

Senior SRE Professional

Jul 2021Present · 4 yrs 8 mos · Bengaluru, Karnataka, India · On-site

  • Developed and maintained Ansible playbooks to automate manual runbacks, significantly reducing human error and improving operational efficiency.
  • Created custom Ansible modules tailored to specific system requirements, allowing for seamless integration of new services and configurations into the automation framework. Enhanced existing Ansible modules to optimize performance and increase reusability across different environments.
  • Led the setup and configuration of multiple clusters for the data science team, providing a reliable and scalable environment for their analytics and modeling work.
  • Implemented best practices for cluster management, including resource allocation, monitoring, and backup strategies to support the team's data-intensive workflows.
  • Collaborated with teams to establish IaC pipelines, enabling automated provisioning and configuration of infrastructure resources.
AnsibleAutomationCluster ManagementInfrastructure as Code (IaC)Site Reliability EngineeringInfrastructure Automation

Goldman sachs

Senior Devops

Apr 2019Jul 2021 · 2 yrs 3 mos · Bengaluru, Karnataka, India

  • Using Terraform to set up automated AWS infrastructure.
  • Performed various test cases on AWS infrastructure using services like EC2, S3, Auto scaling, EBS,
  • ELB, Load Balancer, VPC, IAM, Security groups, AMI, System Manager, and Secret Manager.
  • Setup Docker images for all applications and web services.
  • Using Jenkins for continuous integration for java & python applications.
  • Working on k8s for setting up an application running on the on-primes environment.
  • Infrastructure management ~200 servers on the on-primes ecosystem on all the environments.
TerraformAWSDockerJenkinsKubernetesDevOps+1

Opstree solutions

Sr. Devops Engineer

Jan 2017Apr 2019 · 2 yrs 3 mos · Noida Area, India

  • Project History
  • Client : Sapient
  • Exercise:
  • Track -1
  • Infrastructure Management of 350+ servers including all Environments.
  • Using Chef for software configuration and deployment along with jenkins.
  • Implementing SSL Certificates with Security and server hardening on linux instances.
  • Using AEM for hosting purpose in all the environments.
  • Round the clock monitoring using site24x7 and Splunk.
  • Using Okta as single sign on service
  • Track -2
  • Building Infrastructure on Google Cloud for all Environment from scratch.
  • Using Kafka as a queue service build on multi broker Kubernetes Cluster.
  • Implementing SSL Certificates with Security on different API's.
  • Using Terraform to setup different Google Cloud services.
  • Client : Industrybuying
  • Exercise:
  • Used Terraform to setup automated AWS Cloud Environment
  • Migrated Prod environment between regions
  • Complete Continuous Integration and Delivery using Jenkins and JFrog artifactory
  • Creation of various bash scripts for jenkins jobs
  • Round the clock monitoring using New Relic and Cloud Watch.
  • Implementing SSL Certificates with Security and server hardening on linux instances.
  • Configured Database replication like (Master-Master/ Master-Slave)
  • Handing several projects on python, java and node language.
  • Configuring web servers using nginx and apache for site reliability with HAproxy
  • Created Python script for generating various business data reports.
ChefTerraformMonitoringSSL CertificatesDevOpsInfrastructure Management

Genx info technologies pvt ltd

Linux Administrator

Apr 2015Jan 2017 · 1 yr 9 mos · Gurgaon, India · On-site

  • Responsibilities
  • Responsible for problem Management and Root Cause Analysis.
  • Perform Health check of all production servers.
  • Preparing and updating of various support document.
  • Responsible for investigation of the Incidents with high priority.
  • Responsible for performing of Impact Assessment, Root cause analysis and provide recommending solutions.
  • Responsible to perform Regular Activity on Production and Development Servers.
  • Quickly accept responsibility and ownership of issue/problem. Drive solution development and problem resolution.
Linux AdministrationProblem ManagementRoot Cause AnalysisLinux System Administration

Shahi exports pvt ltd

Linux Administrator

Sep 2013Mar 2015 · 1 yr 6 mos · Delhi- NCR

Linux Administration

Education

Motivational Pathway

Bachelor of Technology (B.Tech.) — Electronics and Communications Engineering

Jan 2009Jan 2013

Stackforce found 100+ more professionals with Site Reliability Engineering & Infrastructure Automation

Explore similar profiles based on matching skills and experience