P

Pankaj Pal

SRE (Site Reliability Engineer)

Mumbai, Maharashtra, India16 yrs 9 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in cloud infrastructure management.
  • Proficient in site reliability engineering practices.
  • Skilled in automating deployments with CI/CD pipelines.
Stackforce AI infers this person is a Cloud Infrastructure and Site Reliability Engineering expert with extensive experience in SaaS environments.

Contact

Skills

Core Skills

Cloud InfrastructureSite Reliability EngineeringWeb Server AdministrationSystem Administration

Other Skills

AWSApacheBambooDNSDeploymentDockerGITGitHub ActionsGoogle Cloud Platform (GCP)HelmKubernetesMicroservicesMonitoringMySQLNagios

About

With 9 years experience in the IT industry. I have specialized experience in the system with specific knowledge of all Linux Distributions & mainly production applications based on Apache-Tomcat, Mysql , httpd, nginx, nagios, new-relic, puppet, cfengine, Varnish, svn, GIT, Activemq, Netscaler loadbalancer,DNS, NFS, LVM, Bamboo, Python.

Experience

Gannett | usa today network

Lead Site Reliability Engineering Team

Mar 2016Present · 10 yrs · Mumbai Area, India

  • Define, Drive and Implement monitoring and deployment solution in multi-datacenter environment with service discovery for a 24X7 operations
  • Expertise in managing AWS and google cloud and creating the different manifest files for applications deployment into kubernetes cluster, such as deployments, role based access control, services: load
  • balancers, ClusterIp, Node port and the Ingress load balancer.
  • Built and deployed Docker containers to break up monolithic app into microservices, improving developer workflow, increasing scalability, and optimizing speed.
  • Developed and customized Helm charts for deploying microservices-based applications on Kubernetes clusters.
  • Experienced in using Terraform to automate the provisioning and management of Google Cloud Platform (GCP) infrastructure, ensuring consistency and scalability across environments.
  • Optimized Helm templates for better resource utilization and scalability in production environments.
  • Developed and maintained CI/CD pipelines using GitHub Actions to automate testing, building, and deployment processes for various projects.
  • Integrated GitHub Actions with GKE for seamless deployment of applications to cloud environments.
  • experience supporting, automating, and optimizing mission critical deployments in Openstack, leveraging configuration management, CI/CD, and DevOps processes. Managing Openstack infrastructure and troubleshooting challenging issues.
  • Configuration deployment using Puppet & Hiera.
  • Experience with CI/CD tools like Github, Jenkins, Bamboo, Ansible, Puppet.
AWSGoogle Cloud Platform (GCP)KubernetesDockerTerraformGitHub Actions+4

Reachlocal

Site Reliability Engineer

Aug 2012Feb 2016 · 3 yrs 6 mos · Mumbai Area, India

  • Provide 24*7 support for production environment and playing Incident Commander role to meet SLA agreement.
  • Quickly and professionally respond to all productions critical incidents.
  • Web Server (Apache, Nginx) administration, maintenance and configuration in Linux platform.
  • Configurations and troubleshooting of Application Servers eg. Tomcat, Apache ActiveMq.
  • Ensure maximum uptime (24x7) and performance for the web servers and system build applications.
  • Nagios installation and configuration to monitor production environment.
  • Expertise supporting and designing web applications using LAMP (Linux, Apache, MySQL, and PHP) infrastructure
  • Code deployment using GIT versioning tool.
  • Creation of CSR and renewing of SSL certificate and installation on the Netscaler load balancer.
  • NFS mounting between multiple clusters, Rsync.
  • Working experience on LVM.
  • Working experience on DNS
  • Installation and configuration of Newrelic to monitor the applications.
  • Configurations and monitoring of caching technologies like Redis & Memcached.
  • Diagnosis and Predictive support as per the Client activities via call, chat, ticket system.
  • Working experience with ticketing system like JIRA, ServiceNow.
  • Maintain comprehensive documentation on incident procedures.
  • Configuration deployment using Puppet & Hiera.
  • Continues Integration and deployment tasks using Bamboo & Maven.
  • Working with Netscaler Loadbalancer. Adding new nodes, Defining policies and creating vserver for load balancing purpose.
  • Experience with Nagios monitoring tool. Adding new checks as per the requirement.
  • Working on Tomcat & Nginx. Configuring load balancer with tomcat and apache.
  • Creating python scripts to automate jobs.
  • Experience working with database such as MySQL & oracle. Configuration of master & slave mysql database for replication.
ApacheNginxNagiosTomcatMySQLGIT+4

Xalted information systems pvt. ltd.

System Engineer

Jun 2009Aug 2012 · 3 yrs 2 mos · Mumbai Area, India

  • Setup Apache Server 2.2.8 with required modules & tune apache performance.
  • Configuration of Tomcat server, Deploying new projects on Tomcat server, Taking daily backup of various tomcat projects, performance tunning, Monitoring od different projects on Tomcat.
  • Working on Subversion repository s for patch release management, Compiling source code in Jboss seam server & releasing all generated binary files to svn repositories for further release.
  • Installation & managing of Jboss 5.1 server.
  • Jboss load balancer server with apache mod_jk & heartbeat cluster.
  • Administrating and managing MySql Database, Creating of databases, taking backup, performance tuning, Giving privileges to different user s, Modification of tables.
  • Setup Bind DNS Server (Master + Slave) and troubleshoot DNS problems like zone transfer, forwarding etc
  • Configuration & maintenance of Sendmail & dovecot server’s.
  • Deploy & Monitoring Squid Server and applied access control by Squid Guard utility
  • Assigned user accounts and granted permissions to shared resources. Assured senior management of data protection by demonstrating permission settings
  • Monitoring the status of all servers by Nagios tool
ApacheTomcatMySQLDNSNagiosSystem Administration

Education

mahatma gandhi university

BCA — Computers

Stackforce found 100+ more professionals with Cloud Infrastructure & Site Reliability Engineering

Explore similar profiles based on matching skills and experience