Akash Mishra

SRE (Site Reliability Engineer)

Bengaluru, Karnataka, India14 yrs 3 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Over 10 years of experience in DevOps and cloud infrastructure.
Expert in Infrastructure as Code and CI/CD management.
Proven track record in automation and operational improvements.

Stackforce AI infers this person is a DevOps Engineer specializing in Infrastructure and Automation within the SaaS and Telecommunications sectors.

Contact

Skills

Core Skills

Infrastructure As CodeCi/cd ManagementAutomation SolutionsBuild And Deployment ManagementApplication SupportIt Operations

Other Skills

AIOpsAWSAnsibleApacheApplication MonitoringAutomationAutomation ScriptsBashBuild AutomationCCactiConfiguration ManagementCore JavaDevOpsEndeca

About

DevOps Engineer with 10+ years of experience in managing cloud infrastructure and system administration, integrating AWS/HWS cloud-based infrastructure components, and developing automation solutions. Optimizing mission-critical deployments in AWS/HWS, leveraging configuration management, CI/CD, IT Operation, Middleware/Application Support and Build using DevOps processes. Key Skills ======== Cloud Technology - AWS & HWS Cloud Provisioning - Terraform & Packer (IaC) Containerization - Docker & k8s Source Control - Git BigIP - f5 Load Balancer Scripting - Shell & Python Monitoring - Zabbix, Sensu, CatchPoint,Nagios, SysDig, Grafana, NewRelic, DataDog, keynote and Omniture. Log management - Loggly, Sumologic, DataDog, Kibana and Splunk. Config Management - Chef & Ansible Platform Supporting - RHEL6/7/8, Ubuntu 14.x/16.x, CentOS 7/9 Middleware/Weblayer Technology - Oracle Weblogic Servers, Reverse proxy, Apache WebServer, JBoss, Tomcat, SolarWind, nuwa and JVM. Troubleshooting and Incident Management of Production Environment in Cloud and DataCenter My Contact details!! - docker run devopssreengineers/contact

Experience

Booking.com

Senior Site Reliability Engineer

Oct 2022 – Present · 3 yrs 5 mos

Designing and maintaining Infrastructure as Code (IaC) solutions using Terraform to efficiently manage AWS resources, leveraging Ansible for provisioning and configuration automation across cloud environments.
Maintaining and extending Ansible playbooks while using Puppet for configuration management and system provisioning across diverse environments, ensuring consistency, automation, and compliance.
Building automation scripts using Python, Bash, Ansible, and AWS Lambda to automate manual and recurring operational tasks such automated backups, system provisioning, and health-check monitoring, improving efficiency and reducing human error.
Managing and enhancing CI/CD pipelines using GitLab CI/CD to streamline infrastructure and application builds and deployments across multiple environments.
Monitoring system performance using Prometheus, Grafana, Zabbix, and other observability tools such as OpenTelemetry (Otel), proactively identifying and resolving performance issues through metrics, logs and traces.
Participating in incident response and conducting root cause analysis to prevent recurrence and improve system resilience.
Collaborating with application, network, and security teams to implement best practices for system performance, resource optimization, and compliance.
Researching and experimenting with new technologies to improve system reliability, cost-efficiency and operational workflows.
Advocating and implementing engineering best practices, including version-controlled infrastructure, modular automation code, and comprehensive documentation.
Designing and implementing HA solutions, including redundancy, failover mechanisms, cross-region deployments, load balancing, and disaster recovery strategies to ensure uptime.
Handling SAP infrastructure, ensuring system resilience, implementing HA architectures and integrating observability practices to monitor SAP system health, performance and reliability across critical business.

TerraformAnsiblePythonBashGitLab CI/CDPrometheus+4

Huawei technologies india

Senior Site Reliability Engineer/DevOps Engineer

Apr 2020 – Oct 2022 · 2 yrs 6 mos · Bengaluru, Karnataka, India

Asia Pacific , Africa & Latin America & Russia Region
Automatic Deployment through IaC - Automatic Ops Capability (SD Process)
Automatic Monitoring - Automatic Alarm & Monitoring Capability (ITR Process)
Proactive Discovery Capability (BCM)
Unified Configuration Management Capability (CMDB)
Data-Based and Intelligent Ops Capabilities ( AIOps )
Proved successful working within tight deadlines and fast-paced atmosphere. Exceeded goals through effective task prioritization and great work ethic.
Drove operational improvements which resulted in savings and improved profit margins.
Improved operations through consistent hard work and dedication.
Leading planning and implementation of automation solutions in team Implemented Auto Chaos Engineering solutions for business continuity test Upgrade/installation/commissioning of Micro-services on Cloud. Deployment of micro-services using IaC (Infrastructure as Cloud).
Performed Migration from a cloud VM to Container.
Real-time Site monitoring, Real-time Alarm monitoring, Analyzing the errors and reporting.
Analyze customer existing technology functions and define business requirements based on new platform/technologies.
Managing service as a Service owner for AAL & Russia Region.
Developed Service monitoring to enhance the user experience.
Used coordination and planning skills to achieve results according to schedule. Increased customer satisfaction by addressing issues automation.
Automated the manual works and repeating tasks using integrated StackStrom. Define Scope, solution offerings, creation of customer journeys and drive end- to-end technical solutions.
Saved cost by implementing cost-saving initiatives that addressed long- standing problems.
Awarded with Spot Award 2020.
Awarded with Digital SRE Best Practices 2021.
Awarded with SRE 2021 Award.
Awarded with APRI Starlight Award Outstanding 2021.

IaCAIOpsMicro-servicesAutomationMonitoringInfrastructure as Code+1

Rakuten

Senior DevOps Engineer

Oct 2017 – Mar 2020 · 2 yrs 5 mos · Bangalore(WTC)

Design and implement build, deployment, and configuration management Build and test automation tools for infrastructure provisioning.
Handle code deployments in all environments.
Monitor metrics and develop ways to improve.
Execute complex Proof of Concepts (PoC) as a tool to finalize the technical evaluation and middleware components.
Provide technical guidance and educate team members and coworkers on development and operations.
Maintain day-to-day management and administration of projects.
Document and design various processes; update existing processes.
Improve infrastructure development and application development.
Follow all best practices and procedures as established by the company.
Take care of end to end infrastructure, which includes all setup deployments and monitoring.
Work on the PCI and Non PCI environment.
Participate in on-call schedule in the local timezone.
Managed GitHub repositories and permissions, including branching and tagging.
Improve the deployment process within AWS.
Developing CI/CD roadmap and implementing the project.
Good Command on Configuration Management tools like chef.
Provision AWS infrastructure using Terraform.
Perform scanning to Infrastructure and remediate the Vulnerabilities (Risk Management)
Excellence Award Oct-2018
Excellence Award Sept-2019
Best Team Rakuten Rewards(Ebates)-2019

Build AutomationConfiguration ManagementAWSGitHubBuild and Deployment Management

Ipsoft

Senior Application Engineer

Mar 2017 – Oct 2017 · 7 mos · bangalore

Engage and assist with changes on the applications.
React to alerts and issues seen within the application.
Assist with the setup and resolve issues for new partners using the applications.
Check and monitor applications to ensure they are running as normal.
Report on the activity on the application.
Work with product and business teams to ensure that the application has a clear and correct use and future.
Work with development teams to outline issues and help quickly resolve them as needed.
Incident handling related to environment issues.
Working on JBOSS, Apache and F5 load Balancer.
Working on Splunk and creating Dashboard as per requirement.

Application MonitoringIncident ManagementJBOSSApacheApplication Support

Hewlett packard enterprise

Appl Mgmt Svc Del Cons/IT Ops Engineer

Nov 2011 – Mar 2017 · 5 yrs 4 mos · bangalore

Installation and Configuration of Weblogic 12C, Coherence, Ultra-ESB, HornetQ and ATG. Knowledge/configuration on F5 load balancer.
Server Administration - Disk Cleanups, Log Rotation, etc.
Environment troubleshooting.
Incident handling related to environmental issues.
Working as an operational lead.
Played the responsibility for any kind of a critical incident in production.
Configure and maintain the Nagios services.
Create SSL Certificates and configure them on Weblogic servers.
Cluster Management, Create Admin and Managed Servers.
Performing backup and recovery when new technologies are being installed or upgraded. Tomcat, WebLogic, Splunk, cacti configuration and monitoring
Application deployment and a bug fix in various environment
WeBserver(Apache), DNS (bind) and Mail server(Postfix).
Partition and space management file system management.
Help the support team in for high incidents.
Implemented no’s of automation script for vodafone.co.uk production support stability Created multiple shell scripts to avoid manual work, better performance, and error tracking. Patch deployment installation
EAR, JAR, WAR, MDS deployment
Build, release in a production environment
Change management for any enhancement in production under L2 support
knowledge of OAM/OVD servers.
Working on various tools like- keynote, Omniture, Splunk, remedy, cacti.
Content file system import,export-related task.
Played the responsibility for any kind of a critical incident in production.
Attending the war room for investigating the root cause of site failure along with several teams