M

Mohammed Junaid

SRE (Site Reliability Engineer)

Jeddah, Makkah, Saudi Arabia10 yrs 9 mos experience
Highly Stable

Key Highlights

  • Expert in AWS and Kubernetes with extensive SRE experience.
  • Proven track record in automating cloud infrastructure management.
  • Strong background in security and compliance for cloud environments.
Stackforce AI infers this person is a Cloud Infrastructure and Fintech expert specializing in Site Reliability Engineering.

Contact

Skills

Core Skills

KubernetesAwsLoggingDevopsMonitoringCloud SecurityAutomationInfrastructure AutomationSite Reliability EngineeringIncident ManagementCloud Solutions Architecture

Other Skills

TerraformIAMKubernetes RBACGitHubLokiPromtailArgoCDGrafanaMinIOGitOpsSecurityComplianceARMBashAnsible

About

Site Reliability Engineer | Linux | Python | AWS | Tech Enthusiast | Gamer

Experience

10 yrs 9 mos
Total Experience
2 yrs 9 mos
Average Tenure
2 yrs 3 mos
Current Experience

Salla e-commerce platform

Senior Site Reliability Engineer

Mar 2024Present · 2 yrs 3 mos · Jeddah, Makkah, Saudi Arabia · Remote

  • Designed and implemented end-to-end automation for managing EKS cluster access using Terraform, IAM, and Kubernetes RBAC. Centralized access management via GitHub ensures easy governance, auditing, and team-based access control.
  • Enabled log collection with Loki and Promtail for production workloads, managing daily logs that exceed terabytes. Enhanced the system to achieve a balance between cost efficiency and optimal performance by configuring log levels in Promtail scrape configurations to drop unnecessary logs and setting appropriate retention policies based on application priority.
  • Collaborated with the team to design and migrate deployments to ArgoCD. Developed a custom script that dynamically generates ArgoCD app sets based on configurations, facilitating complex deployment setups across multiple clusters.
  • Managed upgrades of critical infrastructure components like EKS, Karpenter, Loki, teleport etc. with 0 downtime.
  • Developed Grafana dashboards to provide an eagle-eye view of services, enabling quick issue identification and streamlined debugging during critical production incidents. Designed to support proactive monitoring and enhance operational efficiency.
  • Collaborated with the data engineering team to design and implement complex infrastructure for critical tools like MinIO with DirectPV and Timeplus. Leveraged GitOps and ArgoCD for source-controlled, scalable, and automated deployments, ensuring reliability and streamlined operations. Focused on optimizing performance and maintaining seamless integrations across the infrastructure.
TerraformIAMKubernetes RBACGitHubLokiPromtail+6

Microsoft

Site Reliability Engineer 2

May 2020Jan 2024 · 3 yrs 8 mos · Hyderabad, Telangana, India · Hybrid

  • Part of Azure Core organization.
  • Responsibly managed the Security and Compliance aspects of the entire infrastructure containing VMs, BareMetal Machines and Kubernetes clusters in large scale. This includes regular development of automated solutions to ensure all kubernetes clusters and infrastructure remains Vulnerability free.
  • Single handedly managed upgrades on all kubernetes clusters, also making necessary changes in API versions whenever required for upgrades.
  • Managed other infrastructure tools like linkerd, helm etc. and kept them up to date with latest versions.
  • Created a maintenance tracker using low-code solution. This innovative tool effectively handled necessary approvals, scheduled maintenance tasks, sent out meeting invites and promptly notified relevant stakeholders.
  • To streamline operations and enhance efficiency, an automation was developed to facilitate the creation of an infrastructure required with every new region being created. This automation reduced manual efforts, saving approximately three weeks of work. The complete setup now requires only a few simple clicks and takes approximately 4-5 hours to complete. This was developed using ARM(IaaC like Terraform),bash and ansible.
KubernetesSecurityComplianceAutomationARMBash+2

Arcesium

3 roles

Site reliability Engineer Lead

Promoted

Jan 2020Apr 2020 · 3 mos

  • Led the implementation and regular management of a highly available, secure and scalable infrastructure on AWS, resulting in improved system performance and reduced downtime. These critical systems hosted highly critical FinTech applications.
  • Developed and maintained robust automation scripts and configuration management tools, significantly reducing manual effort and improving operational efficiency with keeping best security practices in mind. Ex: 1. Generic Self Service tool to run commands on production in case of emergencies without giving access to production. 2. Slack bots to report to report incidents and do some actions based on the effected services, 3. Weekly automated reports to leadership about all the noise generated, so it can be fixed and only actionable/genuine alerts are in place.
  • Participated in incident response and performed root cause analysis, implementingpreventive measures to minimize the recurrence of critical issues. Collaborated closelywith dev teams to run blameless RCA, fix performance bottlenecks and ensure reduction inincidents.
  • Implemented DR strategies and conducted regular drills to find gaps in our backup plans.
  • Played an important role in mentoring and training junior team members, regularlydelivering knowledge-sharing sessions to foster their professional growth. Additionally, conceptualized a simulation program to train new hires on effective incident management during critical situations by creating some mock scenarios from past learnings.
AWSAutomationConfiguration ManagementIncident ResponseDisaster RecoverySite Reliability Engineering

Site Reliability Engineer

Promoted

Jul 2017Dec 2019 · 2 yrs 5 mos

Member Technical

Jan 2016Jul 2017 · 1 yr 6 mos

Cloudthat technologies pvt ltd

Cloud Solutions Architect

Jan 2015Dec 2015 · 11 mos · Bengaluru Area, India

  • CloudThat is the first company in India to provide Cloud Training & Consulting services for mid-market & enterprise clients around the world. With expertise in major Cloud platforms including Amazon Web Services and Microsoft Azure, CloudThat is uniquely positioned to be the single technology source for organizations looking to utilize the flexibility and power Cloud Computing provides.
  • Worked on developing an automation tool to manage infrastructure on Amazon Web Services.
  • Worked on almost all AWS services like EC2, VPC, IAM, SQS, SNS, LAMBDA, RDS etc.
  • Wrote many scripts for AWS infrastructure automation, Few examples:
  • 1. Launching multiple instances of multiple configurations from a predefined config file. Also to
  • terminate the instances automatically after a specific user defined duration and also log all the
  • instance details in a separate csv file.
  • 2. Wrote a custom auto scaling script to scale up and down as per the client’s custom metrics
  • requirements instead pre-defined metrics of AWS. Ex: to not terminate the instance even if the load is
  • under threshold, unless the active session counts on the host becomes 0.
  • 3. Rotation of IAM credentials, i.e. delete the IAM access and secret keys and creating new keys for all
  • IAM users and mailing it to them every week.
  • 4. Creating a VPC with public and private subnet, and also launching a NAT instance under a public
  • subnet with routes configured to private subnet, with an Elastic IP attached etc..
  • Involved in architectural decisions for AWS and its implementation as well as management for the clients who were moving from on premise to cloud.
  • Briefly worked on Docker containers.
AWSAutomationDockerCloud Solutions Architecture

Education

RV College Of Engineering

Master of Computer Applications (M.C.A.) — Computer Science

Jan 2012Jan 2015

Elite Institute Of Technology,Gulbarga University.

Bachelor of Computer Applications. — Computers

Jan 2009Jan 2012

Al-Sharay PU college

PU — PCMB

Jan 2006Jan 2009

Faraan high school

Schooling — schooling

Jan 1993Jan 2006

Stackforce found 100+ more professionals with Kubernetes & Aws

Explore similar profiles based on matching skills and experience