Wahba M.

DevOps Engineer

Egypt12 yrs 3 mos experience
Highly StableAI Enabled

Key Highlights

  • Achieved 99.99% CI/CD uptime over 18 months.
  • Saved $200K/year through DevOps optimization.
  • Led cloud-native modernization across mission-critical platforms.
Stackforce AI infers this person is a Cloud Infrastructure and DevOps expert specializing in high-availability and disaster recovery solutions.

Contact

Skills

Core Skills

DevopsSite Reliability Engineering (sre)Cloud ArchitectureAi EngineeringSreCloud Engineering

Other Skills

GitHub ActionsOKETerraformAzure AKSAzure DevOpsPrometheusGrafanaGitOpsPythonBashService MeshChatOpsHashiCorp VaultKubernetesMLOps

About

Principal Cloud Reliability & Resilience Consultant (via Concentrix) supporting a Fortune 500 global cloud initiative. I help enterprise teams design, test, and sustain high-availability, disaster-resilient multi-cloud environments across Azure, AWS, and GCP. I translate Business Impact Analyses into HA/DR strategies with precise RTO ≤ 15 min / RPO ≤ 5 min, drive 99.99%+ uptime, and embed SRE metrics (MTTR, CFR, deployment frequency, lead time) into delivery pipelines for measurable reliability. Previously at KAUST, I engineered automation frameworks serving 300 K+ users, achieving 40% faster recovery, 70% fewer vulnerabilities, and $200 K annual cost optimization through Terraform + GitOps, RBAC, and policy-as-code governance. I lead resiliency and BCP workshops, align stakeholders on risk-to-resilience mapping, and validate readiness through failover drills and post-incident retrospectives. Specialties: Cloud Reliability Architecture | SRE | Multi-Cloud | IaC (Terraform / Helm) | CI/CD | Kubernetes Platform Engineering | Observability | Compliance & Governance | Cost & Performance Optimization. Value: Principal-level consulting that blends Fortune-grade engineering rigor with enterprise resilience expertise — helping global organizations operate faster, safer, and stronger in the cloud.

Experience

12 yrs 3 mos
Total Experience
11 yrs 9 mos
Average Tenure
6 mos
Current Experience

Microsoft

Cloud Solution Architecture Consultant

Nov 2025Present · 6 mos

Concentrix

Principal Cloud Reliability & Resilience Consultant

Nov 2025Present · 6 mos · Remote

Kaust (king abdullah university of science and technology)

4 roles

DevOps & SRE Lead

Nov 2022Nov 2025 · 3 yrs · Remote

  • Core Value: Driving org-wide DevOps transformation, SRE reliability, and AI-driven operational efficiency
  • Key Impact & Achievements:
  • Saved $200K/year by scaling DevOps pipelines, cutting agent licensing, and optimizing AKS workloads.
  • Achieved 99.99% CI/CD uptime over 18 months — even during exam peaks with >5,000 concurrent users.
  • Enhanced platform security by 70% fewer vulnerabilities with RBAC, secret rotation (Azure Key Vault), and Terraform guardrails.
  • Reduced manual incident response by 80% using AI-driven ChatOps, auto-remediation scripts, and proactive alerting.
  • Improved system response times by 40% by adopting service mesh, Redis queues, and scalable Kubernetes (AKS/EKS).
  • 🔹 Core Platforms & Advanced Projects :
  • Generative AI CoE: Built and operated scalable GenAI/LLM platforms on Kubernetes with MLOps pipelines, GPU orchestration, and RAG-based data integration, optimizing performance and cost.
  • SANDS: Designed and managed cloud-native distributed systems with microservices, service mesh, and high-availability architecture, ensuring reliability and scalability in production environments.
  • SCML: Orchestrated HPC-driven machine learning workloads integrating distributed training, simulation pipelines, and hybrid cloud execution with efficient resource utilization.
  • Tech Stack: GitHub Actions, OKE, Terraform, Azure AKS, Azure DevOps, Prometheus, Grafana, GitOps, Python, Bash, Service Mesh, ChatOps, HashiCorp Vault
GitHub ActionsOKETerraformAzure AKSAzure DevOpsPrometheus+9

AI & Platform Engineer

Promoted

Nov 2019Oct 2022 · 2 yrs 11 mos · Remote

  • Core Value: Building scalable, secure, and observable platforms enabling DevOps maturity across multi-cloud environments.
  • Key Impact & Achievements:
  • Designed and deployed multi-cloud Kubernetes clusters (AKS, EKS, OKE) with Terraform, Helm, and GitOps — standardized infra provisioning across 3 business units.
  • Automated platform provisioning pipelines, reducing deployment time by 65% and configuration drift by 30%.
  • Consolidated CI/CD frameworks and YAML templates across 50+ services, improving delivery speed and consistency.
  • Implemented SRE metrics and dashboards (MTTR, CFR, uptime) with Prometheus, Grafana, and Azure Monitor — improved issue detection time by 40%.
  • Integrated RBAC (Role-Based Access Control), secret rotation, and compliance guardrails — achieving 70% fewer security incidents.
  • Led platform modernization from monolithic .NET to containerized microservices — reduced infrastructure footprint by 25% and improved scalability.
  • Mentored 2 cross-functional teams on IaC best practices and cost optimization, lowering platform costs by $120K/year.
  • Tech Stack: Kubernetes (AKS/EKS/OKE), Terraform, Helm, GitHub Actions, Azure DevOps, Prometheus, Grafana, Ansible, Redis, PostgreSQL, Key Vault, Vault
Kubernetes (AKS/EKS/OKE)TerraformHelmGitHub ActionsAzure DevOpsPrometheus+8

Senior DevOps & SRE Engineer

Promoted

Nov 2016Oct 2019 · 2 yrs 11 mos · Remote

  • Core Value: Leading cloud-native modernization across mission-critical academic platforms
  • Key Impact & Achievements:
  • Migrated legacy apps to containerized .NET/Golang microservices with blue/green & canary deployments via Helm + ArgoCD.
  • Automated provisioning of AKS clusters and serverless Azure Functions using IaC (Terraform + Helm) — built infra in 3 weeks instead of 6.
  • Reduced configuration drift by 30% via Ansible automation and standardization.
  • Delivered 70% faster deployment of archival workflows using modularized Terraform.
  • Separated testing & production pipelines, reducing production rollback by 60%.
  • Built CI/CD YAML templates for reusability across 80+ pipelines; cut maintenance overhead by 50%.
  • Integrated GitHub Advanced Security, secrets scanning, and shift-left tooling into pipelines.
  • Reduced merge conflicts by 40% and CI flakiness by 70% through VCS policies and retry logic.
  • Tech Stack: Azure DevOps, Terraform, Helm, GitHub Actions, Ansible, AKS, SonarQube, Redis, PostgreSQL, Key Vault, OpenTelemetry
Azure DevOpsTerraformHelmGitHub ActionsAnsibleAKS+7

DevOps Engineer

Nov 2013Oct 2016 · 2 yrs 11 mos · Remote

  • Core Value: Automation foundation, DevOps enablement, and early IaC/cloud excellence
  • Key Impact & Achievements:
  • Reduced Azure DevOps license costs by 30% (~$200K/year) — gained promotion to Senior Engineer.
  • Built multi-stage YAML pipelines and migrated manual builds to full CI/CD automation with PowerShell and Terraform.
  • Cut manual release efforts by 80%, accelerating SDLC cycles with IaC-based provisioning.
  • Automated deployments of C# apps via Ansible across Linux/Windows, reducing config drift by 30%.
  • Containerized applications and deployed on EKS and later AKS; maintained production K8s reliability for 5,000+ users.
  • Improved code coverage by 20% using SonarQube integration in CI pipelines.
  • Resolved CI merge/rebase issues across teams, reducing failed builds by 70%.
  • Eliminated leaked credentials by rewriting Git history with git-filter-repo for compliance.
  • Tech Stack: Azure DevOps, Terraform, Ansible, Docker, Kubernetes, PowerShell, Python, Helm, Git, EKS, SonarQube, KQL
Azure DevOpsTerraformAnsibleDockerKubernetesPowerShell+8

Education

Tanta University

Bachelor of Engineering

Jan 2010Jan 2015

Stackforce found 100+ more professionals with Devops & Site Reliability Engineering (sre)

Explore similar profiles based on matching skills and experience