Umar Habib

CTO

Ireland6 yrs 2 mos experience

Key Highlights

  • Expert in cloud automation and infrastructure as code.
  • Proven track record in site reliability engineering.
  • Strong leadership in cross-functional team environments.
Stackforce AI infers this person is a Cloud Infrastructure and DevOps Engineering expert in the Fintech and SaaS industries.

Contact

Skills

Core Skills

Infrastructure AutomationSite Reliability EngineeringCloud InfrastructureDevsecopsCloud OptimizationCi/cd ImplementationDevops Engineering

Other Skills

AWSAWS Cloud MigrationAWS CloudFormationAWS CodeBuildAWS CodeCommitAWS CodeDeployAWS CodePipelineAWS Command Line Interface (CLI)AWS Identity and Access Management (AWS IAM)AWS LambdaAmazon CloudFrontAmazon CloudWatchAmazon EC2Amazon ECSAmazon EKS

About

Dynamic Cloud, DevOps, SRE, and DevSecOps Engineer with a proven track record in designing, automating, and securing large-scale cloud infrastructures across AWS, Azure, and GCP. Passionate about building reliable, observable, and self-healing systems that balance innovation with operational excellence. I specialize in cloud automation, infrastructure as code (Terraform, Ansible, CloudFormation), and CI/CD pipeline orchestration that accelerates deployments while maintaining strict reliability and compliance standards. As a Site Reliability Engineer (SRE), I implement and monitor Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs) to measure and improve service health. I manage error budgets to maintain the right balance between velocity and stability, and lead incident management and disaster recovery (DR) initiatives to ensure business continuity. Skilled in container orchestration (Kubernetes, Docker, AWS EKS, Azure AKS) and observability (Prometheus, Grafana, CloudWatch), I leverage data-driven insights to enhance system resilience and performance. Proficient in DevSecOps practices, integrating tools like SonarQube, Snyk, and OWASP ZAP to enforce security and compliance (SOC 1, SOC 2, GDPR). Experienced in MLOps and AI model deployment, automating the training, validation, and monitoring of machine learning models across cloud environments (AWS, Azure, GCP) using tools like MLflow, SageMaker, and Azure Machine Learning. Adept at Microsoft 365 administration and data visualization with Power BI and Tableau to deliver actionable intelligence to stakeholders. With strong leadership, communication, and collaboration skills, I thrive in cross-functional environments—driving innovation, reliability, and scalability in every project I lead.

Experience

6 yrs 2 mos
Total Experience
1 yr 6 mos
Average Tenure
--
Current Experience

Smbc group

Vice President - Infrastructure Automation Engineer

Aug 2025Dec 2025 · 4 mos · Tralee, County Kerry, Ireland · Hybrid

  • Designed, implemented, and maintained automation frameworks and tools to manage infrastructure provisioning, scaling, monitoring, and configuration using Ansible, Terraform, and Python.
  • Developed and managed automated deployment pipelines using Terraform and Ansible, enabling consistent, repeatable, and compliant infrastructure deployments across hybrid environments.
  • Implemented monitoring, observability, and alerting solutions using Prometheus, Grafana, ELK Stack, and PagerDuty to ensure real-time visibility into performance, health, and reliability metrics.
  • Defined and maintained Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets, using them to guide deployment decisions and balance innovation with system reliability.
  • Designed and executed incident management processes, including alerting, escalation, and post-mortem reviews, ensuring effective root-cause analysis and preventive action.
  • Wrote and maintained automation scripts in YAML, Python, Bash, and PowerShell to eliminate repetitive tasks, improve efficiency, and standardize configurations.
  • Managed GitHub Enterprise platform for the organization, including repository creation, structure, and access governance.
  • Integrated GitHub Copilot into development workflows to enhance code quality, accelerate reviews, and improve automation script efficiency.
  • Integrated ServiceNow with automation pipelines for incident management and response, enabling faster issue resolution and improved operational visibility.
  • Designed and implement scalable and resilient observability solutions across logs, metrics, and traces, leveraging existing market solutions or technologies from scratch.
  • Promoted a reliability-driven culture within the engineering teams, mentoring peers on automation, observability, and continuous improvement aligned with SRE principles.
AnsibleTerraformPythonPrometheusGrafanaELK Stack+5

Nile integrity solutions

Cloud DevOpSec Engineer

Jun 2024Apr 2025 · 10 mos · Dublin, County Dublin, Ireland

  • Designed and deployed scalable cloud infrastructures on AWS and Azure using Terraform, automating provisioning with Ansible and Set up and optimized Google Cloud.
  • Migrated on-premises databases to AWS RDS, improving disaster recovery and reducing costs.
  • Administered Kubernetes clusters (AWS EKS, Azure AKS), optimized deployments with Helm charts, and automated scaling.
  • Deployed Redis clusters for caching, reducing latency and enhancing application performance.
  • Automated CI/CD pipelines with Jenkins, GitHub Actions, and Azure DevOps, integrating security tools for vulnerability management.
  • Enhanced security by implementing IAM policies, MFA, SAST/DAST scans, and compliance measures (SOC2, GDPR).
  • Configured monitoring solutions (Prometheus, Grafana, ELK), reducing MTTR by 35%.
  • Streamlined network provisioning, VPN setups, switch, router, wireless and firewall configurations, ensuring secure connectivity.
  • Managed Linux and Windows servers, automating tasks with PowerShell and improving system reliability.
  • Optimized Microsoft 365 environments, enhancing collaboration and securing data with DLP and MFA.
  • Administered Linux-based environments, including user management, disk partitioning, and system monitoring.
  • Configured networking services, including DNS, DHCP, and VPNs, ensuring secure and reliable connectivity.
  • Managed and configured DNS, SSL/TLS certificates, domain registration & Integrated SSO and MFA for enhanced security.
  • Managed and optimized Linux servers (Ubuntu, CentOS, RHEL) for application deployment and web hosting.
  • Automated CI/CD pipelines using Jenkins, GitHub Actions, Azure DevOps, & AWS CodePipeline for .NET Core, Java, and Python applications.
  • Enhanced security posture by configuring IAM policies, Multi-Factor Authentication (MFA), & encryption mechanisms.
  • Conducted SOC1/SOC2 audits, implemented controls to address compliance gaps, & ensured GDPR compliance.
  • Deployed & managed Keycloak on Kubernetes.
TerraformAnsibleAWSAzureKubernetesPrometheus+5

Techvibes

Cloud SRE Engineer

Sep 2023May 2024 · 8 mos · Remote

  • As a versatile DevOps Engineer, I specialize in optimizing cloud infrastructure and streamlining CI/CD processes to drive operational excellence and innovation.
  • Key Responsibilities & Achievements:
  • Kubernetes Expertise: Troubleshooted and resolved Kubernetes issues to ensure seamless application operation and high availability.
  • Database Management: Designed and implemented robust database backup plans to guarantee data integrity & availability.
  • Cloud Optimization: Advised teams on AWS resource utilization, optimizing cloud infrastructure for performance & cost-efficiency.
  • Infrastructure as Code: Created and managed AWS resources for startups using Terraform, facilitating smooth cloud transitions.
  • CI/CD Pipeline Implementation: Developed and maintained CI/CD pipelines with Jenkins, GitLab CI, & GitHub Actions, automating build, test, and deployment processes to enhance software delivery.
  • Monitoring & Logging: Deployed & managed ELK stack (Elasticsearch, Logstash, Kibana) and configured alert systems for centralized logging, real-time monitoring, & issue resolution.
  • Cloud Cluster Management: Set up and managed Kubernetes clusters on AWS EKS and Google Kubernetes Engine (GKE), ensuring scalability, high availability, & resilience for microservices.
  • Automated Provisioning: Automated infrastructure provisioning and management with Terraform, creating reusable modules and managing state with Terraform Cloud & AWS S3.
  • Continuous Monitoring: Configured monitoring solutions using Prometheus, Grafana, & the ELK stack to provide clients with actionable insights & real-time alerts.
  • Implemented SLI-based observability metrics for CPU utilization, request latency, and error rates using Prometheus and Grafana dashboards.
  • Defined SLOs and SLAs in collaboration with development and operations teams to ensure measurable service reliability goals.
  • Monitored Error Budgets to inform change management policies — reducing post-release incidents by 25%.
KubernetesAWSTerraformJenkinsGitLab CIPrometheus+4

Techvibes international limited

DevOps/SRE Engineer

Jan 2021Jan 2023 · 2 yrs

  • As a DevOps Engineer, I enhance system performance, reliability, and automation. My expertise in cloud infrastructure, Kubernetes, and Linux administration delivers high availability and operational efficiency.
  • I set up automated monitoring and failover systems using Prometheus and Grafana, minimizing downtime. Using Terraform for Infrastructure as Code (IaC), I streamline AWS resource management.
  • In AWS, I configure server clusters, load balancers, and Auto Scaling policies to ensure optimal performance. My Kubernetes skills ensure application scalability and reliability, while Docker enables seamless container deployments.
  • I optimize Linux servers, automate tasks, and troubleshoot network issues for smooth operations. I enhance CI/CD pipelines with Jenkins, GitLab CI, and Kubernetes, driving continuous improvement.
  • Implementing Istio service mesh improves microservices communication and security. I manage AWS IAM, ensuring secure access and automating SSL/TLS management. Python scripts on AWS Lambda and Ansible automation enhance performance and consistency.
  • Ensuring optimal network performance involves monitoring CPU and memory usage and resolving connectivity issues.
  • Collaboration on GitHub streamlines workflows and version control. I stay updated with CI/CD technologies, introducing tools like Terraform and Kubernetes to drive improvements.
  • Regular Linux backups ensure data integrity and quick recovery. I ensure network performance by monitoring and resolving connectivity issues. Security testing and compliance monitoring are integrated into the CI/CD pipeline.
  • Implemented SLI-based observability metrics for CPU utilization, request latency, and error rates using Prometheus and Grafana dashboards.
  • Defined SLOs and SLAs in collaboration with development and operations teams to ensure measurable service reliability goals.
  • Monitored Error Budgets to inform change management policies — reducing post-release incidents by 25%.
TerraformKubernetesJenkinsGitLab CIPrometheusGrafana+4

Malashe consultant ltd

AWS DevOps Engineer

Jan 2017Jan 2020 · 3 yrs

  • Designed and implemented AWS-based infrastructure and environments, ensuring high availability, fault tolerance, and scalability.
  • Utilized AWS services such as EC2 and CloudFormation to build and manage infrastructure as code.
  • Implemented Jenkins pipelines for automated CI/CD deployment of applications to servers and containers.
  • Configured Jenkins jobs to trigger Maven builds for artifact creation and version management.
  • Automated application deployment and scaling processes using AWS CodePipeline and AWS CodeDeploy.
  • Integrated Jenkins with version control systems (e.g., Git) to automatically trigger builds on code commits.
  • Utilized Jenkins plugins to deploy artifacts to different servers using SSH connections.
  • Worked closely with development teams to optimize application performance and monitor application health using AWS CloudWatch.
  • Developed custom Jenkins pipeline scripts for specific deployment requirements and integrations.
  • Utilized Ansible for infrastructure configuration management and application deployment to different servers.
  • Created Ansible playbooks for automating the deployment of applications, ensuring consistency across servers.
  • Managed server configurations and deployments using Ansible roles and templates.
  • Automated the provisioning of AWS resources using Ansible and AWS CloudFormation.
  • Collaborated with cross-functional teams to ensure smooth deployment using Ansible.
AWSJenkinsAnsibleTerraformDevOps Engineering

Education

Dubin Business School

Master's degree — Artificial Intelligence

ATB University

Bachelor of Technology - BTech — Electrical Engineering

Dublin Business School

Postgraduate Degree — Data Analytics

Stackforce found 100+ more professionals with Infrastructure Automation & Site Reliability Engineering

Explore similar profiles based on matching skills and experience