Usman M.

CEO

Washington, DC, United States14 yrs 8 mos experience
Most Likely To Switch

Key Highlights

  • Increased HPC cluster capacity from 98% to 400%
  • Reduced deployment time from hours to minutes
  • Delivered 99.9%+ uptime across multi-thousand node environments
Stackforce AI infers this person is a Senior Infrastructure Engineer specializing in AI and HPC workloads for enterprise and public sector.

Contact

Skills

Core Skills

Infrastructure EngineeringCloud ArchitectureHpc EngineeringInfrastructure ManagementDevops EngineeringCloud InfrastructureSystems EngineeringInfrastructure Modernization

Other Skills

LinuxAWSKubernetesDockerAnsibleTerraformCI/CDPostgreSQLElasticsearchHPC schedulersobservability platformsAutomationHPCStorage SolutionsIBM LSF

About

Senior Infrastructure and Platform Engineer with 12+ years designing, building, and operating large-scale Linux and cloud environments for enterprise, research, telecom, and federal systems. Specialize in compute-intensive platforms, AI and HPC workloads, distributed systems, and automation-driven infrastructure across hybrid and cloud environments. Strong background delivering reliable, high-throughput platforms supporting mission-critical services and large data processing. Key strengths include Kubernetes platforms, GPU compute environments, batch scheduling, parallel storage, cloud architecture, and infrastructure as code. Selected impact: • Increased HPC cluster capacity from 98% to 400% through automated resource optimization • Reduced deployment time from hours to minutes via infrastructure automation • Delivered 99.9%+ uptime across multi-thousand node environments • Migrated large data systems with zero downtime and high data integrity • Built secure, hardened platforms for enterprise and federal workloads Extensive experience across AWS, Linux at scale, container orchestration, observability platforms, and distributed services. Open to remote senior roles and contract engagements focused on AI infrastructure, platform engineering, SRE, HPC, or large-scale cloud systems.

Experience

14 yrs 8 mos
Total Experience
5 yrs 3 mos
Average Tenure
14 yrs 8 mos
Current Experience

Cadence design systems

Senior HPC / Linux Systems Engineer

Jan 2020Jul 2022 · 2 yrs 6 mos · California, United States

  • Operated large-scale HPC infrastructure supporting semiconductor design and engineering simulations.
  • Managed IBM LSF clusters across ~1700 servers with 99.8% uptime
  • Increased effective compute capacity to 400% via automated resource optimization
  • Implemented containerized workloads to reduce resource conflicts
  • Automated provisioning using kickstart and configuration management tools
  • Deployed and optimized GPU compute environments
  • Maintained core enterprise Linux services across heterogeneous systems
IBM LSFLinuxGPU compute environmentsContainerizationAutomationHPC Engineering+1

Etisalat

Senior DevOps Engineer

Mar 2018Dec 2019 · 1 yr 9 mos · Down town, Dubai

  • Built cloud platforms and automation systems for large telecom services supporting hundreds of thousands of users.
  • Designed CI/CD pipelines for enterprise application platforms
  • Architected multi-AZ AWS environments with automated failover
  • Managed containerized services and autoscaling infrastructure
  • Implemented monitoring, alerting, and operational automation
  • Automated configuration across large Linux server fleets
CI/CDAWSContainerizationMonitoringAutomationDevOps Engineering+1

Al khaleej international pvt. school

Systems Engineer / devops

Jan 2016Feb 2018 · 2 yrs 1 mo · Sharjah, United Arab Emirates

  • Led infrastructure modernization including virtualization, cloud adoption, network redesign, and centralized services for large campus environments.
VirtualizationCloud adoptionNetwork redesignSystems EngineeringInfrastructure Modernization

Kryptohive

Independent Infrastructure & Platform Consultant

Sep 2011Present · 14 yrs 8 mos · United States · Remote

  • Architect and operate large-scale Linux, cloud, and HPC environments for enterprise, telecom, research, and public sector clients.
  • Selected engagements:
  • Georgia Institute of Technology
  • Automated infrastructure provisioning reducing deployment effort ~70%
  • Led large Elasticsearch migration with zero downtime
  • Built PostgreSQL high-availability cluster achieving 99.95% uptime
  • Deployed enterprise monitoring platform across distributed environments
  • General Electric
  • Designed AWS ParallelCluster HPC platform supporting engineering simulations
  • Implemented high-throughput storage using FSx for Lustre
  • Built automated infrastructure using CloudFormation and CI/CD pipelines
  • Delivered GPU-enabled remote visualization environment
  • Public Sector Systems
  • Delivered hardened infrastructure automation for sensitive environments
  • Implemented monitoring, patching, and configuration management at scale
  • Technologies: Linux, AWS, Kubernetes, Docker, Ansible, Terraform, CI/CD, PostgreSQL, Elasticsearch, HPC schedulers, observability platforms
LinuxAWSKubernetesDockerAnsibleTerraform+7

Education

Eastern Mediterranean University

BSEE — Electrical and Electronics Engineering

Jan 2009Jan 2014

Stackforce found 100+ more professionals with Infrastructure Engineering & Cloud Architecture

Explore similar profiles based on matching skills and experience