Mohd Wasim A.

DevOps Engineer

Singapore, Singapore9 yrs 8 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in building and operating data platforms in regulated environments.
  • Strong focus on reliability, scalability, and automation.
  • Hands-on experience with Kafka, Spark, and Hadoop ecosystems.
Stackforce AI infers this person is a Fintech Infrastructure Engineer specializing in data platforms and SRE practices.

Contact

Skills

Core Skills

Platform EngineeringSite Reliability Engineering (sre)

Other Skills

Monitoring & AlertingTerraformKubernetesOpenShiftKafkaSparkHadoopAnsiblePrometheusGrafanaYARNHDFSAmazon Web Services (AWS)Hadoop EcosystemDocker Products

About

Senior Platform Engineer / SRE with strong experience building and operating data and streaming platforms in banking and regulated environments. I specialize in reliability, scalability, and automation across Kubernetes-based platforms running on both on-prem and cloud infrastructure. I have hands-on experience operating Kafka, Spark, and Hadoop ecosystems on Kubernetes / OpenShift, using Terraform and Ansible for infrastructure automation and Prometheus / Grafana for observability. I work closely with development, security, and operations teams to improve platform stability, reduce operational toil, and support business-critical data workloads. What I bring:• Platform engineering mindset with SRE best practices (SLOs, incident response, postmortems)• Strong ownership of production systems in high-availability environments• Experience across AWS, bare-metal, and hybrid infrastructure Currently open to Senior Platform Engineer / SRE (Data Platform) roles in Europe, US & Canada, Middle East, Australia & New Zealand and Remote as well.

Experience

9 yrs 8 mos
Total Experience
3 yrs 2 mos
Average Tenure
4 yrs
Current Experience

Crédit agricole cib

Senior Platform Engineer / SRE (Data Platforms)

May 2022Present · 4 yrs · Singapore · Hybrid

  • Designed, built, and operated enterprise-scale data and streaming platforms in a regulated banking environment, supporting business-critical analytics and real-time workloads.
  • Owned Kubernetes and OpenShift platforms running on hybrid infrastructure (AWS + bare metal), including cluster lifecycle management, upgrades, capacity planning, and resilience improvements.
  • Engineered and operated Kafka, Spark, and Hadoop ecosystems as shared platforms, ensuring high availability, performance, and operational stability.
  • Implemented Infrastructure as Code using Terraform and configuration automation via Ansible, significantly reducing manual changes and improving consistency across environments.
  • Established observability standards using Prometheus and Grafana, improving alert quality, reducing noise, and enabling faster incident detection and recovery.
  • Applied SRE best practices including incident response, root-cause analysis, and postmortems to continuously improve platform reliability and reduce operational toil.
  • Worked closely with application, security, and compliance teams to deliver secure, scalable platforms aligned with banking risk and regulatory requirements.
  • Automated recurring operational tasks using Python and shell scripting, improving efficiency and freeing engineering time for higher-value platform work.
  • Participates in on-call rotations for platform reliability, handling production incidents and driving long-term fixes.
Monitoring & AlertingTerraformKubernetesOpenShiftKafkaSpark+3

Cloudera

Senior Customer Operations Engineer (Data Platforms)

Jan 2022May 2022 · 4 mos · Bengaluru, Karnataka, India

  • Worked as part of Cloudera’s production engineering and operations team, supporting enterprise-scale data platforms used by large customers in regulated and mission-critical environments.
  • Diagnosed and resolved complex production issues across Kafka, Spark, YARN, and HDFS, focusing on platform stability, performance, and data pipeline reliability.
  • Performed deep root-cause analysis for distributed system failures, including dependency issues, resource contention, and cluster-level misconfigurations.
  • Supported the deployment, upgrade, and secure operation of Cloudera platforms (CDH / CDP) across on-prem and cloud environments.
  • Collaborated closely with engineering, security, and customer platform teams to drive permanent fixes rather than temporary workarounds.
  • Gained strong exposure to enterprise production architectures, security integrations (Kerberos, Ranger, LDAP), and large-scale Hadoop ecosystem operations.
KafkaSparkYARNHDFSSite Reliability Engineering (SRE)

Confidential

3 roles

Senior Platform Engineer (Data Platform)

Promoted

Feb 2021Dec 2021 · 10 mos

  • Led the design, deployment, and operation of enterprise data platforms supporting batch and streaming analytics workloads.
  • Owned production Hadoop-based platforms, including capacity planning, high availability, and reliability improvements across NameNode and ResourceManager components.
  • Designed data platform architectures aligned with business and digital transformation requirements.
  • Implemented automation scripts and workflows to reduce manual operational effort and improve platform stability.
  • Worked closely with clients and internal teams to plan platform rollouts, upgrades, and operational readiness.
  • Applied incident and problem management practices to identify root causes and drive long-term platform improvements rather than short-term fixes.
  • Ensured security, availability, and performance of data platforms in collaboration with infrastructure and governance stakeholders.
Amazon Web Services (AWS)Hadoop EcosystemPlatform Engineering

Platform Engineer (Data Platform)

Promoted

Mar 2018Feb 2021 · 2 yrs 11 mos

  • Built and operated large-scale distributed data platforms using Hadoop, Spark, Kafka, and related ecosystem components.
  • Played a key role in production cluster design, deployment, and lifecycle management, including upgrades and performance tuning.
  • Implemented high availability and disaster recovery strategies to improve platform resilience and reduce single points of failure.
  • Automated operational workflows for data movement, backup, and recovery, improving reliability of production pipelines.
  • Supported streaming and batch processing platforms, ensuring stable execution of business-critical data workloads.
  • Collaborated with solution architects and senior engineers on platform architecture decisions and scalability planning.
  • Gained deep hands-on experience with distributed systems, Linux-based infrastructure, and production troubleshooting.
Amazon Web Services (AWS)Hadoop EcosystemPlatform Engineering

Associate

Aug 2016Mar 2018 · 1 yr 7 mos

  • Supported monitoring and operational stability of Hadoop-based data platforms under the guidance of senior platform engineers.
  • Assisted in deploying proof-of-concept data clusters, gaining early exposure to production-grade distributed systems.
  • Worked with cloud infrastructure components (EC2, VPC, S3, IAM) to support data platform environments.
  • Learned incident analysis, prioritisation, and escalation workflows for production systems.
  • Built strong fundamentals in Linux, distributed systems, and data platform operations, forming the base for future platform engineering roles.
Amazon Web Services (AWS)Hadoop Ecosystem

Education

RKDF UNIVERSITY

Bachelor of Engineering - BE — Computer Science

Stackforce found 100+ more professionals with Platform Engineering & Site Reliability Engineering (sre)

Explore similar profiles based on matching skills and experience