Mohd Wasim A.

DevOps Engineer

Singapore, Singapore9 yrs 8 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Expert in building and operating data platforms in regulated environments.
Strong focus on reliability, scalability, and automation.
Hands-on experience with Kafka, Spark, and Hadoop ecosystems.

Stackforce AI infers this person is a Fintech Infrastructure Engineer specializing in data platforms and SRE practices.

Contact

Skills

Core Skills

Platform EngineeringSite Reliability Engineering (sre)

Other Skills

Monitoring & AlertingTerraformKubernetesOpenShiftKafkaSparkHadoopAnsiblePrometheusGrafanaYARNHDFSAmazon Web Services (AWS)Hadoop EcosystemDocker Products

About

Senior Platform Engineer / SRE with strong experience building and operating data and streaming platforms in banking and regulated environments. I specialize in reliability, scalability, and automation across Kubernetes-based platforms running on both on-prem and cloud infrastructure. I have hands-on experience operating Kafka, Spark, and Hadoop ecosystems on Kubernetes / OpenShift, using Terraform and Ansible for infrastructure automation and Prometheus / Grafana for observability. I work closely with development, security, and operations teams to improve platform stability, reduce operational toil, and support business-critical data workloads. What I bring:• Platform engineering mindset with SRE best practices (SLOs, incident response, postmortems)• Strong ownership of production systems in high-availability environments• Experience across AWS, bare-metal, and hybrid infrastructure Currently open to Senior Platform Engineer / SRE (Data Platform) roles in Europe, US & Canada, Middle East, Australia & New Zealand and Remote as well.

Experience

9 yrs 8 mos

Total Experience

3 yrs 2 mos

Average Tenure

4 yrs

Current Experience

Crédit agricole cib

Senior Platform Engineer / SRE (Data Platforms)

May 2022 – Present · 4 yrs · Singapore · Hybrid

Designed, built, and operated enterprise-scale data and streaming platforms in a regulated banking environment, supporting business-critical analytics and real-time workloads.
Owned Kubernetes and OpenShift platforms running on hybrid infrastructure (AWS + bare metal), including cluster lifecycle management, upgrades, capacity planning, and resilience improvements.
Engineered and operated Kafka, Spark, and Hadoop ecosystems as shared platforms, ensuring high availability, performance, and operational stability.
Implemented Infrastructure as Code using Terraform and configuration automation via Ansible, significantly reducing manual changes and improving consistency across environments.
Established observability standards using Prometheus and Grafana, improving alert quality, reducing noise, and enabling faster incident detection and recovery.
Applied SRE best practices including incident response, root-cause analysis, and postmortems to continuously improve platform reliability and reduce operational toil.
Worked closely with application, security, and compliance teams to deliver secure, scalable platforms aligned with banking risk and regulatory requirements.
Automated recurring operational tasks using Python and shell scripting, improving efficiency and freeing engineering time for higher-value platform work.
Participates in on-call rotations for platform reliability, handling production incidents and driving long-term fixes.

Monitoring & AlertingTerraformKubernetesOpenShiftKafkaSpark+3

Cloudera

Senior Customer Operations Engineer (Data Platforms)

Jan 2022 – May 2022 · 4 mos · Bengaluru, Karnataka, India

Worked as part of Cloudera’s production engineering and operations team, supporting enterprise-scale data platforms used by large customers in regulated and mission-critical environments.
Diagnosed and resolved complex production issues across Kafka, Spark, YARN, and HDFS, focusing on platform stability, performance, and data pipeline reliability.
Performed deep root-cause analysis for distributed system failures, including dependency issues, resource contention, and cluster-level misconfigurations.
Supported the deployment, upgrade, and secure operation of Cloudera platforms (CDH / CDP) across on-prem and cloud environments.
Collaborated closely with engineering, security, and customer platform teams to drive permanent fixes rather than temporary workarounds.
Gained strong exposure to enterprise production architectures, security integrations (Kerberos, Ranger, LDAP), and large-scale Hadoop ecosystem operations.

KafkaSparkYARNHDFSSite Reliability Engineering (SRE)

Confidential

3 roles

Senior Platform Engineer (Data Platform)

Promoted

Feb 2021 – Dec 2021 · 10 mos

Led the design, deployment, and operation of enterprise data platforms supporting batch and streaming analytics workloads.
Owned production Hadoop-based platforms, including capacity planning, high availability, and reliability improvements across NameNode and ResourceManager components.
Designed data platform architectures aligned with business and digital transformation requirements.
Implemented automation scripts and workflows to reduce manual operational effort and improve platform stability.
Worked closely with clients and internal teams to plan platform rollouts, upgrades, and operational readiness.
Applied incident and problem management practices to identify root causes and drive long-term platform improvements rather than short-term fixes.
Ensured security, availability, and performance of data platforms in collaboration with infrastructure and governance stakeholders.

Amazon Web Services (AWS)Hadoop EcosystemPlatform Engineering

Platform Engineer (Data Platform)

Promoted

Mar 2018 – Feb 2021 · 2 yrs 11 mos

Built and operated large-scale distributed data platforms using Hadoop, Spark, Kafka, and related ecosystem components.
Played a key role in production cluster design, deployment, and lifecycle management, including upgrades and performance tuning.
Implemented high availability and disaster recovery strategies to improve platform resilience and reduce single points of failure.
Automated operational workflows for data movement, backup, and recovery, improving reliability of production pipelines.
Supported streaming and batch processing platforms, ensuring stable execution of business-critical data workloads.
Collaborated with solution architects and senior engineers on platform architecture decisions and scalability planning.
Gained deep hands-on experience with distributed systems, Linux-based infrastructure, and production troubleshooting.

Amazon Web Services (AWS)Hadoop EcosystemPlatform Engineering

Associate

Aug 2016 – Mar 2018 · 1 yr 7 mos

Supported monitoring and operational stability of Hadoop-based data platforms under the guidance of senior platform engineers.
Assisted in deploying proof-of-concept data clusters, gaining early exposure to production-grade distributed systems.
Worked with cloud infrastructure components (EC2, VPC, S3, IAM) to support data platform environments.
Learned incident analysis, prioritisation, and escalation workflows for production systems.
Built strong fundamentals in Linux, distributed systems, and data platform operations, forming the base for future platform engineering roles.

Amazon Web Services (AWS)Hadoop Ecosystem