ᎪᎷᏆᎢᎪᏴᎻ ᏢᎪNᎠᎬᎩ

DevOps Engineer

Singapore, Singapore21 yrs 10 mos experience
AI ML PractitionerAI Enabled

Key Highlights

  • 22 years of experience in HPC and AI.
  • Expert in managing large-scale AI supercomputing environments.
  • Certified in multiple advanced cloud and AI technologies.
Stackforce AI infers this person is a highly skilled HPC and AI infrastructure engineer with extensive experience in supercomputing.

Contact

Skills

Core Skills

Ai SupercomputingHpcSupercomputingLinux AdministrationSysopsUnix Administration

Other Skills

AI Infrastructure and Operations FundamentalsAmazon Web Services (AWS)AnsibleApache MesosCloud Computing IaaSDirect-To-Chip Liquid CoolingDisaster RecoveryDocker Containers AdministrationDocker SwarmFreeBSDGPFSGPU ComputingHPC Grid ComputingHPE Cray EX Programming and OptimizationHPE Cray EX System Administration

About

𝗣𝗮𝘀𝘀𝗶𝗼𝗻𝗮𝘁𝗲 𝗲𝘃𝗲𝗿𝘆𝘁𝗵𝗶𝗻𝗴 𝗮𝗯𝗼𝘂𝘁 𝗟𝗶𝗻𝘂𝘅, 𝗹𝗼𝘄 𝗹𝗮𝘁𝗲𝗻𝗰𝘆 𝗻𝗲𝘁𝘄𝗼𝗿𝗸, 𝗮𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗼𝗻, 𝗛𝗣𝗖, 𝗔𝗜/𝗚𝗣𝗨 𝗦𝘂𝗽𝗲𝗿𝗰𝗼𝗺𝗽𝘂𝘁𝗶𝗻𝗴, 𝗚𝗿𝗶𝗱 𝗖𝗼𝗺𝗽𝘂𝘁𝗶𝗻𝗴. ⏩ Seasoned AI / HPC Linux Systems Engineer with 22 years’ experience (11 in Banking/Finance) with system engineering from initial plan, build & operate in a large-scale production AI factory, Private, Public Cloud, HPC Linux infrastructure. ⏩ AI Factory, AI Supercomputing, HPC Grid: 11 years of experience designing, building, and operating enterprise-scale HPC clusters and AI Factories, including NVIDIA DGX, HGX (H100, H200, A100, L40S), HPE ExaScale supercomputers. ✅ 𝑬𝒙𝒑𝒆𝒓𝒊𝒆𝒏𝒄𝒆 𝒊𝒏 𝑯𝑷𝑪 𝐒𝐩𝐞𝐜𝐭𝐫𝐮𝐦 𝐒𝐲𝐦𝐩𝐡𝐨𝐧𝐲 & 𝐋𝐒𝐅 𝐆𝐫𝐢𝐝 (𝑭𝒊𝒏𝒂𝒏𝒄𝒆, 𝑱.𝑷. 𝑴𝒐𝒓𝒈𝒂𝒏, 𝑺𝒊𝒏𝒈𝒂𝒑𝒐𝒓𝒆) & 3 𝑺𝒖𝒑𝒆𝒓𝒄𝒐𝒎𝒑𝒖𝒕𝒆𝒓𝒔: 𝑨𝑺𝑷𝑰𝑹𝑬1 𝑭𝑼𝑱𝑰𝑻𝑺𝑼-30k+ cores,1288 nodes-1160 CPU,128 K40 GPU, 2 socket,12 cores], 13PB Storage, EDR FAT Tree 𝑨𝑺𝑷𝑰𝑹𝑬2𝑨-𝑯𝑷𝑬 𝑬𝒙𝒂𝑺𝒄𝒂𝒍𝒆-100k+ cores (CPU/GPU nodes),35 PB Storage, HPE Slingshot, Dragonfly, 352 A100 GPU 𝑨𝑺𝑷𝑰𝑹𝑬2𝑨+-𝑵𝑽𝑰𝑫𝑰𝑨 𝑺𝒖𝒑𝒆𝒓𝑷𝑶𝑫-40 DGX x 8 H100 GPU/dgx,320 GPU’s,112 (56 cores/socket).2.5PB NVME,27PB HDD ⏩ 𝐀𝐜𝐜𝐨𝐦𝐩𝐥𝐢𝐬𝐡𝐦𝐞𝐧𝐭𝐬/𝐂𝐞𝐫𝐭𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬 ✅ Rafay Certified GPU Cloud Professional ✅ NVIDIA DGX H100 SuperPOD Administration ✅ AI Infrastructure and Operations Fundamentals ✅ NVIDIA AI Enterprise Administration (v3.0) | Base Command Manager Administration ✅ CKA | CKD: Certified Kubernetes Administrator | Application Developer ✅ HashiCorp Certified: Terraform Associate ✅ AWS Certified Developer Associate| DevOps Engineer-Professional | SysOps Administrator ✅ Associate, Solution Architect - Associate | Solution Architect - Professional ✅ 𝑵𝒗𝒊𝒅𝒊𝒂 - 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒆𝒅 𝒊𝒏 “𝑰𝒏𝒕𝒓𝒐𝒅𝒖𝒄𝒕𝒊𝒐𝒏 𝒕𝒐 𝑨𝑰 𝒊𝒏 𝒕𝒉𝒆 𝑫𝒂𝒕𝒂 𝑪𝒆𝒏𝒕𝒆𝒓” ✅ 𝑵𝒗𝒊𝒅𝒊𝒂 - “𝑯𝑷𝑪 𝒘𝒊𝒕𝒉 𝑪𝒐𝒏𝒕𝒂𝒊𝒏𝒆𝒓𝒔 - 𝑺𝒊𝒏𝒈𝒖𝒍𝒂𝒓𝒊𝒕𝒚 & 𝑫𝒐𝒄𝒌𝒆𝒓”. ✅ 𝑼𝒏𝒊𝒗𝒆𝒓𝒔𝒊𝒕𝒚 𝒐𝒇 𝑬𝒅𝒊𝒏𝒃𝒖𝒓𝒈𝒉 - 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒄𝒂𝒕𝒆 𝒊𝒏 “𝑺𝒖𝒑𝒆𝒓𝒄𝒐𝒎𝒑𝒖𝒕𝒊𝒏𝒈” . ✅ 𝑫𝒖𝒃𝒍𝒊𝒏 𝑪𝒊𝒕𝒚 𝑼𝒏𝒊𝒗𝒆𝒓𝒔𝒊𝒕𝒚 - 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒄𝒂𝒕𝒆 𝒊𝒏 “𝑯𝑷𝑪 𝒊𝒏 𝒕𝒉𝒆 𝒄𝒍𝒐𝒖𝒅”. ✅ Certified in “Bright 8.0 Basic Administration for Customers”. ⏩ 𝐀𝐩𝐩𝐫𝐞𝐜𝐢𝐚𝐭𝐢𝐨𝐧𝐬/𝐀𝐰𝐚𝐫𝐝𝐬 ✔️ Appreciation-Program Manager - Genome Institute, Singapore. ✔️ Appreciation-Country head-Fujitsu Singapore for managing Supercomputing Operations. ✔️ Appreciation letter-J.P. Morgan Co-CEO - Bill Winters (Current CEO of Stan Chart Bank). ✔️ “Excel Leadership award” at J.P. Morgan

Experience

Firmus technologies

𝐇𝐏𝐂 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫 - 𝐀𝐈

Sep 2024Present · 1 yr 6 mos · Singapore · On-site

  • ✔️ 𝙎𝙪𝙨𝙩𝙖𝙞𝙣𝙖𝙗𝙡𝙚 𝘼𝙄 𝙁𝙖𝙘𝙩𝙤𝙧𝙮
  • ✔️ 𝙄𝙢𝙢𝙚𝙧𝙨𝙞𝙤𝙣 𝘾𝙤𝙤𝙡𝙚𝙙 𝘼𝙄 𝘾𝙤𝙢𝙥𝙪𝙩𝙞𝙣𝙜
  • ✔️ 𝙉𝙑𝙄𝘿𝙄𝘼 𝙃𝙂𝙓 𝘼𝙄 𝙎𝙪𝙥𝙚𝙧𝙘𝙤𝙢𝙥𝙪𝙩𝙞𝙣𝙜
  • ✔️ 𝙂𝙋𝙐 / 𝘼𝙄 𝙎𝙪𝙥𝙚𝙧𝙘𝙤𝙢𝙥𝙪𝙩𝙞𝙣𝙜
  • ✔️ 𝙀𝙙𝙜𝙚 𝘾𝙤𝙢𝙥𝙪𝙩𝙞𝙣𝙜
  • ✔️ 𝙉𝙑𝙄𝘿𝙄𝘼 𝙃𝟭𝟬𝟬, 𝙉𝙑𝙄𝘿𝙄𝘼 𝙃𝟮𝟬𝟬, 𝙉𝙑𝙄𝘿𝙄𝘼 𝙇40𝙎 𝘼𝙘𝙘𝙚𝙡𝙚𝙧𝙖𝙩𝙤𝙧𝙨
  • ✔️ 𝙒𝙚𝙠𝙖 𝙨𝙩𝙤𝙧𝙖𝙜𝙚 𝙖𝙙𝙢𝙞𝙣𝙞𝙨𝙩𝙧𝙖𝙩𝙞𝙤𝙣 (𝙒𝙚𝙠𝙖 𝙉𝙚𝙪𝙧𝙖𝙡𝙈𝙚𝙨𝙝)
  • ✔️ 𝙒𝙚𝙠𝙖𝙁𝙎 𝙎𝟯 𝙋𝙧𝙤𝙩𝙤𝙘𝙤𝙡
  • ✔️ 𝙆𝙪𝙗𝙚𝙧𝙣𝙚𝙩𝙚𝙨 𝙖𝙙𝙢𝙞𝙣𝙞𝙨𝙩𝙧𝙖𝙩𝙞𝙤𝙣
  • ✔️ 𝙎𝙡𝙪𝙧𝙢 𝘼𝙙𝙢𝙞𝙣𝙞𝙨𝙩𝙧𝙖𝙩𝙞𝙤𝙣
  • ✔️ 𝘽𝙧𝙞𝙜𝙝𝙩 𝘾𝙡𝙪𝙨𝙩𝙚𝙧 𝙈𝙖𝙣𝙖𝙜𝙚𝙧 𝘼𝙙𝙢𝙞𝙣𝙞𝙨𝙩𝙧𝙖𝙩𝙞𝙤𝙣
  • ✔️ 𝙒𝙖𝙧𝙚𝙬𝙪𝙡𝙛 𝘼𝙙𝙢𝙞𝙣𝙞𝙨𝙩𝙧𝙖𝙩𝙞𝙤𝙣
  • ✔️ 𝙍𝙖𝙛𝙖𝙮 𝘾𝙚𝙧𝙩𝙞𝙛𝙞𝙚𝙙 𝙂𝙋𝙐 𝘾𝙡𝙤𝙪𝙙 𝙋𝙧𝙤𝙛𝙚𝙨𝙨𝙞𝙤𝙣𝙖𝙡
Immersion Cooled AI ComputingGPU ComputingKubernetesNVIDIA H100NVIDIA A100AI Supercomputing+1

Fujitsu asia pacific

3 roles

⏩𝐀𝐒𝐏𝐈𝐑𝐄2𝐀 𝐏𝐥𝐮𝐬-𝐀𝐈 𝐒𝐮𝐩𝐞𝐫𝐜𝐨𝐦𝐩𝐮𝐭𝐞𝐫 - 𝑺𝒆𝒏𝒊𝒐𝒓 𝑯𝑷𝑪 𝑳𝒆𝒂𝒅@𝐍𝐒𝐂𝐂

May 2024Sep 2024 · 4 mos · Singapore · On-site

  • 👉 𝐍𝐚𝐭𝐢𝐨𝐧𝐚𝐥 𝐒𝐮𝐩𝐞𝐫𝐜𝐨𝐦𝐩𝐮𝐭𝐢𝐧𝐠 𝐂𝐞𝐧𝐭𝐫𝐞, 𝐒𝐢𝐧𝐠𝐚𝐩𝐨𝐫𝐞
  • 𝐀𝐒𝐏𝐈𝐑𝐄2𝐀 𝐏𝐥𝐮𝐬 - 𝐀𝐈 𝐒𝐮𝐩𝐞𝐫𝐜𝐨𝐦𝐩𝐮𝐭𝐞𝐫 𝐏𝐨𝐰𝐞𝐫𝐞𝐝 𝐛𝐲 𝐍𝐕𝐈𝐃𝐈𝐀 𝐃𝐆𝐗 𝐒𝐮𝐩𝐞𝐫𝐏𝐎𝐃 𝐖𝐢𝐭𝐡 𝐃𝐆𝐗 𝐇100
  • 𝐍𝐯𝐢𝐝𝐢𝐚 𝐀𝐈 𝐃𝐆𝐗 𝐒𝐮𝐩𝐞𝐫𝐏𝐨𝐝 𝐋𝐢𝐧𝐮𝐱 𝐈𝐧𝐟𝐫𝐚𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞
  • ✔️ 𝐍𝐕𝐈𝐃𝐈𝐀 𝐃𝐆𝐗 𝐇100 𝐒𝐮𝐩𝐞𝐫𝐏𝐎𝐃 𝐀𝐝𝐦𝐢𝐧𝐢𝐬𝐭𝐫𝐚𝐭𝐢𝐨𝐧
  • ✔️ 𝐀𝐈 𝐈𝐧𝐟𝐫𝐚𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐢𝐨𝐧𝐬 𝐅𝐮𝐧𝐝𝐚𝐦𝐞𝐧𝐭𝐚𝐥𝐬
  • ✔️ 𝐍𝐕𝐈𝐃𝐈𝐀 𝐀𝐈 𝐄𝐧𝐭𝐞𝐫𝐩𝐫𝐢𝐬𝐞 𝐀𝐝𝐦𝐢𝐧𝐢𝐬𝐭𝐫𝐚𝐭𝐢𝐨𝐧 (𝐯3.0)
  • ✔️ 𝐍𝐕𝐈𝐃𝐈𝐀 𝐁𝐚𝐬𝐞 𝐂𝐨𝐦𝐦𝐚𝐧𝐝 𝐌𝐚𝐧𝐚𝐠𝐞𝐫
  • ✔️ 𝐃𝐃𝐍 𝐄𝐱𝐚𝐒𝐜𝐚𝐥𝐞𝐫 (𝐋𝐮𝐬𝐭𝐫𝐞)
  • ✔️ 𝐍𝐕𝐈𝐃𝐈𝐀 𝐈𝐧𝐟𝐢𝐧𝐢𝐛𝐚𝐧𝐝
  • ✔️ 𝗡𝗩𝗜𝗗𝗜𝗔 𝗦𝗽𝗲𝗰𝘁𝗿𝘂𝗺 𝘀𝘄𝗶𝘁𝗰𝗵𝗲𝘀 (𝗡𝗩𝗜𝗗𝗜𝗔 𝗖𝘂𝗺𝘂𝗹𝘂𝘀)
  • ✔️ 𝐙𝐚𝐛𝐛𝐢𝐱 𝐌𝐨𝐧𝐢𝐭𝐨𝐫𝐢𝐧𝐠
NVIDIA DGX H100 SuperPOD AdministrationAI Infrastructure and Operations FundamentalsNVIDIA Base Command ManagerAI SupercomputingHPC

⏩𝐀𝐒𝐏𝐈𝐑𝐄2𝐀 𝐒𝐮𝐩𝐞𝐫𝐜𝐨𝐦𝐩𝐮𝐭𝐞𝐫 - Senior HPC Engineer@National Supercomputing Centre, SG

Sep 2022May 2024 · 1 yr 8 mos · Singapore · On-site

  • 👉 𝐍𝐚𝐭𝐢𝐨𝐧𝐚𝐥 𝐒𝐮𝐩𝐞𝐫𝐜𝐨𝐦𝐩𝐮𝐭𝐢𝐧𝐠 𝐂𝐞𝐧𝐭𝐫𝐞, 𝐒𝐢𝐧𝐠𝐚𝐩𝐨𝐫𝐞
  • ✔️ 𝑴𝒂𝒏𝒂𝒈𝒊𝒏𝒈 𝑺𝒊𝒏𝒈𝒂𝒑𝒐𝒓𝒆'𝒔 1𝒔𝒕 & 2𝒏𝒅 𝑺𝒖𝒑𝒆𝒓𝒄𝒐𝒎𝒑𝒖𝒕𝒆𝒓:-
  • ✔️ 𝑨𝑺𝑷𝑰𝑹𝑬-1 & 𝑨𝑺𝑷𝑰𝑹𝑬2𝑨 𝐒𝐮𝐩𝐞𝐫𝐜𝐨𝐦𝐩𝐮𝐭𝐞𝐫 𝐏𝐨𝐰𝐞𝐫𝐞𝐝 𝐁𝐲 𝑯𝑷𝑬 𝑪𝒓𝒂𝒚 𝑬𝒙𝒂𝑺𝒄𝒂𝒍𝒆 𝒔𝒖𝒑𝒆𝒓𝒄𝒐𝒎𝒑𝒖𝒕𝒆𝒓)
  • ✔️ 𝑯𝑷𝑬 𝑪𝒓𝒂𝒚 𝑪𝒍𝒖𝒔𝒕𝒆𝒓𝑺𝒕𝒐𝒓 𝑬1000 𝑺𝒚𝒔𝒕𝒆𝒎 𝑨𝒅𝒎𝒊𝒏𝒊𝒔𝒕𝒓𝒂𝒕𝒊𝒐𝒏 (𝑳𝒖𝒔𝒕𝒓𝒆)
  • ✔️ 𝑯𝑷𝑬 𝑷𝒂𝒓𝒂𝒍𝒍𝒆𝒍 𝑭𝒊𝒍𝒆 𝑺𝒚𝒔𝒕𝒆𝒎 𝑺𝒕𝒐𝒓𝒂𝒈𝒆 - (𝑮𝑷𝑭𝑺)
  • ✔️ 𝑯𝑷𝑬 𝑪𝒓𝒂𝒚 𝑬𝑿 𝑺𝒚𝒔𝒕𝒆𝒎 𝑨𝒅𝒎𝒊𝒏𝒊𝒔𝒕𝒓𝒂𝒕𝒊𝒐𝒏 𝒘𝒊𝒕𝒉 𝑯𝑷𝑬 𝑷𝑪𝑴
  • ✔️ 𝑯𝑷𝑬 𝑪𝒓𝒂𝒚 𝑬𝑿 𝑷𝒓𝒐𝒈𝒓𝒂𝒎𝒎𝒊𝒏𝒈 𝒂𝒏𝒅 𝑶𝒑𝒕𝒊𝒎𝒊𝒛𝒂𝒕𝒊𝒐𝒏
  • ✔️ 𝑪𝒉𝒆𝒄𝒌𝑴𝒌
HPE Cray EX Programming and OptimizationHPE Cray EX System AdministrationHPE Slingshot FabricSupercomputingHPC

⏩ [𝐀𝐒𝐏𝐈𝐑𝐄1 𝐒𝐮𝐩𝐞𝐫𝐜𝐨𝐦𝐩𝐮𝐭𝐞𝐫] Senior HPC Engineer@National Supercomputing Centre, SG

Jun 2017Jul 2023 · 6 yrs 1 mo · Singapore · On-site

  • 👉 𝐍𝐚𝐭𝐢𝐨𝐧𝐚𝐥 𝐒𝐮𝐩𝐞𝐫𝐜𝐨𝐦𝐩𝐮𝐭𝐢𝐧𝐠 𝐂𝐞𝐧𝐭𝐫𝐞, 𝐒𝐢𝐧𝐠𝐚𝐩𝐨𝐫𝐞
  • ✔️ 𝑴𝒂𝒏𝒂𝒈𝒊𝒏𝒈 𝑺𝒊𝒏𝒈𝒂𝒑𝒐𝒓𝒆'𝒔 1𝒔𝒕 𝑺𝒖𝒑𝒆𝒓𝒄𝒐𝒎𝒑𝒖𝒕𝒆𝒓 - 𝑨𝑺𝑷𝑰𝑹𝑬-1
  • (Fujitsu Asia Pte. Ltd - FAPL)
  • 𝑪𝑲𝑨𝑫: 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒆𝒅 𝑲𝒖𝒃𝒆𝒓𝒏𝒆𝒕𝒆𝒔 𝑨𝒑𝒑𝒍𝒊𝒄𝒂𝒕𝒊𝒐𝒏 𝑫𝒆𝒗𝒆𝒍𝒐𝒑𝒆𝒓
  • 𝑪𝑲𝑨: 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒆𝒅 𝑲𝒖𝒃𝒆𝒓𝒏𝒆𝒕𝒆𝒔 𝑨𝒅𝒎𝒊𝒏𝒊𝒔𝒕𝒓𝒂𝒕𝒐𝒓
  • 𝑯𝒂𝒔𝒉𝒊𝑪𝒐𝒓𝒑 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒆𝒅: 𝑻𝒆𝒓𝒓𝒂𝒇𝒐𝒓𝒎 𝑨𝒔𝒔𝒐𝒄𝒊𝒂𝒕𝒆
  • 𝑨𝑾𝑺 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒆𝒅 𝑫𝒆𝒗𝒆𝒍𝒐𝒑𝒆𝒓 - 𝑨𝒔𝒔𝒐𝒄𝒊𝒂𝒕𝒆
  • 𝑨𝑾𝑺 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒆𝒅 𝑫𝒆𝒗𝑶𝒑𝒔 𝑬𝒏𝒈𝒊𝒏𝒆𝒆𝒓 - 𝑷𝒓𝒐𝒇𝒆𝒔𝒔𝒊𝒐𝒏𝒂𝒍
  • 𝑨𝑾𝑺 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒆𝒅 𝑺𝒐𝒍𝒖𝒕𝒊𝒐𝒏 𝑨𝒓𝒄𝒉𝒊𝒕𝒆𝒄𝒕 - 𝑷𝒓𝒐𝒇𝒆𝒔𝒔𝒊𝒐𝒏𝒂𝒍.
  • 𝑨𝑾𝑺 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒆𝒅 𝑺𝒚𝒔𝑶𝒑𝒔 𝑨𝒅𝒎𝒊𝒏𝒊𝒔𝒕𝒓𝒂𝒕𝒐𝒓 - 𝑨𝒔𝒔𝒐𝒄𝒊𝒂𝒕𝒆.
  • 𝑨𝑾𝑺 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒆𝒅 𝑺𝒐𝒍𝒖𝒕𝒊𝒐𝒏𝒔 𝑨𝒓𝒄𝒉𝒊𝒕𝒆𝒄𝒕 - 𝑨𝒔𝒔𝒐𝒄𝒊𝒂𝒕𝒆
  • 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒆𝒅 𝒊𝒏 “𝑰𝒏𝒕𝒓𝒐𝒅𝒖𝒄𝒕𝒊𝒐𝒏 𝒕𝒐 𝑨𝑰 𝒊𝒏 𝒕𝒉𝒆 𝑫𝒂𝒕𝒂 𝑪𝒆𝒏𝒕𝒆𝒓” 𝒃𝒚 𝒏𝑽𝒊𝒅𝒊𝒂
  • 𝑪𝒐𝒎𝒑𝒍𝒆𝒕𝒆𝒅 “𝑯𝒊𝒈𝒉-𝑷𝒆𝒓𝒇𝒐𝒓𝒎𝒂𝒏𝒄𝒆 𝑪𝒐𝒎𝒑𝒖𝒕𝒊𝒏𝒈 𝒘𝒊𝒕𝒉 𝑪𝒐𝒏𝒕𝒂𝒊𝒏𝒆𝒓𝒔 - 𝑺𝒊𝒏𝒈𝒖𝒍𝒂𝒓𝒊𝒕𝒚 & 𝑫𝒐𝒄𝒌𝒆𝒓” 𝒃𝒚 𝒏𝑽𝒊𝒅𝒊𝒂
  • 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒆𝒅 “𝑴𝒊𝒄𝒓𝒐𝒔𝒐𝒇𝒕 𝑻𝒆𝒄𝒉𝒏𝒐𝒍𝒐𝒈𝒚 𝑨𝒔𝒔𝒐𝒄𝒊𝒂𝒕𝒆 (𝑴𝑻𝑨): 𝑰𝒏𝒕𝒓𝒐𝒅𝒖𝒄𝒕𝒊𝒐𝒏 𝒕𝒐 𝑷𝒓𝒐𝒈𝒓𝒂𝒎𝒎𝒊𝒏𝒈 𝒖𝒔𝒊𝒏𝒈 𝑷𝒚𝒕𝒉𝒐𝒏”
  • NVIDIA DGX-1 Deep Learning - AI Supercomputer deployment, administration & integration with PBS professional.
  • [𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒄𝒂𝒕𝒆 𝑰𝒏 𝑺𝒖𝒑𝒆𝒓𝒄𝒐𝒎𝒑𝒖𝒕𝒊𝒏𝒈] - The University Of Edinburgh & Partnership For Advanced Computing In EUROPE (PRACE)
  • [𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒄𝒂𝒕𝒆 𝑰𝒏 𝑯𝒊𝒈𝒉 𝑷𝒆𝒓𝒇𝒐𝒓𝒎𝒂𝒏𝒄𝒆 𝑪𝒐𝒎𝒑𝒖𝒕𝒊𝒏𝒈 𝑰𝒏 𝑻𝒉𝒆 𝑪𝒍𝒐𝒖𝒅] - Dublin City University and Cloud Lightning.
High Performance Computing (HPC)SupercomputingIBM Spectrum Symphony GridHPC

Jpmorganchase

𝐒𝐞𝐧𝐢𝐨𝐫 𝐀𝐬𝐬𝐨𝐜𝐢𝐚𝐭𝐞 [𝐇𝐏𝐂(𝐆𝐫𝐢𝐝) Linux Administrator]/Application Support

Jan 2015Jun 2017 · 2 yrs 5 mos · Singapore · On-site

  • Part of CBB (Compute Backbone/Bigdata Hadoop) Support & Operations team, supported Symphony Grid infrastructure globally.
  • Provided 24/7 weekend/weekday on call/out of hours, L2/L3 support to mission critical Symphony Grid Computing Infrastructure.
  • Supported internal on-premise cloud computing platform as a service (PAAS) for (HPC) IBM Symphony/LSF Grid infrastructure.
  • Responsible for the end to end infrastructure support, implementation & administration of IBM Symphony/LSF Grid infrastructure.
  • Work comprise of Linux, GPFS storage server, Platform symphony/LSF & Hadoop administration.
  • Life cycle of plan, build & operate support model, provisioned GPFS/GSS, Symphony grid cluster.
  • Supported Grid comprised of clustered Linux servers(IBM/Lenovo/RHEL), (GPFS/GSS/xCat/platform symphony).
  • BAU-HPC cluster health check/fixed problematic nodes/GPFS/payload/GPU/CPU driver.
  • Cloudera Hadoop administration- Install/configure Cloudera/Hortonworks Hadoop cluster builds, commission/decommission node.
  • Hadoop Ecosystem-MRv2, Yarn, HDFS, Hue, Spark, Flume, Sqoop, Hbase, Zookeeper, Hive, Pig, Cloudera Impala, ambari, JVM.
  • HDFS/Zookeeper YARN, Kerberos, add/remove hadoop services, role assignments, cluster re-balance, directory snapshots(HDFS).
  • OS provisioning-xCat/Cobbler/ Kickstart/Ansible/Puppet/cfEngine. Linux/gpfs troubleshooting.
  • IBM Symphony Grid concepts- SD/SSM/SIM, SOA, EGO, scheduling, reporting, consumer, applications, resource group, plan.Applied RHEL OS and GPFS/GSS updates/patches across the Symphony grid.
  • Platform Symphony Install/configure master/failover/compute hosts, application deployment
  • egosh-Close/Reclaim/Stop/Start/freeze/migrate ego service/host, user/host service, resource mgmt., fixed blocked hosts.
  • Symphony grid troubleshooting/reporting/resilience & rejoined computes nodes back to the grid after maintenance.
  • SOA workload mgmt-soamcontrol/soamview/soamapp/soamdeploy/soamregister, view/control workload, deploy/rollback package.
High Performance Computing (HPC)SupercomputingIBM Spectrum LSF GridHPC

Markit

𝐀𝐬𝐬𝐢𝐬𝐭𝐚𝐧𝐭 𝐕𝐢𝐜𝐞 𝐏𝐫𝐞𝐬𝐢𝐝𝐞𝐧𝐭 - Linux / APAC lead (VMware Infrastructure)

Jun 2012Dec 2014 · 2 yrs 6 mos · Singapore

  • 𝑴𝒂𝒓𝒌𝒊𝒕 𝒊𝒔 𝒂 𝒍𝒆𝒂𝒅𝒊𝒏𝒈 𝒈𝒍𝒐𝒃𝒂𝒍 𝒅𝒊𝒗𝒆𝒓𝒔𝒊𝒇𝒊𝒆𝒅 𝒑𝒓𝒐𝒗𝒊𝒅𝒆𝒓 𝒐𝒇 𝒇𝒊𝒏𝒂𝒏𝒄𝒊𝒂𝒍 𝒊𝒏𝒇𝒐𝒓𝒎𝒂𝒕𝒊𝒐𝒏 𝒔𝒆𝒓𝒗𝒊𝒄𝒆𝒔. 𝑾𝒆 𝒑𝒓𝒐𝒗𝒊𝒅𝒆 𝒑𝒓𝒐𝒅𝒖𝒄𝒕𝒔 𝒕𝒉𝒂𝒕 𝒆𝒏𝒉𝒂𝒏𝒄𝒆 𝒕𝒓𝒂𝒏𝒔𝒑𝒂𝒓𝒆𝒏𝒄𝒚, 𝒓𝒆𝒅𝒖𝒄𝒆 𝒓𝒊𝒔𝒌 𝒂𝒏𝒅 𝒊𝒎𝒑𝒓𝒐𝒗𝒆 𝒐𝒑𝒆𝒓𝒂𝒕𝒊𝒐𝒏𝒂𝒍 𝒆𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒄𝒚.
  • Part of Global SysOps team & provided production/project/app infra support to Markit’s critical trading environment.
  • Infrastructure consists of highly available, (F5) load balanced, replicated, clustered servers (HP ProLiant BL/DL, C-class) running Oracle/Apache/MQ/Splunk/ Jboss on RHEL/Solaris/vSphere/cloud/NAS/SAN spread across NA/EMEA.
  • Worked in a SysOps/DevOps environment- Coordinated with dev teams for releases of in-house/third-party products.
  • Supported RHEL/Unix servers running Apache/Jboss/Red Hat Cluster in Prod/UAT/QA/Test/demo environment.
  • Coordinated different teams to deploy different builds to different environments, including Prod/UAT/QA/Test/demo/Dev.
  • Provided up to 3rd Level technical support (“follow the sun” model), after-hours, on-call & weekend (24/7) support.
  • Support, administration of VMware/NetApp/RHEL (4/5/6) /Red Hat Cluster/Satellite, Puppet, VCS, VMware View.
  • Handled servers/application escalations & resolved Jira incidents/changes escalated by lower levels of support.
  • Configured, troubleshooted RHCS, Memcached, Postfix, NFS, FTP, Iptables, Jboss, Puppet issues.
  • Supported weekend Markit product releases, applied product patches, data center migration/power down, DR/BCP.
  • Participated in US/UK (DR) disaster recovery, failover, failback activities in NetApp SnapMirror/VVR environment.
  • Linux patch/updates Upgraded kernel, applied OS patches/updates on RHEL-4/5/6 servers using PSSH (Parallel SSH).
  • Implemented Linux security, hardening, monitoring, resolved performance/kernel/server related issues on RHEL/vSphere.
High Performance Computing (HPC)SupercomputingIBM Spectrum Symphony GridHPC

Deutsche bank ag

𝐒𝐞𝐧𝐢𝐨𝐫 𝐔𝐧𝐢𝐱/𝐒𝐨𝐥𝐚𝐫𝐢𝐬/𝐋𝐢𝐧𝐮𝐱 𝐀𝐝𝐦𝐢𝐧𝐢𝐬𝐭𝐫𝐚𝐭𝐨𝐫

Jun 2011Jun 2012 · 1 yr · Singapore

  • Worked in Global Unix Team and responsible for day to day server administration incidents/changes/service requests etc.
  • Rebuild/ reconfigure Linux/Unix server using PXE boot/Kickstart/Jumpstart environment on Oracle/HP hardware.
  • Supported around 15000+ servers (Solaris, SuSE Linux, AIX) which includes physical and virtual infrastructure.
  • Servers with mix heterogeneous environment (HP, Oracle,Fujitsu & SUN) and models with different virtualization technologies from VMware to Solaris Zones.
  • Day to day server administration including Veritas products such as VCS/VXVM etc and Oracle/EMC SAN technologies.
  • Applied certified OS patch on SUSE Linux/Solaris servers & SAN stack install/upgrade (VCS/VxVM/SYMcli/HBA firmware/lpfc/Symmetrix/ PowerPath)
  • Participated in US/UK (DR) disaster recovery, failover, failback activities in SAN/SRDF/VCS environment.
  • Backed out applied OS updates on SUSE\RHEL Linux using LVM SNAPSHOTS/REAR method in case application issues.
LinuxVMwareOracleLinux AdministrationSysOps

Jpmorgan chase & co.

Unix/Linux Administrator(Team Lead)

Nov 2006Jun 2011 · 4 yrs 7 mos · Bengaluru Area, India

  • Worked for large Data centers spanning across NA/APAC/EMEA
  • Supported all Solaris (8/9/10)/RHEL-2/3/4/5 servers running Keon/MQ/Oracle on SUN/IBM/HP hardware.
  • Supported virtual infrastructure running on Solaris Zones/Containers, LDOMS/VMware ESX/ Linux KVM.
  • Worked effectively as part of geographically dispersed, culturally diverse virtual teams in NA/APAC/EMEA.
  • A background of 24x7 mission-critical & On-call environments, full change control process, systems monitoring.
  • Installing, upgrading, patching and configuring SUN Solaris 8/9/10 on Sun Servers & RHEL on HP/IBM h/w.
  • Completed Solaris/Linux server reinstall/rebuild/reconfigure using Jumpstart/Kickstart/Cobbler.
  • Extensive experience in setup, configuration, upgrade, maintenance & troubleshooting on different UNIX flavors
  • Experienced in Volume/Disk management, software RAID solutions using VxVM/LVM/SVM/ZFS.
  • Coordinated with SAN Team for storage allocation & extended file system using VxVM/LVM /ZFS.
  • Supported high availability/SRDF, business persistence, fail-over/fail-back using VCS in SAN environment.
  • Ability to handle significant workload in managing multiple projects in a matrix process environment.
UnixSolarisLinuxUnix AdministrationLinux Administration

Mercantila software services pvt. ltd.

Unix/Linux Systems Administrator (Linux/FreeBSD/OpenBSD)

Jan 2006Oct 2006 · 9 mos · Bengaluru Area, India

  • Remotely managed Red Hat Linux/Fedora core/FreeBSD/OpenBSD production servers.
  • Performed Linux backup and recovery and providing remote technical support to US office.
  • Communication with US vendors & onsite engineers on conference call & mails.
  • Installed UNIX software, maintained appropriate maintenance levels and tuning.
  • Led technical projects from start to finish while working within project management guidelines.
  • Worked on ImageStream Linux Router/Firewall appliance and took care of VPN connectivity.
LinuxSolarisLinux AdministrationUnix Administration

Microland ltd

Unix/Linux - Technical Support Engineer

Aug 2004Dec 2005 · 1 yr 4 mos · Bengaluru Area, India

  • Configured & managed RHEL production boxes for GE worldwide Network devices using OpenNMS.
  • Customized scripts for performance tuning and process automation in Core java & Shell Script.
  • Coordinated with US/UK vendors & client on conference call & mails and performed backup.
  • Remotely monitored & managed RHEL Servers, Firewalls using Snort, Big brother/Nagios.
  • Provided technical support for 10 Major customers (India/US/UK) and worked on RHEL Servers.
LinuxFreeBSDLinux Administration

Excel internet pvt. ltd.

Unix/Linux - Technical Support Engineer

Mar 2004Aug 2004 · 5 mos · New Delhi Area, India

  • Provided online technical support on Linux servers to international clients through mail.
  • Performed web hosting and trouble shooting on Linux server using CPANEL/ENSIM/PLESK.
  • Performed R & D on new projects as per client requirement and Linux Servers Backup.
  • Developed scripts for systems monitoring and users log management.
  • Performed Virtual hosting on servers running Linux/FreeBSD.
  • Solved day to day Linux related issues with mail servers(Exim /Sendmail)/Web server (Apache/Tomcat)
LinuxLinux Administration

Acumen software

Trainee Software Developer

Dec 2003Feb 2004 · 2 mos · New Delhi Area, India

  • Performed website integration,testing, management and Linux Administration.
  • Performed web hosting using PLESK control panel.

Education

The University of Edinburgh

ᑕEᖇTIᖴIᑕᗩTE Iᑎ SᑌᑭEᖇᑕOᗰᑭᑌTIᑎG

Jan 2018Jan 2018

Dublin City University

ᑕEᖇTIᖴIᑕᗩTE Iᑎ ᕼIGᕼ ᑭEᖇᖴOᖇᗰᗩᑎᑕE ᑕOᗰᑭᑌTIᑎG Iᑎ TᕼE ᑕᒪOᑌᗪ

Jan 2018Jan 2018

Guru Ghasidas University

𝐌𝐚𝐬𝐭𝐞𝐫 𝐨𝐟 𝐂𝐨𝐦𝐩𝐮𝐭𝐞𝐫 𝐀𝐩𝐩𝐥𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬 - 𝐌𝐂𝐀

Jan 2000Jan 2003

Dr. Harisingh Gour University (Sagar University)

𝐁𝐚𝐜𝐡𝐞𝐥𝐨𝐫 𝐨𝐟 𝐒𝐜𝐢𝐞𝐧𝐜𝐞 - 𝐁𝐒 — 𝐒𝐜𝐢𝐞𝐧𝐜𝐞 [𝐏𝐡𝐲𝐬𝐢𝐜𝐬 | 𝐂𝐡𝐞𝐦𝐢𝐬𝐭𝐫𝐲 | 𝐌𝐚𝐭𝐡𝐞𝐦𝐚𝐭𝐢𝐜𝐬]

Jan 1996Jan 1999

Stackforce found 100+ more professionals with Ai Supercomputing & Hpc

Explore similar profiles based on matching skills and experience