Darshan Redij

Product Engineer

Osaka, Osaka, Japan17 yrs 9 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Managed HPC clusters exceeding 25000 cores.
  • Supported the world's 4th fastest supercomputer.
  • Led a team for HPC IT infrastructure management.
Stackforce AI infers this person is a High Performance Computing specialist with extensive experience in cloud infrastructure management.

Contact

Skills

Core Skills

High Performance ComputingCloud ComputingSystem AdministrationStorage Management

Other Skills

Backup ManagementBackup SolutionsBig DataBrocadeCentOSCloud Computing IaaSCloud Storage AdministrationClusterData CenterFibre ChannelGlusterFSHPHP Data ProtectorHP IbrixHPC Cloud Storage

About

I have experience working in High Performance Computing currently supporting Biomedical domain Researchers. Here I am also responsible for complete management of 3000+ cores HPC cluster and 1.2+ PB of Storage space. I worked with Hewlett Packard Globalsoft Pvt Ltd., was responsible for documenting the HPC Grid Infrastructure project as this project was in its initial phase. At TATA Computational Research Laboratories I got an opportunity to support EKA Supercomputing cluster. This cluster was ranked World's 4th fastest Supercomputer in 2007. Supported various clients from Aviation domain and also supported internal HPC user community. Role also included support to 15000+ CPU cores and around 300TB of storage space. At Tata Consultancy Services Ltd. I worked as HPC Specialist supporting client GE Global Research. My role was to support HPC IT Cloud Infrastructure of 25000+ cores, 1.3+ PB of Storage space and 2500+ Researchers from Aviation, Oil and Gas, Power and Water, Energy & Transportation domain at GE. This includes providing day to day HPC Operational Support, Deployment of New Systems, Storage, Network, etc. I started my career at CDAC where I supported HPC cluster of 5000+ cores, 500+ TB of storage space. The other part of my role was to support HPC User community. I was a part of National PARAM Supercomputing Facility team. The PARAM Yuva Supercomputing system I supported was ranked 69th fastest in the World at its inception.

Experience

Amazon

Data Center Engineer

Dec 2019Present · 6 yrs 3 mos · Japan

University of southern denmark

Sr HPC Architect

Nov 2018Dec 2019 · 1 yr 1 mo · Denmark

Nanyang technological university

Sr. HPC Specialist

Nov 2015Oct 2018 · 2 yrs 11 mos · Singapore

Hewlett packard enterprise

Sr. HPC Grid Administrator

Jul 2015Oct 2015 · 3 mos · Bengaluru, Karnataka, India

Tata consultancy services ltd. (transfer of service)

HPC IT Infrastructure Team Lead

Sep 2012Jul 2015 · 2 yrs 10 mos · Bengaluru, Karnataka, India

  • Currently working with US based Research client for Managing up the Linux HPC
  • Cluster Cloud capacity of more than 2000 blade servers and its integration with PANASAS
  • NAS storage of 1500+ TB capacity amd providing day to day Operational Support to 1500+
  • Researchers in Aviation, Energy, Oil and Gas, Power and Water, Transportation domain.
  • Leading a Infrastructure team of 10 members.
  • Manage transition from HPC IT Infrastructure Solution implementation to HPC Operational Support.
  • Troubleshooting service problems, escalating issues within the organization as needed.
  • First level of escalation for production Infrastructure issues.
  • Manage production Infrastructure - optimizing availability and high operational “up time”.
  • The Configuration, Enhancement, Testing, Integration, Implementation and Documentation of hardware, Operating Systems, Software’s utilizing defined processes and procedures.
  • Set technical direction for the team in regards to Infrastructure upgrades and enhancements.
  • Project progress status and Operational performance status reporting to HPC Global Operations Leader and HPC Global Project Leader during the Weekly/Pillar meeting.
  • Performing hardware refresh in conjunction with OS, software upgrades, etc.
  • Providing support for compute and storage hardware, operating systems and utilities.
  • Administration of Infiniband (IB) interconnects (DDR, QDR, FDR).
  • Managing projects to deploy infrastructure to support HPC Customers.
  • Monitoring of cloud HPC Capacity by using Nagios.
  • Co-ordination with Vendor for Hardware break/fix for HPC infrastructure.
  • Maintain the compute environment with vendor patches, firmware and OS updates.
  • Storage management – Coordinate with respective vendors to resolve Panasas storage related issues, manage mount points and access on the cluster.
  • PANASAS Storage performance tuning, benchmarking and monitoring the filesystem.
  • Adhere to Standard Operating Procedures and defined services delivery processes as provided by client.
HPC IT InfrastructureOperational SupportStorage ManagementCloud ComputingLinuxHigh Performance Computing

Tata computational research laboratories

Sr. HPC Administrator at EKA Supercomputing Facility

Aug 2011Aug 2012 · 1 yr · Pune/Pimpri-Chinchwad Area

  • 1. EKA 172.6 TF HPC cloud storage/system administration (The 4th fastest Supercomputer in the World according to the TOP500 Ratings released on November 2007)
  • Administration and management of EKA (1800+ node Linux cluster)
  • Customer lifecycle management – incident, change management
  • User requirement gathering
  • Proof of Concept setup
  • Daily ticket resolution for HPC users
  • Storage and backup management
  • IB Fabric management
  • Scheduler management and optimization – LSF, SLURM, Torque, Maui
  • User administration and cluster provisioning
  • Parallel file system management – HP Ibrix, Panasas, Lustre, Gluster
  • Application commissioning and management
  • Preventive maintenance for better uptime
  • Adherence to ISO 27001 and ITIL processes
  • 2. Upgrade of Luster filesystem from 1.6.7 to 1.8.3 Production storage in CRL
  • Setup details:
  • Lustre PFS consisting of 10 OSS and MDS/Admin in HA mode
  • 38 OST’s with 2TB LUN size
  • This PFS serves 1800+ clients simultaneously over Infiniband fabric
  • Replicated the original configuration and Test upgrade
  • Analysis and resolution of problems
  • Data integrity checks and monitoring in early stages
  • Client disconnection issue was eliminated post upgrade
HPC Cloud StorageSystem AdministrationUser ManagementBackup ManagementHigh Performance Computing

Centre for development of advanced computing

HPC Administrator at National PARAM Supercomputing Facility

Mar 2008Aug 2011 · 3 yrs 5 mos · Pune/Pimpri-Chinchwad Area

  • 1. PARAM Yuva 54 TF HPC cloud storage administration (The 69th fastest Supercomputer in the World according to the TOP500 Ratings released on November 2008)
  • I had the opportunity to be associated with this project right from commissioning of this system till its sustenance mode.
  • 2. SAN Cloud Storage Configuration and Administration
  • Implementation of HP Storage Area Network, which includes configuring and managing of SAN switches
  • Management and administration of HP MSA RAID Arrays
  • Zoning, LUN creation, etc.
  • Performing firmware upgrades
  • Troubleshooting the storage related performance issues
  • 3. Parallel filesystem and Backup solution integration for PARAM Yuva
  • Installation, configuration and administration of HP Network Storage Gateway server HP X9000 series
  • 200+ TB IBRIX appliance was setup and provisioned home area for users
  • 200+ TB Lustre was setup for scratch area
  • Implemented HP Data Protector backup solution and its integration with HP ESL 712 (670 slots/ 16 drives) Tape Library
  • Also responsible for the storage administration of individual storage systems of 100+ TB capacity in the organization.
  • Training from HP
  • Received configuration and administration training on HP Storage Array, Ibrix filesystem, HP Data Protector backup software, Brocade SAN switches, Lustre, HP ESL Tape Library, Infiniband, etc.
Cloud Storage AdministrationSAN ConfigurationBackup SolutionsHigh Performance ComputingStorage Management

Education

Finolex Academy of Management and Technology

Bachelor of Engineering (BE) — Electronics

Jan 2003Jan 2007

Sunbeam Pune

Post Graduate Diploma

Jan 2007Jan 2008

Sacred Heart Convent

Matriculation

Stackforce found 100+ more professionals with High Performance Computing & Cloud Computing

Explore similar profiles based on matching skills and experience