Lead High-Performance Computing Engineer
Application Support, High-performance computing (HPC), IBM Platform LSF, Linux, Slurm, Bright Cluster Manager, Infiniband
Hyderabad, Bangalore, Pune, Chennai, Gurgaon
We are seeking a Lead High-Performance Computing Engineer experienced in managing and enhancing HPC environments.
The ideal candidate will bring a robust engineering background with proven experience in deploying and optimizing HPC infrastructures, who will thrive in our HPC infrastructure engineering team supporting scientific research teams.
Responsibilities
- Participate in incident resolution, software and hardware upgrades
- Support and maintain HPC infrastructure
- Implement Infrastructure as Code (IaC) automation
- Develop and review system operational procedures
- Lead troubleshooting efforts in complex systems
Requirements
- Experience range of 8 to 12 years in HPC environments
- Proficiency in configuring and supporting HPC infrastructure
- Proficiency in Linux, including capabilities such as kernel modules compilation and using debugging tools like strace, coredump, tcpdump
- Background in job schedulers including IBM LSF and Slurm
- Expertise in Bright Cluster Manager including installation and configuration tasks
- Knowledge of GPFS and Lustre file systems
- Understanding of InfiniBand and OmniPath network interconnect technologies
Nice to have
- Familiarity with cloud-based HPC solutions
- Experience in system security and data protection best practices