Are you an experienced High-Performance Computing (HPC) Systems Administrator looking for your next challenge? Our client is seeking a Level 2 Systems Administrator to manage and support HPC clusters in a research-driven environment. In this role, you will maintain and optimize HPC hardware, troubleshoot technical issues, and support scientists
...
by ensuring seamless execution of research applications. You’ll work with cutting-edge technologies like SLURM, CUDA, Ansible Playbooks, MPI implementations, and parallel storage systems to enhance computing performance. This is a full-time, remote-friendly opportunity, with potential on-site requirements in Ottawa, ON based on operational needs. Candidates must have 5+ years of HPC administration experience and hold a valid Secret clearance. If you have a passion for HPC, research computing, and collaborating with top-tier scientists, we want to hear from you! Apply today to make an impact in a dynamic and innovative environment.
*** 12 months plus 4 additional 12 months options to renew ***
*** Government of Canada Secret Security Clearance required ***
*** Work will be remote however, some in person meetings may occur in Ottawa ***
Advantages
*** Multi-year (up to 5 years) contracting opportunity with primarily remote work ***
Responsibilities
•Maintain a HPC cluster (hardware, image management, local networking, scheduler, backups).
•Troubleshoot the environment when an incident occurs to ensure a quick return to normal operations.
•Meet with scientists and evaluate their requirements for HPC support.
•Develop a task plan to meet scientists' needs and consult the technical authority for approval.
•Application builds and installs, runtime troubleshooting (GNU, Intel, Fortran, Nvidia).
•Support for open-source and commercial off-the-shelf (COTS) software, including:
•Python and Anaconda installs.
•Bash scripts, build/make tools, EasyBuild, and Spack.
•MPI implementations (MPICH, OpenMPI, IntelMPI, HPMPI).
•Assist with in-house developed applications (compilation and runtime).
•Management of:
•Operating system (patching schedule, reliability for Linux distributions).
•Accounts (creation, deletion).
•Configuration via Git, MS DevOps, Ansible Playbooks.
•RPM/DEB Packages.
•Environment modules.
•ThinLinc troubleshooting.
•Troubleshooting jobs on schedulers (PBS Pro/Torque, SLURM, SGE).
•Ensure reliable CUDA installs, troubleshoot GPU failures and other CUDA software/driver issues.
•Hardware support (memory upgrades, storage arrays, power and network cabling, ILO).
•Document each process for every task to ensure enterprise knowledge continuity.
Qualifications
•Experience: Minimum 5 years of HPC system administration in the last 10 years
•Security Clearance: Secret (Level II) clearance required at contract award
•HPC Experience in Research Environments (up to 10 points)
•HPC Experience in Federal Government (Shared Services Canada)
•Hands-on Experience with HPC Technologies & Tasks (e.g., SLURM, Ansible, Lustre, CUDA)
Summary
If you're qualified and interested please submit your resume and one of our experienced Recruiters would be happy to give you a call. Thank you.
Randstad Canada is committed to fostering a workforce reflective of all peoples of Canada. As a result, we are committed to developing and implementing strategies to increase the equity, diversity and inclusion within the workplace by examining our internal policies, practices, and systems throughout the entire lifecycle of our workforce, including its recruitment, retention and advancement for all employees. In addition to our deep commitment to respecting human rights, we are dedicated to positive actions to affect change to ensure everyone has full participation in the workforce free from any barriers, systemic or otherwise, especially equity-seeking groups who are usually underrepresented in Canada's workforce, including those who identify as women or non-binary/gender non-conforming; Indigenous or Aboriginal Peoples; persons with disabilities (visible or invisible) and; members of visible minorities, racialized groups and the LGBTQ2+ community.
Randstad Canada is committed to creating and maintaining an inclusive and accessible workplace for all its candidates and employees by supporting their accessibility and accommodation needs throughout the employment lifecycle. We ask that all job applications please identify any accommodation requirements by sending an email to accessibility@randstad.ca to ensure their ability to fully participate in the interview process.
show more
Are you an experienced High-Performance Computing (HPC) Systems Administrator looking for your next challenge? Our client is seeking a Level 2 Systems Administrator to manage and support HPC clusters in a research-driven environment. In this role, you will maintain and optimize HPC hardware, troubleshoot technical issues, and support scientists by ensuring seamless execution of research applications. You’ll work with cutting-edge technologies like SLURM, CUDA, Ansible Playbooks, MPI implementations, and parallel storage systems to enhance computing performance. This is a full-time, remote-friendly opportunity, with potential on-site requirements in Ottawa, ON based on operational needs. Candidates must have 5+ years of HPC administration experience and hold a valid Secret clearance. If you have a passion for HPC, research computing, and collaborating with top-tier scientists, we want to hear from you! Apply today to make an impact in a dynamic and innovative environment.
*** 12 months plus 4 additional 12 months options to renew ***
*** Government of Canada Secret Security Clearance required ***
*** Work will be remote however, some in person meetings may occur in Ottawa ***
...
Advantages
*** Multi-year (up to 5 years) contracting opportunity with primarily remote work ***
Responsibilities
•Maintain a HPC cluster (hardware, image management, local networking, scheduler, backups).
•Troubleshoot the environment when an incident occurs to ensure a quick return to normal operations.
•Meet with scientists and evaluate their requirements for HPC support.
•Develop a task plan to meet scientists' needs and consult the technical authority for approval.
•Application builds and installs, runtime troubleshooting (GNU, Intel, Fortran, Nvidia).
•Support for open-source and commercial off-the-shelf (COTS) software, including:
•Python and Anaconda installs.
•Bash scripts, build/make tools, EasyBuild, and Spack.
•MPI implementations (MPICH, OpenMPI, IntelMPI, HPMPI).
•Assist with in-house developed applications (compilation and runtime).
•Management of:
•Operating system (patching schedule, reliability for Linux distributions).
•Accounts (creation, deletion).
•Configuration via Git, MS DevOps, Ansible Playbooks.
•RPM/DEB Packages.
•Environment modules.
•ThinLinc troubleshooting.
•Troubleshooting jobs on schedulers (PBS Pro/Torque, SLURM, SGE).
•Ensure reliable CUDA installs, troubleshoot GPU failures and other CUDA software/driver issues.
•Hardware support (memory upgrades, storage arrays, power and network cabling, ILO).
•Document each process for every task to ensure enterprise knowledge continuity.
Qualifications
•Experience: Minimum 5 years of HPC system administration in the last 10 years
•Security Clearance: Secret (Level II) clearance required at contract award
•HPC Experience in Research Environments (up to 10 points)
•HPC Experience in Federal Government (Shared Services Canada)
•Hands-on Experience with HPC Technologies & Tasks (e.g., SLURM, Ansible, Lustre, CUDA)
Summary
If you're qualified and interested please submit your resume and one of our experienced Recruiters would be happy to give you a call. Thank you.
Randstad Canada is committed to fostering a workforce reflective of all peoples of Canada. As a result, we are committed to developing and implementing strategies to increase the equity, diversity and inclusion within the workplace by examining our internal policies, practices, and systems throughout the entire lifecycle of our workforce, including its recruitment, retention and advancement for all employees. In addition to our deep commitment to respecting human rights, we are dedicated to positive actions to affect change to ensure everyone has full participation in the workforce free from any barriers, systemic or otherwise, especially equity-seeking groups who are usually underrepresented in Canada's workforce, including those who identify as women or non-binary/gender non-conforming; Indigenous or Aboriginal Peoples; persons with disabilities (visible or invisible) and; members of visible minorities, racialized groups and the LGBTQ2+ community.
Randstad Canada is committed to creating and maintaining an inclusive and accessible workplace for all its candidates and employees by supporting their accessibility and accommodation needs throughout the employment lifecycle. We ask that all job applications please identify any accommodation requirements by sending an email to accessibility@randstad.ca to ensure their ability to fully participate in the interview process.
show more