Resource Management Engineer
The Ohio Supercomputer Center (OSC) seeks a High Performance Computing (HPC) resource management engineer. OSC accelerates discovery for Ohio's academic and commercial researchers by providing accessible, reliable, and secure high performance computing power at a fraction of the cost of alternatives, along with expert consultation and secure storage.
Responsibilities of this position include:
- Maintain, improve, and help chart the future directions for resource management for OSC's large-scale cluster systems. These systems include the Slurm resource manager for the HPC clusters and Kubernetes clusters for container orchestration.
- Manage system utilization accounting interfaces between the HPC computational and storage systems and the OSC User Services (OSCUS) system that is responsible for billing the users of OSC systems.
- Interface with the operations team to develop and improve procedures for the operation of the resource management and accounting systems.
The HPC Systems Group is responsible for the design, procurement, deployment, and operation of OSC's HPC clusters, storage, and networking. The group collaborates with the other groups at OSC to support the diverse research community using OSC's systems.
The HPC Systems Group comprises an engineering team and an operations team. This position will be part of the engineering team, reporting to OSC's Associate Director, and works closely with the technical lead for the engineering group. The engineering team is responsible for overall design, procurement, installation, and integration of the HPC systems. The engineering team also works closely with the operations team on the day-to-day management of the systems. The operations team is responsible for the day-to-day running and standard preventative maintenance of OSC systems.
The Ohio Supercomputer Center, along with OARnet and OhioLINK, is part of the Ohio Technology Consortium (OH-TECH), the technology and information division of the Ohio Department of Higher Education (ODHE). OH-TECH provides high-tech solutions for Ohio's higher education institutions to catalyze innovation in the modern knowledge economy. OH-TECH is a welcoming and supportive work environment.
Because The Ohio State University serves as OH-TECH's fiscal and legal agent, OH-TECH staff enjoy the same benefits as other Ohio State employees, including participation in the Ohio Public Employees Retirement System (OPERS), the Ohio State Health Plan, employee wellness and work-life balance programs, educational benefits, discount programs, and more.
Required Experience/Education:
- Bachelor's degree in computing, science, mathematics, or engineering discipline or an equivalent combination of education and experience.
- 4 years of experience in administering HPC systems.
- Extensive experience in HPC resource management systems such as Slurm or container orchestration such as Kubernetes.
- Proficiency in shell scripting.
- Experience with revision control systems such as Git.
- Proficiency in one or more programming languages such as Python or Ruby.
- Demonstrated ability for clear communication and to work in a collaborative environment.
- Eligible to work in an export-controlled environment.
Desired Experience/Education:
- Experience installing, configuring, and administering Kubernetes clusters.
- Experience with DevOps workflows.
- Experience developing or utilizing Helm charts to deploy services on a Kubernetes cluster.
- Experience with configuration management systems such as Puppet.
Function: Information Technology
Subfunction: System Administration
Career Band: Individual Contributor - Specialized
Career Level: S3
OSU and OH-TECH will not sponsor applicants for work visas for this position.
Whoops! This job is not yet sponsored…
Or, view more options below
View full job details
See the complete job description, requirements, and application process
Express interest in this position
Let The Ohio State University know you're interested in Resource Management Engineer
Get similar job alerts
Receive notifications when similar positions become available







