HPC Systems Engineer
Posted Job Title
HPC Systems Engineer
Job Profile Title
Systems Administrator Senior
Job Description Summary
The Penn Advanced Research Computing Center (PARCC) core facility is seeking a highly qualified and motivated High Performance Computing (HPC) Systems Engineer to join the team. PARCC's main cluster (Betty), delivers HPC, data-intensive science and Artificial Intelligence (AI) resources to researchers at the University of Pennsylvania. The HPC Systems Engineer contributes to the strategic planning, design, testing, organization, and implementation of cutting-edge technology projects for the facility, and leads the systems team.
Job Responsibilities
- Collaborate with senior staff to design, plan, test, and implement advanced hardware solutions for HPC-AI environments.
- Deploy and configure physical hardware using HPC deployment tools and orchestration frameworks (e.g., Ansible).
- Ensure high availability and minimal downtime of HPC resources to meet the needs of the research community.
- Optimize, monitor, and troubleshoot HPC file systems for performance and reliability.
- Conduct system benchmarking and develop automated testing to ensure a robust and efficient HPC infrastructure.
- Maintain job scheduling systems and enforce storage allocation policies to ensure equitable use of shared resources.
- Administer and configure the Slurm scheduler in alignment with institutional research policies.
- Participate in planning sessions related to network and security operations; collaborate with the university's central networking group (ISC).
- Apply HPC networking configurations and security protocols to optimize resource utilization and protection.
- Maintain a secure, stable, and evolving system/software environment to support dynamic research requirements.
- Implement and manage data security controls, including user- and group-based access.
- Operate comprehensive monitoring systems for rapid issue detection and long-term performance analysis.
- Automate user account lifecycle processes, including creation, maintenance, and removal.
- Install and maintain HPC tools to facilitate processes, for example, coldfront.
- Manage hardware and software inventory in coordination with vendors.
- Establish a collaboration with other groups to provide and keep information about all assets.
- Provide technical guidance on new projects involving HPC-AI computing within the institution.
- Develop custom tools as needed, and contribute relevant innovations to open-source communities when appropriate.
- Evaluate, implement, and test emerging technologies with potential benefits for the HPC-AI research community.
- Continuously assess emerging tools and technologies for integration into current and future HPC cluster environments.
- Actively mentor and support the training of new and existing staff under the incumbent's supervision.
- Participate in departmental and university-sponsored training programs to enhance knowledge and skills; supervisor-approved commercial training may be substituted where appropriate.
Qualifications
- Bachelor's degree and 3-5 years of experience as a systems engineer at an academic institution or equivalent combination of education and experience
- Expertise in Infiniband networking
- Experience configuring job and resource management applications (Slurm)
- Experience deploying HPC portal (OpenOnDemand, Cryo-em, coldfront)
- Familiarity with scientific software deployment (spack, easybuild)
- Expertise with cluster management software (xCAT, BCM,)
- Experience deploying file systems, troubleshooting, and maintenance
Application Requirement
A Cover Letter and Resume/CV are required to be considered for this position. Please upload your Cover Letter where it asks you to upload your Resume/CV; multiple documents are allowed.
Job Location - City, State
Philadelphia, Pennsylvania
Department / School
Provost's Center
Pay Range
$83,500.00 - $110,000.00 Annual Rate
Salary offers are made based on the candidate's qualifications, experience, skills, and education as they directly relate to the requirements of the position, and in alignment with salary ranges based on external market data for the job's level. Internal organization and peer data at Penn are also considered.
To apply, visit https://wd1.myworkdaysite.com/en-US/recruiting/upenn/careers-at-penn/job/3600-Civic-Center-Boulevard---9th-Floor/HPC-Systems-Engineer_JR00113237
Unlock this job opportunity
View more options below
View full job details
See the complete job description, requirements, and application process
Express interest in this position
Let AcademicJobs know you're interested in HPC Systems Engineer
Get similar job alerts
Receive notifications when similar positions become available


















