JOBSEARCHER

HPC Applications Engineer

• Develop, deliver and operate research computing services and applications.• Take a Site Reliability Engineering approach to HPC services, managing thedevelopment deployment, monitoring and incident response end-to-end.• Solve complex technical problems, both with SCP applications and services andthe user’s use of them• Provide deep research software engineering expertise to assist users in debuggingand optimising their workflows and applications• Essential Knowledge, Skills, and Experience• Scientific application installation, optimisation and configuration• Effective use of HPC job schedulers such as SLURM• Experienced working in a Linux environment• Competent in multiple programming and scripting languages from the followinglist: python, R, Shell Scripts, C/C++, Golang, and deep expertise in at least one ofthem• Deep understanding of the factors influencing HPC application performance• Highly customer focused; able to explain IT technical concepts in a manner whichnon-IT experts can understandRequired Skills and Knowledge• Scientific degree, and/or experience in computationally intensive analysis ofscientific data• Previous experience in high performance computing (HPC) environments,especially at large scales (>10,000 cores)• Experience with high performance parallel filesystems at petabyte scale, e.g.GPFS, Lustre• Hands-on knowledge of a range of scientific and HPC applications such assimulation software, bioinformatics tools or 3D data visualisation packages• Experience with software build frameworks such as Easybuild or Spack• Expertise in GPU, AI/ML tools and frameworks (CUDA, TensorFlow, PyTorch)• Strong understanding of parallel programming techniques (e.g. MPI, pthreads,OpenMP) and code profiling/optimisation• Experience with workflow engines (e.g. Apache Airflow, Nextflow, Cromwell, AWSStepFunctions)• Familiarity with container runtimes such as Docker, Singularity or enroot• Expertise in specific scientific domains relevant to early drug development, suchas deep learning, medical imaging, molecular dynamics or 'omics.• Experience with frameworks for regression tests and benchmarks for HPCapplications, like Reframe HPC• Experience with working in GxP-validated environmentsPlus some of the following areas of experience:• Experience administering and optimising a HPC job scheduler (e.g. SLURM)• Experience with configuration automation and infrastructure as code (e.g.Ansible, Hashicorp Terraform, AWS CloudFormation, Amazon Cloud DeveloperKit)• Experience deploying infrastructure and code to public cloud, especially AWS• Hands-on experience working in a DevOps team and using agile methodologies