HPC/Research Computing Engineer

In the Crick's Stp|Scientific Computing Team.

Part of Crick Operations.

The Crick requires state-of-the-art Scientific Computing systems and services to enable world-leading scientific research. Research at the Crick is data-intensive in all experimental, theoretical and computational dimensions, so efficient and effective management and high-performance processing of its data is critical to its success. The HPC & Research Data Systems Engineer position is a role created to provide user support and assist in the design, implementation, development, and service delivery of the institute’s HPC, Cloud, and Research Data Storage and Management hardware and software through a mix of on-premise systems and cloud services. The role is part of the Research Computing Platforms/HPC team within the Scientific Computing Science Technology Platform (STP), which delivers services to the Crick’s research community and works closely with Research Labs and other STPs across the institute. The Crick currently has a large (9 PB) on-premise IBM Spectrum Scale high performance storage system and is currently exploring research computing in the Cloud. In future, we expect to have an evolving hybrid of on-site and Cloud systems to support research. This is a fantastic opportunity for anyone interested in computing systems engineering, keen to develop their skills in this area.
Deadline for applications has passed.

Key information

Job reference
R396
Salary
Competitive with benefits, subject to skills & experience
Applications closed
29 September 2021, 23:59 BST
Hours per week
36 (full time)
Posted 16 September 2021

This is a full-time, permanent position on Crick Terms and Conditions of Employment.

         

SUMMARY

The Cancer Research UK City of London Cancer Centre (CRUK CoLCC) is a world class hub for cancer biotherapeutics. It brings together researchers from four of the central London Cancer Research UK centres: University College London, King’s College London, Barts / Queen Mary University of London and The Francis Crick Institute. This unique interdisciplinary and collaborative network will generate novel innovative biological therapies, diagnostics and stratification strategies in addition to providing a clinical and translational pipeline for cancer discovery science.

This post will be a core member of a team providing research and computational support to the new CRUK Centre. You will be taking a lead role in the support of the Computational Modelling core of the Centre, working in close co-ordination with both the existing HPC team in UCL Computer Science and the Scientific Computing team at the Francis Crick Institute. The post is full time, and will be jointly based at the UCL Computer Science Bloomsbury Campus and the Francis Crick Institute.

Centre personnel include academics, researchers, postgraduate research and PhD students, technicians and professional services staff.

Crick Scientific Computing

The role is part of the Research Computing Platforms/HPC team within the Scientific Computing Science Technology Platform (STP), which delivers specialist services to the Crick’s research community and works closely with Research Labs and other STPs across the institute.

UCL Computer Science

The CoLCC computational provision has been formed with the help of the UCL Computer Science technical support team and currently provides in excess of 10 PB of storage and nearly 3000 cores just for the Centre as part of a wider provision of a much larger computational HPC and research storage resource.

Further information can be found at:

CRUK City of London Cancer Centre           http://www.colcc.ac.uk

Francis Crick Institute                            http://www.crick.ac.uk  

UCL Department of Computer Science        http://www.cs.ucl.ac.uk

KEY RESPONSIBILITIES

The post holder will be responsible for the delivery of the day-to-day support, training and wider computational and storage infrastructure supporting several hundred computational researchers and many cross-institution projects for the CoLCC. The successful candidate will be a key member of both the UCL Computer Science and Crick Scientific Computing teams and will be required to establish excellent working relationships with academic and professional services staff within both the City of London Centre and all its collaborating institutions with whom they will work closely. The post holder will need to provide a consistent Cancer Centre support experience to colleagues across all of the Centre Partners (UCL/Francis Crick/Kings/Barts/Queen Mary) and will be a key enabler of the scientific collaborations of the partners.

Responsibilities include but are not limited to:

Research Computing Support

  • Maintain HPC helpdesk for the centre ensuring timely responses to researchers’ queries.
  • Ensure effective interworking with all the partner institute research computing teams to provide researchers with a smooth transition between using local institute, national and CoLCC provided facilities.  
  • Assisting researchers with software installation, coding, debugging, problem solving, data management and data movement between sites.
  • Providing support and expertise in the design of computational and data processing pipelines for researchers in the centre.
  • Helping to define, document and maintain Standard Operating Procedures for the facilities.
  • Proactively improving the codes run on the system to improve the scientific outputs and efficiency.
  • Attending operational meetings and providing statistics and metrics for the running of the centre and presenting and reporting on developments.
  • Assisting the centre managers with annual reports on the centre progress and the scientific collaborations you have enabled on the infrastructure.
  • Ensure that all data is looked after in accordance with policies and procedures set out by the data providers and that all legal and regulatory compliance is met.

HPC Infrastructure

  • Help to ensure the smooth operation of Centre HPC and storage provision.
  • Monitoring systems to ensure service reliability and enable rapid fixing of any problems with a continuous improvement program of works (including out of hours work where required).
  • Maintaining the resilience/integrity and backup of the centre storage facility, currently in excess of 10PB and expected to grow significantly.
  • Management of high speed cluster networking and site-to-site connectivity.
  • Assisting in the operation of Centre data transfer nodes at each partner site.
  • Helping to maintain and install supported research and system software packages.
  • Making sure that regular system maintenance and security patching takes place on all IT infrastructures.
  • Ensuring the centre computational capabilities remain at the cutting edge, undertaking continuing research system development activities.

Centre Training

  • Providing weekly training sessions to the centre researchers at all partner sites.
  • Providing and maintaining up to date training materials including website and video seminars.
  • Creating and maintaining a CoLCC user induction pack and assisting with inducting researchers on the systems.
  • Keeping up to date with changes in policies and procedures of the City of London Centre.
  • Advising users on current and emerging best practice in the efficient use for the computational and storage resources.
  • Proactively arranging hackathons and code/pipelining workshops

Other duties

  • Maintaining the CoLCC  website and Intranet
  • Helping to maintain the “Research Suite” of collaboration tools. (Rocket Chat/Jitsi/ Intranet/File-sharing)
  • Setting up and maintaining project stores and filing systems as required
  • Any other duties as required within the scope, spirit and purpose of the job as requested
  • Responding to all requests for information from the partner institutions and external bodies

 KEY EXPERIENCE AND COMPETENCIES

The post holder should embody and demonstrate our core Crick values: Bold, Imaginative, Open, Dynamic and Collegial, in addition to the following:

Essential:

  • Proven track record in either Linux system administration OR scientific computing/ software development and an interest to develop both in a cancer research setting.
  • Proven experience in providing an efficient and effective level of service to colleagues.
  • Undergraduate degree or equivalent or having acquired substantial relevant experience.
  • Ability to maintain and support complex research computing systems.
  • Highly developed verbal and written communication skills including the ability to liaise with staff at a range of levels internal and external to the organisation.
  • A methodical and accurate approach to work with attention to detail and a willingness to adapt and innovate.
  • Ability to work both independently and collaboratively.
  • Excellent technical IT skills and the ability to maintain yourself on the cutting edge of both computational systems and methods.
  • Proactively raising potential issues as and when they occur, to work in advance to solve problems for both systems and research problems.
  • Ability to distil highly technical processes and complex information and communicate them in a simple effective manner to a wide range of audiences.
  • Ability to quickly pick up complex new technologies and techniques and computational methods and explain them in detail to researchers.

Desirable:

  • Experience of scientific computational/engineering or bioinformatics computing support.
  • Specifying, designing and running large scale research computing systems including knowledge of key architectures and trends in the latest technological advancements in scientific computing.
  • Knowledge of and experience of facilitating or participating in complex scientific research projects.
  • Experience in one or more of the following: Lustre/GPFS/SGE/Slurm/Graphana/ ELK/Nagios/Centos/VMware/Xen/Ovirt/KVM/HNAS/GPU computing/Databases).
  • Experience of administering HPC/HTC or large distributed computing systems.
  • Knowledge of sequencing, genomics and imaging pipelines and associated bioinformatics tools and software.
  • Experience of creating and maintaining web pages and complex documentation.
  • Experience of software programing and debugging and familiarity with at least one of the following: C++, Java, Fortran, Python, Perl, Julia, R, CUDA, bash, csh.
  • Experience of high speed networking [40/100G Ethernet/Infiniband/OPA].