Characterising complex rearrangement patterns in fusion-driven sarcomas from whole genome sequencing data

Key information

Application close date
07 February 2023, 11:59 GMT
Hours per week
36 (full time)
Application guidance
Posted 22 December 2022

Research topics

Genetics & Genomics
Background texture taken from the lab imagery.

This is a summer student position supervised by Sara Waise from Peter Van Loo's lab. 

Introduction to the Science

Structural variants (genomic rearrangements) are a key source of somatic mutations in human cancer and may occur either as simple events occurring in isolation (such as translocations) or as part of more complex phenomena (e.g. chromothripsis, chromoplexy). A large proportion of complex patterns remain unexplained, raising the possibility of undiscovered rearrangement mechanisms [1].

Gene fusions are a well-described class of genomic rearrangement and are both diagnostic markers and key drivers in numerous bone and soft tissue tumours (sarcomas). To date, there has been only limited characterisation of the mutational processes generating such fusion genes. Recent data have indicated that gene fusions in sarcoma may arise through complex rearrangements. Moreover, work from the Cancer Genomics Group has shown that complex rearrangements are particularly frequent in sarcomas [2], making them an ideal setting in which to study these events.

About the Project

During this project, the student will be involved in characterising complex rearrangement patterns from a bulk whole genome sequencing dataset of ~1500 sarcomas. In particular, we are focussing on selected types of fusion-driven sarcoma. The student will assist in the analysis of a subset of these samples, first applying existing algorithms to identify structural variants. Identified variants will then be categorised into different rearrangement patterns using both bioinformatic tools and manual curation.

About You

The project would suit any student with a strong interest in computational biology and coding experience (R, Unix). The student will learn how to identify structural variants from whole genome sequencing data, how to characterise and classify these into complex rearrangement patterns, and how to run jobs on a high performance computing cluster.


1.         Li, Y., Roberts, N.D., Wala, J.A., Shapira, O., Schumacher, S.E., Kumar, K., . . . PCAWG Consortium (2020)

            Patterns of somatic structural variation in human cancer genomes.

            Nature 578: 112-121. PubMed abstract

2.         The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium (2020)

            Pan-cancer analysis of whole genomes.

            Nature 578: 82-93. PubMed abstract