Genes in the brain are very long and can be transcribed into diverse RNAs.

intro

RNA map

Analysis of multivalent RNA motifs that are bound by neuronal Nova proteins defined the RNA splicing map, explaining the various positional patterns that guide the splicing functions of Nova proteins in the brain (adapted from Ule et al, Nature 2006).

RNAs can be regulated in many ways, including alternative splicing, 3' end processing, RNA methylation, RNA transport, translation and stability, among others. Recent studies indicate that most RBPs regulate hundreds of RNAs, and most mRNAs interact with hundreds of RBPs. To understand how these interactions guide RNAs through the many regulatory steps, we study the regulatory principles at the network level.

Positional regulatory principles, or 'RNA maps'

How RBPs guide the choice of an alternative exon

How RBPs guide the choice of an alternative exon. The image is based on the findings from the RNA splicing map, and shows how the splicing choices are guided by the position where regulatory RBPs bind to pre-mRNA. The red and blue blocks are RBP that either promote the inclusion (red), or skipping of an alternative exon (blue). The yellow blocks are exons that become part of the mRNA, and grey blocks are introns, which are removed from the pre-mRNA and degraded. The two paths show the alternative splicing choices; the blue path leads to skipping of the alternative exon, and red path leads to its inclusion. (made by Josh Witten)

We showed that most RNA-binding proteins (RBPs) regulate RNA processing according to positional principles. In other words, we can't clasify RBPs into those that act as 'enhancers' or 'repressors' of alternative splicing or 3' end processing of pre-mRNAs. In contrast, RBPs can either enhance or repress an alternative exon or a polyA site, and the position of their binding site on the pre-mRNAs defines the function of the RBP. These positional principles can be shown in the form of RNA maps, which summarise the position of RBP's binding sites around the exons or polyA sites that are regulated by the RBP. 

To find out how the position of RNA binding site instructs the function of RNPs, we integrate multiple types of transcriptomic data, such as CLIP and RNA-seq. CLIP tells us where an RBP binds its target transcripts, while RNA-seq tells how such binding controls pre-mRNA processing. We visualise the binding and regulation profiles aross many exons or polyA sites in the form of RNA maps, which can uncover the global positional principles underlying the activity of an RBP.

In addition to CLIP, we found that analysis of multivalent RNA motifs can also identify the functional RNA binding sites with high specificity.

Multivalent motifs are those that repetitively occur in a clustered form on the RNA, and they are recognised cooperatively by multiple RNA-binding domains, often in a form of a larger RNP. We developed the software (RNA motifs) that can derive RNA splicing maps by analysis of multivalent RNA motifs that are often bound by RBPs. Analysis of multivalent RNA motifs helped us to understand and cross-validate the findings obtained by CLIP.

Recursive splicing in human brain

recursive splicing

A diagram explaining the analysis and sequence consensus of recursive splice sites (adapted from Sibley et al, 2016).

We have discovered a regulatory mechanism in some of the longest human genes, which are expressed primarily in the brain. This mechansim explains how long introns can be removed in two steps by recursive splicing. We showed that this requires definition of a cryptic exon, which is located next to the recursive splice site. Recursive splicing leads to skipping of the exon in the second step of intron removal. Thus, cells initially select a cryptic exon that is present deep within a long intron, but later discard it via the recursive splicing mechanism.

We showed how the coupled exon definition and recursive splicing mechanism can be employed for coordinated coupling of multiple alternative exons, which can couple alternative promoters to alternative exons.  If the recursive site is preceded by other cryptic splicing events, then the cryptic exon can end up included – in this way, recursive splicing creates a ‘binary switch’ that can distinguish correct splicing events from the newly emerging cryptic promoters or other cryptic exons. The cryptic exon that is located next to the recursive splice site contains stop codons in all frames, so by coupling other cryptic events to this cryptic exons, the cells can ensure that the newly emerging aberrant mRNAs are not translated into the full protein. We speculate that recursive splicing may enable a kind of evolutionary tinkering, by allowing creation of new mRNA isoforms from long introns, without translating them into potentially toxic proteins. Interestingly, this process happens in some of the longest genes that are expressed in the brain, which are often implicated in autism or other neurodevelopmental disorders.

Regulation of recursive splicing by the exon-junction complex

EJC regulates recursive splicing

An illustration of a long gene (in the shape of brain), with exons in red boxes, an RS-exon in green box, and introns in white line. EJC represses recursive splicing to promote inclusion of RS-exon (as shown on the left), but if activity of EJC is reduced, and RS-exon can be skipped via recursive splicing (as shown on the right). (related to Blazquez et al, 2018)

Our earlier study of recursive splicing discovered cryptic RS-exons in the middle of long introns, and we later found that hundreds of annotated exons can also be skipped through recursive splicing by using a 5’ splice site that is reconstituted at their exon-exon junction (see the paper here). Most of these annotated RS-exons are normally constitutively included because of exon junction complex (EJC), which represses their recursive splicing to preserve transcriptome integrity. Assembly of all core EJC factors, as well as the peripheral factors PNN and RNPS1, is required for efficient repression of recursive splicing. These factors are required mainly to ensure stable EJC binding on the partly-spliced pre-mRNA, since tethering eIF4A3 alone is sufficient for the repression. This demonstrates functionally that the peripheral EJC factors can play an early role in stable EJC assembly. Interestingly, however, the RS-exons varied in their sensitivity to loss of different EJC factors, particularly the peripheral ones, indicating that the importance of these factors varies from exon to exon.

We also showed that varying the strength of the RS-5ss with point mutations can convert an RS-exon from cryptic to constitutive and the other way around, demonstrating that mutations could thereby cause disease by modulating splicing of RS-exons. EJC also represses cryptic 5ss located within exons to repress inclusion of a new class of microexons, which require a longer host exon for their initial definition – in this respect, the mechanism of these microexons is similar to recursive splicing.

Finally, we showed that inclusion and regulation of RS-exon is particularly relevant in Deuterostomes and in the brain, where recursive splicing (RS) contributes to physiological alternative splicing. Haploinsufficiency of EJC components in mice leads to decreased inclusion of RS-exons in the brain, and these splicing changes affect several genes that are important for brain development, which could partly explain how mutations in EJC components lead to neurodevelopmental diseases in humans.

Leading actors

Chris Sibley, Lorea Blazquez, Rupert Faraway, Warren Emmett, Andrea Elser, Zhen Wang, Matteo Cereda, Melis Kayikci, Gregor Rot, Julian König, Kathi Zarnack, Nejc Haberman, Anob Chakrabarti

Selected publications