To understand how RNPs assemble, we need to identify protein-RNA and RNA-RNA interactions that take place in various compartments inside cells.
Studies of protein-RNA and protein-protein interactions
We develop techniques that obtain comprehensive maps of protein-RNA and RNA-RNA interactions in cells and tissues. To identify protein-RNA contacts at high resolution and quantitative power across the transcriptome, we developed the individual-nucleotide resolution UV crosslinking and immunoprecipitation of protein-RNA complexes (iCLIP). We used computational approaches to demonstrate that great majority of cDNAs in iCLIP truncate at crosslink sites, and therefore the start of reads obtained by high-throughput sequencing can be used to identify protein-RNA crosslink sites. We primarily focus on methods employing high-throughput sequencing, but lately we increasingly also employ proteomics, especially to understand the protein-protein interactions mediated by the intrinsically disordered regions of proteins. For instance, in collaboration with Luscombe team, SPACE and SPACEmap were developed to study the chromatin-associated RBPs.
Studies of RNA-RNA interactions
Studies of RNA-RNA interactions
To understand the assembly of RNPs, it is also crucial to identify the RNA-RNA contacts that form between and within RNAs, because RNA structure has an important contribution to the formation of RNP. For this purpose, developed a technique called hiCLIP (or hybrid iCLIP), which identifies the connections that hook sections of RNAs together, referred to as RNA duplexes. This identified RNA duplexes between different regions of the same RNA, as well as interactions between different RNAs, such as long-noncoding RNAs and mRNAs. We were amazed to find that these duplexes often hook together very distant parts of mRNA molecules, and these duplexes interact with the double-stranded RNA binding protein Staufen 1. These long-range RNA-RNA contacts are likely important determinants of the higher-order conformation of RNPs, and we continue using hiCLIP methods to study how the sequence and structure of RNAs defines the composition and function of RNPs on long RNA molecules.
Software and web platform for analysis of transcriptomic data
We have worked jointly with the bioinformatic startup Goodwright to establish the public data analysis web platform, which provides a streamlined analysis of high-throughput sequencing data produced by CLIP variants (iCLIP, eCLIP etc) and increasingly also for other sequencing methods (3' end seq, RNA-seq). The platform is freely available to academics and enables users to make data public upon publication, and thus to build a data resource in uniformly processed format. It maps the data, defines the peaks of high-occupancy RNA binding sites, annotates the binding sites by transcriptomic features, identifies enriched sequence motifs, and analyses RNA maps, which show the position-dependent binding patterns around important landmarks on transcripts. The backend software for the platform is on Nextflow, and a Slack workspace and Q&A CLIP Forum are available for questions and feedback on the new platform and future developments.