We are using machine learning to predict the functions and three-dimensional structures of previously unknown genes and proteins.

Despite the huge progress that has been made in the field of genetics in recent decades, there is still a lot we don’t know about genes and the proteins they encode.

We are working on ways to improve our understanding of genes and protein using machine learning (‘artificial intelligence’) techniques, to help us predict the function of any given gene or the three-dimensional structure of a protein from its underlying genetic code.

Over the last 5 years we have collected vast amounts of experimental data for thousands of human genes – for example, information about which genes are active at the same time, and which genes share similarities in their code – which we can use to predict the functions of many genes that are currently unknown.

We are also using machine learning techniques to better understand and predict three-dimensional protein structures and functions, using our knowledge to aid in the design of entirely new proteins that could be useful in medicine or other applications.

We want to use the gene and protein data we collect and analyse using machine learning techniques to discover new connections between genes, proteins and human diseases. This work could pave the way for the development of new treatments for many different diseases in the future.