Artificial intelligence (AI) is a broad term referring to machines that can simulate a near-human level of intelligence in order to solve problems and perform tasks. One application of AI is machine learning, which refers to the process of machines, usually computers, developing this intelligence ‘independently’. They learn by being fed sets of raw data rather than being explicitly programmed to perform the desired task.
As the limit of computing power continues to be pushed and computer programs become more refined, the potential applications for machine learning in biology are constantly growing.
The beginning of AI
Although the idea of machines thinking like humans has been around for hundreds of years, it wasn’t until the 1950s that people started to program functioning artificial intelligence programs. In those days, biology actually had more of an influence on machine learning than the other way around.
Early on in the development of AI, researchers began to model their programs after the neural networks found in our own brains. These artificial neural networks still underpin most AI programs today, allowing for an even more complex process known as deep learning, a type of machine learning that involves three or more layers of processing within this artificial neural network. This depth of layers hugely increases the limit to how well an AI program can perform, so that no matter how much data you give it, it’s able to continue learning and improving at the task it was made for.
AI, and more specifically machine learning, has already become particularly well established in the field of biological imaging. When researchers want to analyse data from a collection of images, those images need to be segmented, which involves outlining and labelling all the different pieces of the image.
Image segmentation can be used in a variety of ways. Self-driving cars, for example, use image segmentation to determine everything in their surroundings. Knowing whether something is a tree or a person helps the car make decisions about how to drive as safely as possible. “We use similar techniques to identify different regions inside cells,” says Martin Jones, a physicist in the Electron Microscopy team at the Crick. “but instead of a tree or a person, we want the machine to look for a nucleus or a mitochondria.”
Before AI, researchers used to segment images entirely by hand, a task that’s becoming increasingly difficult. For example, some researchers use a powerful instrument called an electron microscope to obtain images of cells. Thanks to recent advances in technology, these microscopes can now automatically collect thousands of images in a single night. That’s far too much data for researchers to be able to analyse on their own, which is why having an AI program that can segment the images for them is often essential.
The challenge with this solution is that these programs first need to be taught how to accurately identify parts of the cell, which means thousands of images need to be segmented by hand so the machine has something to learn from. This might seem to bring researchers back to their original problem, but fortunately there is a way to speed things up.
One approach that researchers take to tackle the incredibly time-consuming process of ‘teaching’ machines to solve problems is through citizen science, where they ask members of the public to pitch in and be a part of the project.
Researchers at the Crick have set up a citizen science project online called Etch-A-Cell, which is open to anyone who wants to participate. Etch-A-Cell is focused on teaching AI programs to get really good at analysing images of cells by identifying features like the mitochondria or tiny fat droplets found inside the cell.
Etch-A-Cell isn’t the only citizen science project the Crick has been involved with. In 2020, a mini-game called Project Discovery was added into the video game EVE Online. Players were tasked with spotting patterns in groups of cells collected from blood samples.
The data produced by players was fed back into a number of studies, including the COVID-IP study, a collaboration that included researchers at the Crick. The study aimed to determine how likely an individual with a COVID-19 infection was to need intensive hospital care, based on the composition of their immune cells.
Project Discovery has had over 150 million data submissions from players so far. In addition to being used for research directly, the data is also being fed into a machine learning algorithm so that the process can be automated in the future.
Changes across biology
Machine learning is being used all over the Crick, and the number of teams turning to AI to help them solve complex problems is constantly growing. Researchers have used AI to solve complex problems in healthcare, discover new cancer genetics, and analyse infections. “In some cases we’re applying existing technologies, while in others we’re trying to go past the existing methods” explains Luke Nightingale, a senior software engineer at the Crick.
We also host an annual Data Challenge, a hackathon-style event that teams up Crick scientists with data and computing experts to work together on a scientific problem. It gives scientists with a challenging problem or set of data the chance to team up with computational scientists who may be able to help them come up with a solution.
Martin notes that in 2019, when the first Data Challenge was held, AI wasn’t incorporated in many of the projects. Just a few years later, however, most teams at least consider AI when thinking about how to approach their problem, and often end up incorporating it.
Luke has been involved in a few different areas of research where machine learning is having a big impact at the Crick. One example of this is being able to predict the functional properties of proteins, such as how they interact with various molecules. Among a multitude of other uses, this can help researchers find potential drug candidates by narrowing the field to the ones most likely to work
DeepMind, who recently established a research lab at the Crick, has been working on this problem for some time and the solutions are constantly getting more accurate. With over 200 million known proteins and more being discovered all the time, figuring out the exact 3D shape of each one would be virtually impossible without the help of AI.
According to Martin, in the last few years there has been an increased focus on improving the user interface for AI programs to make them more accessible to people without a computer science background. He hopes the new development will encourage researchers from different disciplines to link up more over the coming years. “Often you find AI researchers have great solutions looking for problems, and biological researchers have great problems looking for solutions.” Although AI has changed a lot over the years, the importance of networks – neuronal or otherwise – remains as strong as ever.