There is a lot of momentum behind the open science movement at the moment. It aims to make the results of scientific research accessible to anyone, not just privileged institutions with subscriptions to expensive journals.
Access to the end product is, of course, a good thing, especially for research that has been funded by taxpayer money or even by charity fundraisers. But I believe that it’s also possible for the general public to get involved in research in a meaningful way. And the more that science happens in the open, the easier it is for the general public to engage with and feel invested in it.
One of the problems we work on in the electron microscopy team at the Crick is studying a structure in the cell called the nuclear envelope. This is the double membrane that surrounds the nucleus, where the cell’s DNA is stored. We need to pick out exactly where the nuclear envelope is in our images of cells, a process known as ‘segmentation’.
Attempts to automate this type of process, particularly in electron microscopy, usually have modest results. The widely used gold standard method is for an expert to sit at a powerful computer workstation and use a touchscreen to manually trace lines around the objects. Lucy Collinson (head of our team) sometimes refers to it as “advanced colouring in”! This method produces the best results, but is extremely slow.
The age of big data
Analysing single images is hard enough, but recent advances in electron microscopy mean that we can acquire huge series of images automatically. We take an image of the surface of the sample and then use a tool similar to a tiny specialised bacon slicer to remove a very thin layer from the surface.
By removing slices and imaging the sample and repeating the process, we can build a stack of images showing the 3D structure of the sample.
We can take slices that are a few nanometres thick, so we might need thousands of slices to image just one cell. This generates huge amounts of data - it is now routine to capture a terabyte or more of images per day on a single microscope.
To keep science moving, it’s critical to analyse the images in a reasonable amount of time to interpret the data and make new discoveries. A single expert manually ‘colouring in’ could never keep up.
A big hope for the future of biology is that we will be able to train computers to perform some of this analysis for us. But to do this, current machine learning methods require huge amounts of ‘training data’ that has already been manually analysed by an expert.
The big problem with this is that even relatively simple tasks require enormous amounts of this pre-analysed training data. And that’s where the public come in.
The power of the crowd
Many image comprehension tasks fall into a class of ‘human-easy, computer-hard’ problems which makes them ideal for citizen science. One early example of crowd-sourced image analysis is Galaxy Zoo.
Astronomers face similar problems to people working in microscopy, only the images that they automatically collect and need to analyse are at the huge scale of galaxies rather than the tiny scale of cells.
By sharing the images with citizen scientists via the Zooniverse platform, the community is able to answer simple questions about the galaxies in the images and process huge amounts of data.
Etch a Cell
In collaboration with the Zooniverse team, we set up the Etch a Cell project to harness the power of the crowd for our biomedical research. We ask the citizen scientists to segment the nuclear envelope by drawing a line over the image in their web browser.
“But a non-expert can’t possibly be as good as an expert,” is a comment we frequently get, and this is true! However, there is growing evidence that one expert doesn’t necessarily agree with the next – or even with themselves – if they perform the same task at a later date.
In tasks like this that involve subjective decisions such as where to draw a line, it is more scientifically robust to take many people’s lines and combine them. This is exactly what we do with our images, getting several citizen scientists to segment each slice. It gives us statistics about the confidence in each part of the line, which you don’t get with a single expert.
In the future, we hope to use a combination of experts, citizen scientists and machine learning to automatically mine our huge imaging datasets. For example, we could segment every organelle in every cell so the effects of different diseases and conditions can be understood.
The story so far
Etch a Cell has just celebrated its second birthday and we’re hitting milestones. Around 5,500 volunteers have now taken part and completed nearly 120,000 classifications. Beyond Zooniverse itself, our first paper based on the technique and findings is in progress and we’ve even heard about teachers using Etch a Cell in biology lessons. Who knows, maybe it will inspire the next generation of scientists!