Hacking the hackathon: how we organise the Crick Data Challenge

For the past four years, a group of scientists at the Crick have organised the Crick Data Challenge – a three-day, hackathon-style event that connects wet-lab researchers and experts in data analysis to tackle tough scientific problems. Two members of the team share some top tips on how to organise your own scientific hackathon.

1. Start small, but with ambition

How does the Crick Data Challenge work?

Any Crick researcher can submit a ‘data challenge’ to the organising team before the event. After a review, several proposals are selected for the final event.

At the beginning of the Data Challenge week, team leaders pitch their ideas to a group of data scientists from the Crick and external experts, who then form teams to try and find solutions to the problem over the next four days.

We had very modest goals to start with. Initially, we just wanted to find out more about who was in the building and get people with diverse skill sets talking to each other.

The Data Challenge turned out to be a great networking opportunity, but it’s become more than that – people are given the opportunity to work with new colleagues and learn from each other. We now also have participants from other universities and start-ups who provide additional computational expertise and perspectives. 

Even though it wasn’t our original goal, several data challenge projects have been published as preprints or peer-reviewed research articles. These are important scientific contributions that probably wouldn’t have come from day-to-day lab work.

2. Assemble a balanced team

We have a mixture of permanent staff scientists, postdocs and students on the organising team. The staff scientists are present every year and have run several data challenges, whereas the postdocs and students change more frequently. The permanent staff tend to provide guidance and high-level direction, while the students drive the day-to-day work. 

Our team structure benefits from being a bit organic. It can be counterproductive to be too rigid when it comes to job roles. Having said that, you need at least one person (ideally a few!) who accepts the ‘headache’ of making sure everything that needs to get done gets done. 

3. Make it semi-structured to help ensure success 

We use a ‘bookend’ timetable structure. During the first day, everything has a strict time allocation. The project leads pitch to the participants, there is a Q&A and then the teams assemble.

On the last day, there are presentations from every team in the afternoon, followed by a keynote lecture. But in between the kick-off and the final presentations, the teams are free to divide their time however they choose.

To make sure everyone has the physical space they need, we give each team an individual room in which to work. We are lucky to have a budget for providing food and drinks during the event, generously provided by the Crick Translation team to make things more comfortable for the participants. 

Importantly, we organise one social activity at the end of each day. This makes sure that the event is fun, but it also encourages people to stop working!

4. Dedicate time to encouraging project submission and project selection

It can be difficult to get people to submit projects. People often seem interested, but aren’t able to get organised in time. Once someone submits a project, we meet up with them, alongside a few relevant experts, to discuss the idea. We work out if there is already an existing solution to the problem, and it gives us a chance to check that the data are usable for the problem they want to solve. 

It’s important to have a clear cut-off date for project submission, otherwise it can get a bit chaotic trying to ensure a project is ready for the starting date! You don’t want to spend time during the event carrying out any steps that could have been done in advance.

5. Choose the projects carefully and the teams will take care of themselves

We try to select an experienced mentor for each project, either from within the institute or externally. Other than this, the teams assemble themselves organically.

Each team leader pitches their project, and the participants have an hour to ask questions to help them make a selection. We check that every project has enough team members and a good mix of expertise, but otherwise the process is entirely self-organised, 

Participants have asked us to release full project details ahead of time, but we believe in keeping them secret until the first day, so that participants come along with an open mind. We’ve found that, this way, people are more willing to join a project outside of their comfort zone! 

A hand pointing at a screen with microscopy images and graphs on it in a dark room.

6. Use (just enough) tech to streamline each process

  • Data storage 

    It's a hackathon! The most important thing to think about is how the data will be stored, accessed and processed. We have a separate storage space within the Crick’s high-performance computing (HPC) cluster for all the files that will be used in the challenge.

    This makes it easy for us to give people access to folders that contain the specific data that they need for their project, and make sure that we’re complying with our data protection policies.
     
  • Access to computing power

    Most data are analysed on the HPC cluster. Fortunately, our HPC team has developed an easy-to-use graphical interface for running various software environments (such as R, Jupyter, Matlab and Fiji). This gives people huge computing power without needing specific training.
     
  • Good coding practices

    Everyone has to have a Github handle, and we run training sessions ahead of the event for anyone not accustomed to using Github. All the teams store and edit their work inside private GitHub projects.
     
  • Communication 

    We use a Slack workspace for communicating between the organising team, and for the teams participating. We’ve also linked it to GitHub and Zoom so that everything’s available in one place. Each team has their own private channel, which creates an inclusive space that works for hybrid events. 

7. Befriend the IT team

It may sound obvious, but we feel like we have to highlight this. It would be impossible to run the event without the Crick’s IT team, who are amazing at what they do. Each year they contribute experts in data analysis, software engineering and machine learning.

A specialist infrastructure team, within our Scientific Computing group, creates and maintains the high performance computing (HPC) cluster. During the event there are one or two members of the team ‘on-call’ for any issues related to the data challenge. Without them, everything would fall apart.

8. Establish efficient systems for all the boring stuff

We get excited about the scientific problems that are brought to us. Sometimes the administrative things are a bit less exciting, but these need to be done before any of the science can start.

  • Make sure everyone can access the systems. We ask people to check they can log in and operate everything a week in advance, so that we can iron out any issues.
     
  • Make sure that everyone with access to data has signed the correct paperwork, including NDAs. Everyone with a stake in the data needs to be comfortable with the work that’s happening during the hackathon.
     
  • You might want to have a set of organisational documents, like a code of conduct and how-to guides. We have a pack that we send to all participants before the data challenge starts.

9. Respond to feedback 

A survey is sent out at the end of each data challenge so we can see what went well and what we can improve for the next event. Making a few key improvements each year helps us ensure each event is better than the last.

Participants give us overwhelmingly positive feedback about the experience of meeting new scientists and working towards a collective goal, which is backed up by many people participating more than once. Often the data challenge provides a fresh impetus to drive the project to the next stage.

10. Decide how much you’ll keep in touch

We don’t spend time tracking what happens to each project after the event is over. If a good solution was found during the challenge, the team leads usually continue to work with the participants and they let us know when they publish it.

We do have a shared Slack workspace, and we add each year’s participants and create private channels every year for each new team. This makes it easier for people to stay in touch and feel part of the data challenge community – and hopefully come back! 
 

Sign up for our newsletters

Join our mailing lists to receive updates about our latest research and to hear about our free public events and exhibitions.  If you would like to find out more about how we manage your personal information please see our privacy policy.