1. Start small, but with ambition
We had very modest goals to start with. Initially, we just wanted to find out more about who was in the building and get people with diverse skill sets talking to each other.
The Data Challenge turned out to be a great networking opportunity, but it’s become more than that – people are given the opportunity to work with new colleagues and learn from each other. We now also have participants from other universities and start-ups who provide additional computational expertise and perspectives.
Even though it wasn’t our original goal, several data challenge projects have been published as preprints or peer-reviewed research articles. These are important scientific contributions that probably wouldn’t have come from day-to-day lab work.
2. Assemble a balanced team
We have a mixture of permanent staff scientists, postdocs and students on the organising team. The staff scientists are present every year and have run several data challenges, whereas the postdocs and students change more frequently. The permanent staff tend to provide guidance and high-level direction, while the students drive the day-to-day work.
Our team structure benefits from being a bit organic. It can be counterproductive to be too rigid when it comes to job roles. Having said that, you need at least one person (ideally a few!) who accepts the ‘headache’ of making sure everything that needs to get done gets done.
3. Make it semi-structured to help ensure success
We use a ‘bookend’ timetable structure. During the first day, everything has a strict time allocation. The project leads pitch to the participants, there is a Q&A and then the teams assemble.
On the last day, there are presentations from every team in the afternoon, followed by a keynote lecture. But in between the kick-off and the final presentations, the teams are free to divide their time however they choose.
To make sure everyone has the physical space they need, we give each team an individual room in which to work. We are lucky to have a budget for providing food and drinks during the event, generously provided by the Crick Translation team to make things more comfortable for the participants.
Importantly, we organise one social activity at the end of each day. This makes sure that the event is fun, but it also encourages people to stop working!
4. Dedicate time to encouraging project submission and project selection
It can be difficult to get people to submit projects. People often seem interested, but aren’t able to get organised in time. Once someone submits a project, we meet up with them, alongside a few relevant experts, to discuss the idea. We work out if there is already an existing solution to the problem, and it gives us a chance to check that the data are usable for the problem they want to solve.
It’s important to have a clear cut-off date for project submission, otherwise it can get a bit chaotic trying to ensure a project is ready for the starting date! You don’t want to spend time during the event carrying out any steps that could have been done in advance.
5. Choose the projects carefully and the teams will take care of themselves
We try to select an experienced mentor for each project, either from within the institute or externally. Other than this, the teams assemble themselves organically.
Each team leader pitches their project, and the participants have an hour to ask questions to help them make a selection. We check that every project has enough team members and a good mix of expertise, but otherwise the process is entirely self-organised,
Participants have asked us to release full project details ahead of time, but we believe in keeping them secret until the first day, so that participants come along with an open mind. We’ve found that, this way, people are more willing to join a project outside of their comfort zone!
6. Use (just enough) tech to streamline each process
- Data storage
It's a hackathon! The most important thing to think about is how the data will be stored, accessed and processed. We have a separate storage space within the Crick’s high-performance computing (HPC) cluster for all the files that will be used in the challenge.
This makes it easy for us to give people access to folders that contain the specific data that they need for their project, and make sure that we’re complying with our data protection policies.
- Access to computing power
Most data are analysed on the HPC cluster. Fortunately, our HPC team has developed an easy-to-use graphical interface for running various software environments (such as R, Jupyter, Matlab and Fiji). This gives people huge computing power without needing specific training.
- Good coding practices
Everyone has to have a Github handle, and we run training sessions ahead of the event for anyone not accustomed to using Github. All the teams store and edit their work inside private GitHub projects.
We use a Slack workspace for communicating between the organising team, and for the teams participating. We’ve also linked it to GitHub and Zoom so that everything’s available in one place. Each team has their own private channel, which creates an inclusive space that works for hybrid events.
7. Befriend the IT team
It may sound obvious, but we feel like we have to highlight this. It would be impossible to run the event without the Crick’s IT team, who are amazing at what they do. Each year they contribute experts in data analysis, software engineering and machine learning.
A specialist infrastructure team, within our Scientific Computing group, creates and maintains the high performance computing (HPC) cluster. During the event there are one or two members of the team ‘on-call’ for any issues related to the data challenge. Without them, everything would fall apart.
8. Establish efficient systems for all the boring stuff
We get excited about the scientific problems that are brought to us. Sometimes the administrative things are a bit less exciting, but these need to be done before any of the science can start.
- Make sure everyone can access the systems. We ask people to check they can log in and operate everything a week in advance, so that we can iron out any issues.
- Make sure that everyone with access to data has signed the correct paperwork, including NDAs. Everyone with a stake in the data needs to be comfortable with the work that’s happening during the hackathon.
- You might want to have a set of organisational documents, like a code of conduct and how-to guides. We have a pack that we send to all participants before the data challenge starts.
9. Respond to feedback
A survey is sent out at the end of each data challenge so we can see what went well and what we can improve for the next event. Making a few key improvements each year helps us ensure each event is better than the last.
Participants give us overwhelmingly positive feedback about the experience of meeting new scientists and working towards a collective goal, which is backed up by many people participating more than once. Often the data challenge provides a fresh impetus to drive the project to the next stage.
10. Decide how much you’ll keep in touch
We don’t spend time tracking what happens to each project after the event is over. If a good solution was found during the challenge, the team leads usually continue to work with the participants and they let us know when they publish it.
We do have a shared Slack workspace, and we add each year’s participants and create private channels every year for each new team. This makes it easier for people to stay in touch and feel part of the data challenge community – and hopefully come back!