Accelerating single molecule localization microscopy through parallel processing on a high-performance computing cluster

Abstract

Super-resolved microscopy techniques have revolutionized the ability to study biological structures below the diffraction limit. Single molecule localization microscopy (SMLM) techniques are widely used because they are relatively straightforward to implement and can be realized at relatively low cost, e.g. compared to laser scanning microscopy techniques. However, while the data analysis can be readily undertaken using open source or other software tools, large SMLM data volumes and the complexity of the algorithms used often lead to long image data processing times that can hinder the iterative optimization of experiments. There is increasing interest in high throughput SMLM, but its further development and application is inhibited by the data processing challenges. We present here a widely applicable approach to accelerating SMLM data processing via a parallelized implementation of ThunderSTORM on a high-performance computing (HPC) cluster and quantify the speed advantage for a four-node cluster (with 24 cores and 128 GB RAM per node) compared to a high specification (28 cores, 128 GB RAM, SSD-enabled) desktop workstation. This data processing speed can be readily scaled by accessing more HPC resources. Our approach is not specific to ThunderSTORM and can be adapted for a wide range of SMLM software. LAY DESCRIPTION: Optical microscopy is now able to provide images with a resolution far beyond the diffraction limit thanks to relatively new super-resolved microscopy (SRM) techniques, which have revolutionized the ability to study biological structures. One approach to SRM is to randomly switch on and off the emission of fluorescent molecules in an otherwise conventional fluorescence microscope. If only a sparse subset of the fluorescent molecules labelling a sample can be switched on at a time, then each emitter will be, on average, spaced further apart than the diffraction-limited resolution of the conventional microscope and the separate bright spots in the image corresponding to each emitter can be localised to high precision by finding the centre of each feature using a computer program. Thus, a precise map of the emitter positions can be recorded by sequentially mapping the localisation of different subsets of emitters as they are switched on and others switched off. Typically, this approach, described as single molecule localisation microscopy (SMLM), results in large image data sets that can take many minutes to hours to process, depending on the size of the field of view and whether the SMLM analysis employs a computationally-intensive iterative algorithm. Such a slow workflow makes it difficult to optimise experiments and to analyse large numbers of samples. Faster SMLM experiments would be generally useful and automated high throughput SMLM studies of arrays of samples, such as cells, could be applied to drug discovery and other applications. However, the time required to process the resulting data would be prohibitive on a normal computer. To address this, we have developed a method to run standard SMLM data analysis software tools in parallel on a high-performance computing cluster (HPC). This can be used to accelerate the analysis of individual SMLM experiments or it can be scaled to analyse high throughput SMLM data by extending it to run on an arbitrary number of HPC processors in parallel. In this paper we outline the design of our parallelised SMLM software for HPC and quantify the speed advantage when implementing it on four HPC nodes compared to a powerful desktop computer.

Journal details

Volume 273
Issue number 2
Pages 148-160
Publication date

Keywords

Type of publication

Crick labs/facilities