Processing The Next Generation of Angstrom-Scale Microscopy
Dr Lance Wilson Senior HPC Consultant @ MASSIVE
Processing The Next Generation of Angstrom-Scale Microscopy Dr - - PowerPoint PPT Presentation
Processing The Next Generation of Angstrom-Scale Microscopy Dr Lance Wilson Senior HPC Consultant @ MASSIVE Source https://www.monash.edu/research/infrastructure/delivering-impact/research-outcomes/cryo-em/half-a-million-dollar-tick Figure 5.
Processing The Next Generation of Angstrom-Scale Microscopy
Dr Lance Wilson Senior HPC Consultant @ MASSIVE
Source https://www.monash.edu/research/infrastructure/delivering-impact/research-outcomes/cryo-em/half-a-million-dollar-tick
Figure 5. 3D structure of the Ebola spike.
Beniac DR, Melito PL, deVarennes SL, Hiebert SL, Rabb MJ, et al. (2012) The Organisation of Ebola Virus Reveals a Capacity for Extensive, Modular Polyploidy. PLOS ONE 7(1): e29608. https://doi.org/10.1371/journal.pone.0029608 http://journals.plos.org/plosone/article?id=10.1371/jo urnal.pone.0029608
5
“Seeing Molecular Interactions of Large Complexes by Cryo Electron Microscopy at Atomic Resolution” Laurie J. Wang Z.H. Zhou 2014
http://www.cmu.edu/me/xctf/xrayct/index.html https://www.maximintegrated.com/en/app-notes/index.mvp/id/4682
https://skfb.ly/TI89
3D reconstruction of the electron density of aE11 Fab’ polyC9 complex
14
15 “Here is your CD of data…” to “Your data is moving up to a data management system in the cloud where you have access to a range of tools and services to start your data analysis”
Cryo-Em: Titan Krios Cryo-Em PC Collect movies, Images, MyTardis Data storage, sharing MyData App
Raw frames
MyData App
Corrected Stills Picking
Publication DOI, Reuse Strudel Desktop & Web
Ctffind Relion Frealign Model
18
19 § ~1-4TB raw data set/sample ~2000-5000 files § Pipeline analysis with internal & external tools § Require large memory gpu > 8GB § Require large system memory > 64GB § Require cpu cores 200 - 400 § Parallel file reads and writes Compute and Storage Requirements
How long do each
2,500 images 150,000 particles 260 pixels
Task Submitted? GPU? Nodes Time Import No < 1 min Motion Correction Yes Yes 3 20 min CTF estimation Yes No 1 20 min Manual Picking No ? Autopicking Yes Yes 2 40 min Particle Extraction Yes No 1 10 min 2D Classification Yes Yes 2 10 min/iteration 3D Classification Yes Yes 1 10 min/iteration 3D Refine Yes Yes 2 5-10 min/iteration Movie Refine Yes No 1 1 hour Particle Polishing Yes No 1 1-2 hours Mask Creation No 5-30 min Postprocessing No <1 min ~3 days
21
Options for Processing
Cloud Pro Scales easily Con Cost, complexity, data movement HPC Pro Huge resources Con Tightly controlled, shared Workstation Pro Full user control Con Limited by single machine
22
https://sites.google.com/site/emcloudprocessing/home/relion2#TOC-Benchmark-Tests
24 § How to achieve maximum processing speed? § # GPUs = # raw movies § Storage I/O > Processing I/O
§ The electron beam drifts during collection and the results need to be shifted to account for it. § http://www.ncbi.nlm.nih.gov/pubmed/2 3644547 § Software: motioncorr 2.1 § 3.4 GB of raw movie data (15 movies) § 218MB of corrected micrographs (15 images) § Processing Time:- 109s § Single Nvidia K80
Motion Correction
25 Using local desktop :- ~3hrs
Using remote dekstop on MASSIVE:- ~45mins
Using an inhouse parallel scripted version:- ~4.5mins
Case study - Motion Correction 10x speedup!
26 What approaches/software is used? Relion Simple Cryosparc
30
Typical workflow for processing using Relion2
Dell HPC Nodes 24 Cores (2 CPUs) 256 GB Ram 4 x K80 GPUs NVIDIA DGX-1 32 Cores (2 CPUs) 512 GB Ram 8 x P100 GPUs
Hardware for Relion2 Comparisons
Dell HPC Nodes 24 Cores (2 CPUs) 256 GB Ram 4 x K80 GPUs NVIDIA DGX-1 32 Cores (2 CPUs) 512 GB Ram 8 x P100 GPUs
Hardware for CryoSparc Comparisons
3 6 9 12 15 18 21 24 27 30 33 36 1 2 3 4 5 6 7
RUN TIME (MINS) RUN NUMBER
K80 P100
* 24 Core, 128GB Ram with 1 of either K80 or P100 GPU
5 10 15 20 25 30
1 2 3 4 5
RUN TIME (MINS) RUN NUMBER
K80 P100 * 24 Core, 128GB Ram with 1 of either K80 or P100 GPU
How many CPUs do you need? What effect does the number of CPUs have on analysis time?
10 20 30 40 50 60 70 80 90 4 8 16
PROCESSING TIME (MINS) NUMBER OF CORES (THREADS)
K80 DGX
K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 4 8 16
PROCESSING TIME SPEEDUP RELATIVE TO K80
NUMBER OF CORES (THREADS)
K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100
20 40 60 80 100 120 140 3 4 5 6 8 9 10 12 13 16 17 18 20 24
PROCESSING TIME (MINS) PROCESSING THREADS
K80 Processing Time DGX Processing Time
K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100
20 40 60 80 100 120 140 3 5 9 13 17
PROCESSING TIME (MINS) MPI TASKS
K80 Processing Time DGX Processing Time
K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100
How many GPUs do you need? Relion2 - Class 3D Step
20 40 60 80 100 120 140 2 4 8
PROCESSING TIME (MINS) NUMBER OF GPUS
Relion Class 3D - Effect of No. GPUs
K80 DGX
K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100
50 100 150 200 250 300 350 400 17 21
PROCESSING TIME (MINS) NUMBER OF MPI TASKS (~CORES)
K80 DGX
K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100
1 1.5 2 2.5 3 3.5 4 4.5 5 17 21
PROCESSING TIME SPEEDUP RELATIVE TO K80 NUMBER OF CORES (THREADS)
K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100
How many GPUs do you need? Relion2 - Class 2D Step
100 200 300 400 500 600 700 800 4 8 12 16
PROCESSING TIME (MINS) NUMBER OF GPUS
Relion Class 2D - Effect of No. of K80 GPUs
How does this hardware compare against workstations?
5 10 15 20 25
Processing Time (Hours) Hardware Configuration
Processing Time vs Hardware Configuration for 3d Classification Step
Insert Picture of tweet
49
So … what is the solution?
51 GPU cluster Large high performance file system Remote desktop with access to cluster (for GPU and storage) HPC and domain experts optimising pipeline for systems Outcome Data processing faster than collection! Shared resource for optimal use
52 § Jon Mansour for help gathering benchmarking results § Jafar Lie for help gathering benchmarking results and § Mike Wang (NVIDIA) for access to the NVIDIA DGX-1 Acknowledgments
Questions?