Processing The Next Generation of Angstrom-Scale Microscopy Dr - - PowerPoint PPT Presentation

processing the next generation of angstrom scale
SMART_READER_LITE
LIVE PREVIEW

Processing The Next Generation of Angstrom-Scale Microscopy Dr - - PowerPoint PPT Presentation

Processing The Next Generation of Angstrom-Scale Microscopy Dr Lance Wilson Senior HPC Consultant @ MASSIVE Source https://www.monash.edu/research/infrastructure/delivering-impact/research-outcomes/cryo-em/half-a-million-dollar-tick Figure 5.


slide-1
SLIDE 1

Processing The Next Generation of Angstrom-Scale Microscopy

Dr Lance Wilson Senior HPC Consultant @ MASSIVE

slide-2
SLIDE 2
slide-3
SLIDE 3

Source https://www.monash.edu/research/infrastructure/delivering-impact/research-outcomes/cryo-em/half-a-million-dollar-tick

slide-4
SLIDE 4

Figure 5. 3D structure of the Ebola spike.

Beniac DR, Melito PL, deVarennes SL, Hiebert SL, Rabb MJ, et al. (2012) The Organisation of Ebola Virus Reveals a Capacity for Extensive, Modular Polyploidy. PLOS ONE 7(1): e29608. https://doi.org/10.1371/journal.pone.0029608 http://journals.plos.org/plosone/article?id=10.1371/jo urnal.pone.0029608

slide-5
SLIDE 5

5

What is Cryo-Electron Microscopy?

slide-6
SLIDE 6

“Seeing Molecular Interactions of Large Complexes by Cryo Electron Microscopy at Atomic Resolution” Laurie J. Wang Z.H. Zhou 2014

slide-7
SLIDE 7

CryoEM

Computed Tomography + Photogrammetry

slide-8
SLIDE 8

http://www.cmu.edu/me/xctf/xrayct/index.html https://www.maximintegrated.com/en/app-notes/index.mvp/id/4682

slide-9
SLIDE 9

https://skfb.ly/TI89

slide-10
SLIDE 10
slide-11
SLIDE 11
  • penstack.org
slide-12
SLIDE 12
  • penstack.org
slide-13
SLIDE 13
  • penstack.org

3D reconstruction of the electron density of aE11 Fab’ polyC9 complex

slide-14
SLIDE 14

14

Institution Strategy

slide-15
SLIDE 15

15 “Here is your CD of data…” to “Your data is moving up to a data management system in the cloud where you have access to a range of tools and services to start your data analysis”

slide-16
SLIDE 16

Cryo-Em: Titan Krios Cryo-Em PC Collect movies, Images, MyTardis Data storage, sharing MyData App

Raw frames

MyData App

Corrected Stills Picking

Publication DOI, Reuse Strudel Desktop & Web

Ctffind Relion Frealign Model

slide-17
SLIDE 17
slide-18
SLIDE 18

18

What is the scope? Or How big is the computer?

slide-19
SLIDE 19

19 § ~1-4TB raw data set/sample ~2000-5000 files § Pipeline analysis with internal & external tools § Require large memory gpu > 8GB § Require large system memory > 64GB § Require cpu cores 200 - 400 § Parallel file reads and writes Compute and Storage Requirements

slide-20
SLIDE 20

How long do each

  • f the steps take?

2,500 images 150,000 particles 260 pixels

Task Submitted? GPU? Nodes Time Import No < 1 min Motion Correction Yes Yes 3 20 min CTF estimation Yes No 1 20 min Manual Picking No ? Autopicking Yes Yes 2 40 min Particle Extraction Yes No 1 10 min 2D Classification Yes Yes 2 10 min/iteration 3D Classification Yes Yes 1 10 min/iteration 3D Refine Yes Yes 2 5-10 min/iteration Movie Refine Yes No 1 1 hour Particle Polishing Yes No 1 1-2 hours Mask Creation No 5-30 min Postprocessing No <1 min ~3 days

slide-21
SLIDE 21

21

Options for Processing

Cloud Pro Scales easily Con Cost, complexity, data movement HPC Pro Huge resources Con Tightly controlled, shared Workstation Pro Full user control Con Limited by single machine

slide-22
SLIDE 22

22

Why use OpenStack for this workflow?

slide-23
SLIDE 23

https://sites.google.com/site/emcloudprocessing/home/relion2#TOC-Benchmark-Tests

slide-24
SLIDE 24

24 § How to achieve maximum processing speed? § # GPUs = # raw movies § Storage I/O > Processing I/O

§ The electron beam drifts during collection and the results need to be shifted to account for it. § http://www.ncbi.nlm.nih.gov/pubmed/2 3644547 § Software: motioncorr 2.1 § 3.4 GB of raw movie data (15 movies) § 218MB of corrected micrographs (15 images) § Processing Time:- 109s § Single Nvidia K80

Motion Correction

slide-25
SLIDE 25

25 Using local desktop :- ~3hrs

  • Limited by local storage and network access, single gpu

Using remote dekstop on MASSIVE:- ~45mins

  • Limited by GPU (2 per desktop)

Using an inhouse parallel scripted version:- ~4.5mins

  • Limited by file system bandwidth (1-2 GB/s)

Case study - Motion Correction 10x speedup!

slide-26
SLIDE 26

26 What approaches/software is used? Relion Simple Cryosparc

slide-27
SLIDE 27
slide-28
SLIDE 28
slide-29
SLIDE 29
slide-30
SLIDE 30

30

Scientific Problem Definition (What does a scientist do?)

slide-31
SLIDE 31

Typical workflow for processing using Relion2

slide-32
SLIDE 32

Results

Dell HPC Nodes 24 Cores (2 CPUs) 256 GB Ram 4 x K80 GPUs NVIDIA DGX-1 32 Cores (2 CPUs) 512 GB Ram 8 x P100 GPUs

Hardware for Relion2 Comparisons

Dell HPC Nodes 24 Cores (2 CPUs) 256 GB Ram 4 x K80 GPUs NVIDIA DGX-1 32 Cores (2 CPUs) 512 GB Ram 8 x P100 GPUs

Hardware for CryoSparc Comparisons

slide-33
SLIDE 33

3 6 9 12 15 18 21 24 27 30 33 36 1 2 3 4 5 6 7

RUN TIME (MINS) RUN NUMBER

Cryosparc - Ab-initio Step

K80 P100

* 24 Core, 128GB Ram with 1 of either K80 or P100 GPU

slide-34
SLIDE 34

5 10 15 20 25 30

1 2 3 4 5

RUN TIME (MINS) RUN NUMBER

Cryosparc - Refinement Step

K80 P100 * 24 Core, 128GB Ram with 1 of either K80 or P100 GPU

slide-35
SLIDE 35

How many CPUs do you need? What effect does the number of CPUs have on analysis time?

slide-36
SLIDE 36

10 20 30 40 50 60 70 80 90 4 8 16

PROCESSING TIME (MINS) NUMBER OF CORES (THREADS)

Relion - 3D Classification Step

K80 DGX

K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

slide-37
SLIDE 37

1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 4 8 16

PROCESSING TIME SPEEDUP RELATIVE TO K80

NUMBER OF CORES (THREADS)

Relion - 3D Classification Step

K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

slide-38
SLIDE 38

20 40 60 80 100 120 140 3 4 5 6 8 9 10 12 13 16 17 18 20 24

PROCESSING TIME (MINS) PROCESSING THREADS

Relion - 3D Classification Step

K80 Processing Time DGX Processing Time

K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

slide-39
SLIDE 39

20 40 60 80 100 120 140 3 5 9 13 17

PROCESSING TIME (MINS) MPI TASKS

Relion Class 3D - MPI Task Effects

K80 Processing Time DGX Processing Time

K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

slide-40
SLIDE 40

How many GPUs do you need? Relion2 - Class 3D Step

slide-41
SLIDE 41

20 40 60 80 100 120 140 2 4 8

PROCESSING TIME (MINS) NUMBER OF GPUS

Relion Class 3D - Effect of No. GPUs

K80 DGX

K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

slide-42
SLIDE 42

50 100 150 200 250 300 350 400 17 21

PROCESSING TIME (MINS) NUMBER OF MPI TASKS (~CORES)

Relion - 2D Classification Step

K80 DGX

K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

slide-43
SLIDE 43

1 1.5 2 2.5 3 3.5 4 4.5 5 17 21

PROCESSING TIME SPEEDUP RELATIVE TO K80 NUMBER OF CORES (THREADS)

Relion - 2D Classification Step

K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

slide-44
SLIDE 44

How many GPUs do you need? Relion2 - Class 2D Step

slide-45
SLIDE 45

100 200 300 400 500 600 700 800 4 8 12 16

PROCESSING TIME (MINS) NUMBER OF GPUS

Relion Class 2D - Effect of No. of K80 GPUs

slide-46
SLIDE 46

How does this hardware compare against workstations?

slide-47
SLIDE 47

5 10 15 20 25

Processing Time (Hours) Hardware Configuration

Processing Time vs Hardware Configuration for 3d Classification Step

slide-48
SLIDE 48

Insert Picture of tweet

slide-49
SLIDE 49

49

slide-50
SLIDE 50

So … what is the solution?

slide-51
SLIDE 51

51 GPU cluster Large high performance file system Remote desktop with access to cluster (for GPU and storage) HPC and domain experts optimising pipeline for systems Outcome Data processing faster than collection! Shared resource for optimal use

Solution

slide-52
SLIDE 52

52 § Jon Mansour for help gathering benchmarking results § Jafar Lie for help gathering benchmarking results and § Mike Wang (NVIDIA) for access to the NVIDIA DGX-1 Acknowledgments

slide-53
SLIDE 53

Questions?