processing the next generation of angstrom scale
play

Processing The Next Generation of Angstrom-Scale Microscopy Dr - PowerPoint PPT Presentation

Processing The Next Generation of Angstrom-Scale Microscopy Dr Lance Wilson Senior HPC Consultant @ MASSIVE Source https://www.monash.edu/research/infrastructure/delivering-impact/research-outcomes/cryo-em/half-a-million-dollar-tick Figure 5.


  1. Processing The Next Generation of Angstrom-Scale Microscopy Dr Lance Wilson Senior HPC Consultant @ MASSIVE

  2. Source https://www.monash.edu/research/infrastructure/delivering-impact/research-outcomes/cryo-em/half-a-million-dollar-tick

  3. Figure 5. 3D structure of the Ebola spike. Beniac DR, Melito PL, deVarennes SL, Hiebert SL, Rabb MJ, et al. (2012) The Organisation of Ebola Virus Reveals a Capacity for Extensive, Modular Polyploidy. PLOS ONE 7(1): e29608. https://doi.org/10.1371/journal.pone.0029608 http://journals.plos.org/plosone/article?id=10.1371/jo urnal.pone.0029608

  4. What is Cryo-Electron Microscopy? 5

  5. “Seeing Molecular Interactions of Large Complexes by Cryo Electron Microscopy at Atomic Resolution” Laurie J. Wang Z.H. Zhou 2014

  6. Computed Tomography CryoEM + Photogrammetry

  7. http://www.cmu.edu/me/xctf/xrayct/index.html https://www.maximintegrated.com/en/app-notes/index.mvp/id/4682

  8. https://skfb.ly/TI89

  9. openstack.org

  10. openstack.org

  11. 3D reconstruction of the electron density of aE11 Fab’ polyC9 complex openstack.org

  12. Institution Strategy 14

  13. “Here is your CD of data…” to “Your data is moving up to a data management system in the cloud where you have access to a range of tools and services to start your data analysis” 15

  14. MyData App Publication Cryo-Em: Cryo-Em PC MyTardis DOI, Reuse Titan Collect Data Krios movies, storage, sharing Images, MyData Raw Model frames App Ctffind Strudel Corrected Relion Desktop & Web Stills Frealign Picking

  15. What is the scope? Or How big is the computer? 18

  16. Compute and Storage Requirements § ~1-4TB raw data set/sample ~2000-5000 files § Pipeline analysis with internal & external tools § Require large memory gpu > 8GB § Require large system memory > 64GB § Require cpu cores 200 - 400 § Parallel file reads and writes 19

  17. Task Submitted? GPU? Nodes Time Import No < 1 min How long do each Motion Correction Yes Yes 3 20 min CTF estimation Yes No 1 20 min of the steps take? Manual Picking No ? Autopicking Yes Yes 2 40 min 2,500 images Particle Extraction Yes No 1 10 min 2D Classification Yes Yes 2 10 min/iteration 150,000 particles 3D Classification Yes Yes 1 10 min/iteration 260 pixels 3D Refine Yes Yes 2 5-10 min/iteration Movie Refine Yes No 1 1 hour Particle Polishing Yes No 1 1-2 hours Mask Creation No 5-30 min Postprocessing No <1 min ~3 days

  18. Options for Processing Cloud Workstation HPC Pro Pro Pro Full user control Huge resources Scales easily Con Con Limited by single Con Tightly controlled, machine shared Cost, complexity, data movement 21

  19. Why use OpenStack for this workflow? 22

  20. https://sites.google.com/site/emcloudprocessing/home/relion2#TOC-Benchmark-Tests

  21. Motion Correction The electron beam drifts during § How to achieve maximum § collection and the results need to be processing speed? shifted to account for it. § http://www.ncbi.nlm.nih.gov/pubmed/2 § # GPUs = # raw movies 3644547 § Storage I/O > Processing § Software: motioncorr 2.1 § 3.4 GB of raw movie data (15 movies) I/O § 218MB of corrected micrographs (15 images) Processing Time:- 109s § Single Nvidia K80 § 24

  22. Case study - Motion Correction 10x speedup! Using local desktop :- ~3hrs ● Limited by local storage and network access, single gpu Using remote dekstop on MASSIVE:- ~45mins ● Limited by GPU (2 per desktop) Using an inhouse parallel scripted version:- ~4.5mins ● Limited by file system bandwidth (1-2 GB/s) 25

  23. What approaches/software is used? Relion Simple Cryosparc 26

  24. Scientific Problem Definition (What does a scientist do?) 30

  25. Typical workflow for processing using Relion2

  26. Hardware for Relion2 Comparisons Dell HPC Nodes NVIDIA DGX-1 24 Cores (2 CPUs) 32 Cores (2 CPUs) 256 GB Ram 512 GB Ram 4 x K80 GPUs 8 x P100 GPUs Results Hardware for CryoSparc Comparisons Dell HPC Nodes NVIDIA DGX-1 24 Cores (2 CPUs) 32 Cores (2 CPUs) 256 GB Ram 512 GB Ram 4 x K80 GPUs 8 x P100 GPUs

  27. Cryosparc - Ab-initio Step 36 33 30 RUN TIME (MINS) 27 24 21 18 15 12 9 6 3 0 1 2 3 4 5 6 7 RUN NUMBER K80 P100 * 24 Core, 128GB Ram with 1 of either K80 or P100 GPU

  28. Cryosparc - Refinement Step 30 25 RUN TIME (MINS) 20 15 10 5 0 1 2 3 4 5 RUN NUMBER K80 P100 * 24 Core, 128GB Ram with 1 of either K80 or P100 GPU

  29. How many CPUs do you need? What effect does the number of CPUs have on analysis time?

  30. Relion - 3D Classification Step 90 80 PROCESSING TIME (MINS) 70 60 50 K80 40 DGX 30 20 10 0 4 8 16 NUMBER OF CORES (THREADS) K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

  31. Relion - 3D Classification Step 2 PROCESSING TIME SPEEDUP RELATIVE TO K80 1.9 1.8 1.7 1.6 1.5 1.4 1.3 1.2 1.1 1 4 8 16 NUMBER OF CORES (THREADS) K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

  32. Relion - 3D Classification Step 140 K80 Processing Time 120 PROCESSING TIME (MINS) DGX Processing Time 100 80 60 40 20 0 3 4 5 6 8 9 10 12 13 16 17 18 20 24 K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU PROCESSING THREADS DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

  33. Relion Class 3D - MPI Task Effects 140 K80 Processing Time 120 PROCESSING TIME (MINS) DGX Processing Time 100 80 60 40 20 0 3 5 9 13 17 K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU MPI TASKS DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

  34. How many GPUs do you need? Relion2 - Class 3D Step

  35. Relion Class 3D - Effect of No. GPUs 140 120 PROCESSING TIME (MINS) 100 80 K80 60 DGX 40 20 0 2 4 8 NUMBER OF GPUS K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

  36. Relion - 2D Classification Step 400 350 PROCESSING TIME (MINS) 300 250 200 K80 150 DGX 100 50 0 17 21 NUMBER OF MPI TASKS (~CORES) K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

  37. Relion - 2D Classification Step 5 PROCESSING TIME SPEEDUP RELATIVE TO K80 4.5 4 3.5 3 2.5 2 1.5 1 17 21 K80 = 24 x CPU, 256GB RAM, 4 x K80 GPU NUMBER OF CORES (THREADS) DGX-1 = 32 x CPU, 512GB RAM, 8 x P100

  38. How many GPUs do you need? Relion2 - Class 2D Step

  39. Relion Class 2D - Effect of No. of K80 GPUs 800 700 PROCESSING TIME (MINS) 600 500 400 300 200 100 0 4 8 12 16 NUMBER OF GPUS

  40. How does this hardware compare against workstations?

  41. Processing Time vs Hardware Configuration for 3d Classification Step 25 Processing Time (Hours) 20 15 10 5 0 Hardware Configuration

  42. Insert Picture of tweet

  43. 49

  44. So … what is the solution?

  45. Solution GPU cluster Large high performance file system Remote desktop with access to cluster (for GPU and storage) HPC and domain experts optimising pipeline for systems Outcome Data processing faster than collection! Shared resource for optimal use 51

  46. Acknowledgments § Jon Mansour for help gathering benchmarking results § Jafar Lie for help gathering benchmarking results and § Mike Wang (NVIDIA) for access to the NVIDIA DGX-1 52

  47. Questions?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend