ScipionCloud Large scale cryo electron microscopy image processing - - PowerPoint PPT Presentation

scipioncloud
SMART_READER_LITE
LIVE PREVIEW

ScipionCloud Large scale cryo electron microscopy image processing - - PowerPoint PPT Presentation

ScipionCloud Large scale cryo electron microscopy image processing on commercial and academic clouds Who are we? The Instruct cryoEM Image Processing Center Instruct: The European Research Infrastructure for Structural Biology Providing


slide-1
SLIDE 1

ScipionCloud

Large scale cryo electron microscopy image processing on commercial and academic clouds

slide-2
SLIDE 2

Who are we?

The Instruct cryoEM Image Processing Center Instruct: The European Research Infrastructure for Structural Biology

Providing access to state of the art structural biology infrastructure for researchers

slide-3
SLIDE 3

What is Cryo Electron Microscopy

Among the structural biology (SB) techniques at the core of the Instruct ESFRI project, electron microscopy under cryogenic conditions (“cryo-EM”) is currently the fastest growing area, having been nominated “Method of the Year (2015)” by Nature.

slide-4
SLIDE 4

Why do we hear so much about Electron Microscopy?

Because thanks to: 1) The very good performance of current microscopes 2) The very good image acquisition characteristics of Direct Electron Detector 3) The very good new software for 3D reconstruction and classification It is possible to solve the structure of large and flexible macromolecular complexes, without 3D crystals from small amounts of not very concentrated samples.

slide-5
SLIDE 5

CryoEM for drug discovery

Cryo EM resolving the structure of EBOLA VIRUS key glycoprotein in complex with therapeutic antibodies

slide-6
SLIDE 6

Typical EM Workflow

16 cores, 2GB/core

slide-7
SLIDE 7

Hardware revolution on CryoEM processing

Traditionally HPC clusters or Fat Nodes Now two lines of improvement emerge:

  • Graphical Processing Units (GPUs)

Algorithms being ported to use GPUs and new ones developed

  • Cloud platforms
slide-8
SLIDE 8

Plethora of EM software packages: Our answer “Scipion” Workflow Integrator

Bringing software integration to EM in workflows

slide-9
SLIDE 9

Scipion Framework

slide-10
SLIDE 10

Scipion Framework

Scipion encapsulates:

  • Parallelization: By each EM program or by Scipion -> OpenMPI
  • Environment setup, libraries
  • Batch system submission: Scipion templates
  • Use of GPUs: Implemented on EM packages, each with its

requirements. – Relion 2.0: Nvidia cards with at least 3.5 capability and for particles bigger than 200p 2 GPU with minimum 4GB RAM. – Motioncorr2: Cuda 8

slide-11
SLIDE 11

Scipion distributions

  • Binaries
  • Source code + EM packages autoinstall
  • ScipionCloud:
  • Public AMI on AWS EC2 (EU Ireland and US North Virginia

and Oregon regions)

  • Virtual Appliance on EGI AppDB
  • Vagrant file and CVMFS (Westlife project)
  • Puppet + Cloudify (Westlife project)
slide-12
SLIDE 12

ScipionCloud

  • Ubuntu 14.04 LTS
  • Scipion release 1.1 (source git)
  • Most important EM packages compiled with CUDA (GPU

support)

  • Nvidia driver + cuda toolkit (7.5 & 8.0)
  • Guacamole (remote desktop)
  • Starcluster (only AWS)
slide-13
SLIDE 13

ScipionCloud profiling

slide-14
SLIDE 14

Profiling workflow

2 BIM correction 3 CTF estimation 1 Import movies 5 Particle Extraction 10 3D postprocessing 9 3D Refinement

Preprocessin g

Processing

Postprocessi ng

Network transfer

Acquisition

6 2D Classification 7 Initial model 8 3D Classification 4 Particle Picking

slide-15
SLIDE 15

Profiling data transfer

  • 966 movies, 8Kx8K -> 6.6 TB raw data
  • Used Aspera connect (from EMPIAR DB)
  • Tested bbcp and rsync
slide-16
SLIDE 16

Profiling machine types

Environment Instance vCPUs RAM (GB) GPU model GPU RAM (GB) Cost ($/hour) AWS EC2 Ireland g2.2xlarge 8 15 GRID K520 4 0.702 p2.8xlarge 32 488 Tesla K80 12 7.776 r3.8xlarge 32 244

  • 0.888

x1.32xlarge 128 1952

  • 2.96

FedCloud CESNET universe 40 232

  • FedCloud IISAS

gpu1cpu6 6 24 Tesla K20 4

  • gpu2gpu12

12 48 Tesla K20 4

  • Local

asimov 32 512

  • 1.85 (est)

titanxp 32 128 Titan XP 12

slide-17
SLIDE 17

Profiling results

EM Workflow AWS EC2 Ireland FedCloud Local server CNB Step Program Machine type Time (hours) Cost ($) Machine type Time (hours) Machine type Time (hours) Transfer movies Aspera g2.2xlarge 36 26 1gpu6cpu 23 Local server

  • Align movies

motioncor2 GPU 41 Ctf estimation ctffind4 Particle picking Xmipp3 Interactive

  • Interactive

Interactive Particle extraction Relion 2.0 0.6 0.5 0.4 0.4 2D classification Relion 2.0 GPU p2.8xlarge 6 42 2gpu12cpu 25 8 Inital volume Eman 2.12 0.08 0.7 0.16 0.22 3D classification Relion 2.0 GPU 0.6 4.7 2.1 1.3 3D refinement Relion 2.0 GPU 0.7 5.6 2.6 1.8 Postprocessing Relion 2.0 0.003 0.03 0.004 0.003

Following results are not comparable since particle size was 512 px instead of 200 px.

3D refinement Relion 1.4 CPU x1.32xlarge 28 448 universe 166 Local server CPU 74 r3.8xlarge 88 261 4 r3.8xlarge 27 325

slide-18
SLIDE 18

Conclusions

  • GPUs have changed the EM processing paradigm

– Time / Cost

  • Cloud platforms can be a good solution for small labs that do

not want to invest on hardware or occasional needs (training)

  • ScipionCloud allows scientists to try and use Scipion

framework without dealing with installation and configuration

slide-19
SLIDE 19

Plans for the future

  • Improve remote desktop visualization

– Update Guacamole installation – Integrate VirtualGL + TurboVNC with Guacamole

  • Upgrade to Ubuntu 16.04
  • Dynamic cluster support on Federated Cloud
  • Improve image contextualization

– > INDIGO solutions

slide-20
SLIDE 20

Acknowledgments

Projects:

  • EGI Engage Competence Center
  • Instruct Pilot EM cloud computing

People:

  • Enol Fernandez (EGI.eu)
  • Boris Parak (CESNET)
  • Viet Tran and the other support staff at IISAS GPUCloud
slide-21
SLIDE 21

References

  • Scipion project: http://scipion.cnb.csic.es
  • MoBrain project: https://mobrain.egi.eu
  • INSTRUCT: http://www.structuralbiology.eu
  • Westlife project: http://about.west-life.eu
  • StarCluster: http://star.mit.edu/cluster/index.html
  • Guacamole: http://guacamole.incubator.apache.org