Computational Biophysics in the Petascale Computing Era Rommie E. - - PowerPoint PPT Presentation

computational biophysics in the petascale computing era
SMART_READER_LITE
LIVE PREVIEW

Computational Biophysics in the Petascale Computing Era Rommie E. - - PowerPoint PPT Presentation

Computational Biophysics in the Petascale Computing Era Rommie E. Amaro . UC San Diego . Blue Waters Symposium . June 2018 Convergence of HPC, data science, & data enabling transformative advances at the intersection of observational and


slide-1
SLIDE 1

Rommie E. Amaro . UC San Diego . Blue Waters Symposium . June 2018

Computational Biophysics in the Petascale Computing Era

slide-2
SLIDE 2

Convergence of HPC, data science, & data enabling transformative advances at the intersection of observational and simulation sciences

HP 735 12 CPUs

protein 10k atoms 100s ps

SGI Origin 128 CPUs Ranger 60k CPUs time

ion channel 100k atoms 1 ns ribosome 2 mil atoms 100s ns Enveloped virus 160 mil+ atoms 1-100 μs

Compute Power

1993 1997 2007 2013

Exascale

2002

LeMieux 3k CPUs

ATPase 500k atoms 10s ns

360,000 cores + GPU acceleration

Anton

BW BW is a key co component of the cyb cyberin infrastruct cture ecosystem

slide-3
SLIDE 3

Influenza Trypanosomiasis Cancer Chlamydia

NAMD, AMBER, GROMACS, MARTINI…

slide-4
SLIDE 4

Bio Biophysics sics on Blu Blue Waters

Biophysics 15% Elementary Particle Physics 13% Stellar Astronomy and Astrophysics 13% Physics 8% Earth Sciences 7% Chemistry 6% Astronomical Sciences 5% Atmospheric Sciences 4% Molecular Biosciences 4% Engineering 4% Materials Research 3% Fluid, Particulate, and Hydraulic Systems 3% Extragalactic Astronomy and Cosmology 2% Magnetospheric Physics 2% Biological Sciences 2% Galactic Astronomy 1% Nuclear Physics 1% Climate Dynamics 1% Planetary Astronomy 1% Computer and Computation Research 1% Biochemistry and Molecular Structure and Function 1% Geophysics 0% Social, Behavioral, and Economic Sciences 0% Computer and Information Science and Engineering 0% Neuroscience Biology 0% Chemical, Thermal Systems 0% Other 5%

Actual Usage by Discipline, 4/2013-5/2018

slide-5
SLIDE 5

Bio Biophysics sics on Blu Blue Waters

DI Data Intensive: uses large numbers of files, e.g. large disk space/bandwidth, or automated workflows/off-site transfers. GA GPU-Accelerated: written to run faster on XK nodes than on XE nodes TN Thousand Node: scales to at least 1,000 nodes for production science MI Memory Intensive: uses at least 50 percent of available memory on 1,000-node-runs BW Blue Waters: research only possible on Blue Waters MP Multi-Physics/multi-scale: job spans multiple length/timescales or physical/chemical processes ML Machine Learning: employs deep learning or other techniques, includes “big data” CI Communication-Intensive: requires high-bandwidth/low-latency interconnect for frequent, tightly coupled messaging IA Industry Applicable: Researcher has private sector collaborators or results directly applicable to industry

slide-6
SLIDE 6

Computational biophysics bridges gaps across scales

e.g., Can we understand the drug target in its real environment? Can we understand the molecular and chemical mechanisms underlying disease?

Blue waters took us into and across these key “capability gaps”; Engaging all-atom & coarse grained MD to give unseen views into the inner workings of cells at the molecular level

slide-7
SLIDE 7

/ OL15 is 0.44 A from NMR average of Dickerson DNA dodecamer

Tom Cheatham University of Utah Reproducibility and convergence (ensembles, replica exchange) – we can overcome the sampling problem for modest systems (tetraloops and other RNA motifs) Force field assessment, validation, and optimization

slide-8
SLIDE 8

Inner gate opening shifts the SF conformational preference from conductive to constricted conformation Pinched Conductive Pinched Conductive

Allosteric Dynamics of C-type Inactivation in the KcsA Potassium Channel

Benoît Roux Eduardo Perozo Jing Li

slide-9
SLIDE 9

These waters are now visible in a new high-resolution structure of the open-inactivated KcsA (Perozo & Cuello, private communication)

slide-10
SLIDE 10

Hepatitis B Viral capsid is a semi-permeable container with charge selectivity.

Sodium (+) translocates five times faster than chloride (-) in HBV capsids. Analysis of 6 M solvent particles in parallel in Blue Waters. 230 Blue Waters XK nodes for 6 months

slide-11
SLIDE 11

HBV flexibility reveals complex dynamics

Hepatitis B virus capsid Hadden, JA., Perilla, JR. et al. eLife (2018)

slide-12
SLIDE 12

CONFORMATIONAL DYNAMICS

MOLECULAR DYNAMICS

ELECTRON DYNAMICS

BROWNIAN DYNAMICS

Bottom-up biology of entire photosynthetic cell organelle !

JACS 138, 12077 (2016); eLife 5, e09541 (2016); JACS 139, 293 (2017)

slide-13
SLIDE 13

LARGE-SCALE COARSE-GRAINED MOLECULAR SIMULATIONS OF THE VIRAL LIFECYCLE OF HIV-1

Gregory A. Voth University of Chicago

The immature HIV-1 assembly process is catalyzed by scaffolds

RNA co-localizes protein & promotes assembly

Pak, Grime, … and Voth. PNAS 114:E10056 (2017)

Membrane deformation co-localizes & promotes assembly

slide-14
SLIDE 14

HIV-1 capsid HIV-1 virion

14

186 hexamers 12 pentamers

slide-15
SLIDE 15

HIV capsid: 4.2 million atoms, 1300+ proteins HIV capsid contains 186 hexamers, 12 pentamers

slide-16
SLIDE 16

A204 E213 K203 I201

Perilla and Schulten . Nat. Commun. (2017), 15959

Acoustic analysis reveals allostery between distant sites Contact points control curvature Ions permeate through capsid

slide-17
SLIDE 17

How membrane organization controls influenza infection: simulation & experiment

Simulations yield a new molecular organizing principle for cholesterol that controls influenza binding and infection. Zawada…Kasson, 2016; Goronzy…Kasson, 2018.

slide-18
SLIDE 18

18

Routine dataset is 1.2 trillion pixels

  • 100,000’s of structures in

a single dataset

3D Str tructu ctural al data a to build ild visib isible le vir irtu tual al ce cells lls

Se Serial Se Section EM

Res esin-em embed edded ed samples es

Se Serial Block ck EM

Re Resin-em embed edded ed ti tissues

slide-19
SLIDE 19

19

Ex Extendin ing Mole lecula lar Structure to Cellu llula lar En Envir ironments

slide-20
SLIDE 20
slide-21
SLIDE 21

21

Cell-centered, data-centric modeling framework

slide-22
SLIDE 22

Alasdair Steven, NIH

PyMolecule LipidWrapper CellPACK Fully Atomic Reconstructions

Moving from single protein to whole virus

Johnson et al, Nature Methods (2014),; Durrant & Amaro, PLOS Comp Bio (2014).

  • Improved sense of the physical arrangement of biological entities in complex biological milieu
  • Enables simultaneous study of multiple components
  • Mesoscale molecular models as a platform for other simulation approaches (e.g., Brownian dynamics, Mcell,

lattice boltzmann MD)

… leads us to new avenues of investigation, not possible on the single protein scale

slide-23
SLIDE 23

114,688 processors (16,384 Blue Waters nodes) 25.6 steps / s or ~4.5 ns/day

2013 2014 2015 2016 120 ns total, 12TB

Durrant and Amaro, unpublished (2017)

Equilibration, membrane System & tool building, waiting Prod Prop. Rev.

160 million atoms, with explicit solvent

slide-24
SLIDE 24

Petascale MD of Fully Enveloped Flu Virus

  • Largest biological system ever simulated

(~165 million atoms)

  • 4.5 ns/day using 114,688 CPUs
  • 158 ns total simulation
  • Saving every 20 ps è ~25 TB of data
  • Collaboration with TCBG P41
slide-25
SLIDE 25

Active Inactive

Markov state models define metastable states and transitions between states Allows one to extract long timescale dynamics from many short timescale simulations

Cell-scale Markov state models of protein dynamics

Swope, Pande, Schutte, Noe…

slide-26
SLIDE 26

MSMs characterize loop dynamics & druggable pockets

Virion has 30 NAs, 236 HAs Enough sampling to make a Markov state model (MSM) of NA loop dynamics 2-state Macrostate model

  • pen/closed

MFPT for the 150-loop:

  • pen to closed 52.9ns
  • closed to open 198.4 ns
slide-27
SLIDE 27

Molecular simulation at the mesoscale

100 nm 1000 nm (1 um) 10 nm

Biophysics is ready for exascale!

slide-28
SLIDE 28

Acknowledgements

http://nbcr.ucsd.edu http://amarolab.ucsd.edu

slide-29
SLIDE 29

Dedicated to Klaus Schulten, 1947-2016 “Why? … because we can.”