Rommie E. Amaro . UC San Diego . Blue Waters Symposium . June 2018
Computational Biophysics in the Petascale Computing Era Rommie E. - - PowerPoint PPT Presentation
Computational Biophysics in the Petascale Computing Era Rommie E. - - PowerPoint PPT Presentation
Computational Biophysics in the Petascale Computing Era Rommie E. Amaro . UC San Diego . Blue Waters Symposium . June 2018 Convergence of HPC, data science, & data enabling transformative advances at the intersection of observational and
Convergence of HPC, data science, & data enabling transformative advances at the intersection of observational and simulation sciences
HP 735 12 CPUs
protein 10k atoms 100s ps
SGI Origin 128 CPUs Ranger 60k CPUs time
ion channel 100k atoms 1 ns ribosome 2 mil atoms 100s ns Enveloped virus 160 mil+ atoms 1-100 μs
Compute Power
1993 1997 2007 2013
Exascale
2002
LeMieux 3k CPUs
ATPase 500k atoms 10s ns
360,000 cores + GPU acceleration
Anton
BW BW is a key co component of the cyb cyberin infrastruct cture ecosystem
Influenza Trypanosomiasis Cancer Chlamydia
NAMD, AMBER, GROMACS, MARTINI…
Bio Biophysics sics on Blu Blue Waters
Biophysics 15% Elementary Particle Physics 13% Stellar Astronomy and Astrophysics 13% Physics 8% Earth Sciences 7% Chemistry 6% Astronomical Sciences 5% Atmospheric Sciences 4% Molecular Biosciences 4% Engineering 4% Materials Research 3% Fluid, Particulate, and Hydraulic Systems 3% Extragalactic Astronomy and Cosmology 2% Magnetospheric Physics 2% Biological Sciences 2% Galactic Astronomy 1% Nuclear Physics 1% Climate Dynamics 1% Planetary Astronomy 1% Computer and Computation Research 1% Biochemistry and Molecular Structure and Function 1% Geophysics 0% Social, Behavioral, and Economic Sciences 0% Computer and Information Science and Engineering 0% Neuroscience Biology 0% Chemical, Thermal Systems 0% Other 5%
Actual Usage by Discipline, 4/2013-5/2018
Bio Biophysics sics on Blu Blue Waters
DI Data Intensive: uses large numbers of files, e.g. large disk space/bandwidth, or automated workflows/off-site transfers. GA GPU-Accelerated: written to run faster on XK nodes than on XE nodes TN Thousand Node: scales to at least 1,000 nodes for production science MI Memory Intensive: uses at least 50 percent of available memory on 1,000-node-runs BW Blue Waters: research only possible on Blue Waters MP Multi-Physics/multi-scale: job spans multiple length/timescales or physical/chemical processes ML Machine Learning: employs deep learning or other techniques, includes “big data” CI Communication-Intensive: requires high-bandwidth/low-latency interconnect for frequent, tightly coupled messaging IA Industry Applicable: Researcher has private sector collaborators or results directly applicable to industry
Computational biophysics bridges gaps across scales
e.g., Can we understand the drug target in its real environment? Can we understand the molecular and chemical mechanisms underlying disease?
Blue waters took us into and across these key “capability gaps”; Engaging all-atom & coarse grained MD to give unseen views into the inner workings of cells at the molecular level
/ OL15 is 0.44 A from NMR average of Dickerson DNA dodecamer
Tom Cheatham University of Utah Reproducibility and convergence (ensembles, replica exchange) – we can overcome the sampling problem for modest systems (tetraloops and other RNA motifs) Force field assessment, validation, and optimization
Inner gate opening shifts the SF conformational preference from conductive to constricted conformation Pinched Conductive Pinched Conductive
Allosteric Dynamics of C-type Inactivation in the KcsA Potassium Channel
Benoît Roux Eduardo Perozo Jing Li
These waters are now visible in a new high-resolution structure of the open-inactivated KcsA (Perozo & Cuello, private communication)
Hepatitis B Viral capsid is a semi-permeable container with charge selectivity.
Sodium (+) translocates five times faster than chloride (-) in HBV capsids. Analysis of 6 M solvent particles in parallel in Blue Waters. 230 Blue Waters XK nodes for 6 months
HBV flexibility reveals complex dynamics
Hepatitis B virus capsid Hadden, JA., Perilla, JR. et al. eLife (2018)
CONFORMATIONAL DYNAMICS
MOLECULAR DYNAMICS
ELECTRON DYNAMICS
BROWNIAN DYNAMICS
Bottom-up biology of entire photosynthetic cell organelle !
JACS 138, 12077 (2016); eLife 5, e09541 (2016); JACS 139, 293 (2017)
LARGE-SCALE COARSE-GRAINED MOLECULAR SIMULATIONS OF THE VIRAL LIFECYCLE OF HIV-1
Gregory A. Voth University of Chicago
The immature HIV-1 assembly process is catalyzed by scaffolds
RNA co-localizes protein & promotes assembly
Pak, Grime, … and Voth. PNAS 114:E10056 (2017)
Membrane deformation co-localizes & promotes assembly
HIV-1 capsid HIV-1 virion
14
186 hexamers 12 pentamers
HIV capsid: 4.2 million atoms, 1300+ proteins HIV capsid contains 186 hexamers, 12 pentamers
A204 E213 K203 I201
Perilla and Schulten . Nat. Commun. (2017), 15959
Acoustic analysis reveals allostery between distant sites Contact points control curvature Ions permeate through capsid
How membrane organization controls influenza infection: simulation & experiment
Simulations yield a new molecular organizing principle for cholesterol that controls influenza binding and infection. Zawada…Kasson, 2016; Goronzy…Kasson, 2018.
18
Routine dataset is 1.2 trillion pixels
- 100,000’s of structures in
a single dataset
3D Str tructu ctural al data a to build ild visib isible le vir irtu tual al ce cells lls
Se Serial Se Section EM
Res esin-em embed edded ed samples es
Se Serial Block ck EM
Re Resin-em embed edded ed ti tissues
19
Ex Extendin ing Mole lecula lar Structure to Cellu llula lar En Envir ironments
21
Cell-centered, data-centric modeling framework
Alasdair Steven, NIH
PyMolecule LipidWrapper CellPACK Fully Atomic Reconstructions
Moving from single protein to whole virus
Johnson et al, Nature Methods (2014),; Durrant & Amaro, PLOS Comp Bio (2014).
- Improved sense of the physical arrangement of biological entities in complex biological milieu
- Enables simultaneous study of multiple components
- Mesoscale molecular models as a platform for other simulation approaches (e.g., Brownian dynamics, Mcell,
lattice boltzmann MD)
… leads us to new avenues of investigation, not possible on the single protein scale
114,688 processors (16,384 Blue Waters nodes) 25.6 steps / s or ~4.5 ns/day
2013 2014 2015 2016 120 ns total, 12TB
Durrant and Amaro, unpublished (2017)
Equilibration, membrane System & tool building, waiting Prod Prop. Rev.
160 million atoms, with explicit solvent
Petascale MD of Fully Enveloped Flu Virus
- Largest biological system ever simulated
(~165 million atoms)
- 4.5 ns/day using 114,688 CPUs
- 158 ns total simulation
- Saving every 20 ps è ~25 TB of data
- Collaboration with TCBG P41
Active Inactive
Markov state models define metastable states and transitions between states Allows one to extract long timescale dynamics from many short timescale simulations
Cell-scale Markov state models of protein dynamics
Swope, Pande, Schutte, Noe…
MSMs characterize loop dynamics & druggable pockets
Virion has 30 NAs, 236 HAs Enough sampling to make a Markov state model (MSM) of NA loop dynamics 2-state Macrostate model
- pen/closed
MFPT for the 150-loop:
- pen to closed 52.9ns
- closed to open 198.4 ns
Molecular simulation at the mesoscale
100 nm 1000 nm (1 um) 10 nm
Biophysics is ready for exascale!
Acknowledgements
http://nbcr.ucsd.edu http://amarolab.ucsd.edu