molecular dynamics simulations of nucleic acids enabled by Blue - - PowerPoint PPT Presentation

molecular dynamics simulations of
SMART_READER_LITE
LIVE PREVIEW

molecular dynamics simulations of nucleic acids enabled by Blue - - PowerPoint PPT Presentation

Convergence and reproducibility in molecular dynamics simulations of nucleic acids enabled by Blue Waters Thomas E. Cheatham III Professor, Dept. of Medicinal Chemistry, College of Pharmacy Director, Center for High Performance Computing


slide-1
SLIDE 1

Thomas E. Cheatham III

Professor, Dept. of Medicinal Chemistry, College of Pharmacy Director, Center for High Performance Computing

University of Utah

Convergence and reproducibility in molecular dynamics simulations of nucleic acids enabled by Blue Waters

slide-2
SLIDE 2

People: Niel Henrikse ksen, , Hamed Hayatsh tshahi, , Dan Roe, Julien Thibault, , Kiu Shahrokh kh, , Rodrigo Galindo, Christin stina Bergonzo zo, Sean Cornillie, Zahra Heidari

$$$:

NIH R01-GM098102 “RNA-ligand interactions: simulation & experiment” NSF CHE-1266307 “CDS&E: Tools to facilitate deeper data analysis, …” NSF ACI-1521728 “RAPID: Optimizing … Ebola membrane fusion inhibitor … design” NSF ACI-1443054 “CIF21 DIBBS: Middleware and high performance analytics…” NSF ACI-1341034 “CC-NIE Integration: Science slices…” network DMZ NSF “Blue Waters” PetaScale Resource Allocation for AMBER RNA

Compute ter time:

XRAC AC MCA01S0 A01S027 27 ~12M 2M core re hours urs ~3M hour urs “Anton” (3 past awards ards)

PITTSBURGH

SUPERCOMPUTING CENTER

~7 ~7-14M 14M GPU hour urs

!!! !!!

slide-3
SLIDE 3

Accurate modeling of RNA and other biomolecules re accurate and fast simulation methods validated RNA, protein, water, ion, and ligand “force fields” “good” experiments to assess results dynamics and complete sampling: (convergence, repro Question: Is the movement real or artifact?

conformational selection vs. induced fit

slide-4
SLIDE 4

Light at the end of the tunnel? (the good vs. the bad) RNA vs. DNA “peek-a-boo” slot canyon in Escalante, Utah

slide-5
SLIDE 5

We’re seeing some progress!!! (vsrSL5)

Mg2+ free Mg2+ bound

RMSd to Mg2+ bound RMSd to Mg2+ free add Mg2+ and convert to correct structure!!!

slide-6
SLIDE 6

amber

~1978 - present

code vs. force field

late 60’s: CFF (consistent force field) + early code {Warshel, Levitt, Lifson} 1978: Bruce Gelin thesis @ Harvard {Karplus} Amber 1.1, 1981 (minimization only, f’’) GROMOS CHARMM ENCAD Discover Amber 2, 1984 (+ dynamics) NAMD GROMACS

Assisted Model Building with Energy Refinement

first protein simulation ~1975 first nucleic acid simulation in H2O ~1985

slide-7
SLIDE 7

amber

~1978 - present

code vs. force field

Assisted Model Building with Energy Refinement

Amber 14 released April, 2014; AmberTools 15, May 2015

  • 1.23x increase in GPU performance

[fully deterministic, mixed SP/fixed precision, ||-ized]

  • support for M-REMD simulation and analysis
  • constant pH
  • new TI methods
  • more methods ported to GPU
  • protein ff14SB, RNA ff12, DNA ff12+χOL4+ε/ζ
slide-8
SLIDE 8

are the force fields reliable?

(free energetics, sampling, dynamics)

energy “reaction coordinate”

Computer power? experimental 

vs.

Short simulations stay near experimental structure; longer simulations invariably move away and often to unrealistic lower energy structures…

slide-9
SLIDE 9

How to fully sample conformational ensemble?

fs ps ns μs ms s brute force – long contiguous in time MD requires: special purpose / unique hardware D.E. Shaw’s Anton machine 16 μs/day!

slide-10
SLIDE 10

ff99 + bsc0 + OL χ fix

UUCG-1 – sequence 3 (~1.5 µs) ~2.9 Å simulated w/out restraints, modern force field, explicit solvent 2 1 500ns 1µs 1.5µs

slide-11
SLIDE 11

RNA UUCG tetraloop (ff99bsc0 + OL Χ) on Anton @ PSC: 2 expt + 1-1.1 µs µs: 0.5 1

RMSd 5.0 Å 2.5 Å

Initial tests: RNA tetraloop

real time: 50 min 100 min

slide-12
SLIDE 12

RNA UUCG tetraloop (ff99bsc0 + OL Χ):

2 expt + 1-1.1 µs 2-2.1 µs 3.2-3.3 µs 3.5-3.6 µs 4.5-4.6 µs 5.6-5.7 µs µs: 1 2 3 4 5 10 Å 5 Å

slide-13
SLIDE 13

How to fully sample conformational ensemble?

fs ps ns μs ms s brute force – long contiguous in time MD requires: special purpose / unique hardware D.E. Shaw’s Anton machine 16 μs/day! fs ps ns

ensembles of independent simulations

AMBER on GPUs ~197 ns/day!

slide-14
SLIDE 14
slide-15
SLIDE 15

Independent simulations of 2KOC “UUCG” tetraloop …longer runs…

Limited sampling & too complex: Is there a simpler set of systems?

slide-16
SLIDE 16

Today: two “long-time-to-develop” short stories…

 can we converge DNA duplex structure/dynamics?  sampling RNA structure accurately is difficult

slide-17
SLIDE 17

2 ns intervals, 10 ns running average, every 5th frame (~10 us).

Anton run:

slide-18
SLIDE 18

5 “average” structures overlayed @ 1.0-4.0 µs, 1.5-4.5 µs, 2.0-5.0 µs, 2.5-5.5 µs, 3.0-6.0 µs … RMSd (0.028 Å) (0.049 Å) (0.076 Å) (0.160 Å)

…this cannot be right, can it?

(breathing, bending, twisting, …)

~2010-2011

slide-19
SLIDE 19

Test for convergence within and between simulations: Dynamics Principal components (or major modes of motion) Visualization of the first two (dominant) modes of motion Overlap of modes from independent simulations (internal helix)

slide-20
SLIDE 20

Test for convergence within and between simulations: How long does it take to converge the PC’s?

slide-21
SLIDE 21

are the force fields reliable?

(free energetics, sampling, dynamics) energy “reaction coordinate”

Computer power? experimental 

vs.

all tetraloops NMR structures

  • f DNA & RNA

crystal simulations RNA motifs RNA-drug interactions quadruplexes

What we typically find if we run long enough…

slide-22
SLIDE 22

r(GACC) tetranucleotide

[Turner / Yildirim]

…a system where we can get complete sampling NMR suggests two dominant conformations… …compare to MD simulations in explicit solvent

slide-23
SLIDE 23

r(GACC) tetranucleotide: AMBER ff12

< explicit solvent time-contiguous MD >

slide-24
SLIDE 24
slide-25
SLIDE 25

…still need more sampling!

(enablers)

  • strong GPU performance of AMBER/PMEMD
  • good replica exchange functionality
  • access to Keeneland, Stampede, Blue Waters, …
slide-26
SLIDE 26

Blue Waters PRAC: The main goals are to hierarchically and tightly couple a series of optimized molecular dynamics engines to fully map out the conformational, energetic and chemical landscape of RNA.

independent || MD engines

… …

exchanging information (e.g. T, force field, pH, …) Current players: Cheatham, Roitberg, Simmerling, York, Case

slide-27
SLIDE 27

Standard MD

Replica-exchange MD

r(GACC) tetranucleotide

slide-28
SLIDE 28

RMSD distribution profiles: Distance from A-form reference

(aka each peak shows population certain distance from the reference)

slide-29
SLIDE 29

Change in “energy representation”

  • pH
  • restraints, umbrella potentials, …
  • force field / parameter sets
  • biasing potentials (aMD)

multi-D REMD – Bergonzo / Roe, Roitberg / Swai

Fukunishi, H., Wanatabe, O., and T akada, S., J. Chem. Phys. 2002. Sugita, Y., Kitao, A., and Y. Okamoto, J. Chem. Phys. 2000.

slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32

Can use longer (4 fs) time step!

slide-33
SLIDE 33
slide-34
SLIDE 34

…more complete sampling alters results

Free Energy (kcal/mol)

Kührová et al. 2013 JCTC 500 ns T-REMD

34

Native 277 K – last 1 μs of 2 μs/replica M-REMD

Ladder-like stem

slide-35
SLIDE 35

35

ff99 9 Chen-Garc rcia shifts ts the populati tion

  • Folded UUCG tetraloop structure is sampled
  • Iso-energetic structures
slide-36
SLIDE 36

r(GACC): We now get correct 3:1 population of experimental structures with anomalous structures < 5%

slide-37
SLIDE 37

Avg Structure (150 ns sim) Avg Structure (220 ns sim) IZ + N21 + SLLSA5 N21 + SLLSA5

FUTURE: Ebola membrane fusion inhibitor peptide design

slide-38
SLIDE 38
slide-39
SLIDE 39

2013

slide-40
SLIDE 40

questions?