Gaussian Accelerated Molecular Dynamics (GaMD) Yinglong Miao Center - - PowerPoint PPT Presentation

gaussian accelerated molecular dynamics gamd
SMART_READER_LITE
LIVE PREVIEW

Gaussian Accelerated Molecular Dynamics (GaMD) Yinglong Miao Center - - PowerPoint PPT Presentation

Gaussian Accelerated Molecular Dynamics (GaMD) Yinglong Miao Center for Computational Biology & Department of Molecular Biosciences University of Kansas Mar 19, 2018 Frontiers in Computational Drug Discovery Academia Sinica, Taiwan 1


slide-1
SLIDE 1

Gaussian Accelerated Molecular Dynamics (GaMD)

Yinglong Miao

Center for Computational Biology & Department of Molecular Biosciences University of Kansas Mar 19, 2018 Frontiers in Computational Drug Discovery Academia Sinica, Taiwan

1

slide-2
SLIDE 2

 Method: Gaussian Accelerated Molecular Dynamics (GaMD)  Applications 1.

Protein folding and conformational sampling

2.

Protein-ligand binding and unbinding

3.

Protein-protein binding

 Practical Usage of GaMD

Outline

2

slide-3
SLIDE 3

Computer Simulation is Limited

Large gap remains between simulation and biological timescales: This often leads to insufficient sampling and inaccurate free energy calculations of biomolecules:

3

Protein functional free energy landscape

slide-4
SLIDE 4

Enhanced Sampling is often Constrained

  • Adaptive Biasing Force
  • Metadynamics
  • Umbrella Sampling
  • Conformational Flooding

Require predefined reaction coordinates, e.g., atom distances, torsion angles, etc.

Abrams and Bussi, Entropy, 2014.

4

slide-5
SLIDE 5

V(r): Original potential energy ΔV(r): Boost potential E: Reference energy α: Acceleration factor

Hamelberg, Mongan and McCammon, J. Chem. Phys., 2004.

Accelerated Molecular Dynamics (aMD)

5

Unconstrained enhanced sampling by adding a boost potential ΔV to reduce system energy barriers:

slide-6
SLIDE 6

ΔV ~ tens to hundreds of kcal/mol

A long-standing problem in calculating free energy profiles from aMD simulations, notably for proteins!

But large energetic noise in aMD

Shen and Hamelberg, J. Chem. Phys., 2008.

6

slide-7
SLIDE 7

How to achieve fast & accurate accelerated computer simulations, especially for large biomolecules like proteins?

  • Unconstrained enhanced sampling
  • Free energy calculation

7

slide-8
SLIDE 8

Miao, Feher and McCammon, J. Chem. Theory Comput., 2015.

GaMD enhances sampling by reducing energy barriers with a harmonic boost potential, which follows Gaussian distribution:

V(r): Original potential energy ΔV(r): Boost potential E: Reference energy k: Harmonic force constant.

8

Gaussian Accelerated Molecular Dynamics (GaMD)

slide-9
SLIDE 9

Optimal boost potential is added in GaMD

  • 1. Keep overall shape of original potential energy surface
  • 2. Reduce energy barriers for enhanced sampling
  • 3. Set narrow distribution for the boost potential

9

Miao, Feher and McCammon, J. Chem. Theory Comput., 2015.

slide-10
SLIDE 10

How to reweight GaMD simulations?

  • Cumulant expansion to 2nd order (“Gaussian Approximation”)*:

ΔV is dimensionless as divided by kBT

  • Anharmonicity of ΔV Distribution is defined as:

* Miao, Sinko, Pierce, Bucher, Walker and McCammon, JCTC, 2014.

10

ΔV ~ 0 – 50 kcal/mol

slide-11
SLIDE 11

GaMD Enables Accurate Free Energy Calculation

Cumulant expansion to the 2nd order is exact:

11

F(A): Original free energy; F*(A): GaMD free energy; Fc: Constant; β = 1/kBT; C1, C2: Boost potential cumulants.

Miao, Feher and McCammon, J. Chem. Theory Comput., 2015.

ΔV ~ 0 – 50 kcal/mol

slide-12
SLIDE 12

 Method: Gaussian Accelerated Molecular Dynamics (GaMD)  Applications 1.

Protein folding and conformational sampling

2.

Protein-ligand binding and unbinding

3.

Protein-protein binding

 Practical Usage of GaMD

Outline

12

slide-13
SLIDE 13

Protein Folding

Chignolin:

PDB GaMD

slide-14
SLIDE 14

Protein Folding: Chignolin

ΔVavg (kcal/mol) 9.8 σΔV 2.4

ΔV follows Gaussian distribution:

slide-15
SLIDE 15

Protein Folding: Free Energy Profile

Folded (F), Intermediate (I) and Unfolded (U) states

slide-16
SLIDE 16

Protein-Ligand Binding

T4-lysozyme (protein) + Benzene (ligand):

PDB: 181L

C terminal domain N terminal domain

16

slide-17
SLIDE 17

Protein-Ligand Binding

Benzene binding to T4-lysozyme:

Binding time: ~50 μs (kon = 8x105 − 106 M-1S-1)* ~100 ns (GaMD); ~ 500x speedup

* Feher, Baldwin and Dahlquist. Nat. Struct. Biol. 1996.

slide-18
SLIDE 18

Protein-Ligand Binding

Δvavg (kcal/mol) 36.3 σΔV 4.7

ΔV follows Gaussian distribution:

slide-19
SLIDE 19

Protein-Ligand Binding: Free Energy Profile

Ligand RMSD: Relative to crystal structure; Ncontact: # of protein atoms in contact with ligand.

19

GaMD quantitatively characterizes the Unbound (U), Intermediate (I) and Bound (B) states.

slide-20
SLIDE 20

Protein-Ligand Binding

Semi-open Closed

HIV Protease:

slide-21
SLIDE 21

GaMD Captured Ligand Binding to HIV Protease

Miao, Huang, Walker, McCammon and Chang, Biochemistry, 2018.

slide-22
SLIDE 22

Comparison with Anton MD Simulation

Miao, Huang, Walker, McCammon and Chang, Biochemistry, 2018.

slide-23
SLIDE 23

Ligand Binding: Free Energy Landscape

Miao, Huang, Walker, McCammon and Chang, Biochemistry, 2018.

slide-24
SLIDE 24

Protein-Ligand Binding & Unbinding

M2 Muscarinic GPCR + Arecoline (~5 μM binding affinity):

Miao and McCammon, Proc. Natl. Acad. Sci. U. S. A., 2016.

24

Drug pathway & binding sites

Valuable information for drug design

slide-25
SLIDE 25

Protein-Protein Binding

25

Miao and McCammon, Proc. Natl. Acad. Sci. U. S. A., 2018.

M2 Muscarinic GPCR + G-protein mimetic nanobody:

slide-26
SLIDE 26

Available in widely-used AMBER, NAMD: http://miao.compbio.ku.edu/GaMD/

GaMD Advantages

 ~103 times faster than standard computer simulations  Efficient: Unconstrained Enhanced Sampling

Critical for simulating protein folding, ligand binding & unbinding, etc.

 Accurate: Free Energy Calculation

Quantitative description of biomolecular dynamics.

26

slide-27
SLIDE 27

 Method: Gaussian Accelerated Molecular Dynamics (GaMD)  Applications 1.

Protein folding and conformational sampling

2.

Protein-ligand binding and unbinding

3.

Protein-protein binding

 Practical Usage of GaMD

Outline

27

slide-28
SLIDE 28

GaMD Implementation in AMBER

 Originally in Amber12  A patch was available for Amber 14

(“amber14_GaMD_patch_confidential.txt”):

  • CPU: pmemd and pmemd.MPI
  • GPU: pmemd.cuda and pmemd.cuda.MPI

 Officially released in AMBER16

slide-29
SLIDE 29

GaMD Parameters

 Minimal set of input parameters:

igamd Flag to apply boost potential = 0 (default) no boost is applied = 1 boost on the total potential energy only = 2 boost on the dihedral energy only = 3 dual boost on both dihedral and total potential energy irest_gamd Flag to restart GaMD simulation = 0 (default) new simulation = 1 restart simulation iE Flag to set the threshold energy E = 1 (default) set threshold energy to lower bound E=Vmax = 2 set threshold energy to upper bound E=Vmin+(Vmax-Vmin)/k0 ntcmd Number of cMD steps to calculate Vmax, Vmin, Vavg, σV (default 1,000,000) nteb Number of biasing equilibration steps (default 1,000,000) sigma0P Upper limit of the standard deviation of total potential boost (default 6.0 kcal/mol). sigma0D Upper limit of the standard deviation of dihedral potential boost (default 6.0 kcal/mol).

slide-30
SLIDE 30

Example: Alanine Dipeptide

 How to run GaMD simulations with different

dihedral-, total- and dual-boost potentials?

 How to analyze simulation trajectories?  How to reweight GaMD simulations for free

energy calculations?

slide-31
SLIDE 31

GaMD: Alanine Dipeptide

 http://miao.compbio.ku.edu/GaMD/lib/GaMD_

Amber-Tutorial.pdf

 http://miao.compbio.ku.edu/GaMD/lib/amber-

gamd-tutorial.tgz

 cp amber14-gamd-tutorial.tgz /data/userid/

cd /data/userid/; tar -xvzf amber14-gamd- tutorial.tgz

 export GaMDHOME=/data/userid/amber14-

gamd-tutorial

slide-32
SLIDE 32

Total-boost GaMD

 cd $GaMDHOME/test-dia/gamd-tot  md.in

&cntrl imin = 0, irest = 0, ntx = 1, nstlim = 1000, dt = 0.002, ntc = 2, ntf = 2, tol = 0.000001, iwrap = 1, ntb = 1, cut = 8.0, ntt = 3, temp0 = 300.0, tempi = 300.0, ntpr = 50, ntwx = 50, ntwr = 500, ntxo = 1, ioutfm = 1, ig = -1, ntwprt = 22, igamd = 1, iE = 1, irest_gamd = 0, ntcmd = 200, nteb = 200, ntave = 100, sigma0P = 6.0, &end

 qsub run-gamd.pbs pmemd.cuda -O -i md.in -o md-1.out -p dip.top -c dip.crd - r md-1.rst -x md-1.nc

slide-33
SLIDE 33

Dihedral-boost GaMD

 cd $GaMDHOME/test-dia/gamd-dih  md.in

&cntrl imin = 0, irest = 0, ntx = 1, nstlim = 1000, dt = 0.002, ntc = 2, ntf = 2, tol = 0.000001, iwrap = 1, ntb = 1, cut = 8.0, ntt = 3, temp0 = 300.0, tempi = 300.0, ntpr = 50, ntwx = 50, ntwr = 500, ntxo = 1, ioutfm = 1, ig = -1, ntwprt = 22, igamd = 2, iE = 1, irest_gamd = 0, ntcmd = 200, nteb = 200, ntave = 100, sigma0D = 6.0, &end

 qsub run-gamd.pbs pmemd.cuda -O -i md.in -o md-1.out -p dip.top -c dip.crd - r md-1.rst -x md-1.nc

slide-34
SLIDE 34

Dual-boost GaMD

 cd $GaMDHOME/test-dia/gamd-dual  md.in

&cntrl imin = 0, irest = 0, ntx = 1, nstlim = 1000, dt = 0.002, ntc = 2, ntf = 2, tol = 0.000001, iwrap = 1, ntb = 1, cut = 8.0, ntt = 3, temp0 = 300.0, tempi = 300.0, ntpr = 50, ntwx = 50, ntwr = 500, ntxo = 1, ioutfm = 1, ig = -1, ntwprt = 22, igamd = 3, iE = 1, irest_gamd = 0, ntcmd = 200, nteb = 200, ntave = 100, sigma0P = 6.0, sigma0D = 6.0, &end

 qsub run-gamd.pbs

pmemd.cuda -O -i md.in -o md-1.out -p dip.top -c dip.crd -r md-1.rst

  • x md-1.nc
slide-35
SLIDE 35

Dual-boost GaMD: Restart simulation

 cd $GaMDHOME/test-dia/gamd-dual  md-restart.in

&cntrl imin = 0, irest = 1, ntx = 5, nstlim = 1000, dt = 0.002, ntc = 2, ntf = 2, tol = 0.000001, iwrap = 1, ntb = 1, cut = 8.0, ntt = 3, temp0 = 300.0, tempi = 300.0, ntpr = 50, ntwx = 50, ntwr = 500, ntxo = 1, ioutfm = 1, ig = -1, ntwprt = 22, igamd = 3, iE = 1, irest_gamd = 1, ntcmd = 0, nteb = 0, ntave = 100, sigma0P = 6.0, sigma0D = 6.0, &end

 qsub run-gamd-restart.pbs pmemd.cuda -O -i md-restart.in -o md-2.out -p dip.top -c md-1.rst -r md-2.rst -x md-2.nc

slide-36
SLIDE 36

Dual-boost GaMD: 30ns simulation

 cd $GaMDHOME/test-dia/gamd-dual-30ns  md.in

&cntrl imin = 0, irest = 0, ntx = 1, nstlim = 17000000, dt = 0.002, ntc = 2, ntf = 2, tol = 0.000001, iwrap = 1, ntb = 1, cut = 8.0, ntt = 3, temp0 = 300.0, tempi = 300.0, ntpr = 50, ntwx = 50, ntwr = 500, ntxo = 1, ioutfm = 1, ig = -1, ntwprt = 22, igamd = 3, iE = 1, irest_gamd = 0, ntcmd = 1000000, nteb = 1000000, ntave = 50000, sigma0P = 6.0, sigma0D = 6.0, &end

 qsub run-gamd.pbs pmemd.cuda -O -i md.in -o md-1.out -p dip.top -c dip.crd - r md-1.rst -x md-1.nc

slide-37
SLIDE 37

Analysis: Trajectory

 Calculate dihedral angles

cd $GaMDHOME/test-dia/gamd-dual/results # ./calc_dih.sh

echo "trajin md-1.nc dihedral Phi :1@C :2@N :2@CA :2@C out Phi dihedral Psi :2@N :2@CA :2@C :3@N out Psi" > dih- dia.ptraj cpptraj dip_vac.top < dih-dia.ptraj awk '{print $2}' Phi | tail -n 20 > Phi.dat awk '{print $2}' Psi | tail -n 20 > Psi.dat

slide-38
SLIDE 38

Analysis: Reweighting

Calculate free energy profiles of dihedral angles cd $GaMDHOME/test-dia/gamd-dual-30ns/results # ./reweight.sh

# Prepare input file "weights.dat" in the following format: # Column 1: dV in units of kbT; column 2: timestep; column 3: dV in units of kcal/mol # For AMBER12: # awk 'NR%1==0' gamd.log | awk '{print ($8+$7)" " $3 " " ($8+$7)*(0.001987*300)}' > weights.dat # For AMBER14+: # Ignore the ntcmd and nteb steps nlines=300000 # number of data points used for reweighting tail -n $nlines gamd.log | awk 'NR%1==0' | awk '{print ($8+$7)/(0.001987*300)" " $2 " " ($8+$7)}' > weights.dat

slide-39
SLIDE 39

Analysis: ./reweight.sh Continued…

cd $GaMDHOME/test-dia/gamd-dual-30ns/results # ./reweight.sh

# 1D data # Reweighting using cumulant expansion python PyReweighting-1D.py -input Psi.dat -cutoff 10 - Xdim -180 180 -disc 6 -Emax 20 -job amdweight_CE - weight weights.dat | tee -a reweight_variable.log … # Analyze boost potential distribution and anharmonicity python PyReweighting-1D.py -input Psi.dat -cutoff 10 - Xdim -180 180 -disc 6 -Emax 20 -job amd_dV -weight weights.dat | tee -a reweight_variable.log

slide-40
SLIDE 40

Analysis: ./reweight.sh Continued…

cd $GaMDHOME/test-dia/gamd-dual-30ns/results # ./reweight.sh

# 2D data # Reweighting using cumulant expansion python PyReweighting-2D.py -cutoff 10 -input Phi_Psi - Xdim -180 180 -discX 6 -Ydim -180 180 -discY 6 -Emax 20 - job amdweight_CE -weight weights.dat | tee -a reweight_variable.log … # Analyze boost potential distribution and anharmonicity python PyReweighting-2D.py -cutoff 10 -input Phi_Psi - Xdim -180 180 -discX 6 -Ydim -180 180 -discY 6 -Emax 20 - job amd_dV -weight weights.dat | tee -a reweight_variable.log

slide-41
SLIDE 41

Results of more converged simulations

  • Enhanced sampling of distinct conformations:

αL, αR, β sheet , polyproline II (PII)

  • Low anharmonicity in sampled space, although with

increased values in the energy barrier regions

 Combine 30ns x 3 GaMD simulations

slide-42
SLIDE 42

Questions?

GaMD mailing list: https://sourceforge.net/projects/gamd/lists/gamd-discuss

References:

  • Miao, Y.; Feher, V. A.; McCammon, J. A., Gaussian Accelerated Molecular Dynamics:

Unconstrained Enhanced Sampling and Free Energy Calculation. J. Chem. Theory

  • Comput. 2015, 11, 3584-3595.
  • Pang YT, Miao Y, Wang Y,&McCammon JA (2017) Gaussian Accelerated Molecular

Dynamics in NAMD. J Chem Theory Comput13(1):9-19.

  • Miao Y & McCammon JA (2017) Gaussian Accelerated Molecular Dynamics: Theory,

Implementation and Applications. Annual Reports in Computational Chemistry13:231- 278.

  • Miao, Y.; Sinko, W.; Pierce, L.; Bucher, D.; McCammon, J. A., Improved reweighting
  • f accelerated molecular dynamics simulations for free energy calculation. J. Chem.

Theory Comput. 2014, 10, 2677–2689. http://miao.compbio.ku.edu/GaMD http://miao.compbio.ku.edu/PyReweighting