Simulating Biomolecules with Variable Protonation State: Constant-pH - PowerPoint PPT Presentation

Simulating Biomolecules with Variable Protonation State: Constant-pH Molecular Dynamics Simulations with NAMD Brian Radak Leadership Computing Facility, Argonne National Laboratory Computational Biophysics Workshop – Enhanced Sampling and Free Energy Calculations University of Illinois, Urbana-Champaign September 26, 2017

Acknowledgements (other people to blame) Univ of Chicago ◮ Benoˆ ıt Roux ◮ Donghyuk Suh ◮ Huan Rui ◮ Yunjie Chen Theta Early Science Program ALCF ◮ Sunhwan Jo (SilcsBio) ◮ Wei Jiang UIUC ◮ Jim Phillips ◮ Chris Chipot ◮ Abhi Singharoy

pH Effects in Biochemistry Casey, et al Nat Rev Mol Cell Biol , 2010 6.0 enzyme rate vs. pH golgi endosome 6.7 5.5-6.5 general general mitochondria acid base fractional population lysosome 8.0 k opt rate constant 4.7 endoplasmic nucleus reticulum cytosol 7.2 pK a,acid pK a,opt pK a,base pH variability of pH by region cancerous normal pHe + 7.5 6.9 pH gradients at cell surfaces + pHi 7.2 7.5 Webb, et al Nat Rev Cancer , 2011

Constant pH and the semi-grand canonical ensemble ◮ Conventional MD samples a canonical ensemble: � d x e − β U ( x ) Q = ◮ Constant-pH MD samples a semi-grand canonical ensemble: � Q λ 10 − n λ pH Ξ(pH) = λ ∈S The added interaction is between the number of protons, n λ , and a pH bath. λ is a new variable designating the protonation state.

Networks of protonation states − H + − H + E-R 1 H:R 2 H E-R − 1 :R 2 H E-R 1 H:R 2 H E-R − 1 :R 2 H − H + − H + − H + − H + − H + − H + E-R 1 H:R − E-R − 1 :R − E-R 1 H:R − E-R − 1 :R − 2 2 2 2 constant pH MD conventional MD

Networks of protonation states − H + − H + E-R 1 H:R 2 H E-R − 1 :R 2 H E-R 1 H:R 2 H E-R − 1 :R 2 H − H + − H + − H + − H + − H + − H + E-R 1 H:R − E-R − 1 :R − E-R 1 H:R − E-R − 1 :R − 2 2 2 2 constant pH MD conventional MD 2 N N = 1 3 4 2

pH as a thermodynamic force ◮ Classical MD utilizes mechanical forces F = −∇ U [ x ( t )] = m ∂ v v = ∂ x ∂ t ; ∂ t ◮ pH may be regarded as a thermodynamic force ln 10pH = − ∂ ln Ξ ∂ n λ Mechanical forces – deterministic/stochastic dynamics Thermodynamic forces – probabilistic “dynamics” P λ (pH) ∝ Q λ 10 − n λ pH

How do we define nodes in the network? Consider a system with m sites:

Protonation state probabilities/populations d x A ( x , λ ) e − β U ( x ; λ ) 10 − n λ pH � � λ ∈S � A ( x , λ ) � pH = Ξ(pH) P λ s = � λ s � pH – the probability that site s is occupied There are two kinds of terms in the summation, λ s = 0 / 1 Ξ(pH) = Ξ 0 (pH) + Ξ 1 (pH)10 − pH thus, Ξ 1 (pH)10 − pH 1 � λ s � pH = Ξ 0 (pH) + Ξ 1 (pH)10 − pH = 1 + Ξ 0 (pH) Ξ 1 ( pH ) 10 pH

Connection to thermodynamics 1 � λ s � pH = 1 + Ξ 0 (pH) Ξ 1 ( pH ) 10 pH compares to the Henderson-Hasselbalch equation such that p K a (pH) = − log Ξ 0 (pH) Ξ 1 (pH) , except that now p K a (pH) is pH dependent . One often uses the approximation: � � p K a ( pH ) ≈ p K (a) pH − p K (a) + (1 − n ) , a a where n is the Hill coefficient and p K (a) is the “apparent” p K a . a

Networks of protonation states − H + − H + E-R − E-R − E-R 1 H:R 2 H 1 :R 2 H E-R 1 H:R 2 H 1 :R 2 H − H + − H + − H + − H + − H + − H + E-R 1 H:R − E-R − 1 :R − E-R 1 H:R − E-R − 1 :R − 2 2 2 2 constant pH MD conventional MD We can now see that the fraction of simulation time spent in a given protonation state is directly impacted by the difference of the p K a of a residue/site and the pH.

That’s great – how do we sample the states? 1. Sample the configuration space of a given state ( i.e., sample x for a given Q λ ) 2. Change between protonation states according to the number of protons and the given pH ( i.e., sample λ and choose a new Q λ ) This may be regarded as a Gibbs sampling , whereby the configuration and state are sampled in an alternating fashion.

A problem! Environmental response ◮ (De)Protonation is a significant electrostatic event. ◮ Non-trivial reorganization of solvent, possibly solute. ◮ Naive sudden changes in protonation are likely to cause high energy configurations and/or steric clashes.

Possible solutions to the solvent clash problem auxillary implicit solvent continuous fractional proton discrete copy fractional proton

“Fast” alchemical growth ◮ Swap the protonation state by using time-dependent interactions. ◮ Gradually stronger interactions will induce solvent response. ◮ Clashes are avoided by using the natural dynamics of the model.

The neMD/MC constant pH paradigm ◮ Drive alchemical growth with nonequilibrium work ◮ Accept/reject with a generalized Metropolis criterion Stern J Chem Phys , 2007 ; Chen & Roux J Chem Theory Comput , 2015

The neMD/MC constant pH paradigm ! =2/3 neMD alchemical growth removal of auxiliary MC sample of auxiliary coordinates coordinates ! =0 ! =1 ! =1/3 ◮ Drive alchemical growth with nonequilibrium work ◮ Accept/reject with a generalized Metropolis criterion Stern J Chem Phys , 2007 ; Chen & Roux J Chem Theory Comput , 2015

Beyond Gibbs sampling: Hybrid MD and neMD/MC We now alternate conventional sampling with MD ( x ) and Metropolis Monte Carlo sampling ( x and λ ): ρ ( x , λ ) T ( x , λ → x ′ , λ ′ ) = ρ ( x ′ , λ ′ ) T ( x ′ , λ ′ → x , λ ) such that the neMD/MC transition probability is: 1 , ρ ( x ′ , λ ′ ) � � � 1 , e − β W 10 − ∆ n pH � T ( x , λ → x ′ , λ ′ ) = min = min ρ ( x , λ ) (If you’d like, MD uses the probability T ( x → x ′ ) = 1.)

Important considerations ◮ How long should I sample the equilibrium stage? ◮ How long should I sample the nonequilibrium stage (the “switch time,” τ switch ) ◮ Rejecting a nonequilibrum trajectory is expensive, how can we avoid doing that so much?

The two-step “inherent” p K a algorithm T ( x , λ → x ′ , λ ′ ) = T ( i ) ( λ → λ ′ ) T ( s ) ( x → x ′ | λ → λ ′ ) 1 , 10 p K (i) � a ( λ , λ ′ ) − ∆ n pH � T ( i ) ( λ → λ ′ ) = min ◮ neMD/MC can be split into two parts 1. T ( i ) – only depends on λ and the pH – CHEAP 2. T ( s ) – depends on the switch ( W ) – COSTLY ◮ Effort is shifted by estimating a parameter, p K (i) a ◮ Optimal efficiency achieved for exact p K a ◮ Dramatically improved performance on wide pH ranges! ◮ Can do even better for systems with more than two states. Chen & Roux J Chem Theory Comput , 2015 ; Radak, et al submitted

Let’s look at this graphically 1.0 0.8 be occupied remove add probability 0.6 if occupied if unoccupied 0.4 0.2 0.0 pK a - 2 pK a - 1 pK a pK a + 1 pK a + 2 pH ◮ It’s silly to try to add/remove protons to/from acidic/basic residues at high/low pH ◮ Transitions are proposed in proportion to the estimated population.

What about after we’ve proposed a switch? ◮ A short switch will not change much and likely be rejected. ◮ A long switch is expensive (limit of a single switch – BAD). ◮ Since the switch success depends on the work, let’s analyze that.

Work and force fluctuations – a typical neMD/MC cycle τ mol τ switch 2 σ 0 2 force σ 0 force correlation work ∆ F 0 0 5 10 15 20 0 50 100 150 200 time (/ps) Radak & Roux J Chem Phys , 2016

Theoretical and Empirical Performance Analysis 80 1.0 exact 20 (noramlized) optimal 70 approx. near maximal 60 mean acceptance probability acceptance probability 0.8 switch time transition rate (/ns -1 ) 50 15 40 0.6 error ~10% 30 10 0.4 20 10 5 0.2 0 τ opt ≈ 11 ps near minimal 0 40 80 120 160 200 transition rate 0 0.0 1.0 0 10 20 30 40 50 60 1400 1600 1800 2000 acceptance probability switch time (/ps) 0.8 optimal mean 0.6 ◮ Well-defined criteria for 0.4 optimization. 0.2 23.3% ◮ Cost is quite tractable. 0.0 0 10 20 30 40 50 60 Radak & Roux J Chem Phys , 2017 equilibrium work variance [/(k B T) 2 ]

NAMD Constant pH features ◮ Flexible Tcl interface source ...lib/namdcph/namdcph.tcl ◮ PSF build procedure is unchanged (automated psfgen ) ◮ Implemented with PME and full electrostatics ◮ Normal CPU scaling (no GPU yet) - depends on alchemy ◮ Companion analysis script cphanalyze pH 7.0 cphNumstepsPerSwitch 7500 ;# run 7500 steps per switch cphRun 5000 10 ;# run 10 cycles of 5000 MD steps

Output and Analysis ◮ Normal usage requires multiple pH values 1.0 (“titration curves”) 0.8 ◮ New output cphlog and GLU10/43/52 protonated fraction ASP19/21/83 cphrst LYS24 0.6 HIS8 ◮ New checkpoint files 0.4 psf/pdb ◮ Can boost performance with 0.2 WHAM ( cphanalyze ) 0.0 2 3 4 5 6 7 ◮ Can also analyze residue pH correlations

Reference amino acids are well-reproduced ◮ Adjustments to force field 1.0 LYS: 10.4 +/- 0.3 enforce empirical reference CYS: 9.5 +/- 0.1 protonated fraction/state population 0.8 values ASP: 4.1 +/- 0.1 GLU: 4.4 +/- 0.1 ◮ Implicitly model solvated 0.6 proton and bond energy 0.4 HIS- δ : 6.6 +/- 0.2 effects HIS- ε : 7.1 +/- 0.2 ◮ Bonus - accurate 0.2 reproduction of tautomeric 0.0 4 5 6 7 8 9 10 11 ratios! pH

Simulating Biomolecules with Variable Protonation State: Constant-pH - PowerPoint PPT Presentation

Simulating Biomolecules with Variable Protonation State: Constant-pH Molecular Dynamics Simulations with NAMD Brian Radak Leadership Computing Facility, Argonne National Laboratory Computational Biophysics Workshop Enhanced Sampling and

Simulating Biomolecules with Variable Protonation State: Constant-pH Molecular Dynamics

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

Simulating Syst Simulating Systems in Gr ems in Ground V ound Vehicle hicle Design Design

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

An Efficient Algorithm for An Efficient Algorithm for Simulating Coalescence with Simulating

Simulating Search Strategies Simulating Search Strategies for Gnutella for Gnutella Chun Wai

Simulating the effects of anticoagulant drugs Simulating the effects of anticoagulant drugs on

Non-Photorealistic Computer Graphics Chapter 6 Simulating Natural Media and Artistic Techniques

Simulating Chromosome Segregation Qi Zheng Simulating Chromosome Segregation Qi Zheng

Syscall Proxying Simulating Remote Execution Maximiliano Cceres maximiliano.caceres@corest.com

Interaction of zinc(II) and copper(II) terpyridine complexes with biomolecules Tanja Soldatovi

Variable Benefit Plans in Depth Kelly Coffing, FSA, EA, MAAA September 21, 2019 Agenda The

Measuring variable importance in random forests Variable Variable importance in RF importance

Variables in C++ The variable C++ Variables Kinds of Variables Memory storage

Variable & Value Ordering Heuristics Heuristics for backtracking algorithms Variable

Electrostatic Interactions in Mixtures of Cationic and Anionic Biomolecules: Bulk Structures and

Drug Metabolizing Enzyme Induction Survey of Current Practices A PhRMA Drug Metabolism Technical

Science applications requiring exascale compute and data capabilities Jack Wells Director of

Aims and objectives Why is haematology so difficult? Classification of anaemias

Emergence of virus from nature depends on: Virological factors, such as permissive cells &

Sicurezza del Registro di mala2a di Gaucher: una vera storia

Modelling protein trafficking: progress and challenges Vashti Galpin Laboratory for Foundations

Optical-Kermit: Optical map guided genome assembly Miika Leinonen, Leena Salmela University of

Outline What is EMBOSS? Major programs Running EMBOSS Programs from the Unix

Sambuz

Useful Links

Newsletter

Mail Us

Simulating Biomolecules with Variable Protonation State: Constant-pH - PowerPoint PPT Presentation

Simulating Biomolecules with Variable Protonation State: Constant-pH Molecular Dynamics Simulations with NAMD Brian Radak Leadership Computing Facility, Argonne National Laboratory Computational Biophysics Workshop Enhanced Sampling and

Simulating Biomolecules with Variable Protonation State: Constant-pH Molecular Dynamics

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

Simulating Syst Simulating Systems in Gr ems in Ground V ound Vehicle hicle Design Design

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

An Efficient Algorithm for An Efficient Algorithm for Simulating Coalescence with Simulating

Simulating Search Strategies Simulating Search Strategies for Gnutella for Gnutella Chun Wai

Simulating the effects of anticoagulant drugs Simulating the effects of anticoagulant drugs on

Non-Photorealistic Computer Graphics Chapter 6 Simulating Natural Media and Artistic Techniques

Simulating Chromosome Segregation Qi Zheng Simulating Chromosome Segregation Qi Zheng

Syscall Proxying Simulating Remote Execution Maximiliano Cceres maximiliano.caceres@corest.com

Interaction of zinc(II) and copper(II) terpyridine complexes with biomolecules Tanja Soldatovi

Variable Benefit Plans in Depth Kelly Coffing, FSA, EA, MAAA September 21, 2019 Agenda The

Measuring variable importance in random forests Variable Variable importance in RF importance

Variables in C++ The variable C++ Variables Kinds of Variables Memory storage

Variable &amp; Value Ordering Heuristics Heuristics for backtracking algorithms Variable

Electrostatic Interactions in Mixtures of Cationic and Anionic Biomolecules: Bulk Structures and

Drug Metabolizing Enzyme Induction Survey of Current Practices A PhRMA Drug Metabolism Technical

Science applications requiring exascale compute and data capabilities Jack Wells Director of

Aims and objectives Why is haematology so difficult? Classification of anaemias

Emergence of virus from nature depends on: Virological factors, such as permissive cells &amp;

Sicurezza del Registro di mala2a di Gaucher: una vera storia

Modelling protein trafficking: progress and challenges Vashti Galpin Laboratory for Foundations

Optical-Kermit: Optical map guided genome assembly Miika Leinonen, Leena Salmela University of

Outline What is EMBOSS? Major programs Running EMBOSS Programs from the Unix

Sambuz

Useful Links

Newsletter

Mail Us

Variable & Value Ordering Heuristics Heuristics for backtracking algorithms Variable

Emergence of virus from nature depends on: Virological factors, such as permissive cells &