vamp
play

(VAMP) Iden%fica%on of molecular order parameters and states from - PowerPoint PPT Presentation

Varia%onal Approach to Markov Processes (VAMP) Iden%fica%on of molecular order parameters and states from nonreversible MD simula%ons Fabian Paul Computer Tutorial in Markov Modeling 18-FEB-2020 Recap: the spectral theory of MSMs A Markov


  1. Varia%onal Approach to Markov Processes (VAMP) Iden%fica%on of molecular order parameters and states from nonreversible MD simula%ons Fabian Paul Computer Tutorial in Markov Modeling 18-FEB-2020

  2. Recap: the spectral theory of MSMs β€’ A Markov state model consists of: 1. a set of states 𝑑 ! !"#,…& 2. (condi8onal) transi8on probabili8es between these states π‘ˆ !' = β„™(𝑑 𝑒 + 𝜐 = π‘˜ ∣ 𝑑(𝑒) = 𝑗) …

  3. Markov state models: estimation Markov model es-ma-on starts with: β€’ grouping of geometrically [1] or kine-cally [2] related conforma-ons into clusters or microstates 1 2 microstates 3 [1] Prinz et al ., J. Chem. Phys. 134 , 174105 (2011) 3 [2] PΓ©rez-HernΓ‘ndez, Paul , et al. , J. Chem. Phys. 139 , 015102 (2013)

  4. Markov state models: estimation β€’ We then assign every conforma-on in a MD trajectory to a microstate. -me 𝑒 2 t 3 t 4 t 5 t 6 t 7 t t trajectory microstate 𝑑 1 2 3 2 3 1 3 We count transitions between microstates and tabulate them in a β€’ count matrix 𝐃 e. g. 𝐷 !! = 1 , 𝐷 !" = 1 , 𝐷 "# = 2 , … We estimate the transition probabilities π‘ˆ $% from 𝐃 . β€’ NaΓ―ve estimator: ) π‘ˆ $% = 𝐷 $% / βˆ‘ & 𝐷 $& β€’ Maximum-likelihood estimator [1] β€’ [1] Prinz et al ., J. Chem. Phys. 134 , 174105 (2011) 4 [2] PΓ©rez-HernΓ‘ndez, Paul , et al. , J. Chem. Phys. 139 , 015102 (2013)

  5. The spectrum of a Energy U ( i ) reversible T matrix β€’ The large eigenvalues of the transition matrix and their corresponding eigenvectors encode the information about the slow molecular processes. β€’ Flat regions of the eigenvectors allow to identify the metastable states. Prinz et al ., J. Chem. Phys. 134 , 174105 (2011)

  6. Both MSMs and TICA make use of the same spectral method The spectral method (working with eigenvalue and eigenvector) is not limited to Markov state models. β€’ Es;ma;on of MSMs π‘ˆ(𝜐) = 𝐷 !" (𝜐) 𝐷 ! β€’ In matrix nota;on 𝐔 𝜐 = 𝐃 0 #$ 𝐃(𝜐) β€’ Eigenvalue problem: 𝐔 𝜐 𝐰 = πœ‡π° ⇔ 𝐃 0 #$ 𝐃 𝜐 𝐰 = πœ‡π° ⇔ 𝐃 𝜐 𝐰 = πœ‡πƒ(0)𝐰 β€’ The last equa;on is known as the TICA problem. All equa;ons generalize to the case where 𝐃(0) and 𝐃 𝜐 are not count matrices, but correla;on matrices. β€’ The indices 𝑗, π‘˜ don’t longer refer to states but to features . Schwantes, Pande, J. Chem. Theory Comput. 9 2000 (2013) PΓ©rez-HernΓ‘ndez et al ,J. Chem. Phys. , 139 015102 (2013)

  7. VAC and VAMP

  8. Varia%onal approach to conforma%onal dynamics VAC (Rayleigh-Ritz for classical dynamics) Any autocorrelation is bounded by the system-specific number ! πœ‡ , that is related to the 𝑒 by ! πœ‡ = 𝑓 !"/$ % . system-specific autocorrelation time Μ‚ &!" πœ” 𝑦 𝑒 βˆ‘ % πœ” 𝑦(𝑒 + 𝜐) = πœ”, Tπœ” ' ≀ ! acf πœ”; 𝜐 : = πœ‡ &!" πœ” 𝑦 𝑒 πœ”, πœ” ' βˆ‘ % πœ” 𝑦(𝑒) β€’ The maximum is achieved if πœ” is an eigenfunction of T . Proof : Expand πœ” in an (orthonormal) eigen-basis of T : ) > 0 πœ” 𝑦 = βˆ‘ ( 𝑑 ( 𝜚 ( 𝑦 , πœ”, πœ” ' = βˆ‘ ( 𝑑 ( ) πœ‡ ( βˆ’ < ) ! ) πœ‡ ( βˆ’ ! πœ”, Tπœ” ' βˆ’ ! πœ‡ πœ”, πœ” ' = < 𝑑 ( 𝑑 ( πœ‡ = < 𝑑 ( πœ‡ ≀ 0 ( ( ( β€’ If ! πœ‡ is max πœ‡ ( the largest of T’s eigenvalues, the inequality holds. ( β€’ Result can only be zero if 𝑑 ( = 0 for 𝑗 β‰  π‘˜ and πœ‡ * = max πœ‡ ( β‡’ πœ” 𝑦 ∝ 𝜚 +,- 𝑦 ( β€’ Remark: the variational approach generalizes to the optimization of multiple eigenfunctions. ! . πœ‡ is replaced by the sum of the eigenvalues 𝑆 . = βˆ‘ (/0 πœ‡ ( NoΓ©, NΓΌske, SIAM Multiscale Model. Simul. 11 , 635 (2013)

  9. Interpretation of variational principle good test func0on bad test func0on 1. Pick some test func0on 𝝍 !"#! 𝐲 and pick some test conforma0ons 𝐲 $,&'&!() distributed according to equilibrium distribu0on 𝜌 Ξ© 2. Propagate 𝐲 $,&'&!() with the the MD integrator. Call result 𝐲 $,*&'() . 3. Correlate 𝝍 !"#! 𝐲 &'&!() with 𝝍 !"#! 𝐲 *&'() . $ βˆ‘ !"# 𝝍 𝐲 !,&'&()* ./ 𝝍 β‹… 𝝍 𝐲 !,+&')* ./ 𝝍 score= $ βˆ‘ !"# 𝝍 𝐲 !,&'&()* ./ 𝝍 β‹… 𝝍 𝐲 !,&'&()* ./ 𝝍

  10. Gradient-based optimization of function parameters Parameters πͺ of 𝝍 '()' 𝐲; πͺ can be op-mized with gradient-based techniques. Make use of the gradient of the VAC or VAMP score, the gradient of the test func-on and off-the-shelf op-mizers such as ADAM or BFGS.

  11. Reversible dynamics β€’ In equilibrium , every trajectory is as probable as its time-reversed copy β„™ 𝑑 𝑒 + 𝜐 = π‘˜ and 𝑑 𝑒 = 𝑗 = β„™ 𝑑 𝑒 + 𝜐 = 𝑗 and 𝑑 𝑒 = π‘˜ β„™ 𝑑 𝑒 + 𝜐 = π‘˜ ∣ 𝑑 𝑒 = 𝑗 β„™ 12 (𝑑 𝑒 = 𝑗) = β„™ 𝑑 𝑒 + 𝜐 = 𝑗 ∣ 𝑑 𝑒 = π‘˜ β„™ 12 (𝑑 𝑒 = π‘˜) 𝜌 ! π‘ˆ !' = 𝜌 ' π‘ˆ '! β€’ In mathematician’s notation 𝐟 ! , π”πŸ ' 3 = 𝐟 ' , π”πŸ ! 3 where 𝐲, 𝐳 3 = βˆ‘ ! 𝑦 ! 𝑧 ! 𝜌 ! β€’ 𝐔 is a symmetric matrix w.r.t. to a non-standard scalar product. β€’ 𝐔 has real eigenvalues and eigenvectors (linear algebra I). Prinz et al ., J. Chem. Phys. 134 , 174105 (2011)

  12. The problem with nonreversible systems & πœ‡ $ where πœ‡ ! are the true eigenvalues. β€’ 𝑆 & = βˆ‘ $*! β€’ For nonreversible dynamics 𝐟 ! , π”πŸ ' 3 β‰  𝐟 ' , π”πŸ ! 3 β€’ There might not even be a well-defined 𝝆 . β€’ Eigenvalues and eigenvectors will be complex. β€’ Varia8onal principle doesn’t work. acf(πœ”) ≀ ( πœ‡ ∈ β„‚ makes no sense. One can’t order complex numbers on a line. β€’ Op8miza8on of models not possible β€’ Feature selec8on not possible β€’ Is there any way to fix this? Can we maybe find some other operator that is related to dynamics and that is symmetric?

  13. A possible solu;on: VAMP Varia%onal approach to Markov processes β€’ Introduce the β€œbackward” transition matrix 𝐔 + ∢= 𝐃 𝑂 ,! 𝐃 βˆ’πœ = 𝐃 𝑂 ,! 𝐃 - 𝜐 i.e. estimate MSM/TICA from time-reversed data, where 0 𝐷 $% βˆ’πœ : = 8 𝑔 $ 𝑦 𝑒 βˆ’ 𝜐 𝑔 % 𝑦(𝑒) .*/ 0 𝐷 $% 𝑂 : = 8 𝑔 $ 𝑦(𝑒) 𝑔 % (𝑦(𝑒)) .*/ β€’ Introduce the forward-backward transition matrix 𝐔 1+ ≔ 𝐔𝐔 + and 𝐔 +2 : = 𝐔 3 𝐔 β€’ Can show that 𝐔 1+ and 𝐔 +2 are symmetric without any reference to a stationary vector (symmetry is built into the matrices). β€’ Eigenvalues and eigenvectors of 𝐔 43 and 𝐔 34 are real. β€’ They fulfill a variational principle 𝐃 ,!/" 0 𝐃 𝜐 𝐃 N ,!/" ≀ 𝑆 Wu, NoΓ©, J. Nonlinear Sci. 30 , 23 (2020) Klus, S. et al, J. Nonlinear Sci., 28 , 1 (2018)

  14. Cross-validation β€’ The model parameters (in this example parameters of the line and steepness of the transition) were optimized for a particular realization of the dynamics. β€’ Didn’t we say that the eigenfunctions and eigenvalues were an intrinsic property of the molecular system? β€’ So the eigenfunctions should be the same if we repeat the analysis with a second simulation of the same system.

  15. Cross-valida;on β€’ The model parameters (in this example parameters of the line and steepness of the transi-on) were op-mized for a par-cular realiza-on of the dynamics. β€’ Didn’t we say that the eigenfunc-ons and eigenvalues were an intrinsic property of the molecular system? β€’ So the eigenfunc-ons should be the same if we repeat the analysis with a second simula-on of the same system.

  16. Cross-validation β€’ Ideally, we want to tell if the solu-on is robust at a single glance by measuring the robustness with one number. β€’ The VAMP score or VAC score (also called GRMQ 1 ) lends itself to this task. β€’ Keep all the trained model parameters fixed (here the line parameters and the steepness of the transition), plug in new data and recompute the test autocorrelation. β€’ The test autocorrelation will be lower in general, which means that the original model was fit to noise ( overfit ). [1] McGibbon, Pande, J Chem Phys., 142 124105 (2015)

  17. Cross-validation β€’ Repor-ng a test-score that was computed from independent realiza-ons is the gold standard. β€’ Independent realiza-ons can be expensive to sample. β€’ Do the approximate k -fold (hold-out) cross-valida-on. β€’ Split all data into training set and test sets. β€’ Op0mize the model parameters with the training set and test the parameters with test sets. β€’ Repeat for k different divisions of the data. k -fold cross-valida-on can be tricky with highly autocorrelated -me β€’ series data!

  18. Applica;ons

  19. Application: feature selection β€’ varia%onal principle: the higher the score the be3er β€’ Compare test scores for different selec%ons of molecular features. Which selec%on gives best score? distances? dihedrals? contacts? side chain flips? rigid body approximation? chemical intui0on?

  20. Application: feature selection Scherer et al J. Chem. Phys. 150 , 194108 (2019)

  21. Application: ion channel non- equilibrium MD Analysis of MD simulation data of the "controversial” direct- knock-on conduction mechanism in the KcsA potassium channel. Ions a constantly inserted at one side of the membrane and deleted at the other side. Paul et al , J. Chem. Phys. MMMK , 164120 (2019). Fig1 and data: KΓΆpfer et al., Science , 346 , 352 (2014).

  22. Applica:on: ion channel non- equilibrium MD By clustering in the VAMP space, we identified 15 different states that differ structurally near the selectivity filter and differ in their conductivity.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend