(VAMP) Iden%fica%on of molecular order parameters and states from - PowerPoint PPT Presentation

Varia%onal Approach to Markov Processes (VAMP) Iden%fica%on of molecular order parameters and states from nonreversible MD simula%ons Fabian Paul Computer Tutorial in Markov Modeling 18-FEB-2020

Recap: the spectral theory of MSMs • A Markov state model consists of: 1. a set of states 𝑡 ! !"#,…& 2. (condi8onal) transi8on probabili8es between these states 𝑈 !' = ℙ(𝑡 𝑢 + 𝜐 = 𝑘 ∣ 𝑡(𝑢) = 𝑗) …

Markov state models: estimation Markov model es-ma-on starts with: • grouping of geometrically [1] or kine-cally [2] related conforma-ons into clusters or microstates 1 2 microstates 3 [1] Prinz et al ., J. Chem. Phys. 134 , 174105 (2011) 3 [2] Pérez-Hernández, Paul , et al. , J. Chem. Phys. 139 , 015102 (2013)

Markov state models: estimation • We then assign every conforma-on in a MD trajectory to a microstate. -me 𝑢 2 t 3 t 4 t 5 t 6 t 7 t t trajectory microstate 𝑡 1 2 3 2 3 1 3 We count transitions between microstates and tabulate them in a • count matrix 𝐃 e. g. 𝐷 !! = 1 , 𝐷 !" = 1 , 𝐷 "# = 2 , … We estimate the transition probabilities 𝑈 $% from 𝐃 . • Naïve estimator: ) 𝑈 $% = 𝐷 $% / ∑ & 𝐷 $& • Maximum-likelihood estimator [1] • [1] Prinz et al ., J. Chem. Phys. 134 , 174105 (2011) 4 [2] Pérez-Hernández, Paul , et al. , J. Chem. Phys. 139 , 015102 (2013)

The spectrum of a Energy U ( i ) reversible T matrix • The large eigenvalues of the transition matrix and their corresponding eigenvectors encode the information about the slow molecular processes. • Flat regions of the eigenvectors allow to identify the metastable states. Prinz et al ., J. Chem. Phys. 134 , 174105 (2011)

Both MSMs and TICA make use of the same spectral method The spectral method (working with eigenvalue and eigenvector) is not limited to Markov state models. • Es;ma;on of MSMs 𝑈(𝜐) = 𝐷 !" (𝜐) 𝐷 ! • In matrix nota;on 𝐔 𝜐 = 𝐃 0 #$ 𝐃(𝜐) • Eigenvalue problem: 𝐔 𝜐 𝐰 = 𝜇𝐰 ⇔ 𝐃 0 #$ 𝐃 𝜐 𝐰 = 𝜇𝐰 ⇔ 𝐃 𝜐 𝐰 = 𝜇𝐃(0)𝐰 • The last equa;on is known as the TICA problem. All equa;ons generalize to the case where 𝐃(0) and 𝐃 𝜐 are not count matrices, but correla;on matrices. • The indices 𝑗, 𝑘 don’t longer refer to states but to features . Schwantes, Pande, J. Chem. Theory Comput. 9 2000 (2013) Pérez-Hernández et al ,J. Chem. Phys. , 139 015102 (2013)

VAC and VAMP

Varia%onal approach to conforma%onal dynamics VAC (Rayleigh-Ritz for classical dynamics) Any autocorrelation is bounded by the system-specific number ! 𝜇 , that is related to the 𝑢 by ! 𝜇 = 𝑓 !"/$ % . system-specific autocorrelation time ̂ &!" 𝜔 𝑦 𝑢 ∑ % 𝜔 𝑦(𝑢 + 𝜐) = 𝜔, T𝜔 ' ≤ ! acf 𝜔; 𝜐 : = 𝜇 &!" 𝜔 𝑦 𝑢 𝜔, 𝜔 ' ∑ % 𝜔 𝑦(𝑢) • The maximum is achieved if 𝜔 is an eigenfunction of T . Proof : Expand 𝜔 in an (orthonormal) eigen-basis of T : ) > 0 𝜔 𝑦 = ∑ ( 𝑑 ( 𝜚 ( 𝑦 , 𝜔, 𝜔 ' = ∑ ( 𝑑 ( ) 𝜇 ( − < ) ! ) 𝜇 ( − ! 𝜔, T𝜔 ' − ! 𝜇 𝜔, 𝜔 ' = < 𝑑 ( 𝑑 ( 𝜇 = < 𝑑 ( 𝜇 ≤ 0 ( ( ( • If ! 𝜇 is max 𝜇 ( the largest of T’s eigenvalues, the inequality holds. ( • Result can only be zero if 𝑑 ( = 0 for 𝑗 ≠ 𝑘 and 𝜇 * = max 𝜇 ( ⇒ 𝜔 𝑦 ∝ 𝜚 +,- 𝑦 ( • Remark: the variational approach generalizes to the optimization of multiple eigenfunctions. ! . 𝜇 is replaced by the sum of the eigenvalues 𝑆 . = ∑ (/0 𝜇 ( Noé, Nüske, SIAM Multiscale Model. Simul. 11 , 635 (2013)

Interpretation of variational principle good test func0on bad test func0on 1. Pick some test func0on 𝝍 !"#! 𝐲 and pick some test conforma0ons 𝐲 $,&'&!() distributed according to equilibrium distribu0on 𝜌 Ω 2. Propagate 𝐲 $,&'&!() with the the MD integrator. Call result 𝐲 $,*&'() . 3. Correlate 𝝍 !"#! 𝐲 &'&!() with 𝝍 !"#! 𝐲 *&'() . $ ∑ !"# 𝝍 𝐲 !,&'&()* ./ 𝝍 ⋅ 𝝍 𝐲 !,+&')* ./ 𝝍 score= $ ∑ !"# 𝝍 𝐲 !,&'&()* ./ 𝝍 ⋅ 𝝍 𝐲 !,&'&()* ./ 𝝍

Gradient-based optimization of function parameters Parameters 𝐪 of 𝝍 '()' 𝐲; 𝐪 can be op-mized with gradient-based techniques. Make use of the gradient of the VAC or VAMP score, the gradient of the test func-on and off-the-shelf op-mizers such as ADAM or BFGS.

Reversible dynamics • In equilibrium , every trajectory is as probable as its time-reversed copy ℙ 𝑡 𝑢 + 𝜐 = 𝑘 and 𝑡 𝑢 = 𝑗 = ℙ 𝑡 𝑢 + 𝜐 = 𝑗 and 𝑡 𝑢 = 𝑘 ℙ 𝑡 𝑢 + 𝜐 = 𝑘 ∣ 𝑡 𝑢 = 𝑗 ℙ 12 (𝑡 𝑢 = 𝑗) = ℙ 𝑡 𝑢 + 𝜐 = 𝑗 ∣ 𝑡 𝑢 = 𝑘 ℙ 12 (𝑡 𝑢 = 𝑘) 𝜌 ! 𝑈 !' = 𝜌 ' 𝑈 '! • In mathematician’s notation 𝐟 ! , 𝐔𝐟 ' 3 = 𝐟 ' , 𝐔𝐟 ! 3 where 𝐲, 𝐳 3 = ∑ ! 𝑦 ! 𝑧 ! 𝜌 ! • 𝐔 is a symmetric matrix w.r.t. to a non-standard scalar product. • 𝐔 has real eigenvalues and eigenvectors (linear algebra I). Prinz et al ., J. Chem. Phys. 134 , 174105 (2011)

The problem with nonreversible systems & 𝜇 $ where 𝜇 ! are the true eigenvalues. • 𝑆 & = ∑ $*! • For nonreversible dynamics 𝐟 ! , 𝐔𝐟 ' 3 ≠ 𝐟 ' , 𝐔𝐟 ! 3 • There might not even be a well-defined 𝝆 . • Eigenvalues and eigenvectors will be complex. • Varia8onal principle doesn’t work. acf(𝜔) ≤ ( 𝜇 ∈ ℂ makes no sense. One can’t order complex numbers on a line. • Op8miza8on of models not possible • Feature selec8on not possible • Is there any way to fix this? Can we maybe find some other operator that is related to dynamics and that is symmetric?

A possible solu;on: VAMP Varia%onal approach to Markov processes • Introduce the “backward” transition matrix 𝐔 + ∶= 𝐃 𝑂 ,! 𝐃 −𝜐 = 𝐃 𝑂 ,! 𝐃 - 𝜐 i.e. estimate MSM/TICA from time-reversed data, where 0 𝐷 $% −𝜐 : = 8 𝑔 $ 𝑦 𝑢 − 𝜐 𝑔 % 𝑦(𝑢) .*/ 0 𝐷 $% 𝑂 : = 8 𝑔 $ 𝑦(𝑢) 𝑔 % (𝑦(𝑢)) .*/ • Introduce the forward-backward transition matrix 𝐔 1+ ≔ 𝐔𝐔 + and 𝐔 +2 : = 𝐔 3 𝐔 • Can show that 𝐔 1+ and 𝐔 +2 are symmetric without any reference to a stationary vector (symmetry is built into the matrices). • Eigenvalues and eigenvectors of 𝐔 43 and 𝐔 34 are real. • They fulfill a variational principle 𝐃 ,!/" 0 𝐃 𝜐 𝐃 N ,!/" ≤ 𝑆 Wu, Noé, J. Nonlinear Sci. 30 , 23 (2020) Klus, S. et al, J. Nonlinear Sci., 28 , 1 (2018)

Cross-validation • The model parameters (in this example parameters of the line and steepness of the transition) were optimized for a particular realization of the dynamics. • Didn’t we say that the eigenfunctions and eigenvalues were an intrinsic property of the molecular system? • So the eigenfunctions should be the same if we repeat the analysis with a second simulation of the same system.

Cross-valida;on • The model parameters (in this example parameters of the line and steepness of the transi-on) were op-mized for a par-cular realiza-on of the dynamics. • Didn’t we say that the eigenfunc-ons and eigenvalues were an intrinsic property of the molecular system? • So the eigenfunc-ons should be the same if we repeat the analysis with a second simula-on of the same system.

Cross-validation • Ideally, we want to tell if the solu-on is robust at a single glance by measuring the robustness with one number. • The VAMP score or VAC score (also called GRMQ 1 ) lends itself to this task. • Keep all the trained model parameters fixed (here the line parameters and the steepness of the transition), plug in new data and recompute the test autocorrelation. • The test autocorrelation will be lower in general, which means that the original model was fit to noise ( overfit ). [1] McGibbon, Pande, J Chem Phys., 142 124105 (2015)

Cross-validation • Repor-ng a test-score that was computed from independent realiza-ons is the gold standard. • Independent realiza-ons can be expensive to sample. • Do the approximate k -fold (hold-out) cross-valida-on. • Split all data into training set and test sets. • Op0mize the model parameters with the training set and test the parameters with test sets. • Repeat for k different divisions of the data. k -fold cross-valida-on can be tricky with highly autocorrelated -me • series data!

Applica;ons

Application: feature selection • varia%onal principle: the higher the score the be3er • Compare test scores for different selec%ons of molecular features. Which selec%on gives best score? distances? dihedrals? contacts? side chain flips? rigid body approximation? chemical intui0on?

Application: feature selection Scherer et al J. Chem. Phys. 150 , 194108 (2019)

Application: ion channel non- equilibrium MD Analysis of MD simulation data of the "controversial” direct- knock-on conduction mechanism in the KcsA potassium channel. Ions a constantly inserted at one side of the membrane and deleted at the other side. Paul et al , J. Chem. Phys. MMMK , 164120 (2019). Fig1 and data: Köpfer et al., Science , 346 , 352 (2014).

Applica:on: ion channel non- equilibrium MD By clustering in the VAMP space, we identified 15 different states that differ structurally near the selectivity filter and differ in their conductivity.

(VAMP) Iden%fica%on of molecular order parameters and states from - PowerPoint PPT Presentation

Varia%onal Approach to Markov Processes (VAMP) Iden%fica%on of molecular order parameters and states from nonreversible MD simula%ons Fabian Paul Computer Tutorial in Markov Modeling 18-FEB-2020 Recap: the spectral theory of MSMs A Markov

CoCoA: COmanage+Conext+Applications 2012 VAMP Utrecht COCOA 2012 VAMP Utrecht COmanage

Open Space sessions VAMP 2013 Notes from the Open Space sessions 1

Vamp plugins Chunks of compiled program code delivered in shared library files (DLLs), which

MTBNZ Affiliated Clubs Consolidate & Affiliation Re=Vamp Kerikeri MTB Auckland Downhill

STUDENT SERVICES MANAGER SPEAKEASY Re-vamp outreach strategy to increase student engagement

ELIXIR EGA AAI PILOT Mikael.Linden@csc.fi, project manager VAMP workshop 6th Sep, 2012 European

OpenConext Niels van Dijk, Technical Product Manager SURFconext Utrecht, VAMP, Sept 2012

Everything you wanted to know about VAMP but were afraid to ask Brooke Husic Stanford/FU Berlin

Data Protection Code of Conduct for Service Providers (DP CoC) VAMP workshop 7.9.2012

GakuNin VO Platform GakuNin mAP - Takeshi NISHIMURA (GakuNin, NII, Japan) TERENA VAMP

Model optimization and selection: Variational Approach for Markov Processes (VAMP) Frank No (FU

Update from the College Mrs Scarlett McNally BSc FRCS(Tr&Orth) MA MBA Consultant Orthopaedic

iPlant + iRODS: Enabling data driven collaborations Nirav Merchant iPlant Collaborative/Univ. of

CIFER Community Identity Framework Keith Hazelton for Education and U. of Wisconsin-Madison

EUDAT: Towards a pan-European Collaborative Data Infrastructure Federated Identity Management and

IDP IN THE CLOUD a solution to facilitate the access of research communities to collaborative

ALMA and VLA imaging of intense galaxy-wide star formation at z~2: bridging SMGs to the Main

Computer Graphics MTAT.03.015 Raimond Tunnel 2 / 50 The Road So Far... 3 / 50 Shadows 4 / 50

S Summary of Under. CG related to f U d CG l t d t CS680 Sung-Eui Yoon ( ) (

Summary of Under. CG related to CS580 Sung-Eui Yoon ( ) Course URL:

Maintenance Items On the evening of July 9, 2016, Alberta Transportation was alerted by the

Lecture 1: Course outline and logistics What is Machine Learning Aykut Erdem February 2016

Clarify some terminology about rock exhumation and erosion Review the basic concepts of

Point Location Geometry Motivation Point Location and Trapezoidal Maps A Randomized

Sambuz

Useful Links

Newsletter

Mail Us

(VAMP) Iden%fica%on of molecular order parameters and states from - PowerPoint PPT Presentation

Varia%onal Approach to Markov Processes (VAMP) Iden%fica%on of molecular order parameters and states from nonreversible MD simula%ons Fabian Paul Computer Tutorial in Markov Modeling 18-FEB-2020 Recap: the spectral theory of MSMs A Markov

CoCoA: COmanage+Conext+Applications 2012 VAMP Utrecht COCOA 2012 VAMP Utrecht COmanage

Open Space sessions VAMP 2013 Notes from the Open Space sessions 1

Vamp plugins Chunks of compiled program code delivered in shared library files (DLLs), which

MTBNZ Affiliated Clubs Consolidate &amp; Affiliation Re=Vamp Kerikeri MTB Auckland Downhill

STUDENT SERVICES MANAGER SPEAKEASY Re-vamp outreach strategy to increase student engagement

ELIXIR EGA AAI PILOT Mikael.Linden@csc.fi, project manager VAMP workshop 6th Sep, 2012 European

OpenConext Niels van Dijk, Technical Product Manager SURFconext Utrecht, VAMP, Sept 2012

Everything you wanted to know about VAMP but were afraid to ask Brooke Husic Stanford/FU Berlin

Data Protection Code of Conduct for Service Providers (DP CoC) VAMP workshop 7.9.2012

GakuNin VO Platform GakuNin mAP - Takeshi NISHIMURA (GakuNin, NII, Japan) TERENA VAMP

Model optimization and selection: Variational Approach for Markov Processes (VAMP) Frank No (FU

Update from the College Mrs Scarlett McNally BSc FRCS(Tr&amp;Orth) MA MBA Consultant Orthopaedic

iPlant + iRODS: Enabling data driven collaborations Nirav Merchant iPlant Collaborative/Univ. of

CIFER Community Identity Framework Keith Hazelton for Education and U. of Wisconsin-Madison

EUDAT: Towards a pan-European Collaborative Data Infrastructure Federated Identity Management and

IDP IN THE CLOUD a solution to facilitate the access of research communities to collaborative

ALMA and VLA imaging of intense galaxy-wide star formation at z~2: bridging SMGs to the Main

Computer Graphics MTAT.03.015 Raimond Tunnel 2 / 50 The Road So Far... 3 / 50 Shadows 4 / 50

S Summary of Under. CG related to f U d CG l t d t CS680 Sung-Eui Yoon ( ) (

Summary of Under. CG related to CS580 Sung-Eui Yoon ( ) Course URL:

Maintenance Items On the evening of July 9, 2016, Alberta Transportation was alerted by the

Lecture 1: Course outline and logistics What is Machine Learning Aykut Erdem February 2016

Clarify some terminology about rock exhumation and erosion Review the basic concepts of

Point Location Geometry Motivation Point Location and Trapezoidal Maps A Randomized

Sambuz

Useful Links

Newsletter

Mail Us

MTBNZ Affiliated Clubs Consolidate & Affiliation Re=Vamp Kerikeri MTB Auckland Downhill

Update from the College Mrs Scarlett McNally BSc FRCS(Tr&Orth) MA MBA Consultant Orthopaedic