Protein Folding Simulation in Concurrent Constraint Programming - PowerPoint PPT Presentation

Protein Folding Simulation in Concurrent Constraint Programming Luca Bortolussi, Alessandro Dal Pal` u, Agostino Dovier DIMI, Univ. of Udine (IT) Federico Fogolari DST, Univ. of Verona (IT)

Outline of the talk • Introduction • Concurrent framework • Testing model • Results • Future Work L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 2/20

Proteins Proteins are abundant in nature and fundamental to life. • • The diversity of 3D protein structure underlies the very large range of their function (Enzymes, Storage, Transport, Mes- sengers, Antibodies, Regulation, mechanical support). • A Protein is a polymer chain made of monomers ( aminoacids ) of 20 different kinds. Aminoacids have a common part (6 atoms) and a distinguish- • ing part (from 1 to 18 atoms). • They are typically identified by one letter in { A, . . . , Z }\{ B, J, O, U, X, Z } . L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 3/20

Proteins The Primary Structure is the sequence of aminoacids consti- • tuting a protein. • The Secondary Structures of a Protein are local structures ( α - helices , β -sheets ) which formation is caused by local forces. • The Tertiary Structure , that determines macroscopic properties and biological functions, is the 3D conformation of the Protein. L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 4/20

Example: Protein 1ENH Primary Structure: • R,P,R,T,A,F,S,S,E,Q, L,A,R,L,K,R,E,F,N,E, N,R,Y,L,T,E,R,R,R,Q, Q,L,S,S,E,L,G,L,N,E, A,Q,I,K,I,W,F,Q,N,K, R,A,K,I • Tertiary Structure: All atom Model / Simplified Model L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 5/20

The Protein Structure Prediction Problem • Proteins fold in a determined environment (e.g. water) to form a very specific geometric pattern ( native state/conformation ). • The native conformation is relatively stable and unique, and corresponds to a state which minimizes the global free energy. • The Protein Structure Prediction problem (PSP) consists in predicting the Tertiary Structure of a protein, given its Primary Structure. L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 6/20

Approaches to PSP Several different approaches have been used to tackle PSP: • Homology modelling and folding recognition; • All atoms simulation using molecular dynamics; • Constraint-based approaches in lattices; • Ab-initio simulations using simplified models. L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 7/20

CCP simulation framework • We encoded the PSP problem into a Concurrent Constraint (Logic) Programming paradigm. • Each aminoacid is associated to an independent process. • Each process communicates with the others, and reacts to their changes of the spatial position. • The framework is independent from the spatial model of the protein and from the energy model. L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 8/20

Communication Strategy • Each process, before performing a move, waits for the communication of a movement of some other aminoacid; • The information of position changes of process P i is stored in a list L i of logic terms (leaving the tail variable uninstantiated), thus keeping track of the entire history of the folding known to him; • Each process, while moving, uses the most recent information available to him, i.e. the last ground terms of the lists L i ; • each process, once it has moved, communicates to all other processes its new position updating its list. L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 9/20

Abstract CCP program 7 run(ID, S, [P1, P2, ..., Pn]):- 1 simulation(S):- 8 getTails([P1, ..., Pn],[T1, ..., Tn]), 2 Init=[[I1|_], 9 ask(T1=[_|_]) -> skip + [I2|_], 10 ask(T2=[_|_]) -> skip + ..., 11 ... + [In|_]], 12 ask(Tn=[_|_]) -> skip, 3 run(1,S,Init) || 13 getLast([P1, ..., Pn],[L1, ..., Ln]), 4 run(2,S,Init) || 14 updatePosition(ID,S,[L1,..,Ln],NP), 5 ... || 15 tell(TID=[NP|_]), 6 run(n,S,Init). 16 run(ID,S,[P1, ..., Pn]). • The main clause is simulation . • Init is a variable containing n lists which contain the initial positions Ii . • n concurrent calls to run are called, one for each process - aminoacid. L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 10/20

Abstract CCP program 7 run(ID, S, [P1, P2, ..., Pn]):- 1 simulation(S):- 8 getTails([P1, ..., Pn],[T1, ..., Tn]), 2 Init=[[I1|_], 9 ask(T1=[_|_]) -> skip + [I2|_], 10 ask(T2=[_|_]) -> skip + ..., 11 ... + [In|_]], 12 ask(Tn=[_|_]) -> skip, 3 run(1,S,Init) || 13 getLast([P1, ..., Pn],[L1, ..., Ln]), 4 run(2,S,Init) || 14 updatePosition(ID,S,[L1,..,Ln],NP), 5 ... || 15 tell(TID=[NP|_]), 6 run(n,S,Init). 16 run(ID,S,[P1, ..., Pn]). • ID is the identification code of the aminoacid. • getTails gets the tails of the lists P1,...,Pn and assigns them to the variables T1,...,Tn . • Then the process waits for one of these variables to be instan- tiated with ask(Ti=[_|_]) -> skip . L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 11/20

Abstract CCP program 7 run(ID, S, [P1, P2, ..., Pn]):- 1 simulation(S):- 8 getTails([P1, ..., Pn],[T1, ..., Tn]), 2 Init=[[I1|_], 9 ask(T1=[_|_]) -> skip + [I2|_], 10 ask(T2=[_|_]) -> skip + ..., 11 ... + [In|_]], 12 ask(Tn=[_|_]) -> skip, 3 run(1,S,Init) || 13 getLast([P1, ..., Pn],[L1, ..., Ln]), 4 run(2,S,Init) || 14 updatePosition(ID,S,[L1,..,Ln],NP), 5 ... || 15 tell(TID=[NP|_]), 6 run(n,S,Init). 16 run(ID,S,[P1, ..., Pn]). • Once this happens it retrieves the last information with getLast . • Then it updates its position with updatePosition and communicates its move to all other processes by means of tell . • Finally the run procedure is called recursively. L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 12/20

Movement Strategy The procedure updatePosition works in the following way: • The aminoacid randomly chooses a new position, close to the current one within a given step; • Using the most recent information available about the spatial position of other processes, it computes the energy relative to the choice; • It accepts the position using a Montecarlo criterion: - If the new energy is lower than the current one, it accepts the move; - If the new energy is greater than the current one, it accepts the move with probability e − Enew − Ecurrent . T • This procedure depends on the spatial model adopted. L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 13/20

Movement Strategy The new position is randomly selected in the following way: • We calculate the set of points which keep fixed the distance with the adjacent neighbours of the aminoacid (a circumference or a sphere); • We randomly select a point in this set, close to the current position; • We randomly select a small offset from this point. L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 14/20

Testing Model ✬ ✩ side chain ⑦ Cα ❳ C ′ ✘ ❳❳ ❳❳ ✿ ✘ ✘ N ✘✘ ✘ ③ ❳ H ✫ ✪ H O • Each aminoacid is represented as a single center of interaction, which corresponds to the C α atom. • The energy function consists of four terms, which take into account local and global interactions. • This model is very simple, but served as a test for the framework. L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 15/20

Energy Function The Energy Function we use is E ( � s ) = η b E b ( � s ) + η a E a ( � s ) + η t E t ( � s ) + η c E c ( � s ) E b ( � s ) is the Bond Distance term � 2 � � E b ( � s ) = r ( s i , s i +1 ) − r 0 1 ≤ i ≤ n − 1 E a ( � s ) is the Bond Angle Bend term 1 0.8 � 2 � 2   n − 2 � βi − β 1 � βi − β 2 0.6 − − � σ 1 σ 2 0.4 E a ( � s ) = − log + a 2 e  a 1 e  0.2 i =1 0 0 1 2 3 Radians L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 16/20

Energy Function The Energy Function we use is E ( � s ) = η b E b ( � s ) + η a E a ( � s ) + η t E t ( � s ) + η c E c ( � s ) E t ( � s ) is the Torsional Angle term  (Φ i − φ 1)2 (Φ i − φ 2)2  n − 3 ( σ 1+ σ 0)2 + a 2 e ( σ 2+ σ 0)2 � E t ( � s ) = − log  a 1 e    i =1 E c ( � s ) is the Contact Interaction term � 12 � 6   n − 3 n � r 0 ( s i , s j ) � r 0 ( s i , s j ) � � E c ( � s ) =  | Pot ( s i , s j ) | + Pot ( s i , s j )  r ( s i , s j ) r ( s i , s j ) i =1 j = i +3 1 0.8 Potential 0.6 0 0.4 0.2 0 0 r 1r 2 r 3 r 0 2 4 6 Radians L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 17/20

Implementation The code is implemented in Mozart. There are two classes: • Protein , implements the simulation predicate and coordinates the process associated with the single aminoacids; • Amino , which describes the single aminoacid, and implements all the methods related to the action, like updatePosition , ask and tell . L. Bortolussi, A. Dal Pal` u, A. Dovier, and F. Fogolari BIOCONCUR 2004 — 18/20

Protein Folding Simulation in Concurrent Constraint Programming - PowerPoint PPT Presentation

Protein Folding Simulation in Concurrent Constraint Programming Luca Bortolussi, Alessandro Dal Pal` u, Agostino Dovier DIMI, Univ. of Udine (IT) Federico Fogolari DST, Univ. of Verona (IT) Outline of the talk Introduction Concurrent

Protein Folding Protein Folding Proteins have unique 3-dimensional shapes created by the

Protein Folding Protein Folding Proteins have unique 3-dimensional shapes created by the

Predicting Protein Folding Paths S.Will, 18.417, Fall 2011 Protein Folding by Robotics S.Will,

Protein design Chris Bystroff Biology 12 Apr 2016 1 Protein folding/ protein design folding

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Protein Folding In Vitro Biochemistry 412 February 24, 2006 Fersht & Daggett (2002) Cell 108

Constraint Networks Dario Maggi University Basel October 9, 2014 Dario Maggi Constraint

Protein-Protein interactions Reducing the complexity Why are protein-protein interactions

Making connection between ultra-fast folding kinetics and molecular dynamics simulation Group 4

Protein Folding In Vivo Biochemistry 412 March 7 th , 2006 But first, before we talk about in

Basic Rules of Protein Folding Seth Lichter Northwestern University Mechanical Engineering Dept.

Constraint Programming approaches to the Protein Folding Problem. Agostino Dovier DIMI,

Constraint Satisfaction Problems Chapter 5 Section 1 3 Constraint Satisfaction 1 Outline

Animal protein production in a Animal protein production in a Animal protein production in a

DNA RNA Protein synthesis AMINO ACIDS PROTEIN Protein degradation FUNCTION Some properties

CSE182-L7 CSE182-L7 Protein structure Basics Protein structure Basics Protein sequencing via MS

Overview of Common Strategies for Paralleliza5on Ivan Giro8o igiro8o@ictp.it Interna?onal

The White Rabbit project an Ethernet-based solution for sub-ns synchronization and deterministic

Business to IT Transformations Revisited Sebastian Stein 1 Stefan Khne 2 Konstantin Ivanov 1 1

A design pattern for component oriented development of agent-based multithreaded applications A

VHDL Digital Systems 1 The designers guide to VHDL Peter J. Andersen Morgan Kaufman

Wireless Sensor Networks 14th Lecture 12.12.2006 Christian Schindelhauer

Abstraction of Clocks in Synchronous Data-flow Systems A. Cohen 1 L. Mandel 2 F. Plateau 2 M.

Synchronous Constructive Cry ryptography Chen-Da Ueli Liu-Zhang Maurer ETH Zurich ETH

Protein Folding Simulation in Concurrent Constraint Programming - PowerPoint PPT Presentation

Protein Folding Simulation in Concurrent Constraint Programming Luca Bortolussi, Alessandro Dal Pal` u, Agostino Dovier DIMI, Univ. of Udine (IT) Federico Fogolari DST, Univ. of Verona (IT) Outline of the talk Introduction Concurrent

Protein Folding Protein Folding Proteins have unique 3-dimensional shapes created by the

Protein Folding Protein Folding Proteins have unique 3-dimensional shapes created by the

Predicting Protein Folding Paths S.Will, 18.417, Fall 2011 Protein Folding by Robotics S.Will,

Protein design Chris Bystroff Biology 12 Apr 2016 1 Protein folding/ protein design folding

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Protein Folding In Vitro Biochemistry 412 February 24, 2006 Fersht &amp; Daggett (2002) Cell 108

Constraint Networks Dario Maggi University Basel October 9, 2014 Dario Maggi Constraint

Protein-Protein interactions Reducing the complexity Why are protein-protein interactions

Making connection between ultra-fast folding kinetics and molecular dynamics simulation Group 4

Protein Folding In Vivo Biochemistry 412 March 7 th , 2006 But first, before we talk about in

Basic Rules of Protein Folding Seth Lichter Northwestern University Mechanical Engineering Dept.

Constraint Programming approaches to the Protein Folding Problem. Agostino Dovier DIMI,

Constraint Satisfaction Problems Chapter 5 Section 1 3 Constraint Satisfaction 1 Outline

Animal protein production in a Animal protein production in a Animal protein production in a

DNA RNA Protein synthesis AMINO ACIDS PROTEIN Protein degradation FUNCTION Some properties

CSE182-L7 CSE182-L7 Protein structure Basics Protein structure Basics Protein sequencing via MS

Overview of Common Strategies for Paralleliza5on Ivan Giro8o igiro8o@ictp.it Interna?onal

The White Rabbit project an Ethernet-based solution for sub-ns synchronization and deterministic

Business to IT Transformations Revisited Sebastian Stein 1 Stefan Khne 2 Konstantin Ivanov 1 1

A design pattern for component oriented development of agent-based multithreaded applications A

VHDL Digital Systems 1 The designers guide to VHDL Peter J. Andersen Morgan Kaufman

Wireless Sensor Networks 14th Lecture 12.12.2006 Christian Schindelhauer

Abstraction of Clocks in Synchronous Data-flow Systems A. Cohen 1 L. Mandel 2 F. Plateau 2 M.

Synchronous Constructive Cry ryptography Chen-Da Ueli Liu-Zhang Maurer ETH Zurich ETH

Protein Folding In Vitro Biochemistry 412 February 24, 2006 Fersht & Daggett (2002) Cell 108