Protein Hypernetworks Johannes K oster, Eli Zamir, Sven Rahmann - - PowerPoint PPT Presentation

protein hypernetworks
SMART_READER_LITE
LIVE PREVIEW

Protein Hypernetworks Johannes K oster, Eli Zamir, Sven Rahmann - - PowerPoint PPT Presentation

Genome Informatics Protein Hypernetworks Johannes K oster, Eli Zamir, Sven Rahmann August 20, 2012 1 / 14 Protein Network Modelling Genome Informatics B A C G H Interaction maps (undirected graphs) F D I E 2 / 14 Protein


slide-1
SLIDE 1

1 / 14 Genome Informatics

Protein Hypernetworks

Johannes K¨

  • ster, Eli Zamir, Sven Rahmann

August 20, 2012

slide-2
SLIDE 2

2 / 14 Genome Informatics

Protein Network Modelling

Interaction maps (undirected graphs)

A H G B C D E F I

slide-3
SLIDE 3

2 / 14 Genome Informatics

Protein Network Modelling

Interaction maps (undirected graphs)

A H G B C D E F I

Differential equations (Law of Mass Action), Bayesian Networks, ...

d[C] dt = kon[A][B] − koff[C]

slide-4
SLIDE 4

2 / 14 Genome Informatics

Protein Network Modelling

Interaction maps (undirected graphs)

A H G B C D E F I

Differential equations (Law of Mass Action), Bayesian Networks, ...

d[C] dt = kon[A][B] − koff[C]

accuracy scalability

slide-5
SLIDE 5

2 / 14 Genome Informatics

Protein Network Modelling

Interaction maps (undirected graphs)

A H G B C D E F I

Protein Hypernetworks

?

Differential equations (Law of Mass Action), Bayesian Networks, ...

d[C] dt = kon[A][B] − koff[C]

accuracy scalability

slide-6
SLIDE 6

3 / 14 Genome Informatics

Structure

1 Protein Hypernetworks 2 Mining Protein Hypernetworks 3 Data Aquisition

slide-7
SLIDE 7

4 / 14 Genome Informatics

Idea

Protein Network (P, I)

A H G B C D E F I

H I G B G A

slide-8
SLIDE 8

4 / 14 Genome Informatics

Idea

Protein Network (P, I)

A H G B C D E F I

  • H

I G B G A

slide-9
SLIDE 9

4 / 14 Genome Informatics

Idea

Protein Hypernetwork (P, I, C) Protein Network (P, I)

A H G B C D E F I

  • H

I G B G A

Boolean Logic Constraints C

{G, H} ⇒ {I, H} {A, B} ⇒ ¬{G, B} {G, B} ⇒ ¬{A, B}

slide-10
SLIDE 10

5 / 14 Genome Informatics

Mining Protein Hypernetworks

Protein Hypernetwork (P, I, C)

A H G B C D E F I H I G B G A

H G I A B G A B G

slide-11
SLIDE 11

5 / 14 Genome Informatics

Mining Protein Hypernetworks

Protein Hypernetwork (P, I, C)

A H G B C D E F I H I G B G A

Minimal network states (Nec, Imp) for q ∈ P ∪ I

q ∧

  • c∈C

c

H G I A H A G B C C F G F H I F I D F C D D E E I A B G A B G

Satisfying model α : P ∪ I → {0, 1} by tableau algorithm Nec := {q′ ∈ P ∪ I | α(q′) = 1} Imp := {q′ ∈ P ∪ I | α(q′) = 0 due to active c ∈ C}

slide-12
SLIDE 12

6 / 14 Genome Informatics

Tableau Algorithm

Propositional Logic Tableau Algorithm

for a given formula f explore depth-first the tree

  • f deductions from root f

each root-leaf-path without contradiction is a satisfying model AB ∧ (AB ⇒ BC) ∧ (CD ⇒ ¬DE) AB (AB ⇒ BC) (CD ⇒ ¬DE) ¬AB BC ¬CD

slide-13
SLIDE 13

6 / 14 Genome Informatics

Tableau Algorithm

Propositional Logic Tableau Algorithm

for a given formula f explore depth-first the tree

  • f deductions from root f

each root-leaf-path without contradiction is a satisfying model

Modifications

expand disjunctions from left to right allow to pre-block subformulas to guide the algorithm to the right model AB ∧ (AB ⇒ BC) ∧ (CD ⇒ ¬DE) AB (AB ⇒ BC) (CD ⇒ ¬DE) ¬AB BC ¬CD

slide-14
SLIDE 14

6 / 14 Genome Informatics

Tableau Algorithm

Propositional Logic Tableau Algorithm

for a given formula f explore depth-first the tree

  • f deductions from root f

each root-leaf-path without contradiction is a satisfying model

Modifications

expand disjunctions from left to right allow to pre-block subformulas to guide the algorithm to the right model AB ∧ (AB ⇒ BC) ∧ (CD ⇒ ¬DE) AB (AB ⇒ BC) (CD ⇒ ¬DE) ¬AB BC ¬CD

slide-15
SLIDE 15

6 / 14 Genome Informatics

Tableau Algorithm

Propositional Logic Tableau Algorithm

for a given formula f explore depth-first the tree

  • f deductions from root f

each root-leaf-path without contradiction is a satisfying model

Modifications

expand disjunctions from left to right allow to pre-block subformulas to guide the algorithm to the right model AB ∧ (AB ⇒ BC) ∧ (CD ⇒ ¬DE) AB (AB ⇒ BC) (CD ⇒ ¬DE) ¬AB BC ¬CD

slide-16
SLIDE 16

6 / 14 Genome Informatics

Tableau Algorithm

Propositional Logic Tableau Algorithm

for a given formula f explore depth-first the tree

  • f deductions from root f

each root-leaf-path without contradiction is a satisfying model

Modifications

expand disjunctions from left to right allow to pre-block subformulas to guide the algorithm to the right model AB ∧ (AB ⇒ BC) ∧ (CD ⇒ ¬DE) AB (AB ⇒ BC) (CD ⇒ ¬DE) ¬AB BC ¬CD

slide-17
SLIDE 17

6 / 14 Genome Informatics

Tableau Algorithm

Propositional Logic Tableau Algorithm

for a given formula f explore depth-first the tree

  • f deductions from root f

each root-leaf-path without contradiction is a satisfying model

Modifications

expand disjunctions from left to right allow to pre-block subformulas to guide the algorithm to the right model AB ∧ (AB ⇒ BC) ∧ (CD ⇒ ¬DE) AB (AB ⇒ BC) (CD ⇒ ¬DE) ¬AB BC ¬CD

slide-18
SLIDE 18

6 / 14 Genome Informatics

Tableau Algorithm

Propositional Logic Tableau Algorithm

for a given formula f explore depth-first the tree

  • f deductions from root f

each root-leaf-path without contradiction is a satisfying model

Modifications

expand disjunctions from left to right allow to pre-block subformulas to guide the algorithm to the right model AB ∧ (AB ⇒ BC) ∧ (CD ⇒ ¬DE) AB (AB ⇒ BC) (CD ⇒ ¬DE) ¬AB BC ¬CD

slide-19
SLIDE 19

6 / 14 Genome Informatics

Tableau Algorithm

Propositional Logic Tableau Algorithm

for a given formula f explore depth-first the tree

  • f deductions from root f

each root-leaf-path without contradiction is a satisfying model

Modifications

expand disjunctions from left to right allow to pre-block subformulas to guide the algorithm to the right model AB ∧ (AB ⇒ BC) ∧ (CD ⇒ ¬DE) AB (AB ⇒ BC) (CD ⇒ ¬DE) ¬AB BC ¬CD

slide-20
SLIDE 20

6 / 14 Genome Informatics

Tableau Algorithm

Propositional Logic Tableau Algorithm

for a given formula f explore depth-first the tree

  • f deductions from root f

each root-leaf-path without contradiction is a satisfying model

Modifications

expand disjunctions from left to right allow to pre-block subformulas to guide the algorithm to the right model AB ∧ (AB ⇒ BC) ∧ (CD ⇒ ¬DE) AB (AB ⇒ BC) (CD ⇒ ¬DE) ¬AB BC ¬CD

slide-21
SLIDE 21

7 / 14 Genome Informatics

Minimal Network States

Clashes

Two minimal network states (Nec, Imp) and (Nec′, Imp′) are clashing iff Nec ∩ Imp′ = ∅ or Nec′ ∩ Imp = ∅. not clashing pair → interactions simultaneously possible

A B

G H G I

slide-22
SLIDE 22

7 / 14 Genome Informatics

Minimal Network States

Clashes

Two minimal network states (Nec, Imp) and (Nec′, Imp′) are clashing iff Nec ∩ Imp′ = ∅ or Nec′ ∩ Imp = ∅. not clashing pair → interactions simultaneously possible

A B

G

+

H G I

=

slide-23
SLIDE 23

7 / 14 Genome Informatics

Minimal Network States

Clashes

Two minimal network states (Nec, Imp) and (Nec′, Imp′) are clashing iff Nec ∩ Imp′ = ∅ or Nec′ ∩ Imp = ∅. not clashing pair → interactions simultaneously possible

A B

G

+

A B

G

=

slide-24
SLIDE 24

8 / 14 Genome Informatics

Prediction of Protein Complexes

Network based complex prediction

◮ e.g. dense regions

A H G B C D E F I C D E F I A H G B I A H G B I C D E F I A H G I

slide-25
SLIDE 25

8 / 14 Genome Informatics

Prediction of Protein Complexes

Network based complex prediction

◮ e.g. dense regions

A H G B C D E F I

Maximal combinations of minimal network states

C D E F I A H G B I A H G B I C D E F I A H G I

slide-26
SLIDE 26

8 / 14 Genome Informatics

Prediction of Protein Complexes

Network based complex prediction

◮ e.g. dense regions

A H G B C D E F I

Maximal combinations of minimal network states

C D E F I A H G B I A H G B I

Refined complexes

◮ no violated constraints

C D E F I A H G I

slide-27
SLIDE 27

9 / 14 Genome Informatics

Prediction of Functional Importance

Minimal network state graph

A

AB AG AH GH HI EI FI FG BG BC CD CF DF ED EF

H G B C D E F I

Minimal network states

H G I A H A G B C C F G F H I F I D F C D D E E I A B G A B G

A

AB AG AH GH HI EI FI FG BG BC CD CF DF ED EF

H G B C D E F I

slide-28
SLIDE 28

9 / 14 Genome Informatics

Prediction of Functional Importance

Minimal network state graph

A

AB AG AH GH HI EI FI FG BG BC CD CF DF ED EF

H G B C D E F I

Breadth first search from each node

A

AB AG AH GH HI EI FI FG BG BC CD CF DF ED EF

H G B C D E F I

slide-29
SLIDE 29

9 / 14 Genome Informatics

Prediction of Functional Importance

Minimal network state graph

A

AB AG AH GH HI EI FI FG BG BC CD CF DF ED EF

H G B C D E F I

Breadth first search from each node

A

AB AG AH GH HI EI FI FG BG BC CD CF DF ED EF

H G B C D E F I

Perturbation Impact Score PIS(P,I,C)(Q↓) := |BFS(Q↓)|

slide-30
SLIDE 30

10 / 14 Genome Informatics

Harvesting Constraints

Text-Mining

Observation: Interaction dependencies are reported as single sentence natural language statements in literature. Tokenize full-text papers into relevant words and search for simple regular expression patterns. 71 new interaction dependencies from 59 000 human adhesome related papers. K¨

  • ster, Zamir, Rahmann. 2012

... binding of Abl induces a conformational change in Cbl that allows binding

  • f

Src ... p . i p a p i p d i p p p .

slide-31
SLIDE 31

11 / 14 Genome Informatics

Results for Complex Predicton

Network: CYGD (4579 proteins, 12576 interactions) Constraints: Competition on binding sites (Jung et al. 2010) Complexes: CYGD (55 connected complexes) Network based complex prediction: LCMA (Li et al. 2005)

458 458 random constraints 0.00 0.05 0.10 0.15 0.20 0.25

precision

458 458 random constraints 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

recall

slide-32
SLIDE 32

12 / 14 Genome Informatics

Results for Perturbation Impact Score

Network: CYGD (4579 proteins, 12576 interactions) Constraints: Competition on binding sites (Jung et al. 2010) Perturbations classified as lethal/sick and viable: SGD

prediction quality

20 40 60 80 100 % above threshold 49 50 51 52 53 54 55 56 57 % of true positives

  • rand. constraints +- SD

no constraints constraints

TP: lethal/sick and PIS ≥ t, viable and PIS < t

slide-33
SLIDE 33

13 / 14 Genome Informatics

Conclusion

Protein Hypernetworks

A H G B C D E F I H I G B G A

slide-34
SLIDE 34

13 / 14 Genome Informatics

Conclusion

Protein Hypernetworks

A H G B C D E F I H I G B G A

more precise protein complexes perturbation effects

slide-35
SLIDE 35

13 / 14 Genome Informatics

Conclusion

text-mining logic inference from real measurements Protein Hypernetworks

A H G B C D E F I H I G B G A

more precise protein complexes perturbation effects

slide-36
SLIDE 36

14 / 14 Genome Informatics

Harvesting Constraints

Using the Quine-McCluskey-Algorithm

Given a truth table with interactions in columns and simultaneous observations in rows. Infer logic relationships using the Quine-McCluskey-Algorithm. AB BC

  • bserved

1 1 1 1 1 1 1 Inferred constraints: AB ⇒ ¬BC

slide-37
SLIDE 37

14 / 14 Genome Informatics

Harvesting Constraints

Using the Quine-McCluskey-Algorithm

Given a truth table with interactions in columns and simultaneous observations in rows. Infer logic relationships using the Quine-McCluskey-Algorithm.

derive rows from

simultaneous interaction measurements (e.g. future variants of FCS) combination of protein complex measurements (e.g. MS) with binary protein interactions AB BC

  • bserved

1 1 1 1 1 1 1 Inferred constraints: AB ⇒ ¬BC