Constraint Programming in Community-based Gene Regulatory Network - - PowerPoint PPT Presentation

constraint programming in community based gene regulatory
SMART_READER_LITE
LIVE PREVIEW

Constraint Programming in Community-based Gene Regulatory Network - - PowerPoint PPT Presentation

Background Constraint Programing in Community Networks Experiments and Results Conclusions Constraint Programming in Community-based Gene Regulatory Network Inference Ferdinando Fioretto Enrico Pontelli Dept. Computer Science, New Mexico


slide-1
SLIDE 1

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraint Programming in Community-based Gene Regulatory Network Inference

Ferdinando Fioretto Enrico Pontelli

  • Dept. Computer Science, New Mexico State University
  • Sept. 24, 2013
slide-2
SLIDE 2

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Talk Outline

1

Background

2

Constraint Programing in Community Networks

3

Experiments and Results

4

Conclusions

slide-3
SLIDE 3

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Networks

A cell contains different entities (including pro- teins, RNA) which interact and perform specific functions.

slide-4
SLIDE 4

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Networks

A cell contains different entities (including pro- teins, RNA) which interact and perform specific functions. DNA transcription

slide-5
SLIDE 5

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Networks

A cell contains different entities (including pro- teins, RNA) which interact and perform specific functions. DNA transcription mRNA translation

slide-6
SLIDE 6

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Networks

Some proteins (Transcriptor Factors (TF)) can regulate the production of other proteins. Done by enhancing or inhibiting DNA transcription or mRNA translation. The unit of encapsulation of these interactions are the coding regions of the DNA: the genes. A Gene Regulatory Network is the set of the interactions among genes.

slide-7
SLIDE 7

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Networks

Modeling

A GRN is described by a weighted directed graph G = (V, E). V is the set of genes of the network. E ⊆ V × V × [0, 1] is the set of the regulatory interactions. Each regulatory interaction s → t is associated with a confidence value ωs→t ∈ [0, 1]. Example

G1 regulates G2. G2 regulates G5. G3 is regulated by G4. G4 regulates G2 and is regulated by G5.

slide-8
SLIDE 8

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Network Inference

GRN inference from high-throughput data

Motivation: Key to understand important genetic diseases, such as cancer. Crucial to devise effective medical interventions.

slide-9
SLIDE 9

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Network Inference

Current Methods and Challenges

Methods proposed:

Correlation-based. Information-theoretic based. Boolean Networks. Bayesian Networks. Regression-based. Stochastics.

Based on different assumptions. Exhibits peculiar limitations.

slide-10
SLIDE 10

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Network Inference

Current Methods and Challenges

Methods proposed:

Correlation-based. Information-theoretic based. Boolean Networks. Bayesian Networks. Regression-based. Stochastics.

Based on different assumptions. Exhibits peculiar limitations. Solutions proposed:

Integrating heterogeneous data into the inference model. Meta-approaches using multiple inference models (Community Networks (CN)).

slide-11
SLIDE 11

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Network Inference

Community Networks

community network

G1 G2 GJ

edge ranking

Borda voting score: ω

s

#

→t = 1

|G|

|G|

  • j=1

ω j

s

#

→t

ω j

s # →t : the ranked interaction s → t

by the j-th method in G.

  • D. Marbach et al. “Wisdom of crowds for robust gene network inference”.

Nature Methods, 9(8):796–804, Aug. 2012.

slide-12
SLIDE 12

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Network Inference

Our Approach

CN approach for an “initial analysis” of the GRN.

Community prediction collective agreements.

Integrate additional biological knowledge (when available).

Leverage specific GRN properties.

slide-13
SLIDE 13

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Network Inference

Our Approach

CN approach for an “initial analysis” of the GRN.

Community prediction collective agreements.

Integrate additional biological knowledge (when available).

Leverage specific GRN properties.

Why CP ?

slide-14
SLIDE 14

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraint Programming

Constraint Satisfaction Problem (CSP)

Variables X: xi = position of the queen in the ith column. Domains D: Dxi = {1, . . . , n}. Constraints C: ∀i, ∀j with i < j: xi = xj xi + i = xj + j xi − j = xj − j Search = Labeling + Constraint Propagation

slide-15
SLIDE 15

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraint Programming

Constraint Satisfaction Problem (CSP)

Variables X: xi = position of the queen in the ith column. Domains D: Dxi = {1, . . . , n}. Constraints C: ∀i, ∀j with i < j: xi = xj xi + i = xj + j xi − j = xj − j Search = Labeling + Constraint Propagation

slide-16
SLIDE 16

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraint Programming

Constraint Satisfaction Problem (CSP)

Variables X: xi = position of the queen in the ith column. Domains D: Dxi = {1, . . . , n}. Constraints C: ∀i, ∀j with i < j: xi = xj xi + i = xj + j xi − j = xj − j Search = Labeling + Constraint Propagation

slide-17
SLIDE 17

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraint Programming

Constraint Satisfaction Problem (CSP)

Variables X: xi = position of the queen in the ith column. Domains D: Dxi = {1, . . . , n}. Constraints C: ∀i, ∀j with i < j: xi = xj xi + i = xj + j xi − j = xj − j Search = Labeling + Constraint Propagation

slide-18
SLIDE 18

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraint Programming

Constraint Satisfaction Problem (CSP)

Variables X: xi = position of the queen in the ith column. Domains D: Dxi = {1, . . . , n}. Constraints C: ∀i, ∀j with i < j: xi = xj xi + i = xj + j xi − j = xj − j Search = Labeling + Constraint Propagation

slide-19
SLIDE 19

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraint Programming

Constraint Satisfaction Problem (CSP)

Variables X: xi = position of the queen in the ith column. Domains D: Dxi = {1, . . . , n}. Constraints C: ∀i, ∀j with i < j: xi = xj xi + i = xj + j xi − j = xj − j Search = Labeling + Constraint Propagation

slide-20
SLIDE 20

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraint Programming

Constraint Satisfaction Problem (CSP)

Variables X: xi = position of the queen in the ith column. Domains D: Dxi = {1, . . . , n}. Constraints C: ∀i, ∀j with i < j: xi = xj xi + i = xj + j xi − j = xj − j Search = Labeling + Constraint Propagation

slide-21
SLIDE 21

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraint Programming

Constraint Satisfaction Problem (CSP)

Variables X: xi = position of the queen in the ith column. Domains D: Dxi = {1, . . . , n}. Constraints C: ∀i, ∀j with i < j: xi = xj xi + i = xj + j xi − j = xj − j Search = Labeling + Constraint Propagation

slide-22
SLIDE 22

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraint Programming

Constraint Satisfaction Problem (CSP)

Variables X: xi = position of the queen in the ith column. Domains D: Dxi = {1, . . . , n}. Constraints C: ∀i, ∀j with i < j: xi = xj xi + i = xj + j xi − j = xj − j Search = Labeling + Constraint Propagation Solution = assignment for X satisfying all c ∈ C

slide-23
SLIDE 23

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Gene Regulatory Network Inference

Our Approach

CN approach for an “initial analysis” of the GRN.

Community prediction collective agreements.

Integrate additional biological knowledge (when available).

Leverage specific GRN properties.

Why CP ?

Separation between prediction methods and model. Declaratively. Constraint expressions allow incremental model refinement.

slide-24
SLIDE 24

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constrained Community Networks

CSP Modeling

GRN inference (GRNi) problem: Given a set of n genes, a GRNi is a CSP X, D, C X = x1, . . . , xn2−n (regulatory relations, exuding self regulations). D = D1, . . . , Dn2−n, with each Dk = {0, . . . , 100} (possible confidence values). C is a list of constraints expressing properties of the GRNs. Notation: xs→t: “s regulates t” and Ds→t its domain. d(xs→t): the value assigned to xs→t.

slide-25
SLIDE 25

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constrained Community Networks

CSP Modeling

A solution to the GRNi defines a GRN prediction G = (V, E) V = {1, . . . , n}, E = {s, t, w | d(xs→t) > 0}, where w = d(xs→t)/100.

slide-26
SLIDE 26

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constrained Community Networks

E.coli2 size 10 (from DREAM3)

G2 G10 G9 G1 G5 G8 G7 G6 G4 G3

slide-27
SLIDE 27

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constrained Community Networks

E.coli2 size 10 CN prediction

slide-28
SLIDE 28

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Analysis and Domains Reduction

The pre resolution phase

Leverage the collection of GRN predictions G by:

(i.) Reducing the size of the solution search space. (ii.) Integrate the Gj ∈ G taking into account their discrepancies.

Set up domains of each variable xs→t ∈ X, such that: Ds→t = Ds→t ∩ Bs→t

where: Bs→t =

  • ω

s

#

→t

  • if σs→t < θd
  • σs→t =

1 |G|

2

  • |G|
  • j=1

|G|

  • i=j+1
  • ω j

s

#

→t − ω i s

#

→t

  • θd ∈ [0, 1] is a “disagreement threshold”.
slide-29
SLIDE 29

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Analysis and Domains Reduction

The pre resolution phase

Leverage the collection of GRN predictions G by:

(i.) Reducing the size of the solution search space. (ii.) Integrate the Gj ∈ G taking into account their discrepancies.

Set up domains of each variable xs→t ∈ X, such that: Ds→t = Ds→t ∩ Bs→t

where: Bs→t =

  • ω

s

#

→t − σs→t

2 , ω

s

#

→t, ω s

#

→t + σs→t

2

  • if σs→t ≥ θd

∧ 0.1 < ω

s# →t

< 0.9

  • σs→t =

1 |G|

2

  • |G|
  • j=1

|G|

  • i=j+1
  • ω j

s

#

→t − ω i s

#

→t

  • θd ∈ [0, 1] is a “disagreement threshold”.
slide-30
SLIDE 30

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Sparseness

Elements of a GRN are considered to be controlled by a small number of genes: GRN are sparse. Combining predictions in a CN does not guarantee sparseness. Enforce a sparseness constraint by: atleast k ge(kl, X, θl) :

  • {xi ∈ X | d(xi) > θl}
  • ≥ kl

and atmost k ge(km, X, θm) :

  • {xi ∈ X | d(xi) > θm}
  • ≤ km

with kl,m > 0 and 0 ≤ θl,m ≤ 100, and where d(xi) indicates the value of an assignment for xi

slide-31
SLIDE 31

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Sparseness

atleast k ge(10, X, 65) ∩ atmost k ge(25, X, 65)

slide-32
SLIDE 32

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Sparseness

atleast k ge(10, X, 65) ∩ atmost k ge(25, X, 65)

slide-33
SLIDE 33

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Redundant edge

Several state-of-the art inference methods rely on techniques which cannot discriminate causality (e.g., M.I., Correlation). Given a collection of predictions G ={G1, . . . , GJ} for a GRN G=(V, E) and a non-empty set of non causal based methods H ⊆ G, an edge t → s is redundant if: ∀ Gi ∈ G \ H . ω i

s→t > ω i t→s + β

If an edge t → s is redundant we call the edge s → t required. Let XR be the set of all the required and redundant variables, red edge(xs→t, xt→s, θR, θr) : xs→t > θR ∧ xt→s < θr

with θR, θr ∈ N, and 0 ≤ θR ≤ 100.

slide-34
SLIDE 34

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Redundant edge

∀xs→t, xt→s ∈ XR red edge(xs→t, xt→s, 75, 50)

slide-35
SLIDE 35

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Redundant edge

∀xs→t, xt→s ∈ XR red edge(xs→t, xt→s, 75, 50)

slide-36
SLIDE 36

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Sparseness + Redundant edge

slide-37
SLIDE 37

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Transcriptor Factor

Information about DNA-binding motifs often available from public sources (e.g., BDB, Gene Ontology). Existing methods do not often allow integration of such information (treated in postprocess). A gene s ∈ V is a transcriptor factor (TF) if it regulates the production of other genes. Express this property on the out-degree of s: tf(s) : atleast k ge(ks, Xs, θs)

where: Xs = {xs→t ∈ X | t ∈ V} k is the co-expressing degree (the number of genes targeted by the TF).

slide-38
SLIDE 38

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Transcriptor Factor

atleast k ge(2, Ni, 85) with Ni = {xi→s | (∀Gj ∈ G) ω j

i→s > 0.10}, (i = 1, 5, 9)

slide-39
SLIDE 39

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Transcriptor Factor

atleast k ge(2, Ni, 85) with Ni = {xi→s | (∀Gj ∈ G) ω j

i→s > 0.10}, (i = 1, 5, 9)

slide-40
SLIDE 40

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Co-transcriptor Factors

Multiple TFs cooperate to regulate a specific gene (Co-regulators). Let s′, s′′ ∈ V be two TFs, which are co-regulators.

coregulator(k, X, θ) : ∀xs′→t′, xs′′→t′′ ∈ X | {(s′, s′′, t′) | s′ =s′′ ∧ t′ =t′′ ∧ d(xs′→t′)>θ ∧ d(xs′′→t′′)>θ} | ≥ k with k ∈ N and 0 < θ < 1

slide-41
SLIDE 41

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Co-transcriptor Factors

coregulator(1, V, 75), with s′ =1, s′′ =5

slide-42
SLIDE 42

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Constraints

Co-transcriptor Factors

coregulator(1, V, 75), with s′ =1, s′′ =5

slide-43
SLIDE 43

Background Constraint Programing in Community Networks Experiments and Results Conclusions

GRN Consensus

We implement two solution strategy prop-labeling (DFS) and a Monte Carlo (MC) based prop-labeling tree exploration. No consensus on objective function to drive the solution search. We propose 3 metric to generate a GRN consensus Constrained Community Network (CCN). Given a set S of m solutions, the consensus value a∗

k associated

with the variable xk is computed by:

Max Frequency: a∗

k = arg max a∈S|xk

(freq(a, k)) Average: a∗

k = 1

m

m

  • i=1

ai

k.

Weighted average: a∗

k =

1

  • a∈S|xk

freq(a, k)2

  • a∈S|xk

freq(a, k)2a.

slide-44
SLIDE 44

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Experiments

Community Networks

The CN was built from 4 top ranking methods of last DREAM competitions:

1

TIGRESS (Regression model)

2

Genie3 (Random Forest approach)

3

Infleator (MCZ + tlCLR + linear ODE)

4

CLR (Mutual Information model)

slide-45
SLIDE 45

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Experiments

Datasets and validation

Benchmarks: DREAM{3,4} (110 GRNs of various sizes). Subnetworks from GRNs of E. coli and S. cerevisiae. Datasets:

steady state expressions for wild types steady state expressions measured after gene knockouts. time-series data.

Validation: AUROC score. CCNs generated via MC search with 1, 000 samplings.

slide-46
SLIDE 46

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Experiments

Settings

Domains Setup. θd = 1 |ECN| X

(s,t,w)∈ECN

σs→t Sparseness constraint. atleast k ge(kl, X, θl) \ atmost k ge(km, X, θm)

Ordered ECN

1 g1 → g3 0.998 2 g1 → g8 0.981 . . . n g4 → g6 0.856 . . . n log(n) g7 → g3 0.633 . . .

kl ≤ |{xi|xi 2 X ^ max(Dxi) > θl}| km ≥ |{xi|xi 2 X ^ min(Dxi) > θm}| Redundant edge constraint. 8 Gi 2 G \ H . ω i

s→t > ω i t→s + β

1 |G||ERR| X

Gi2G\H

(ω i

s→t − ω i t→s))

red edge(xs→t, xt→s, θR, θr) 1 |G \ H||EREQ| X

Gi2G\H

ω i

s→t

1 |G \ H||ERED| X

Gi2G\H

ω i

t→s

slide-47
SLIDE 47

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Results

CCN with sparsity and redundant edge constraints

AUROC % improvement 5 10 15

b f a w b f a w b f a w b f a w b f a w DREAM3 10 DREAM4 10 DREAM3 50 DREAM3 100 DREAM4 100

s,r s,r,t

Average AUC score improvements (in percentage) w.r.t. CN rank

slide-48
SLIDE 48

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Experiments

Integrating GRN knowledge: TFs

Domains Setup. θd = 1 |ECN| X

(s,t,w)∈ECN

σs→t Sparseness constraint. atleast k ge(kl, X, θl) \ atmost k ge(km, X, θm)

Ordered ECN

1 g1 → g3 0.998 2 g1 → g8 0.981 . . . n g4 → g6 0.856 . . . n log(n) g7 → g3 0.633 . . .

kl ≤ |{xi|xi 2 X ^ max(Dxi) > θl}| km ≥ |{xi|xi 2 X ^ min(Dxi) > θm}| Redundant edge constraint. 8 Gi 2 G \ H . ω i

s→t > ω i t→s + β

1 |G||ERR| X

Gi2G\H

(ω i

s→t − ω i t→s))

red edge(xs→t, xt→s, θR, θr) 1 |G \ H||EREQ| X

Gi2G\H

ω i

s→t

1 |G \ H||ERED| X

Gi2G\H

ω i

t→s

Transcription Factor constraint. atleast k ge(blog(n)c, X, θ)

Ordered ECN

1 g1 → g3 0.998 2 g1 → g8 0.981 . . . n g4 → g6 0.856 . . .

slide-49
SLIDE 49

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Results

CCN with additional GRN knowledge integration

AUROC % improvement 5 10 15

b f a w b f a w b f a w b f a w b f a w DREAM3 10 DREAM4 10 DREAM3 50 DREAM3 100 DREAM4 100

s,r s,r,t

Average AUC score improvements (in percentage) w.r.t. CN rank

slide-50
SLIDE 50

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Results

CCN with additional GRN knowledge integration

AUROC % improvement 5 10 15

b f a w b f a w b f a w b f a w b f a w DREAM3 10 DREAM4 10 DREAM3 50 DREAM3 100 DREAM4 100

s,r s,r,t

Average AUC score improvements (in percentage) w.r.t. CN rank

slide-51
SLIDE 51

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Conclusions

CP-based approach to infer GRNs by integrating several methods in a CN. Introduces a set of constraints able to:

1

enforce the satisfaction of GRNs specific properties;

2

take account of the community predictions agreements and methods limitations.

No assumptions on datasets nor on the type of inference methods. Take Home Message:

GRN knowledge integration offer improvements in prediction accuracy. Constraints are a powerful tool to model and integrate GRN properties.

slide-52
SLIDE 52

Background Constraint Programing in Community Networks Experiments and Results Conclusions

Conclusions

CP-based approach to infer GRNs by integrating several methods in a CN. Introduces a set of constraints able to:

1

enforce the satisfaction of GRNs specific properties;

2

take account of the community predictions agreements and methods limitations.

No assumptions on datasets nor on the type of inference methods. Take Home Message:

GRN knowledge integration offer improvements in prediction accuracy. Constraints are a powerful tool to model and integrate GRN properties.

Thank you!