ISC Operator for reconstructing Bayesian Network in gene networks - - PowerPoint PPT Presentation

isc operator for reconstructing bayesian network in gene
SMART_READER_LITE
LIVE PREVIEW

ISC Operator for reconstructing Bayesian Network in gene networks - - PowerPoint PPT Presentation

ISC Operator for reconstructing Bayesian Network in gene networks context. Jimmy Vandel & Simon de Givry Outlines: Biological motivation Bayesian Networks framework Learning Algorithms Local Operators Comet language


slide-1
SLIDE 1

ISC Operator for reconstructing Bayesian Network in gene networks context.

Jimmy Vandel & Simon de Givry

slide-2
SLIDE 2

Outlines:

➢ Biological motivation ➢ Bayesian Networks framework ➢ Learning Algorithms ➢ Local Operators ➢ Comet language ➢ Experimentation

Vandel Jimmy 1 /17 Plan

slide-3
SLIDE 3

Biological motivation

Vandel Jimmy 2 /17

  • 1. Biological motivation

Gene 1 Gene 2 Gene 3 DNA

slide-4
SLIDE 4

Biological motivation

Vandel Jimmy 2 /17

  • 1. Biological motivation

Gene 1 Gene 2 Gene 3 DNA → gene expressions (mRNA concentrations)

slide-5
SLIDE 5

Biological motivation

Vandel Jimmy 2 /17

  • 1. Biological motivation

Gene 1 Gene 2 Gene 3 DNA → gene regulations → gene expressions (mRNA concentrations)

slide-6
SLIDE 6

Vandel Jimmy 2 /17

  • 1. Biological motivation

Biological motivation

Gene 1 Gene 2 Gene 3 DNA → gene regulations → gene expressions (mRNA concentrations)

slide-7
SLIDE 7

Escherichia coli (423 genes, 578 regulations) (SS. Shen-Orr and al., 2002)

Goal : Reconstruction of gene regulatory network.

Vandel Jimmy 3 /17

  • 1. Biological motivation
slide-8
SLIDE 8

Polymorphism

Vandel Jimmy 4 /17

  • 1. Biological motivation
slide-9
SLIDE 9

Polymorphism

Vandel Jimmy 4 /17

  • 1. Biological motivation

G1 G2 G3

slide-10
SLIDE 10

Polymorphism

Vandel Jimmy 4 /17

  • 1. Biological motivation

G1 G2 G3 G1 G2 G3

slide-11
SLIDE 11

Polymorphism

Vandel Jimmy 4 /17

  • 1. Biological motivation

G1 G2 G3 G1 G2 G3

DNA mutations in genes : in promoter region → impact on its gene activity

slide-12
SLIDE 12

Polymorphism

Vandel Jimmy 4 /17

  • 1. Biological motivation

G1 G2 G3 G1 G2 G3

in coding region → impact on others gene activities DNA mutations in genes: in promoter region → impact on its gene activity

slide-13
SLIDE 13

Polymorphism

Vandel Jimmy 4 /17

  • 1. Biological motivation

G1 G2 G3 G1 G2 G3 M1 M2 M3

in coding region → impact on others gene activities DNA mutations in genes: in promoter region → impact on its gene activity Genetic data from one genetic marker (SNP) for each gene

slide-14
SLIDE 14

Discrete Bayesian network

Directed acyclic graph composed of variables X i={Gi , M i}

G

n

Vandel Jimmy 5/17

  • 2. Bayesian Networks framework

M 1 M 2 M 3 G1 G 2 G 3

Gene expressions Genetic data

M 1 G1

slide-15
SLIDE 15

Discrete Bayesian network

Directed acyclic graph composed of variables PG X =∏i=1

n

PG X i/ Pai

PGG 3/G 2, M 2

X i={Gi , M i}

G

n

Vandel Jimmy 5/17

  • 2. Bayesian Networks framework

M 1 M 2 M 3 G1 G 2 G 3

Gene expressions Genetic data

Conditional distribution Graphic representation of a joint probability distribution

M 1 G1

G2 G2 !G2 !G2 G 3 !G3 M 2 M 2 ! M 2 ! M 2 0.72 0.59 0.63 0.10 0.90 0.37 0.41 0.28

slide-16
SLIDE 16

Learning strategy

We look for the graph with dataset . G score=argmaxGi PGi/ D D

➢ BDe score (D.Heckerman Machine learning 1995) ➢ BIC score (G.Schwartz Annals of statistics 1978)

Vandel Jimmy 6/17

  • 3. Learning algorithms

➢ decomposable and penalized scores

Objective function easy to evaluate and avoids over-fitting

PG i/ D= PD /G i PGi P D ∝ P D/GiPGi P D/Gi

:prior probability of the graph Gi → assumed to be uniform PG i

: :marginal likelihood of Gi

slide-17
SLIDE 17

Local search components

  • 1. Search space

➢ Directed Acyclic Graph ➢ Partial DAG (PDAG) ➢ variable orders

Vandel Jimmy 7/17

  • 3. Learning algorithms
slide-18
SLIDE 18

Local search components

Vandel Jimmy 7/17

  • 3. Learning algorithms
  • 2. Initial structure

➢ empty structure ➢ random structure ➢ informed structure

(MWST, expert...)

  • 1. Search space

➢ Directed Acyclic Graph ➢ Partial DAG (PDAG) ➢ variable orders

slide-19
SLIDE 19

Local search components

  • 2. Initial structure

➢ empty structure ➢ random structure ➢ informed structure

(MWST, expert...)

  • 3. Neighborhood operators

➢ addition of an edge ➢ deletion of an edge ➢ reversal of an edge ➢ k look-ahead ➢ optimal reinsertion

Vandel Jimmy 7/17

  • 3. Learning algorithms
  • 1. Search space

➢ Directed Acyclic Graph ➢ Partial DAG (PDAG) ➢ variable orders

slide-20
SLIDE 20

Local search components

  • 4. Meta-heuristics

➢ hill climbing (with restarts) ➢ tabu search ➢ simulated annealing ➢ MCMC ➢ genetic algorithms ➢ ...

Vandel Jimmy 7/17

  • 3. Learning algorithms
  • 2. Initial structure

➢ empty structure ➢ random structure ➢ informed structure

(MWST, expert...)

  • 1. Search space

➢ Directed Acyclic Graph ➢ Partial DAG (PDAG) ➢ variable orders

  • 3. Neighborhood operators

➢ addition of an edge ➢ deletion of an edge ➢ reversal of an edge ➢ k look-ahead ➢ optimal reinsertion

slide-21
SLIDE 21

Local Operators

➢ addition ➢ deletion

➢ reversal (deletion + addition on the same pair) ➢ swap (deletion + addition including an extra node)

Vandel Jimmy 8/17

  • 4. Local operators
slide-22
SLIDE 22

Local Operators

➢ addition ➢ deletion

➢ reversal (deletion + addition on the same pair) ➢ swap (deletion + addition including an extra node)

Example:

score Add G 2,G3score AddG 1,G30

Current situation

G1 G 2 G 3 G 3 G 2 G1

Vandel Jimmy 8/17

  • 4. Local operators

Target situation

slide-23
SLIDE 23

Local Operators

➢ addition ➢ deletion

➢ reversal (deletion + addition on the same pair) ➢ swap (deletion + addition including an extra node)

Example:

DeletionG1,G 3 Add G 2,G3 score Add G 2,G3score AddG 1,G30

Current situation

G1 G 2 G 3 G1 G 2 G 3 G 3 G 2 G1

Vandel Jimmy 8/17

  • 4. Local operators

Target situation

slide-24
SLIDE 24

Local Operators

➢ addition ➢ deletion

➢ reversal (deletion + addition on the same pair) ➢ swap (deletion + addition including an extra node)

Example:

DeletionG1,G 3 Add G 2,G3 score Add G 2,G3score AddG 1,G30

Current situation

score Add G1,G 30

G1 G 2 G 3 G1 G 2 G 3 G 3 G 2 G1

Vandel Jimmy 8/17

  • 4. Local operators

Target situation

slide-25
SLIDE 25

Local Operators

➢ addition ➢ deletion

➢ reversal (deletion + addition on the same pair) ➢ swap (deletion + addition including an extra node)

Example:

SwapG 1,G 3,G 2 DeletionG1,G 3 Add G 2,G3

Current situation

score Add G 2,G3−score AddG 1,G30

G1 G 2 G 3 G1 G 2 G 3 G 3 G 2 G1

→ escape from some local maxima

Vandel Jimmy 8/17

  • 4. Local operators

score Add G 2,G3score AddG 1,G30 score Add G1,G 30

Target situation

slide-26
SLIDE 26

ISC Operator

Current situation

(Iterative Swap Operator)

SwapG 2,G 3,G 7?

score Add G 7,G 3∣G 1score Add G2,G 3∣G10

G1 G 2 G 3 G 7 G 6 G 4 G 5

Vandel Jimmy 9/17

  • 4. Local operators
slide-27
SLIDE 27

ISC Operator

Current situation

(Iterative Swap Operator)

SwapG 2,G 3,G 7? Cycle {G 3,G 4,G6,G7} G1 G 2 G 3 G 7 G 6 G 4 G 5

Vandel Jimmy 9/17

  • 4. Local operators

score Add G 7,G 3∣G 1score Add G2,G 3∣G10

slide-28
SLIDE 28

ISC Operator

Current situation

(Iterative Swap Operator)

SwapG 2,G 3,G 7? Cycle {G 3,G 4,G6,G7} G1 G 2 G 3 G 7 G 6 G 4 G 5

Vandel Jimmy 9/17

  • 4. Local operators

score Add G 7,G 3∣G 1score Add G2,G 3∣G10

Select the edge of the cycle minimizing score Add While there exist a cycle and ! STOP

slide-29
SLIDE 29

ISC Operator

Current situation

(Iterative Swap Operator)

SwapG 2,G 3,G 7? Cycle {G 3,G 4,G6,G7} G1 G 2 G 3 G 7 G 6 G 4 G 5

Vandel Jimmy 9/17

  • 4. Local operators

score Add G 7,G 3∣G 1score Add G2,G 3∣G10

Select the edge of the cycle minimizing score Add Try to delete it While there exist a cycle and ! STOP

slide-30
SLIDE 30

ISC Operator

Current situation

(Iterative Swap Operator)

SwapG 2,G 3,G 7? Cycle {G 3,G 4,G6,G7} G1 G 2 G 3 G 7 G 6 G 4 G 5

Vandel Jimmy 9/17

  • 4. Local operators

score Add G 7,G 3∣G 1score Add G2,G 3∣G10

Select the edge of the cycle minimizing score Add Try to delete it If scoreSwapG 2,G 3,G 7score DeletionG 4,G 6≤0 While there exist a cycle and ! STOP Else Record DeletionG 4,G 6

slide-31
SLIDE 31

ISC Operator

Current situation

(Iterative Swap Operator)

SwapG 2,G 3,G 7? Cycle {G 3,G 4,G6,G7} G1 G 2 G 3 G 7 G 6 G 4 G 5

Vandel Jimmy 9/17

  • 4. Local operators

score Add G 7,G 3∣G 1score Add G2,G 3∣G10

Select the edge of the cycle minimizing score Add Try to delete it If scoreSwapG 2,G 3,G 7score DeletionG4,G 6≤0 Try to swap this edge While there exist a cycle and ! STOP Else Record DeletionG 4,G 6

slide-32
SLIDE 32

ISC Operator

Current situation

(Iterative Swap Operator)

SwapG 2,G 3,G 7? Cycle {G 3,G 4,G6,G7} G1 G 2 G 3 G 7 G 6 G 4 G 5

Vandel Jimmy 9/17

  • 4. Local operators

score Add G 7,G 3∣G 1score Add G2,G 3∣G10

Select the edge of the cycle minimizing score Add Try to delete it If scoreSwapG 2,G 3,G 7score DeletionG4,G 6≤0 Try to swap this edge While there exist a cycle and ! STOP If scoreSwap G 2,G 3,G 7scoreSwapG 4,G 6,G5≤0 STOP Else Else Record Record SwapG 4,G 6,G 5 DeletionG 4,G6

slide-33
SLIDE 33

ISC Operator

Current situation

(Iterative Swap Operator)

SwapG 2,G 3,G 7? Cycle {G 3,G 4,G6,G7} G1 G 2 G 3 G 7 G 6 G 4 G 5

Vandel Jimmy 9/17

  • 4. Local operators

score Add G 7,G 3∣G 1score Add G2,G 3∣G10

Select the edge of the cycle minimizing score Add Try to delete it If Validate all recorded moves scoreSwapG 2,G 3,G 7score DeletionG4,G 6≤0 Try to swap this edge While there exist a cycle and ! STOP If scoreSwap G 2,G 3,G 7scoreSwapG 4,G 6,G5≤0 STOP If ! STOP Else Else Record Record SwapG 4,G 6,G 5 DeletionG 4,G6

slide-34
SLIDE 34

ISC Operator

Current situation

(Iterative Swap Operator)

SwapG 2,G 3,G 7? Cycle {G 3,G 4,G6,G7} G1 G 2 G 3 G 7 G 6 G 4 G 5

Vandel Jimmy 9/17

  • 4. Local operators

Select the edge of the cycle minimizing score Add Try to delete it If Validate all recorded moves scoreSwapG 2,G 3,G 7score DeletionG4,G 6≤0

score Add G 7,G 3∣G 1score Add G2,G 3∣G10

Try to swap this edge While there exist a cycle and ! STOP If scoreSwap G 2,G 3,G 7scoreSwapG 4,G 6,G5≤0 STOP If ! STOP Else Else Record Record SwapG 4,G 6,G 5 DeletionG 4,G6 nISC operator

slide-35
SLIDE 35

Comet Language

Vandel Jimmy 10/17

  • 5. Comet language

Is a High level programming language http://www.comet-online.org/

(L.Michel and P.Van Hentenryck, 2002)

slide-36
SLIDE 36

Comet Language

Vandel Jimmy 10/17

  • 5. Comet language

Model optimization problems Implement search procedures Is a High level programming language To http://www.comet-online.org/

(L.Michel and P.Van Hentenryck, 2002)

slide-37
SLIDE 37

Comet Language

Vandel Jimmy 10/17

  • 5. Comet language

Model optimization problems Implement search procedures Constraint programming Constraint-Based Local search In domains of Is a High level programming language To http://www.comet-online.org/

(L.Michel and P.Van Hentenryck, 2002)

slide-38
SLIDE 38

Comet Language

Vandel Jimmy 10/17

  • 5. Comet language

Model optimization problems Implement search procedures Constraint programming Constraint-Based Local search Invariants Objective functions Constraints definition Parallel programming ... In domains of Is a High level programming language To Offering easy implementation for http://www.comet-online.org/

(L.Michel and P.Van Hentenryck, 2002)

slide-39
SLIDE 39

Hill-climbing implementation in Comet

Vandel Jimmy 11/17

  • 5. Comet language

BDeu score Topological order Neighborhoods

Graph parents childs Invariant Incremental variable → modify → update when is modified

slide-40
SLIDE 40

Experimentation

DREAM5 systems genetics challenge (November 2010, New York) Objective: recover gene regulatory network from Our gold network

➢ 2000 nodes (1000 genes / 1000 genetic markers) ➢ 1983 edges

Simulated population of 300 individuals

Vandel Jimmy 12/17

  • 6. Experimentation

➢ Gene expressions ➢ Genetic data

Gold standard network

slide-41
SLIDE 41

Experimentation

➢ Discretization of data (max. 4 classes) ➢ Pre-filtering candidate parents under condition

➢ Limit number of parents : 6

 AddParent ,Target 0

Vandel Jimmy 12/17

  • 6. Experimentation

Gold standard network

➢ Gene expressions ➢ Genetic data

DREAM5 systems genetics challenge (November 2010, New York) Objective: recover gene regulatory network from Our gold network

➢ 2000 nodes (1000 genes / 1000 genetic markers) ➢ 1983 edges

Simulated population of 300 individuals

slide-42
SLIDE 42

Results (1/4)

A+D A+D+R A+D+S A+D+R+S A²+D+R²+S² BDeu scores

➢ mean ➢ deviation

  • 359 580

169.3

  • 359 430

168.5

  • 357 990

92.9

  • 357 850

91.0

  • 357 460

55.2 Mean time (in seconds) 17.9 27.0 27.6 32.3 149.2

➢ 1000 runs of hill climbing algorithm ➢ Initialized with random networks (2 parents max) ➢ 5 operator configurations:

✗ Addition + Deletion ✗ Addition + Deletion + Reversal ✗ Addition + Deletion + Swap ✗ Addition + Deletion + Reversal + Swap ✗ Addition² + Deletion + Reversal² + Swap²

(²:nISC) Vandel Jimmy 13/17

  • 6. Experimentation
slide-43
SLIDE 43

Results (2/4)

➢ 1 run of hill climbing algorithm ➢ Initialized with random networks (2 parents max) ➢ 1 operator configurations:

✗ Addition² + Deletion + Reversal² + Swap²

(²:nISC)

Number of applied operators by type during the search

Vandel Jimmy 14/17

  • 6. Experimentation
slide-44
SLIDE 44

➢ 1000 runs of hill climbing algorithm ➢ 2 starting configurations: ➢ 2 operator configurations: ✗ Addition² + Deletion + Reversal² + Swap²

(²:nISC)

✗ empty network ✗ random networks (2 parents max)

Results (3/4)

Vandel Jimmy 15/17

  • 6. Experimentation

✗ Addition + Deletion + Reversal

slide-45
SLIDE 45

Results (4/4)

A+D+R A+D+S A²+D+R²+S² A*+D+R*+S* Tabu BDeu scores

➢ mean ➢ deviation

  • 359 430

168.5

  • 357 990

92.9

  • 357 460

55.2

  • 357 450

54.5

  • 359 150

160.4 Mean time (in seconds) 27.0 27.6 149.2 373.1 291.5

➢ 1000 runs of hill climbing algorithm ➢ Initialized with random networks (2 parents max) ➢ 5 configurations:

✗ Addition + Deletion + Reversal ✗ Addition + Deletion + Swap ✗ Addition² + Deletion + Reversal² + Swap²

(²:nISC)

✗ Addition* + Deletion + Reversal* + Swap*

(*:ISC)

✗ Tabu search with Addition + Deletion + Reversal

(10 000 operations, tabuu list size :100) Vandel Jimmy 16/17

  • 6. Experimentation
slide-46
SLIDE 46

Conclusion & Perspectives

➢ try other meta-heuristics ➢ tune Tabu parameters ➢ improve time efficiency of ISC operator

TODO list:

➢ Propose a new Iterative Swap Operator breaking cycles ➢ Improve BDeu scores of learned networks with this operator ➢ Compare initial structure effect

We Vandel Jimmy 17/17

  • 7. Conclusion
slide-47
SLIDE 47

Question time !

Vandel Jimmy END