Population Markov Chain Monte Carlo and Genetic Networks Fujun Ye - - PowerPoint PPT Presentation

population markov chain monte carlo and genetic networks
SMART_READER_LITE
LIVE PREVIEW

Population Markov Chain Monte Carlo and Genetic Networks Fujun Ye - - PowerPoint PPT Presentation

Population Markov Chain Monte Carlo and Genetic Networks Fujun Ye MSc in Artificial Intelligence Supervised by Dirk Husmeier Outline Introduction MCMCMC MCMCMC for missing values Result Evaluation (complete data) Result


slide-1
SLIDE 1

Population Markov Chain Monte Carlo and Genetic Networks

Fujun Ye

MSc in Artificial Intelligence Supervised by Dirk Husmeier

slide-2
SLIDE 2

Outline

Introduction MCMCMC MCMCMC for missing values Result Evaluation (complete data) Result Evaluation (missing values) Summary

slide-3
SLIDE 3

Introduction

Genetic Network Clustering and Differential equation Bayesian Network MCMC

slide-4
SLIDE 4

Genetic Network

+ A F B b ab f2 a f eq

  • +
  • +
slide-5
SLIDE 5

Clustering

slide-6
SLIDE 6

Differential Equation

Advantage

provide detailed understanding of the biological systems

Shortcoming

short of data noisy data

slide-7
SLIDE 7

Inferring Bayesian Network From Expression Data

=

=

n i i G i n

X Pa X P X X X P

1 2 1

)) ( | ( ) ,..., , (

Bayesian Network

A C B D E

) ( ) | ( ) | ( ) , | ( ) | ( ) , , , , ( a P a b P a c P c b d P d e P e d c b a P =

slide-8
SLIDE 8

Problems

The number of different network structures grows

super exponentially with the number of nodes

slide-9
SLIDE 9

M M’ P(M|D) Where the data set is large, the optimal structure M’ is well defined Where the data set is small, there are many networks which can explain the data fairly well. M P(M|D) M’

slide-10
SLIDE 10

MCMC

MCMC samples networks from its posterior

distribution

Calculate the posterior probability of a

feature

) | ( D f P

=

∑ ∑

i i i i i

D M P M f P D M P ) | ( ) | ( ) | (

=

i i i k k k

M P M D P M P M D P D M P ) ( ) | ( ) ( ) | ( ) | (

slide-11
SLIDE 11

1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6

Coincidence dependence

slide-12
SLIDE 12

P (M|D) M

Escape from local optima

using traditional MCMC

slide-13
SLIDE 13

Small step size versus big step size

P (M|D) M

slide-14
SLIDE 14

Problems

Huge search space and coincidence

dependence — Prescreening is important!

Local optima — Traversal operator is

important!

Fixed step size —

Varied step size is more reasonable

slide-15
SLIDE 15

MCMCMC

Metropolis-coupled Markov Chain Monte Carlo

(MCMCMC)

Pre-processing method Traversal operators Algorithm MCMCMC for missing values

slide-16
SLIDE 16

MCMCMC

1 2

T T >

1

1 =

T

2 3

T T >

slide-17
SLIDE 17

For each chain, move a step based on Chain swap ) ) | ( ) | ( ) ( ) | ( ) ( ) | ( , 1 min( ) , (

' ' 1 ' ' '

M M Q M M Q M P M D P M P M D P M M A

T

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

M k i M k i a

M M M M M T T T T T S ... ... ... ... ... ...

2 1 2 1

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

M i k M k i b

M M M M M T T T T T S ... ... ... ... ... ...

2 1 2 1

Acceptance Probability

[ ] [ ] [ ] [ ]

} ) ( ) | ( ) ( ) | ( ) ( ) | ( ) ( ) | ( , 1 min

1 1 1 1

⎪ ⎭ ⎪ ⎬ ⎫ ⎪ ⎩ ⎪ ⎨ ⎧ =

k i i k

T k k T i i T k k T i i a

M P M D P M P M D P M P M D P M P M D P P

slide-18
SLIDE 18

Pre-processing method

= θ θ θ d M P M D P M D P ) | ( ) , | ( ) | (

Penalize complex model | | * | |

n n nv

v

n n

π α α

π =

n n n v

n v nv π π

α α =

∏∏ ∏

Γ + Γ + Γ Γ =

n v nv nv nv n n n

n n n n n n n n n n n

n n M D p

π π π π π π π

α α α α ) ( ) ( ) ( ) ( ) | (

slide-19
SLIDE 19

The log likelihood is

=

n n D

n score M D p ) , , ( )) | ( log( π where

+ + Γ − Γ =

n n n n n

n nv n n

n D n score

π π π π

α α π ))] ( log( )) ( [log( ) , , ( ∑∑ Γ − + Γ

n n n n n n n n

nv v nv nv

n

π π π π

α α ))] ( log( ) ) ( [log(

slide-20
SLIDE 20

Use some max fan in Find all possible parents-configurations for each node and delete low

score parents-configurations

Keep C parents-configurations for each node and cardinality

Threshold is set as:

= θ

m scoresl m scoresl scoresh / / ) ( * + − λ

slide-21
SLIDE 21

π π’ score(n,π,D) When data is quite sparse and noisy score(n,π,D) π π’ Using pre-screening method

slide-22
SLIDE 22

Traversal operators

Importance sampling---

Sample a parents-configuration for a node

= ) | ( nodei p

j

π

C n D i score C D i score

n

  • ld

k k k k j

) 1 ( ) , , ( ) , , (

_ , 1

− + +

≠ =

π π

= ) | ( ) | ( Mold Mnew Q Mnew Mold Q

∑ ∑

≠ = ≠ =

− + + − + +

n new k k k k new k n

  • ld

k k k k

  • ld

k

C n D i score C D i score C n D i score C D i score

_ , 1 _ _ , 1 _

) 1 ( ) , , ( ) , , ( ) 1 ( ) , , ( ) , , ( π π π π

= ) | ( ) | ( Mold D P Mnew D P

)) , , ( ) , , ( exp(

_ _

D i score D i score

  • ld

k new k

π π −

=

=

n i

Ki Ki nodei P

1

) (

slide-23
SLIDE 23

DIN sampling --- If the new network is loopy

Step 1 2 3 1 2 1 3 The old model 2 3 1 Step 2 2 3 1 The new model

slide-24
SLIDE 24

) ) | ( * ) ( * ) | ( ) | ( * ) ( * ) | ( , 1 ( ) | ( Mold Mnew Q Mold P Mold D P Mnew Mold Q Mnew P Mnew D P Min Mold Mnew A =

)) ), ( , ( ) ), ( , ( exp( )) ), ( , ( ) ), ( , ( exp( ) | ( ) | (

1 1

D n

  • n

score D n

  • n

score D n n n score D n n n score Mold D P Mnew D P

i i n j j j i i n j j j

π π π π + + =

∑ ∑

= =

) | ( ) | ( Mold Mnew Q Mnew Mold Q = I simply us an approximation since it is quite time consuming to calculate the proposal probability

∑ ∑

≠ = ≠ =

− + + − + +

n new k k k k new k n

  • ld

k k k k

  • ld

k

C n D i score C D i score C n D i score C D i score

_ , 1 _ _ , 1 _

) 1 ( ) , , ( ) , , ( ) 1 ( ) , , ( ) , , ( π π π π

slide-25
SLIDE 25

DIN proposal Traditional MCMC

slide-26
SLIDE 26

Algorithm

Initialization Each iteration

  • Move a step for every chain
  • Chain swap

Keep the first chain

slide-27
SLIDE 27

T>=1 Chain Swap DIN Sampling Illegal Legal M1 M2 … Mm Importance sampling M1’ Importance sampling Mi’ A (Mi’, Mi) Mi’ S( m Chains) S’ Pa (S’, S) T=1

slide-28
SLIDE 28
slide-29
SLIDE 29

MCMCMC for missing values

I4 I3 I10 I6 2 2 1 1 I1 I2 I3 I4 I5 I6 I7 I8 I9 I10 I11 I12 1 2 2 2 1 1 2 1 2 2 1 2

X1 X2 X3 X4 X5 1 ? 2 1 1 2 2 2 1 2 ? 1 1 ? 1 1 2 ? 2 ? ? 1 1 1 2 1 1 2 ? 1 1 ? 2 1 1 2 1 1 2 ? 2 2 ? 1 2 1 2 2 ? ? I1 I2 I3 I4 I5 I6 … 1 3 1 5 2 1 2 7 3 4 3 9 …

slide-30
SLIDE 30

T=1 T>=1 Chain Swap

DIN Sampling

Illegal Legal M1, D1 M2, D2 … Mm, Dm Importance sampling M1’ Importance sampling Mi’ A (Mi’, Mi| Di) Mi’ S( m Chains) S’ Pa (S’, S) T=1 Dmi Dmi’ Di’ Observed data A (Di’, Di| Gi)

slide-31
SLIDE 31

Proposal method for before

burn in

) , , | ( m n M v Q

n

=∑

+ +

n m n m n

v v v v v

N N ) 1 ( 1

π π

) , , | , ( m n M v v Q

mis n π

= ∑

+ +

mis n nm mis n nm mis n

v v v v v v v

N N

π

π π π π ,

) 1 ( 1

) , | ( n M v Q

n

=∑

+ +

n n n

v v v

N N ) 1 ( 1

slide-32
SLIDE 32

X1 X2 X3 X4 X5 1 ? 2 1 1 2 2 2 1 2 ? 1 1 2 1 1 2 ? ? ? ? 1 1 1 2 1 1 2 ? 1 1 ? 2 1 1 2 1 1 2 ? 2 2 ? 1 2 1 2 2 ? ?

1 2 5 4 3

slide-33
SLIDE 33

Acceptance probability ) , (

'

MissVal MissVal Accept = ) ) | ( ) | ( ) | ( ) | ( , 1 min(

' ' '

M D P MissVal MissVal Q M D P MissVal MissVal Q

slide-34
SLIDE 34

After burn in

Acceptance probability

) , (

'

MissVal MissVal Accept

=

) ) | ( ) | ( ) | ( ) | ( , 1 min(

' ' '

M D P MissVal MissVal Q M D P MissVal MissVal Q

) | (

' MissVal

MissVal Q = ∏ ∑

Ω ∈

+ +

) ( _

) 1 ( 1

cmis i j ij new i

N N ) | (

'

MissVal MissVal Q = ∏ ∑

Ω ∈

+ +

) ( _

) 1 ' ( 1

cmis i j ij

  • ld

i

N N

slide-35
SLIDE 35

Result Evaluation (complete data)

fn tp tp sensitivty + = fp tn tn y specificit + = ary complement fp tn tn y specificit + − = 1 = fp tn fp + fn tp

2 1 3 2 1 3

fp

4 4

tn tn

ROC curve

tp is the number of true positive edges. fn is the number of false negative edges. fp is the number of false positive edges. tn is the number of true negative edges.

slide-36
SLIDE 36

Model Genetic Network

slide-37
SLIDE 37

MCMCMC against order MCMC MCMCMC against structure MCMC MCMCMC against Population MCMC

slide-38
SLIDE 38
  • Temperatures= [1, 1, 3, 9, 30]
  • Keep at most 10 parents-configurations for each node and cardinality.
  • With 60000 iterations: 30000 burn in and keep the last 30000 samples.
slide-39
SLIDE 39

Alarm Network

slide-40
SLIDE 40
slide-41
SLIDE 41

Arabidopsis data

slide-42
SLIDE 42

Result Evaluation (missing values)

Model Genetic Network

Before burn in(30000 burn in, 30000 iterations After burn in 40000 iterations Temp=[1,1,3,9,12]

slide-43
SLIDE 43
slide-44
SLIDE 44
slide-45
SLIDE 45
  • The ROC curve for noise=0.2 data=200 with different missing rate
  • Temp = [1, 1, 3, 9, 12]
  • Use 30000 burn in and 30000 iterations.

Every 10 steps keep one sample. (before burn in algorithm)

slide-46
SLIDE 46

B cell Lymphoma data

slide-47
SLIDE 47

Summary

MCMCMC Order MCMC Structure MCMC Population MCMC

slide-48
SLIDE 48

Problems with MCMCMC

T=1 T=50 iterations

logP(D|M)+ logP(M)

slide-49
SLIDE 49