The Elimination Algorithm Probabilistic Graphical Models (10- - - PDF document

the elimination algorithm
SMART_READER_LITE
LIVE PREVIEW

The Elimination Algorithm Probabilistic Graphical Models (10- - - PDF document

School of Computer Science The Elimination Algorithm Probabilistic Graphical Models (10- Probabilistic Graphical Models (10 -708) 708) Lecture 4, Sep 26, 2007 Receptor A Receptor A Receptor B Receptor B X 1 X 1 X 1 X 2 X 2 X 2 Eric Xing


slide-1
SLIDE 1

1

1

School of Computer Science

The Elimination Algorithm

Probabilistic Graphical Models (10 Probabilistic Graphical Models (10-

  • 708)

708)

Lecture 4, Sep 26, 2007

Eric Xing Eric Xing

Receptor A Kinase C TF F Gene G Gene H Kinase E Kinase D Receptor B X1 X2 X3 X4 X5 X6 X7 X8 Receptor A Kinase C TF F Gene G Gene H Kinase E Kinase D Receptor B X1 X2 X3 X4 X5 X6 X7 X8 X1 X2 X3 X4 X5 X6 X7 X8

Reading: J-Chap 3, KF-Chap. 8, 9

Eric Xing 2

Questions?

slide-2
SLIDE 2

2

Eric Xing 3

Probabilistic Inference

We now have compact representations of probability

distributions: Graphical Models

A GM G describes a unique probability distribution P How do we answer queries about P? We use inference as a name for the process of

computing answers to such queries

Eric Xing 4

∑ ∑

=

1

1 x x k

k

, ,x , x P P ) ( ) ( e e K L

Query 1: Likelihood

Most of the queries one may ask involve evidence

  • Evidence e is an assignment of values to a set E variables in the domain
  • Without loss of generality E = { Xk+1, …, Xn }

Simplest query: compute probability of evidence

  • this is often referred to as computing the likelihood of e
slide-3
SLIDE 3

3

Eric Xing 5

= = =

x

x, X P X, P P X, P X P ) ( ) ( ) ( ) ( ) | ( e e e e e

= =

z

e z Z Y Y ) | ( ) | ( , P e P

Query 2: Conditional Probability

Often we are interested in the conditional probability

distribution of a variable given the evidence

  • this is the a posteriori belief in X, given evidence e

We usually query a subset Y of all domain variables

X={Y,Z} and "don't care" about the remaining, Z:

  • the process of summing out the "don't care" variables z is called

marginalization, and the resulting P(y|e) is called a marginal prob.

Eric Xing 6

A C B A C B

? ?

Applications of a posteriori Belief

  • Prediction: what is the probability of an outcome given the starting

condition

  • the query node is a descendent of the evidence
  • Diagnosis: what is the probability of disease/fault given symptoms
  • the query node an ancestor of the evidence
  • Learning under partial observation
  • fill in the unobserved values under an "EM" setting (more later)
  • The directionality of information flow between variables is not restricted

by the directionality of the edges in a GM

  • probabilistic inference can combine evidence form all parts of the network
slide-4
SLIDE 4

4

Eric Xing 7

In this query we want to find the most probable joint

assignment (MPA) for some variables of interest

Such reasoning is usually performed under some given

evidence e, and ignoring (the values of) other variables z :

  • this is the maximum a posteriori configuration of y.

∈ ∈

= =

z y y

e z y e y e Y ) | , ( max arg ) | ( max arg ) | ( MPA P P

Y Y

Query 3: Most Probable Assignment

Eric Xing 8

x y P(x,y) 0.35 1 0.05 1 0.3 1 1 0.3

Applications of MPA

Classification

  • find most likely label, given the evidence

Explanation

  • what is the most likely scenario, given the evidence

Cautionary note:

The MPA of a variable depends on its "context"---the set

  • f variables been jointly queried

Example:

  • MPA of X ?
  • MPA of (X, Y) ?
slide-5
SLIDE 5

5

Eric Xing 9

Thm: Computing P(X = x | e) in a GM is NP-hard

Hardness does not mean we cannot solve inference

  • It implies that we cannot find a general procedure that works efficiently

for arbitrary GMs

  • For particular families of GMs, we can have provably efficient

procedures

Complexity of Inference

Eric Xing 10

Approaches to inference

Exact inference algorithms

  • The elimination algorithm
  • Message-passing algorithm (sum-product, belief propagation)
  • The junction tree algorithms

Approximate inference techniques

  • Stochastic simulation / sampling methods
  • Markov chain Monte Carlo methods
  • Variational algorithms
slide-6
SLIDE 6

6

Eric Xing 11

A signal transduction pathway: Query: P(e) By chain decomposition, we get

A B C E D

∑∑∑∑ ∑∑∑∑

= =

d c b a d c b a

d e P c d P b c P a b P a P e) P(a,b,c,d, e P ) | ( ) | ( ) | ( ) | ( ) ( ) (

a naïve summation needs to enumerate over an exponential number of terms

What is the likelihood that protein E is active?

Marginalization and Elimination

Eric Xing 12

A B C E D

∑∑∑ ∑ ∑∑∑∑

= =

d c b a d c b a

a b P a P d e P c d P b c P d e P c d P b c P a b P a P e P ) | ( ) ( ) | ( ) | ( ) | ( ) | ( ) | ( ) | ( ) | ( ) ( ) (

Elimination on Chains

Rearranging terms ...

slide-7
SLIDE 7

7

Eric Xing 13

Now we can perform innermost summation This summation "eliminates" one variable from our

summation argument at a "local cost".

A B C E D

X

∑∑∑ ∑∑∑ ∑

= =

d c b d c b a

b p d e P c d P b c P a b P a P d e P c d P b c P e P ) ( ) | ( ) | ( ) | ( ) | ( ) ( ) | ( ) | ( ) | ( ) (

Elimination on Chains

Eric Xing 14

A B C E D

∑∑ ∑∑ ∑ ∑∑∑

= = =

d c d c b d c b

c p d e P c d P b p b c P d e P c d P b p d e P c d P b c P e P ) ( ) | ( ) | ( ) ( ) | ( ) | ( ) | ( ) ( ) | ( ) | ( ) | ( ) (

X X

Elimination in Chains

Rearranging and then summing again, we get

slide-8
SLIDE 8

8

Eric Xing 15

Eliminate nodes one by one all the way to the end, we get Complexity:

  • Each step costs O(|Val(Xi)|*|Val(Xi+1)|) operations: O(kn2)
  • Compare to naïve evaluation that sums over joint values of n-1 variables O(nk)

A B C E D

=

d

d p d e P e P ) ( ) | ( ) (

X X X X

Elimination in Chains

Eric Xing 16

Rearranging terms ...

A B C E D

Undirected Chains

L = = =

∑∑∑ ∑ ∑∑∑∑

d c b a d c b a

a b d e c d b c Z d e c d b c a b Z e P ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( ) ( φ φ φ φ φ φ φ φ 1 1

slide-9
SLIDE 9

9

Eric Xing 17

The Sum-Product Operation

In general, we can view the task at hand as that of computing

the value of an expression of the form: where F is a set of factors

We call this task the sum-product inference task.

∑∏

∈ z F φ

φ

Eric Xing 18

Outcome of elimination

  • Let X be some set of variables,

let F be a set of factors such that for each φ ∈ F , Scope[φ ] ∈ X, let Y ⊂ X be a set of query variables, and let Z = X−Y be the variable to be eliminated

  • The result of eliminating the variable Z is a factor
  • This factor does not necessarily correspond to any probability or conditional

probability in this network. (example forthcoming)

∑∏

=

z

Y

F φ

φ τ ) (

slide-10
SLIDE 10

10

Eric Xing 19

Dealing with evidence

Conditioning as a Sum-Product Operation

  • The evidence potential:
  • Total evidence potential:
  • Introducing evidence --- restricted factors:

⎩ ⎨ ⎧ ≠ ≡ =

i i i i i i

e E e E e E if if ) , ( 1 δ

∑∏

× =

e z

e E e Y

,

) , ( ) , (

F φ

δ φ τ

=

E

e E

I i i i e

E ) , ( ) , ( δ δ

Eric Xing 20

General idea:

  • Write query in the form

this suggests an "elimination order" of latent variables to be

marginalized

  • Iteratively

Move all irrelevant terms outside of innermost sum Perform innermost sum, getting a new term Insert the new term into the product

  • wrap-up

∑ ∑∑∏

=

n

x x x i i i

pa x P X P

3 2

) | ( ) , (

1

L e

=

1

1 1 1 x

X X X P ) , ( ) , ( ) | ( e e e φ φ

Inference on General GM via Variable Elimination

slide-11
SLIDE 11

11

Eric Xing 21

The elimination algorithm

Procedure Elimination (

G, // the GM E, // evidence Z, // Set of variables to be eliminated X, // query variable(s) )

1.

Initialize (G)

2.

Evidence (E)

3.

Sum-Product-Elimination (F, Z, ≺)

4.

Normalization (F)

Eric Xing 22

The elimination algorithm

Procedure Initialize (G, Z)

1.

Let Z1, . . . ,Zk be an ordering of Z such that Zi ≺ Zj iff i < j

2.

Initialize F with the full the set of factors Procedure Evidence (E)

1.

for each i∈ΙE ,

F =F ∪δ(Ei, ei)

Procedure Sum-Product-Variable- Elimination (F, Z, ≺)

1.

for i = 1, . . . , k F ← Sum-Product-Eliminate-Var(F, Zi)

2.

φ∗ ← ∏φ∈F φ

3.

return φ∗

4.

Normalization (φ∗)

slide-12
SLIDE 12

12

Eric Xing 23

The elimination algorithm

Procedure Normalization (φ∗)

1.

P(X|E)=φ∗(X)/∑xφ∗(X) Procedure Sum-Product-Eliminate-Var (

F, // Set of factors Z // Variable to be eliminated )

1.

F ′ ← {φ ∈ F : Z ∈ Scope[φ]}

2.

F ′′ ← F − F ′

3.

ψ ←∏φ∈F ′ φ

4.

τ ← ∑Z ψ

5.

return F ′′ ∪ {τ} Procedure Initialize (G, Z)

1.

Let Z1, . . . ,Zk be an ordering of Z such that Zi ≺ Zj iff i < j

2.

Initialize F with the full the set of factors Procedure Evidence (E)

1.

for each i∈ΙE ,

F =F ∪δ(Ei, ei)

Procedure Sum-Product-Variable- Elimination (F, Z, ≺)

1.

for i = 1, . . . , k F ← Sum-Product-Eliminate-Var(F, Zi)

2.

φ∗ ← ∏φ∈F φ

3.

return φ∗

4.

Normalization (φ∗)

Eric Xing 24

B A D C E F G H

A food web

What is the probability that hawks are leaving given that the grass condition is poor?

A more complex network

slide-13
SLIDE 13

13

Eric Xing 25

  • Query: P(A |h)
  • Need to eliminate: B,C,D,E,F,G,H
  • Initial factors:
  • Choose an elimination order: H,G,F,E,D,C,B
  • Step 1:
  • Conditioning (fix the evidence node (i.e., h) on its observed value (i.e., )):
  • This step is isomorphic to a marginalization step:

B A D C E F G H ) , | ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( f e h P e g P a f P d c e P a d P b c P b P a P

) , | ~ ( ) , ( f e h h p f e mh = =

h ~

= =

h h

h h f e h p f e m ) ~ ( ) , | ( ) , ( δ

B A D C E F G

Example: Variable Elimination

Eric Xing 26

  • Query: P(B |h)
  • Need to eliminate: B,C,D,E,F,G
  • Initial factors:
  • Step 2: Eliminate G
  • compute

B A D C E F G H ) , ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , | ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( f e m e g P a f P d c e P a d P b c P b P a P f e h P e g P a f P d c e P a d P b c P b P a P

h

1 ) | ( ) ( = = ∑

g g

e g p e m

B A D C E F

) , ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( f e m a f P d c e P a d P b c P b P a P f e m e m a f P d c e P a d P b c P b P a P

h h g

= ⇒

Example: Variable Elimination

slide-14
SLIDE 14

14

Eric Xing 27

  • Query: P(B |h)
  • Need to eliminate: B,C,D,E,F
  • Initial factors:
  • Step 3: Eliminate F
  • compute

B A D C E F G H

Example: Variable Elimination

) , ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , | ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( f e m a f P d c e P a d P b c P b P a P f e m e g P a f P d c e P a d P b c P b P a P f e h P e g P a f P d c e P a d P b c P b P a P

h h

⇒ ⇒

=

f h f

f e m a f p a e m ) , ( ) | ( ) , (

) , ( ) , | ( ) | ( ) | ( ) ( ) ( e a m d c e P a d P b c P b P a P

f

B A D C E Eric Xing 28 B A D C E

  • Query: P(B |h)
  • Need to eliminate: B,C,D,E
  • Initial factors:
  • Step 4: Eliminate E
  • compute

B A D C E F G H

Example: Variable Elimination

) , ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , | ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( e a m d c e P a d P b c P b P a P f e m a f P d c e P a d P b c P b P a P f e m e g P a f P d c e P a d P b c P b P a P f e h P e g P a f P d c e P a d P b c P b P a P

f h h

⇒ ⇒ ⇒

=

e f e

e a m d c e p d c a m ) , ( ) , | ( ) , , (

) , , ( ) | ( ) | ( ) ( ) ( d c a m a d P b c P b P a P

e

B A D C

slide-15
SLIDE 15

15

Eric Xing 29

  • Query: P(B |h)
  • Need to eliminate: B,C,D
  • Initial factors:
  • Step 5: Eliminate D
  • compute

B A D C E F G H

Example: Variable Elimination

) , , ( ) | ( ) | ( ) ( ) ( ) , ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , | ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( d c a m a d P b c P b P a P e a m d c e P a d P b c P b P a P f e m a f P d c e P a d P b c P b P a P f e m e g P a f P d c e P a d P b c P b P a P f e h P e g P a f P d c e P a d P b c P b P a P

e f h h

⇒ ⇒ ⇒ ⇒

=

d e d

d c a m a d p c a m ) , , ( ) | ( ) , (

) , ( ) | ( ) ( ) ( c a m d c P b P a P

d

B A C Eric Xing 30

  • Query: P(B |h)
  • Need to eliminate: B,C
  • Initial factors:
  • Step 6: Eliminate C
  • compute

B A D C E F G H

Example: Variable Elimination

) , ( ) | ( ) ( ) ( c a m d c P b P a P

d

=

c d c

c a m b c p b a m ) , ( ) | ( ) , (

) , ( ) | ( ) ( ) ( ) , , ( ) | ( ) | ( ) ( ) ( ) , ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , | ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( c a m d c P b P a P d c a m a d P d c P b P a P e a m d c e P a d P d c P b P a P f e m a f P d c e P a d P d c P b P a P f e m e g P a f P d c e P a d P d c P b P a P f e h P e g P a f P d c e P a d P d c P b P a P

d e f h h

⇒ ⇒ ⇒ ⇒ ⇒

B A

slide-16
SLIDE 16

16

Eric Xing 31

  • Query: P(B |h)
  • Need to eliminate: B
  • Initial factors:
  • Step 7: Eliminate B
  • compute

B A D C E F G H

Example: Variable Elimination

) , ( ) ( ) ( ) , ( ) | ( ) ( ) ( ) , , ( ) | ( ) | ( ) ( ) ( ) , ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , | ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( b a m b P a P c a m d c P b P a P d c a m a d P d c P b P a P e a m d c e P a d P d c P b P a P f e m a f P d c e P a d P d c P b P a P f e m e g P a f P d c e P a d P d c P b P a P f e h P e g P a f P d c e P a d P d c P b P a P

c d e f h h

⇒ ⇒ ⇒ ⇒ ⇒ ⇒

=

b c b

b a m b p a m ) , ( ) ( ) (

) ( ) ( a m a P

b

A Eric Xing 32

  • Query: P(B |h)
  • Need to eliminate: B
  • Initial factors:
  • Step 8: Wrap-up

B A D C E F G H

Example: Variable Elimination

) ( ) ( ) , ( ) ( ) ( ) , ( ) | ( ) ( ) ( ) , , ( ) | ( ) | ( ) ( ) ( ) , ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( ) , | ( ) | ( ) | ( ) , | ( ) | ( ) | ( ) ( ) ( a m a P b a m b P a P c a m d c P b P a P d c a m a d P d c P b P a P e a m d c e P a d P d c P b P a P f e m a f P d c e P a d P d c P b P a P f e m e g P a f P d c e P a d P d c P b P a P f e h P e g P a f P d c e P a d P d c P b P a P

b c d e f h h

⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒

, ) ( ) ( ) ~ , ( a m a p h a p

b

=

= ⇒

a b b

a m a p a m a p h a P ) ( ) ( ) ( ) ( ) ~ | (

=

a b a

m a p h p ) ( ) ( ) ~ (

slide-17
SLIDE 17

17

Eric Xing 33

Suppose in one elimination step we compute

This requires

  • multiplications
  • For each value for x, y1, …, yk, we do k multiplications
  • additions
  • For each value of y1, …, yk , we do |Val(X)| additions

Complexity is exponential in number of variables in the intermediate factor

Complexity of variable elimination

  • i

Ci

X k ) Val( ) Val( Y

  • i

Ci

X ) Val( ) Val( Y

=

x k x k x

y y x m y y m ) , , , ( ' ) , , (

1 1

K K

=

=

k i c i k x

i

x m y y x m

1 1

) , ( ) , , , ( ' y K