Bucket-elimination COMPSCI 276, Spring 2011 Class 5: Rina Dechter - - PowerPoint PPT Presentation

bucket elimination
SMART_READER_LITE
LIVE PREVIEW

Bucket-elimination COMPSCI 276, Spring 2011 Class 5: Rina Dechter - - PowerPoint PPT Presentation

Exact Inference Algorithms Bucket-elimination COMPSCI 276, Spring 2011 Class 5: Rina Dechter (Reading: class notes chapter 4 , Darwiche chapter 6) 1 Belief Updating Smoking Bronchitis lung Cancer X-ray Dyspnoea P (lung cancer=yes |


slide-1
SLIDE 1

1

Exact Inference Algorithms Bucket-elimination

COMPSCI 276, Spring 2011 Class 5: Rina Dechter

(Reading: class notes chapter 4 , Darwiche chapter 6)

slide-2
SLIDE 2

2

Belief Updating

lung Cancer Smoking X-ray Bronchitis Dyspnoea

P (lung cancer=yes | smoking=no, dyspnoea=yes ) = ?

slide-3
SLIDE 3
slide-4
SLIDE 4

4

Probabilistic Inference Tasks

X/A a * k * 1

e) , x P( max arg ) a ,..., (a evidence) | x P(X ) BEL(X

i i i

 

  • Belief updating: E is a subset {X1,…,Xn}, Y subset X-E, P(Y=y|E=e)
  • P(e)?

Finding most probable explanation (MPE)

  • Finding maximum a-posteriory hypothesis
  • Finding maximum-expected-utility (MEU) decision

e) , x P( max arg * x

x

) x U( e) , x P( max arg ) d ,..., (d

X/D d * k * 1

variables hypothesis : X A  function utility x variables decision : ) ( : U X D 

slide-5
SLIDE 5

5

Belief updating is NP-hard

 Each sat formula can be mapped to a

Bayesian network query.

 Example: (u,~v,w) and (~u,~w,y) sat?

slide-6
SLIDE 6

6

Motivation

 How can we compute P(D)?, P(D|A=0)? P(A|D=0)?  Brute force O(k^4)  Maybe O(4k^2)

A D B C

Given:

slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10

10

Belief updating: P(X|evidence)=?

“Moral” graph

A D E C B

P(a|e=0)  P(a,e=0)=

 b c d e , , ,

P(a)P(b|a)P(c|a)P(d|b,a)P(e|b,c)=

0 e

P(a) 

d

) , , , ( e c d a h B

b

P(b|a)P(d|b,a)P(e|b,c)

B C E D

Variable Elimination

P(c|a)

c

slide-11
SLIDE 11

11

slide-12
SLIDE 12

12

“Moral” graph

A D E C B

slide-13
SLIDE 13

13

slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17
slide-18
SLIDE 18

18

Bucket elimination

Algorithm BE-bel (Dechter 1996)



b

Elimination operator

P(a|e=0)

W*=4 ”induced width” (max clique size)

bucket B: P(a) P(c|a) P(b|a) P(d|b,a) P(e|b,c) bucket C: bucket D: bucket E: bucket A: e=0 B C D E A

e) (a, h D

(a) h E

e) c, d, (a, h B

e) d, (a, hC

slide-19
SLIDE 19

19

BE-BEL

slide-20
SLIDE 20

Intelligence Difficulty Grade Letter SAT Job Apply

Student Network example

 P(J)?

slide-21
SLIDE 21

21

E D C B A B C D E A

slide-22
SLIDE 22

22

Complexity of elimination

)) ( ( exp (

* d

w n O

d d w

  • rdering

along graph moral

  • f

width induced the ) (

*

The effect of the ordering:

4 ) (

1 *

 d w 2 ) (

2 *

 d w

“Moral” graph

A D E C B

B C D E A E D C B A

slide-23
SLIDE 23

23

More accurately: O(r exp(w*(d)) where r is the number of cpts. For Bayesian networks r=n. For Markov networks?

BE-BEL

slide-24
SLIDE 24

24

slide-25
SLIDE 25

25

The impact of observations

Induced graph Ordered graph Ordered conditioned graph

slide-26
SLIDE 26

26

Use the ancestral graph only BE-BEL “Moral” graph

A D E C B

slide-27
SLIDE 27
slide-28
SLIDE 28
slide-29
SLIDE 29
slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32

32

Probabilistic Inference Tasks

X/A a * k * 1

e) , x P( max arg ) a ,..., (a evidence) | x P(X ) BEL(X

i i i

 

  • Belief updating:
  • Finding most probable explanation (MPE)
  • Finding maximum a-posteriory hypothesis
  • Finding maximum-expected-utility (MEU) decision

e) , x P( max arg * x

x

) x U( e) , x P( max arg ) d ,..., (d

X/D d * k * 1

variables hypothesis : X A  function utility x variables decision : ) ( : U X D 

slide-33
SLIDE 33

33

b

max

Elimination operator

MPE

W*=4 ”induced width” (max clique size)

bucket B: P(a) P(c|a) P(b|a) P(d|b,a) P(e|b,c) bucket C: bucket D: bucket E: bucket A: e=0 B C D E A

e) (a, h D

(a) h E

e) c, d, (a, h B

e) d, (a, hC

Finding

Algorithm elim-mpe (Dechter 1996)

) x P( max MPE

x

) , | ( ) , | ( ) | ( ) | ( ) ( max by replaced is

, , , ,

c b e P b a d P a b P a c P a P MPE :

b c d e a

  max

slide-34
SLIDE 34

34

Generating the MPE-tuple

C: E: P(b|a) P(d|b,a) P(e|b,c) B: D: A: P(a) P(c|a) e=0

e) (a, h D

(a) hE

e) c, d, (a, h B

e) d, (a, hC

(a) h P(a) max arg a' 1.

E a

 

e' 2. 

) e' d, , (a' h max arg d' 3.

C d

) e' c, , d' , (a' h ) a' | P(c max arg c' 4.

B c

   ) c' b, | P(e' ) a' b, | P(d' ) a' | P(b max arg b' 5.

b

   

) e' , d' , c' , b' , (a' Return

slide-35
SLIDE 35

35

slide-36
SLIDE 36

Algorithm BE-MPE

36

slide-37
SLIDE 37

37

slide-38
SLIDE 38

Algorithm BE-MAP

38

Variable ordering: Restricted: Max buckets should Be processed after sum buckets

slide-39
SLIDE 39

39

More accurately: O(r exp(w*(d)) where r is the number of cpts. For Bayesian networks r=n. For Markov networks?

slide-40
SLIDE 40

40

Finding small induced-width

 NP-complete  A tree has induced-width of ?  Greedy algorithms:

 Min width  Min induced-width  Max-cardinality  Fill-in (thought as the best)  See anytime min-width (Gogate and Dechter)

slide-41
SLIDE 41

41

Min-width ordering

Proposition: algorithm min-width finds a min-width ordering of a graph

slide-42
SLIDE 42

Greedy orderings heuristics

42

min-induced-width (miw) input: a graph G = (V;E), V = {1; :::; vn}

  • utput: An ordering of the nodes d = (v1; :::; vn).
  • 1. for j = n to 1 by -1 do
  • 2. r  a node in V with smallest degree.
  • 3. put r in position j.
  • 4. connect r's neighbors: E  E union {(vi; vj)| (vi; r) in E; (vj ; r) 2 in E},
  • 5. remove r from the resulting graph: V V - {r}.

min-fill (min-fill) input: a graph G = (V;E), V = {v1; :::; vn}

  • utput: An ordering of the nodes d = (v1; :::; vn).
  • 1. for j = n to 1 by -1 do
  • 2. r a node in V with smallest fill edges for his parents.
  • 3. put r in position j.
  • 4. connect r's neighbors: E E union {(vi; vj)| (vi; r) 2 E; (vj ; r) in E},
  • 5. remove r from the resulting graph: V V –{r}.

Theorem: A graph is a tree iff it has both width and induced-width of 1.

slide-43
SLIDE 43

43

Different Induced-graphs

slide-44
SLIDE 44

Fall 2003 ICS 275A - Constraint Networks 44

Min-induced-width

slide-45
SLIDE 45

Fall 2003 ICS 275A - Constraint Networks 45

Min-fill algorithm

 Prefers a node who add the least

number of fill-in arcs.

 Empirically, fill-in is the best among the

greedy algorithms (MW,MIW,MF,MC)

slide-46
SLIDE 46

Fall 2003 ICS 275A - Constraint Networks 46

Chordal graphs and Max- cardinality ordering

A graph is chordal if every cycle of length at least 4 has a chord

Finding w* over chordal graph is easy using the max-cardinality

  • rdering

If G* is an induced graph it is chordal chord

K-trees are special chordal graphs (A graph is a k-tree if all its max-clique are of size k+1, created recursively by connection a new node to k earlier nodes in a cliques

Finding the max-clique in chordal graphs is easy (just enumerate all cliques in a max-cardinality ordering

slide-47
SLIDE 47

47

Max-cardinality ordering

Figure 4.5 The max-cardinality (MC) ordering procedure.