Look-Ahead with Mini-Bucket Heuristics for MPE Rina Dechter 1 , - - PowerPoint PPT Presentation

look ahead with mini bucket
SMART_READER_LITE
LIVE PREVIEW

Look-Ahead with Mini-Bucket Heuristics for MPE Rina Dechter 1 , - - PowerPoint PPT Presentation

Look-Ahead with Mini-Bucket Heuristics for MPE Rina Dechter 1 , Kalev Kask 1 , William Lam 1 , Javier Larrosa 2 1 University of California, Irvine 2 UPC Barcelona Tech Outline Background Graphical models, MPE, Branch-and-Bound


slide-1
SLIDE 1

Look-Ahead with Mini-Bucket Heuristics for MPE

Rina Dechter1, Kalev Kask1, William Lam1, Javier Larrosa2

1University of California, Irvine 2UPC Barcelona Tech

slide-2
SLIDE 2

Outline

 Background

  • Graphical models, MPE, Branch-and-Bound
  • Look-ahead
  • Mini-bucket Heuristic

 Bucket Error  Look-ahead with Bucket Error  Experiments  Conclusions

slide-3
SLIDE 3

A B f2 0 0 2 0 1 0 1 0 1 1 1 4 A D f3 0 0 3 0 1 0 1 0 0 1 1 1 A G f4 0 0 0 0 1 3 1 0 2 1 1 0 B C f5 0 0 2 0 1 0 1 0 0 1 1 2 B D f6 0 0 0 0 1 1 1 0 2 1 1 4 B E f7 0 0 4 0 1 2 1 0 1 1 1 0 B F f8 0 0 3 0 1 2 1 0 1 1 1 0 C D f9 0 0 1 0 1 4 1 0 0 1 1 0 C E f10 0 0 1 0 1 0 1 0 0 1 1 2

A B C F G D E

Graphical Models and Finding an Optimal Assignment [Marinescu and Dechter 2008]

A Primal graph Pseudo-tree AND/OR search space Approach: Search

  • Depth-first AND/OR

Branch and Bound (AOBB)

  • Heuristic: mini-bucket

elimination (MBE) + variational cost-shifting [Ihler et al 2012, Otten et al 2012]

A B B C 1 1 1 1 C 1 C 1 C 1 F 1 F 1 F 1 F 1

. . .

A f1 0 3 1 2

slide-4
SLIDE 4

Depth-First Branch and Bound (DFBB)

Explored Solutions

Variables

Each node n is a subproblem (defined by current conditioning)

4 h(n)= lower bound of the best solution in the sub-tree g(n)= cost to get to node n n f(n) = g(n)+h(n) estimate of

  • verall solution

given the conditioning to node n If f(n) is worse than current best solution, then prune.

slide-5
SLIDE 5

Look-Ahead in Search

  • Given that s1,...,st are child nodes of s in the search space and w(s,si) is the weight
  • f the arc from s to si, hlh(d) be the d-level lookahead function of s, then

hlh(1)(s) = min{s1,…,st} in child(s) {w(s, si) + h(si)} hlh(d)(s) = min{s1,…,st} in child(s) {w(s, si) + hlh(d-1)(si)}

  • The (1-level) residual: resh(s) = hlh(1)(s) - h(s)
  • Can be viewed as a search problem over the next d levels
  • Our focus: Can we cost-effectively improve our heuristic with look-ahead?

s s1 s2

st

w(s,s1) w(s,s2) w(s,st) h(s1) h(s2) h(st) h(s)

slide-6
SLIDE 6

Bucket and Mini-Bucket Elimination

A B C D E F G

A f(A,B) B f(B,C) C f(B,F) F f(A,G) f(F,G) G f(B,E) f(C,E) E f(A,D) f(B,D) f(C,D) D λF (A,B) λB (A) λE (B,C) λD (A,B,C) λC (A,B) λG (A,F) f(A)

A B C F G D E

Bucket Elimination (BE) (Dechter 1999)

  • Solves the min-sum problem by eliminating variables one at a time.
  • Complexity: exponential in the w* of the underlying primal graph

Mini-Bucket Elimination (MBE) (Dechter and Rish 2001)

  • We can approximate BE by solving a relaxation created by duplicating variables
  • to bound the w* by a parameter known as the i-bound.

A

f(A,B)

B f(B,C) C f(B,F) F f(A,G) f(F,G) G f(B,E) f(C,E) E f(B,D) f(C,D) D λF (A,B) λB (A) λE (B,C) λD (B,C) λC (B) λD (A) f(A,D) D λG (A,F) f(A)

slide-7
SLIDE 7

Bucket and Mini-Bucket Elimination

A B C D E F G

A f(A,B) B f(B,C) C f(B,F) F f(A,G) f(F,G) G f(B,E) f(C,E) E f(A,D) f(B,D) f(C,D) D λF (A,B) λB (A) λE (B,C) λD (A,B,C) λC (A,B) λG (A,F) f(A)

A B C F G D E

Bucket Elimination (BE) (Dechter 1999)

  • Solves the min-sum problem by eliminating variables one at a time.
  • Complexity: exponential in the w* of the underlying primal graph

Mini-Bucket Elimination (MBE) (Dechter and Rish 2001)

  • We can approximate BE by solving a relaxation created by duplicating variables
  • to bound the w* by a parameter known as the i-bound.

A

f(A,B)

B f(B,C) C f(B,F) F f(A,G) f(F,G) G f(B,E) f(C,E) E f(B,D) f(C,D) D λF (A,B) λB (A) λE (B,C) λD (B,C) λC (B) λD (A) f(A,D) D λG (A,F) f(A)

minD [f(A,D) + f(B,D) + f(C,D)] ≥ minD [f(A,D)] + minD [f(B,D) + f(C,D)] λD(A,B,C) ≥ λD(A) + λD(B,C)

  • Bucket Error

ErrD(A,B,C) = λD(A,B,C) - (λD (A) + λD (B,C))

  • Captures the local error of the

bucket.

  • When a bucket is not partitioned,

Err is trivially zero.

slide-8
SLIDE 8

Mini-Bucket Errors and the i-bound

A B C D E F G

i-bound=3 i-bound=2 ErrD(A,B,C) = λD(A,B,C) – (λD(A) + λD(B,C)) ErrD(A,B,C) = λD(A,B,C) - (λD(A) + λD(B) + λD(C)) ErrE(B,C) = λE(B,C) - (λE(B) + λE(C)) ErrG(A,F) = λG(A,F) - (λG(A) + λG(F))

A

f(A,B)

B f(B,C) C f(B,F) F f(A,G) f(F,G) G f(B,E) f(C,E) E f(B,D) f(C,D) D λF (A,B) λB (A) λE (B,C) λD (B,C) λC (B) λD (A) f(A,D) D λG (A,F) f(A) A

f(A,B)

B f(B,C) C f(B,F) F f(C,D) D λF (B) λB (A) λE (C) λD (C) λC (B) λD (A) f(B,D) D λG (F) f(A) f(A,D) D λD (B) λE (B) f(C,E) E f(B,E) E f(F,G) G f(A,G) G λG (A)

slide-9
SLIDE 9

Bucket Error Evaluation (BEE)

slide-10
SLIDE 10

Complexity of BEE

pswj: the pseudo-width of bucket j is the number of variables in the bucket at the time of processing.

slide-11
SLIDE 11

Residual and Bucket Error

slide-12
SLIDE 12

Minimal Look-ahead Subtree

A B C F G D E H I

Pseudo-tree: Red nodes are look-ahead relevant (relative error above a certain threshold) Look-ahead subtrees from A, by depth

A B A B F G A B C F G D H

Depth 1 Depth 3 Depth 5

slide-13
SLIDE 13

Sample of a part of a pseudotree (pedigree40)

X406 X629 X410 depth = 5, threshold = 0

slide-14
SLIDE 14

Sample of a part of a pseudotree (pedigree40)

X406 X629 X410 depth = 3, threshold = 0 X627 X834 X833 X836 X835 X408

slide-15
SLIDE 15

Sample of a part of a pseudotree (pedigree40)

X406 X629 X410 depth = 5, threshold = 0 X627 X834 X833 X836 X835 X408

slide-16
SLIDE 16

Sample of a part of a pseudotree (pedigree40)

X406 X629 X410 depth ≥ 3, threshold = 9.5 X834 X833 X408

slide-17
SLIDE 17

Bucket-Errors: Across i-bounds (pedigree40)

slide-18
SLIDE 18

Experiments

 AOBB on the context-minimal AND/OR graph  MBE-MM heuristic with different i-bounds  Time limit: 6 hours  MBE memory: 4GB  Pruned look-ahead trees with BEE

  • Compute error functions based on sampling at most 105
  • entries. (Exact if there are fewer entries.)
  • Error threshold of 0.01.
slide-19
SLIDE 19

Experiments

 Benchmark Statistics

slide-20
SLIDE 20

Experiments

instance Lookahead depth time nodes time nodes i=5 i=8 pedigree 7 none 1262 826K 35 23K 1 912 564K 20 13K 3 691 311K 12 6K 6 300 66K 13 2K i=17 i=18 lf3_11_53 none 1042 7 6730K 4349 2809K 1 8611 4875K 3653 2116K 3 5481 1674K 2750 901K 6 2014 7 583K 1091 8 323K

slide-21
SLIDE 21

Experiments: Summary over Instances

✏ ✏ ✏

significant

slide-22
SLIDE 22

Experiments: Summary over Instances

slide-23
SLIDE 23

Conclusion

 We introduced the notion of bucket error to estimate the accuracy of the MBE heuristic  We can make look-ahead cost-effective for MBE heuristics using bucket errors  Future work: Applying the techniques described here to best-first search and work towards anytime algorithms that produce lower and upper bounds.

slide-24
SLIDE 24

Thanks!