Look-Ahead with Mini-Bucket Heuristics for MPE
Rina Dechter1, Kalev Kask1, William Lam1, Javier Larrosa2
1University of California, Irvine 2UPC Barcelona Tech
Look-Ahead with Mini-Bucket Heuristics for MPE Rina Dechter 1 , - - PowerPoint PPT Presentation
Look-Ahead with Mini-Bucket Heuristics for MPE Rina Dechter 1 , Kalev Kask 1 , William Lam 1 , Javier Larrosa 2 1 University of California, Irvine 2 UPC Barcelona Tech Outline Background Graphical models, MPE, Branch-and-Bound
1University of California, Irvine 2UPC Barcelona Tech
A B f2 0 0 2 0 1 0 1 0 1 1 1 4 A D f3 0 0 3 0 1 0 1 0 0 1 1 1 A G f4 0 0 0 0 1 3 1 0 2 1 1 0 B C f5 0 0 2 0 1 0 1 0 0 1 1 2 B D f6 0 0 0 0 1 1 1 0 2 1 1 4 B E f7 0 0 4 0 1 2 1 0 1 1 1 0 B F f8 0 0 3 0 1 2 1 0 1 1 1 0 C D f9 0 0 1 0 1 4 1 0 0 1 1 0 C E f10 0 0 1 0 1 0 1 0 0 1 1 2
A B C F G D E
A Primal graph Pseudo-tree AND/OR search space Approach: Search
Branch and Bound (AOBB)
elimination (MBE) + variational cost-shifting [Ihler et al 2012, Otten et al 2012]
A B B C 1 1 1 1 C 1 C 1 C 1 F 1 F 1 F 1 F 1
. . .
A f1 0 3 1 2
Explored Solutions
Each node n is a subproblem (defined by current conditioning)
4 h(n)= lower bound of the best solution in the sub-tree g(n)= cost to get to node n n f(n) = g(n)+h(n) estimate of
given the conditioning to node n If f(n) is worse than current best solution, then prune.
hlh(1)(s) = min{s1,…,st} in child(s) {w(s, si) + h(si)} hlh(d)(s) = min{s1,…,st} in child(s) {w(s, si) + hlh(d-1)(si)}
s s1 s2
st
w(s,s1) w(s,s2) w(s,st) h(s1) h(s2) h(st) h(s)
A B C D E F G
A f(A,B) B f(B,C) C f(B,F) F f(A,G) f(F,G) G f(B,E) f(C,E) E f(A,D) f(B,D) f(C,D) D λF (A,B) λB (A) λE (B,C) λD (A,B,C) λC (A,B) λG (A,F) f(A)
A B C F G D E
Bucket Elimination (BE) (Dechter 1999)
Mini-Bucket Elimination (MBE) (Dechter and Rish 2001)
A
f(A,B)
B f(B,C) C f(B,F) F f(A,G) f(F,G) G f(B,E) f(C,E) E f(B,D) f(C,D) D λF (A,B) λB (A) λE (B,C) λD (B,C) λC (B) λD (A) f(A,D) D λG (A,F) f(A)
A B C D E F G
A f(A,B) B f(B,C) C f(B,F) F f(A,G) f(F,G) G f(B,E) f(C,E) E f(A,D) f(B,D) f(C,D) D λF (A,B) λB (A) λE (B,C) λD (A,B,C) λC (A,B) λG (A,F) f(A)
A B C F G D E
Bucket Elimination (BE) (Dechter 1999)
Mini-Bucket Elimination (MBE) (Dechter and Rish 2001)
A
f(A,B)
B f(B,C) C f(B,F) F f(A,G) f(F,G) G f(B,E) f(C,E) E f(B,D) f(C,D) D λF (A,B) λB (A) λE (B,C) λD (B,C) λC (B) λD (A) f(A,D) D λG (A,F) f(A)
minD [f(A,D) + f(B,D) + f(C,D)] ≥ minD [f(A,D)] + minD [f(B,D) + f(C,D)] λD(A,B,C) ≥ λD(A) + λD(B,C)
ErrD(A,B,C) = λD(A,B,C) - (λD (A) + λD (B,C))
bucket.
Err is trivially zero.
A B C D E F G
i-bound=3 i-bound=2 ErrD(A,B,C) = λD(A,B,C) – (λD(A) + λD(B,C)) ErrD(A,B,C) = λD(A,B,C) - (λD(A) + λD(B) + λD(C)) ErrE(B,C) = λE(B,C) - (λE(B) + λE(C)) ErrG(A,F) = λG(A,F) - (λG(A) + λG(F))
A
f(A,B)
B f(B,C) C f(B,F) F f(A,G) f(F,G) G f(B,E) f(C,E) E f(B,D) f(C,D) D λF (A,B) λB (A) λE (B,C) λD (B,C) λC (B) λD (A) f(A,D) D λG (A,F) f(A) A
f(A,B)
B f(B,C) C f(B,F) F f(C,D) D λF (B) λB (A) λE (C) λD (C) λC (B) λD (A) f(B,D) D λG (F) f(A) f(A,D) D λD (B) λE (B) f(C,E) E f(B,E) E f(F,G) G f(A,G) G λG (A)
pswj: the pseudo-width of bucket j is the number of variables in the bucket at the time of processing.
A B C F G D E H I
Pseudo-tree: Red nodes are look-ahead relevant (relative error above a certain threshold) Look-ahead subtrees from A, by depth
A B A B F G A B C F G D H
Depth 1 Depth 3 Depth 5
X406 X629 X410 depth = 5, threshold = 0
X406 X629 X410 depth = 3, threshold = 0 X627 X834 X833 X836 X835 X408
X406 X629 X410 depth = 5, threshold = 0 X627 X834 X833 X836 X835 X408
X406 X629 X410 depth ≥ 3, threshold = 9.5 X834 X833 X408
instance Lookahead depth time nodes time nodes i=5 i=8 pedigree 7 none 1262 826K 35 23K 1 912 564K 20 13K 3 691 311K 12 6K 6 300 66K 13 2K i=17 i=18 lf3_11_53 none 1042 7 6730K 4349 2809K 1 8611 4875K 3653 2116K 3 5481 1674K 2750 901K 6 2014 7 583K 1091 8 323K
✏ ✏ ✏
significant