SLIDE 1 INFERENCE SCHEMES FOR M BEST SOLUTIONS FOR SOFT CSPS
Emma Rollón, Natalia Flerova and Rina Dechter
erollon@lsi.upc.edu, flerova@ics.uci.edu, dechter@ics.uci.edu Universitat Politècnica de Catalunya University of California, Irvine
SLIDE 2 Summary
Optimization problems:
Finding the best solution Finding the m-best solutions
Applications of the m-best solutions:
Set of diverse solutions desired (e.g., haplotaping) Constraints are hard to formalized (e.g., portfolio mgmt) Sensitivity analysis (e.g., biological sequence alignment)
SLIDE 3 Summary
Previous works on the m-best tasks: Compute the m-best solutions by successively computing the
best solution, each time using a slightly different reformulation of the original problem.
Lawler, 1972; Nilsson, 1998; Yanover and Weiss, 2004;
Compute the m best solutions in a single pass of algorithm,
using message passing/propagation
Seroussi and Golmard, 1994 ; Elliot, 2007;
SLIDE 4 Summary
Our contribution: We provide a formalization of the m-best task within the
unifying framework of c-semiring, making many known inference schemes immediately applicable.
In particular, we focus on Graphical Models and extend:
Bucket Elimination (exact algorithm) Mini-Bucket Elimination (bounding algorithm)
We show how to tighten the bound on the best solution using
a bound on the m-best solutions.
SLIDE 5 1-best vs. m-best optimization
Variables : X = {X1, X2, X3,…,Xn} Finite domain values: D = {D1, D2,…,Dn} Objective function: A is a totally ordered set (<) of valuations A D F
i n i
1
:
1-best optimization m-best optimization
that such ) ( ),..., ( 1
m
t F t F
) ' ( ) ( ) ( m j 1 , ' t F t F t F i t t t
j i j i
F(t) such that
) ' ( ) ( : ' t F t F t t
SLIDE 6 Graphical Model
X = {x1, …, xn}: a set of variables D = {D1, …, Dn}: a set of domain values {f1, …, fe}: a set of local functions
: combination operator over functions
Interaction graph:
f j : DY A
Y X scope of f j
1
x
3
x
n
x
2
x
SLIDE 7 Graphical Model
Global view (objective function):
Reasoning task: Particular instantiations:
F(X)
k1 e
fk
X
X ) ( F
Task WCSP MPE
1 ...
P(e) A
min
max
NU
1 ...
SLIDE 8 Bucket Elimination
Combination Marginalization Output
x1 x5 x4 x3 x2 x1 x5 x4 x3
5
x5 x4 x3
4
x2 x2
BE Select a var
Complete and correct: whenever the task can be defined over a semiring
[Shafer et. Al, Srivanas et al, Kohlas et al.]
()
k1 e
fk
X
SLIDE 9 Bucket Elimination
Combination Marginalization Output
x1 x5 x4 x3 x2 x1 x5 x4 x3
5
x5 x4 x3
4
x2 x2
() = c1 = WCSP
min
BE elim-opt Select a var
()
k1 e
fk
X
SLIDE 10 Bucket Elimination
Combination Marginalization Output
x1 x5 x4 x3 x2 x1 x5 x4 x3
5
x5 x4 x3
4
x2 x2
() = c1 = WCSP
min
??
??
() = {c1,...,cm}
=m-best WCSP
BE elim-opt elim-m-opt Select a var
()
k1 e
fk
X
SLIDE 11 Bucket Elimination
Combination Marginalization Output
x1 x5 x4 x3 x2 x1 x5 x4 x3
5
x5 x4 x3
4
x2 x2
() = c1 = WCSP
()
k1 e
fk
X
min
??
??
() = {c1,...,cm}
=m-best WCSP
BE elim-opt
Each tuple is the best cost extension to x1 Each tuple has to be the m-best cost extensions to x1
elim-m-opt Select a var
SLIDE 12 From 1-best to m-best optimization
WCSP m-best WCSP (m = 2)
min min
Functions
] 1 ... [ ) ( : Y l f
m
Y l f ] 1 ... [ ) ( : f (t) c1,c2
f (t) c1 5 ) ( ; 6 ) ( t g t f
6 , 1 ) ( ; 5 , 3 ) ( t g t f 5 ) ( ; 6 ) ( b x f a x f
6 , 1 ) ( ; 5 , 3 ) ( b x f a x f
SLIDE 13 From 1-best to m-best optimization
WCSP m-best WCSP (m = 2)
min min
Functions
] 1 ... [ ) ( : Y l f
m
Y l f ] 1 ... [ ) ( : f (t) c1,c2
f (t) c1 11 5 6 ) ( ) ( 5 ) ( ; 6 ) ( t g t f t g t f
} 6 , 4 { 11 , 6 , 9 , 4 ) ( ) ( 6 , 1 ) ( ; 5 , 3 ) ( t g t f t g t f
5 6 , 5 min min 5 ) ( ; 6 ) ( f b x f a x f
x
} 3 , 1 { 6 , 1 , 5 , 3 min 6 , 1 ) ( ; 5 , 3 ) ( f b x f a x f
x
SLIDE 14
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1
SLIDE 15
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1
1st best
SLIDE 16
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1 4 1
1st best
SLIDE 17
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1 2 3 4 1
1st best
SLIDE 18
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1 2 3 4 1
1st best 2nd best
SLIDE 19
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1 2 3 4 1 5 1
1st best 2nd best
SLIDE 20
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1 2 3 4 1 4 3 5 1
1st best 2nd best
SLIDE 21
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1 2 3 4 1 4 3 5 1
1st best 2nd best 3rd best
SLIDE 22
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1 2 3 4 1 4 3 5 1
1st best 2nd best 3rd best
SLIDE 23
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1 2 3 4 1 4 3 5 1 2 6
1st best 2nd best 3rd best
SLIDE 24
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1 2 3 4 1 4 3 5 1 2 6
1st best 2nd best 3rd best 4th best
SLIDE 25
How to combine two ordered sets
} 6 , 3 , 1 { S } 5 , 4 , 2 { T 2 1 2 3 4 1 4 3 5 1 2 6
1st best 2nd best 3rd best 4th best
O(m2) O(m * log(m+1))
SLIDE 26 Bucket Elimination
Combination Marginalization Output
x1 x5 x4 x3 x2 x5 x4 x3
4
x2 x1 x5 x4 x3
5
x2
() = c1 = WCSP
min
min
BE elim-opt elim-m-opt Select a var
() = {c1,..,cm}
=m-best WCSP
()
k1 e
fk
X
SLIDE 27 Bucket Elimination
Combination Marginalization Output
x1 x5 x4 x3 x2 x5 x4 x3
4
x2 x1 x5 x4 x3
5
x2
() = c1 = WCSP
min
min
BE elim-opt elim-m-opt Select a var
() = {c1,..,cm}
=m-best WCSP
Correct and complete: the m-best problem can be formulated as a commutative semiring using the new operators
()
k1 e
fk
X
SLIDE 28 Semirings
A commutative semiring is a triplet (A,⊗,⊕), where
- perators satisfy three axioms:
A1. The operation ⊕ is associative, commutative and idempotent, and there is an additive identity element called 0 such that a ⊕0 = a for all a ∈ A. A2. The operation ⊗ is also associative and commutative, and there is a multiplicative identity element called 1 such that a ⊗ 1 = a for all a ∈ A A3. ⊗ distributes over ⊕, i.e., (a ⊗ b) ⊕ (a ⊗ c) = a ⊗ (b ⊕ c)
Example: MPE task is defined over semiring K = (R,×,max), a
CSP is defined over semiring K = ({0, 1}, ∧, ∨), and a Weighted CSP is defined over semiring K = (N ∪ {∞},+,min).
It was showed that the correctness of inference algorithms
- ver a reasoning task P is ensured whenever P is defined
- ver a semiring.
[Shafer et. Al, Srivanas et al, Kohlas et al.]
SLIDE 29 m-space and semirings
Consider an optimization problem: Let S be a subset of a set of valuation A. We define the set of ordered m-best elements of S such that where j=m if |S|≥m and j=|S|
m-space of A: denoted , is the set of subsets of ordered m- best elements of A. Operators and described above are defined over m- space Theorem1: the triplet is a semiring, defining the m-best WCSP task.
A D F
i n i
1
: A
} ,..., { 1
j m
s s S
j
s s s ...
2 1
s s S s
j m
,
min
min) , , ( A
SLIDE 30 Bucket Elimination
Combination Marginalization Output
x1 x5 x4 x3 x2 x5 x4 x3
4
x2 x1 x5 x4 x3
5
x2
() = c1 = WCSP
min
min
BE elim-opt elim-m-opt Select a var
() = {c1,..,cm}
=m-best WCSP
Time: Space: Time: Space:
) ) 1 * (exp( n w O ) *) (exp( n w O )) 1 log( ) 1 * (exp( m m n w O ) ) 1 * (exp( m n w O
()
k1 e
fk
X
SLIDE 31 Mini-Bucket Elimination
Combination Marginalization Output
x1 x5 x4 x3 x2
MBE(z = 3) Select a var
x1 x5 x4 x3 x2
3 3
x5 x4 x3 x2
()
k1 e
fk
X
SLIDE 32 Time: Space: Time: Space:
Mini-Bucket Elimination
Combination Marginalization Output
x1 x5 x4 x3 x2
() = a1 ≤
Cost(WCSP)
min
min
mbe-opt mbe-m-opt Select a var
x1 x5 x4 x3 x2
3 3
x5 x4 x3 x2
MBE(z = 3)
() = {a1,..,am}
? m-best WCSP
) ) 1 (exp( n z O ) ) (exp( n z O )) 1 log( ) 1 (exp( m m n z O ) ) 1 (exp( m n z O
()
k1 e
fk
X
SLIDE 33 mbe-m-opt Output
x1 x5 x4 x3 x2
3 3
x5 x4 x3 x2
mbe-m-opt
m-best WCSP = {c1, c2, c3,..., ck} c1 ≤ c2 ≤ c3 ≤ ... ≤ ck
SLIDE 34 mbe-m-opt Output
x5 x4 x3 x2 x1 x5 x4 x3 x2
3 3
x1
mbe-m-opt
m-best WCSP = {c1, c2, c3,..., ck} c1 ≤ c2 ≤ c3 ≤ ... ≤ ck
SLIDE 35 mbe-m-opt Output
x5 x4 x3 x2
mbe-m-opt
m-best WCSP = {c1, c2, c3,..., ck} () = {a1, a2, a3, c1, a4, a5, c2, a5, a6, a7, ..., aj, c3, ..., aj+l,..., ck} c1 ≤ c2 ≤ c3 ≤ ... ≤ ck
x1 x5 x4 x3 x2
3 3
x1
SLIDE 36 mbe-m-opt Output
x5 x4 x3 x2
mbe-m-opt
m-best WCSP = {c1, c2, c3,..., ck} () = {a1, a2, a3, c1, a4, a5 , c2, a6, a7, a8, ..., aj, c3, ..., aj+l,..., ck} c1 ≤ c2 ≤ c3 ≤ ... ≤ ck
a1 ≤ a2 ≤ a3 ≤ c1 ≤ a4 ≤ ... ≤ ck
x1 x5 x4 x3 x2
3 3
x1
SLIDE 37 () = {a1, a2, a3, c1, a4, a5, c2, a6, a7, a8, ..., aj, c3, ..., aj+l,..., ck}
mbe-m-opt Output
x5 x4 x3 x2
mbe-m-opt
m-best WCSP = {c1, c2, c3,..., ck} c1 ≤ c2 ≤ c3 ≤ ... ≤ ck
a1 ≥ a2 ≥ a3 ≥ c1 ≥ ... ≥ ck
x1 x5 x4 x3 x2
3 3
x1
SLIDE 38 () = {a1, a2, a3 ..., aj, c3, ..., aj+l,..., ck}
mbe-m-opt Output
x5 x4 x3 x2
mbe-m-opt
m-best WCSP = {c1, c2, c3,..., ck} c1 ≤ c2 ≤ c3 ≤ ... ≤ ck
a1 ≥ a2 ≥ a3 ≥ c1 ≥ ... ≥ ck
x1 x5 x4 x3 x2
3 3
x1
SLIDE 39 Empirical Evaluation
Benchmarks (UAI 2008 competition):
Linkage analysis networks (pedigree) Grid networks WCSPs
Algorithm: mbe-m-opt(z = 10) Evaluate:
The runtime as a function of the number of solutions m The improvement of the 1-bound as a function of m
SLIDE 40
Runtime as a function of m
SLIDE 41
Runtime as a function of m
SLIDE 42 Bound’s improvement as a function of m
- 51.27
- 51.17
- 51.07
- 50.97
- 50.87
- 50.77
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 103 109 115 121 127 133 139 145 151 157 163 169 175 181 187 193 199
The index of the solution
pedigree20
pedigree20
log(UB)
SLIDE 43 Bound’s improvement as a function of m
WCSP instances
log(LB)
SLIDE 44 Conlusions
We presented two new bucket elimination algorithms for solving the m-best
task by extending the combination and marginalization operators.
The same extension yields a general formalization of m-best task over a
semiring and can be used for:
solving other optimization problems applying other known inference algorithms to m-best task. Future work:
Improve the empirical evaluation. Investigate an extension of the loopy-belief propagation for the m-best task. Investigate the use of bounds on m-best solutions as possible heuristics
SLIDE 45 INFERENCE SCHEMES FOR M BEST SOLUTIONS FOR SOFT CSPS
Emma Rollón, Natalia Flerova and Rina Dechter
erollon@lsi.upc.edu, flerova@ics.uci.edu, dechter@ics.uci.edu Universitat Politècnica de Catalunya University of California, Irvine