- C. Olsson
Higher-order Segmentation Functionals:
Entropy, Color Consistency, Curvature, etc.
Yuri Boykov jointly with
- O. Veksler
Andrew Delong
- L. Gorelick
- C. Nieuwenhuis
- E. Toppe
- I. Ben Ayed
- M. Tang
- A. Delong
- H. Isack
- A. Osokin
Higher-order Segmentation Functionals: Entropy, Color Consistency, - - PowerPoint PPT Presentation
Higher-order Segmentation Functionals: Entropy, Color Consistency, Curvature, etc. Yuri Boykov jointly with Andrew Delong M. Tang I. Ben Ayed O. Veksler H. Isack L. Gorelick C. Nieuwenhuis E. Toppe C. Olsson A. Osokin A. Delong
Entropy, Color Consistency, Curvature, etc.
Yuri Boykov jointly with
Andrew Delong
mesh level-sets graph labeling
point cloud labeling
p
s
continuous
mixed
Z
p
s
p
{0,1}
p
s
combinatorial
graph labeling {0,1}
p
s
combinatorial
Implicit surfaces/bondary
4
I
Fg) | Pr(I Bg) | Pr(I
p p p s
bg) | Pr(I fg) | Pr(I ln f
p p p
S
{0,1}
p
s
p p p s
Examples of potential functions
2
) ( c I f
p p
1
p
f ) (
p p
I f Pr ln
{0,1}
p
s
pair-wise discontinuities
q p N pq
{0,1}
p
s
second-order terms
q p q p q p
s s s s s s ) ( ) ( ] [ 1 1
quadratic
q p N pq
{0,1}
p
s
Examples of discontinuity penalties
boundary length
1
pq
w
2
) (
q p pq
I I w exp
second-order terms
q p N pq pq
|
– grids [B&K, 2003], via integral geometry – complexes [Sullivan 1994]
– can be minimized exactly via graph cuts
[Greig et al.’91, Sullivan’94, Boykov-Jolly’01]
pq
w
n-links s t a cut
{0,1}
p
s
second-order terms
q p N pq pq
2
any (binary) segmentation energy E(S) is a set function E:
S
Ω
Set function is submodular if for any
Significance: any submodular set function can be globally optimized in polynomial time [Grotschel et al.1981,88, Schrijver 2000]
S T
Ω
9
Set function is submodular if for any
S T
Ω an alternative equivalent definition providing intuitive interpretation: “diminishing returns”
v
Easily follows from the previous definition:
E(T) ) } { ( ) ( ) } { ( v S E S E v T E
S T S T S
Significance: any submodular set function can be globally optimized in polynomial time [Grotschel et al.1981,88, Schrijver 2000]
9
Assume set Ω and 2nd-order (quadratic) function
Function E(S) is submodular if for any
pq pq pq pq
Significance: submodular 2nd-order boolean (set) function can be globally optimized in polynomial time by graph cuts [Hammer 1968, Pickard&Ratliff 1973]
N pq q p pq
s s E s E
) (
, ) ( ) (
q p s
Indicator variables
[Boros&Hammer 2000, Kolmogorov&Zabih2003]
) | | | N (|
2
O
Combinatorial
Continuous
submodularity convexity
Assume Gibbs distribution over binary random variables for
} { 1 0,
p
s
)) ( ( ) ,..., ( S E exp s s Pr
n 1
Theorem [Boykov, Delong, Kolmogorov, Veksler in unpublished book 2014?] All random variables sp are positively correlated iff set function E(S) is submodular That is, submodularity implies MRF with “smoothness” prior
} { 1 s | p S
p
q p N pq pq p p p
boundary smoothness segment region/appearance
this talk
Curvature (3-rd order) Convexity (3-rd order) segment region/appearance Shape priors (N-th order) Connectivity (N-th order) Cardinality potentials (N-th order) Appearance Entropy (N-th order) Color consistency (N-th order) Distribution consistency (N-th order) boundary smoothness
submodular approximations
[our work: Trust Region 13, Auxiliary Cuts 13]
global minimum
[our work: One Cut 2014]
block-coordinate descent
[Zhu&Yuille 96, GrabCut 04]
high-order functionals
] [ ) Pr(
N pq q p pq p s p
s s w I S E
p
| ln ) , | (
1
assuming known
[Boykov&Jolly, ICCV2001] image segmentation, graph cut
RGB I p
} { 1 0,
p
s
pair-wise (quadratic) term unary (linear) term
guaranteed globally optimal S
[Rother, et al. SIGGRAPH’2004] iterative image segmentation, Grabcut (block coordinate descent )
RGB I p
1 0,
S
Models 0 , 1 are iteratively re-estimated (from initial box)
extra variables
} { 1 0,
p
s
] [ ) Pr(
N pq q p pq p s p
s s w I S E
p
| ln ) , , (
1
pair-wise (quadratic) term mixed optimization term
NP hard mixed optimization!
[Vesente et al., ICCV’09]
fixed for S=const
) (
1 0,
, S E
N pq q p pq p S p
s s w I S E
p
] [ ) Pr( ) ( | ln , ,
1 N pq q p pq s p p s p p
s s w I I S E
p p
] [ ) Pr( ) Pr( ) (
1 1 1 : :
| ln | ln , ,
S
p ˆ
S
p
1
ˆ
distribution of intensities in current bkg. segment ={p:Sp=0} distribution of intensities in current obj. segment S={p:Sp=1}
S
(binary case )
start from models 0 , 1 inside and outside some given box iterate graph cuts and model re-estimation until convergence to a local minimum
] [ ) Pr( ) (
N pq q p pq p S p
s s w I S E
p
| ln , ,
1
} { 1 0,
p
s
solution is sensitive to initial box
BCD minimization of converges to a local minimum ) (
1 0,
, S E
E=2.37×106 E=2.41×106 E=1.39×106 E=1.410×106
(binary case ) } { 1 0,
p
s
(interactivity a la “snakes”)
(could be used for more than 2 labels )
] [ ) Pr( ) (
N pq q p pq p S p
s s w I S E
p
| ln ... , , ,
2 1
| |labels
} { ,... , , 2 1
p
s
using level sets + merging heuristic initialize models 0 , 1 , 2 , from many randomly sampled boxes iterate segmentation and model re-estimation until convergence models compete, stable result if sufficiently many
(could be used for more than 2 labels )
| |labels
} { ,... , , 2 1
p
s
using a-expansion (graph-cuts) initialize models 0 , 1 , 2 , from many randomly sampled boxes models compete, stable result if sufficiently many iterate segmentation and model re-estimation until convergence
] [ ) Pr( ) (
N pq q p pq p S p
s s w I S E
p
| ln ... , , ,
2 1
(could be used for more than 2 labels )
initialize plane models 0 , 1 , 2 , from many randomly sampled SIFT matches in 2 images of the same scene
| |labels
using a-expansion (graph-cuts) iterate segmentation and model re-estimation until convergence models compete, stable result if sufficiently many
] [
(
N pq q p pq p S
s s w p p S E
p
... , , ,
2 1
} { ,... , , 2 1
p
s
(could be used for more than 2 labels )
initialize Fundamental matrices 0 , 1 , 2 , from many randomly sampled SIFT matches in 2 consecutive frames in video
| |labels
using a-expansion (graph-cuts) iterate segmentation and model re-estimation until convergence models compete, stable result if sufficiently many
} { ,... , , 2 1
p
s
] [
(
N pq q p pq p S
s s w p p S E
p
... , , ,
2 1
[Tang et al. ICCV 2013]
S
S1 Si S2 S3 S4 S5
pixels of color i in S
} | { i I S p S
p i
| | | | S S p
i s i
probability of intensity i in S
S p p
I ) ( | Pr ln
i i S i
p p S ln | |
where = {p1 , p2 , ... , pn }
given distribution
} ,..., , {
S n S S S
p p p p
2 1
distribution of intensities
cross entropy
i i i
p S ln | |
] [ ) ( ) ( ) (
N pq q p pq
s s w S H | S | S H | S | S E
N pq q p pq 1 S : p 1 p S : p p
s s w | I ln | I ln
p p
] [ ) Pr( ) Pr(
) (
1 0,
, S E
) ( | | | S H S ) (
1
| | | S H S
1 0 ,
entropy of intensities in
S
entropy of intensities in
S
minimization of segments entropy
Note: H(P|Q) H(P) for any two distributions (equality when Q=P)
cross-entropy entropy
joint estimation of S and color models [Rother et al., SIGGRAPH’04, ICCV’09] [Tang et al, ICCV 2013]
) (
1 0,
, S E
) ( | | | S H S ) (
1
| | | S H S
1 0 ,
entropy of intensities in
S
entropy of intensities in
S
binary optimization
Note: H(P|Q) H(P) for any two distributions (equality when Q=P)
cross-entropy entropy
mixed optimization [Tang et al, ICCV 2013] [Rother et al., SIGGRAPH’04, ICCV’09]
] [ ) ( ) ( ) (
N pq q p pq
s s w S H | S | S H | S | S E
N pq q p pq 1 S : p 1 p S : p p
s s w | I ln | I ln
p p
] [ ) Pr( ) Pr(
] [ ) ( ) ( ) (
N pq q p pq
s s w S H | S | S H | S | S E
) (
1 0,
, S E
) ( | | | S H S ) (
1
| | | S H S
1 0 ,
entropy of intensities in
S
entropy of intensities in
S
common energy for categorical clustering, e.g. [Li et al. ICML’04]
Note: H(P|Q) H(P) for any two distributions (equality when Q=P)
cross-entropy entropy
N pq q p pq 1 S : p 1 p S : p p
s s w | I ln | I ln
p p
] [ ) Pr( ) Pr(
] [ ) ( ) ( ) (
N pq q p pq
s s w S H | S | S H | S | S E
unsupervised image segmentation (like in Chan-Vese) high entropy segmentation break image into two coherent segments with low entropy of intensities
S S
low entropy segmentation
S S
S S S S
more general than Chan-Vese (colors can vary within each segment)
S S
break image into two coherent segments with low entropy of intensities
] [ ) ( ) ( ) (
N pq q p pq
s s w S H | S | S H | S | S E
all pixels
i
Minimization of entropy encourages pixels
i of the same color bin i
to be segmented together
(proof: see next page)
i 1 2 4 3 5
] [ ) ( ) ( ) (
N pq q p pq
s s w S H | S | S H | S | S E
S i i S i S i i S i
p p S p p S ln | | ln | |
| | | | | | | |
ln | | ln | |
S S i i S S i i
i i
S S | | ln | | | | ln | |
i i i i i
S S S S | | ln | | | | ln | |
i i i i i
S S S S
| | ln | | | | ln | | S S S S
i i i i i
| S | ln | S | | S | ln | S | ) (
volume balancing color consistency |S| |S|
Si Si
Si = S
i
|S| |Si|
i i
pixels in each color bin i prefer to be together (either inside object
| | ln | | | | ln | | S S S S
volume balancing color consistency
Si = S
i
|S| |Si|
i i
segmentation S with better color consistency
pixels in each color bin i prefer to be together (either inside object
i i i i i
| S | ln | S | | S | ln | S | ) (
| | ln | | | | ln | | S S S S
volume balancing color consistency
Si = S
i
|S| |Si|
i i
convex function of cardinality |S| (non-submodular)
pixels in each color bin i prefer to be together (either inside object
concave function of cardinality |Si| (submodular)
Graph-cut constructions for similar cardinality terms (for superpixel consistency) [Kohli et al. IJCV’09] In many applications, this term can be either dropped or replaced with simple unary ballooning [Tang et al. ICCV 2013]
i i i i i
| S | ln | S | | S | ln | S | ) (
|Si|
| | ln | | | | ln | | S S S S
volume balancing color consistency (also, simpler construction)
connect pixels in each color bin to corresponding auxiliary nodes
] [
N pq q p pq
s s w
boundary smoothness |S| |Si|
i i
In many applications, this term can be either dropped or replaced with simple unary ballooning [Tang et al. ICCV 2013]
convex function of cardinality |S| (non-submodular)
L1 color separation works better in practice [Tang et al. ICCV 2013]
i i i i i
| S | ln | S | | S | ln | S | ) (
One Cut [Tang, et al., ICCV’13]
connect pixels in each color bin to corresponding auxiliary nodes
Grabcut is sensitive to bin size guaranteed global minimum box segmentation linear ballooning inside the box
One Cut [Tang, et al., ICCV’13]
box segmentation ballooning from hard constraints linear ballooning from saliency measure
connect pixels in each color bin to corresponding auxiliary nodes
guaranteed global minimum linear ballooning inside the box from seeds saliency-based segmentation
Color consistency can be integrated into
connect pixels in each color bin to corresponding auxiliary nodes
+ color consistency photo-consistency+smoothness
d || S S ||
Trust region
1st-order approximation for H(S)
S
d
submodular (easy) hard
|| S S || λ B(S) (S) U (S) L
λ
minimize
minimize
d || S S || s.t. B(S) (S) U (S) E ~
can be approximated with unary terms [Boykov,Kolmogorov,Cremers,Delong, ECCV’06]
45
C 2 s ds
dC dC , dC
p
p p
s s d 2 ) (
unary potentials [Boykov et al. ECCV 2006]
C p dp
d 2
dp - signed distance map from C0
|S|
] [ ) Pr(
N pq q p pq p S p
s s w | I ln
p
submodular terms appearance log-likelihoods boundary length non-submodular term volume constraint Linear approx. at S0 S0 S0 submodular approx. trust region
p
p p
s s d ) (
L2 distance to S0
Log-Lik. + length
48
Interactive segmentation with box volume balancing color consistency
] [
N pq q p pq
s s w
boundary smoothness |S| |Si|
i i
submodular terms non-submodular term
global minimum Approximations
(local minima near the box)
Surprisingly, TR outperforms QPBO, DD, TRWS, BP, etc.
non-submodular problems [arXiv13]
52
4-neighborhood
8-neighborhood
53
54
multi-view reconstruction [Vogiatzis et al. 2005]
– Li & Zuker 2010 (loopy belief propagation) – Woodford et al. 2009 (fusion of proposals, QPBO) – Olsson et al. 2012-13 (fusion of planes, nearly submodular)
– Schoenemann et al. 2009 (complex, LP relaxation, many extra variables) – Strandmark & Kahl 2011 (complex, LP relaxation,…) – El-Zehiry & Grady 2010 (grid, 3-clique, only 90 degree accurate, QPBO) – Shekhovtsov et al. 2012 (grid patches, approximately learned, QPBO) – Olsson et al. 2013 (grid patches, integral geometry, partial enumeration) – Nieuwenhuis et al 2014? (grid, 3-cliques, integral geometry, trust region)
this talk good approximation of curvature, better and faster optimization practical !
[Olsson, Ulen, Boykov, Kolmogorov - ICCV 2013]
[Nieuwenhuis, Toppe, Gorelick, Veksler, Boykov - arXiv 2013]
S
Motivating example: for any convex shape
S
n n
easy to estimate via approximating polygons
S
polygons also work for
p
[Bruckstein et al. 2001]
curvature on a cell complex (standard geometry)
/2 /4 /4 /2
4- or 3-cliques on a cell complex solved via LP relaxations
curvature on a cell complex (standard geometry)
/2 /4 /4 /2
cell-patch cliques on a complex
partial enumeration + TRWS
zero gap reduction to pair-wise Constrain Satisfaction Problem
P4 P5 P6 P1 P2 P3
A B C D E F G H
curvature on a cell complex (standard geometry)
/2 /4 /4 /2
A A A A B C F F G H E D
2A+B= /2 A+F+G+H = /4 D+E+F = /2 A+C= /4
/4 /2 /2 /2 /4
curvature on a pixel grid (integral geometry)
representative cell-patches representative pixel-patches
2x2 patches 3x3 patches 5x5 patches
zero gap
2x2 patches 3x3 patches 5x5 patches
zero gap
S 2
3-cliques with configurations (0,1,0) and (1,0,1)
p p+ p-
general intuition example
more responses where curvature is higher
N n n n C
s ds s
1 2 2
) (
Δ i Δ i
1 2 3 … n N N-1 n-1 n+1 n+2 … …
Δ i
5x5 neighborhood
i
N n n i n 1 ) (
| |
) (
| |
n i n n
s
rn
n
4
3
d d R | | , ) (
d r =1/
zoom-in
Thus, appropriately weighted 3-cliques estimate squared curvature integral
r Circle
2 ) (
r Circle
2 ) (
(unless non-submodular regularization is very weak)
uses local submodular approximations
length-based regularization
elastica [Heber,Ranftl,Pock, 2012]
90-degree curvature [El-Zehiry&Grady, 2010]
7x7 neighborhood
7x7 neighborhood
2x2 neighborhood
length squared curvature