Dense Random Fields
Philipp Krähenbühl Stanford University
Dense Random Fields Philipp Krhenbhl Stanford University Zoo of - - PowerPoint PPT Presentation
Dense Random Fields Philipp Krhenbhl Stanford University Zoo of computer vision problems bottle tiger bottle bottle kitten wood car skin paper pumpkin cloth playing tennis Emma in her hat looking super cute 2 Zoo of computer
Philipp Krähenbühl Stanford University
skin
paper cloth wood
kitten
Emma in her hat looking super cute
playing tennis
pumpkin tiger
car
bottle bottle bottle
2
Labeling problems
skin
paper cloth wood
kitten
Emma in her hat looking super cute
playing tennis
pumpkin tiger
car
bottle bottle bottle
2
Labeling problems
skin
paper cloth wood
kitten
Emma in her hat looking super cute
playing tennis
pumpkin tiger
car
bottle bottle bottle
2
Labeling problems
skin
paper cloth wood
kitten
Emma in her hat looking super cute
playing tennis
pumpkin tiger
car
bottle bottle bottle
2
skinpaper cloth wood
kitten
sparse dense per image
Emma in her hat looking super cute playing tennis pumpkin tiger car bottle bottle bottle
3
skinpaper cloth wood
kitten
sparse dense per image
Emma in her hat looking super cute playing tennis pumpkin tiger car bottle bottle bottle
3
skinpaper cloth wood
kitten
sparse dense per image
Emma in her hat looking super cute playing tennis pumpkin tiger car bottle bottle bottle
3
skinpaper cloth wood
kitten
sparse dense per image
Emma in her hat looking super cute playing tennis pumpkin tiger car bottle bottle bottle
3
skinpaper cloth wood
kitten
sparse dense per image
Emma in her hat looking super cute playing tennis pumpkin tiger car bottle bottle bottle
3
skin paper cloth wood car bottle bottle bottle
4
skin paper cloth wood car bottle bottle bottle
4
skin paper cloth wood car bottle bottle bottle
4
5
skin paper cloth wood car bottle bottle bottle
5
skin paper cloth wood car bottle bottle bottle
table chair background
6
sheep grass
7
[1] TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context, Shotton et.al. 2009
𝜔(grass) 𝜔(sheep)
8
[1] TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context, Shotton et.al. 2009
𝜔(grass) 𝜔(sheep)
8
[1] TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context, Shotton et.al. 2009
8
boundaries
[1] TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context, Shotton et.al. 2009
8
9
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj)
unary term pairwise term
10
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj) P(X) = 1 Z exp(−E(X))
unary term pairwise term
10
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj) P(X) = 1 Z exp(−E(X))
unary term pairwise term
10
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj)
𝜔ij(Xi,Xj) = 0 unary term
11
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj)
𝜔ij(Xi,Xj) = [Xi≠Xj] conditional random field
12
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj)
𝜔ij(Xi,Xj) = 100[Xi≠Xj] conditional random field
13
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj)
𝜔ij(Xi,Xj) = 100[Xi≠Xj] conditional random field
14
conditional random field
𝜔ij(Xi,Xj) = wij[Xi≠Xj]
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj)
wij=exp(-α(ci-cj)2)
15
weight horizontal 𝜔ij(Xi,Xj) = wij[Xi≠Xj]
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj)
wij=exp(-α(ci-cj)2)
15
𝜔ij(Xi,Xj) = wij[Xi≠Xj]
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj)
wij=exp(-α(ci-cj)2) weight vertical
15
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj)
𝜔ij(Xi,Xj) = 100wij[Xi≠Xj] conditional random field color sensitive
16
17
Pros:
17
Pros:
17
Pros:
17
Pros:
17
Pros:
Cons:
17
Pros:
Cons:
17
Pros:
Cons:
17
Pros:
Cons:
propagate
17
classifier labeling
18
classifier log likelihood
19
blurred log likelihood Gaussian 𝜏s=2px
20
blurred labeling Gaussian 𝜏s=2px
21
blurred labeling Gaussian 𝜏s=6px
22
Conditional Random Field (CRF)
23
blurred labeling Gaussian 𝜏s=6px
24
blurred labeling Bilateral 𝜏s=60px 𝜏c=15
25
Conditional Random Field (CRF) color sensitive
26
˜ vi = X
j
wijvj
[2] Fast High-Dimensional Filtering Using the Permutohedral Lattice, Adams et.al. 2010
ṽi
27
wij = exp(-(si-sj)2/𝜏s) exp(-(ci-cj)2/𝜏c)
˜ vi = X
j
wijvj
vj
[2] Fast High-Dimensional Filtering Using the Permutohedral Lattice, Adams et.al. 2010
ṽi
27
wij = exp(-(si-sj)2/𝜏s) exp(-(ci-cj)2/𝜏c)
˜ vi = X
j
wijvj
vj
[2] Fast High-Dimensional Filtering Using the Permutohedral Lattice, Adams et.al. 2010
ṽi
27
wij = exp(-(si-sj)2/𝜏s) exp(-(ci-cj)2/𝜏c)
˜ vi = X
j
wijvj
vj
[2] Fast High-Dimensional Filtering Using the Permutohedral Lattice, Adams et.al. 2010
ṽi
27
wij = exp(-(si-sj)2/𝜏s) exp(-(ci-cj)2/𝜏c)
˜ vi = X
j
wijvj
vj
[2] Fast High-Dimensional Filtering Using the Permutohedral Lattice, Adams et.al. 2010
ṽi
27
wij = exp(-(si-sj)2/𝜏s) exp(-(ci-cj)2/𝜏c)
˜ vi = X
j
wijvj
si sj vj (ci-cj)2=( - )2
[2] Fast High-Dimensional Filtering Using the Permutohedral Lattice, Adams et.al. 2010
ṽi
27
wij = exp(-(si-sj)2/𝜏s) exp(-(ci-cj)2/𝜏c)
˜ vi = X
j
wijvj
si sj vj (c
[2] Fast High-Dimensional Filtering Using the Permutohedral Lattice, Adams et.al. 2010
ṽi
27
w exp(-(si-sj)2/𝜏s) exp
˜ vi = X
j
wijvj
s s vj (ci-cj)2=( - )2
[2] Fast High-Dimensional Filtering Using the Permutohedral Lattice, Adams et.al. 2010
ṽi
27
w exp exp(-(ci-cj)2/𝜏c)
˜ vi = X
j
wijvj
si sj vj (ci-cj)2=( - )2
[2] Fast High-Dimensional Filtering Using the Permutohedral Lattice, Adams et.al. 2010
ṽi
27
wij = exp(-(si-sj)2/𝜏s) exp(-(ci-cj)2/𝜏c)
v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19 v20 v21 v22 v23 v24 ṽi
28
v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19 v20 v21 v22 v23 v24 ṽi
Pros:
28
v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19 v20 v21 v22 v23 v24 ṽi
Pros:
large distances
28
v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19 v20 v21 v22 v23 v24 ṽi
Pros:
large distances
Cons:
28
v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19 v20 v21 v22 v23 v24 ṽi
Pros:
large distances
Cons:
28
v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19 v20 v21 v22 v23 v24 ṽi
Pros:
large distances
Cons:
28
v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19 v20 v21 v22 v23 v24 ṽi
Pros:
large distances
Cons:
28
29
29
29
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj)
unary term pairwise term
30
E(X) = X
i
ψi(Xi) + X
i,j∈N
ψij(Xi, Xj)
unary term pairwise term
30
31
31
32
Pros:
32
Pros:
32
Pros:
32
Pros:
32
Pros:
32
Pros:
32
33
Cons:
33
Cons:
33
Cons:
convergence in 3 days
33
Gaussians
34
E(X) = X
i
ψi(Xi)+ X
i>j
ψij(Xi, Xj)
35
E(X) = X
i
ψi(Xi)+ X
i>j
ψij(Xi, Xj) ψij(Xi, Xj) = X
m
µ(m)(Xi, Xj) k(m)(fi, fj)
35
E(X) = X
i
ψi(Xi)+ X
i>j
ψij(Xi, Xj) ψij(Xi, Xj) = X
m
µ(m)(Xi, Xj)
Gaussian kernel k(m)
k(m)(fi, fj)
35
E(X) = X
i
ψi(Xi)+ X
i>j
ψij(Xi, Xj) ψij(Xi, Xj) = X
m
µ(m)(Xi, Xj)
Gaussian kernel k(m) Label compatibility 𝜈(m)
𝜈 GRASS SHEEP WATER … GRASS
1 1 …
SHEEP
1 10 …
WATER
1 10 …
…
… … …
k(m)(fi, fj)
35
ψij(Xi, Xj) =µ1(Xi, Xj) exp −|si − sj|2 2σ2
α
− |ci − cj|2 2σ2
β
! µ2(Xi, Xj) exp ✓ −|si − sj|2 2σ2
γ
◆ +
36
ψij(Xi, Xj) =µ1(Xi, Xj) exp −|si − sj|2 2σ2
α
− |ci − cj|2 2σ2
β
! µ2(Xi, Xj) exp ✓ −|si − sj|2 2σ2
γ
◆ +
36
ψij(Xi, Xj) =µ1(Xi, Xj) exp −|si − sj|2 2σ2
α
− |ci − cj|2 2σ2
β
! µ2(Xi, Xj) exp ✓ −|si − sj|2 2σ2
γ
◆ +
𝜈 GRASS SHEEP WATER … GRASS
1 1 1
SHEEP
1 1 1
WATER
1 1 1
…
1 1 1
36
ψij(Xi, Xj) =µ1(Xi, Xj) exp −|si − sj|2 2σ2
α
− |ci − cj|2 2σ2
β
! µ2(Xi, Xj) exp ✓ −|si − sj|2 2σ2
γ
◆ +
𝜈 GRASS SHEEP WATER … GRASS
? ? ?
SHEEP
? ? ?
WATER
? ? ?
…
? ? ?
36
ψij(Xi, Xj) =µ1(Xi, Xj) exp −|si − sj|2 2σ2
α
− |ci − cj|2 2σ2
β
! µ2(Xi, Xj) exp ✓ −|si − sj|2 2σ2
γ
◆ +
si sj (ci-cj)2=( - )2
36
ψij(Xi, Xj) =µ1(Xi, Xj) exp −|si − sj|2 2σ2
α
− |ci − cj|2 2σ2
β
! µ2(Xi, Xj) exp ✓ −|si − sj|2 2σ2
γ
◆ +
si sj (ci-cj)2=( - )2
36
ψij(Xi, Xj) =µ1(Xi, Xj) exp −|si − sj|2 2σ2
α
− |ci − cj|2 2σ2
β
! µ2(Xi, Xj) exp ✓ −|si − sj|2 2σ2
γ
◆ +
si sj
36
ψij(Xi, Xj) =µ1(Xi, Xj) exp −|si − sj|2 2σ2
α
− |ci − cj|2 2σ2
β
! µ2(Xi, Xj) exp ✓ −|si − sj|2 2σ2
γ
◆ +
si sj
36
E(X) = X
i
ψi(Xi)+ X
i>j
ψij(Xi, Xj)
37
E(X) = X
i
ψi(Xi)+ X
i>j
ψij(Xi, Xj)
Find most likely assignment (MAP)
P(X) = 1 Z exp(−E(X)) ˆ x = arg max
X P(X) where
37
E(X) = X
i
ψi(Xi)+ X
i>j
ψij(Xi, Xj)
Find most likely assignment (MAP)
P(X) = 1 Z exp(−E(X)) ˆ x = arg max
X P(X) where
37
E(X) = X
i
ψi(Xi)+ X
i>j
ψij(Xi, Xj)
Find most likely assignment (MAP)
P(X) = 1 Z exp(−E(X)) ˆ x = arg max
X P(X) where
Mean Field approximation
Find Q(X)=∏iQi(Xi) close to P(X) in terms of KL-divergence D(Q||P)
ˆ x ≈ arg max
X Q(X)
37
38
38
38
Mean Field algorithm
39
Mean Field algorithm
Initialize Qi(l) = 1
Zi exp(−ψi(l))
39
Mean Field algorithm
Initialize Until convergence:
Qi(l) = 1 Zi exp(−ψi(l))
39
Mean Field algorithm
Initialize Until convergence:
Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
Qi(l) = 1 Zi exp(−ψi(l))
39
Mean Field algorithm
Initialize Until convergence:
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
ˆ Q(m)
i
(l0) = X
l
µ(m)(l0, l) ˜ Q(m)
i
(l)
Qi(l) = 1 Zi exp(−ψi(l))
𝜈 GRASS SHEEP WATER … GRASS 1 1 … SHEEP 1 10 … WATER 1 10 … … … … …
39
Mean Field algorithm
Initialize Until convergence:
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
ˆ Q(m)
i
(l0) = X
l
µ(m)(l0, l) ˜ Q(m)
i
(l)
Qi(l) = exp(−ψi(l) − X
m
ˆ Q(m)
i
(l)) Qi(l) = 1 Zi exp(−ψi(l))
39
Mean Field algorithm
Initialize Until convergence:
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
ˆ Q(m)
i
(l0) = X
l
µ(m)(l0, l) ˜ Q(m)
i
(l)
Qi(l) = exp(−ψi(l) − X
m
ˆ Q(m)
i
(l)) Qi(l) = 1 Zi exp(−ψi(l))
39
Mean Field algorithm
Initialize Until convergence:
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
ˆ Q(m)
i
(l0) = X
l
µ(m)(l0, l) ˜ Q(m)
i
(l)
Qi(l) = exp(−ψi(l) − X
m
ˆ Q(m)
i
(l))
O(N)
Qi(l) = 1 Zi exp(−ψi(l))
39
Mean Field algorithm
Initialize Until convergence:
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
ˆ Q(m)
i
(l0) = X
l
µ(m)(l0, l) ˜ Q(m)
i
(l)
Qi(l) = exp(−ψi(l) − X
m
ˆ Q(m)
i
(l))
O(N) O(N)
Qi(l) = 1 Zi exp(−ψi(l))
39
Mean Field algorithm
Initialize Until convergence:
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
ˆ Q(m)
i
(l0) = X
l
µ(m)(l0, l) ˜ Q(m)
i
(l)
Qi(l) = exp(−ψi(l) − X
m
ˆ Q(m)
i
(l))
O(N) O(N) O(N)
Qi(l) = 1 Zi exp(−ψi(l))
39
Mean Field algorithm
Initialize Until convergence:
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
ˆ Q(m)
i
(l0) = X
l
µ(m)(l0, l) ˜ Q(m)
i
(l)
Qi(l) = exp(−ψi(l) − X
m
ˆ Q(m)
i
(l))
O(N) O(N) O(N) O(N)
Qi(l) = 1 Zi exp(−ψi(l))
39
Mean Field algorithm
Initialize Until convergence:
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
ˆ Q(m)
i
(l0) = X
l
µ(m)(l0, l) ˜ Q(m)
i
(l)
Qi(l) = exp(−ψi(l) − X
m
ˆ Q(m)
i
(l))
O(N) O(N) O(N) O(N) O(N2)
Qi(l) = 1 Zi exp(−ψi(l))
39
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
40
Mean Field algorithm
Initialize Until convergence:
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
ˆ Q(m)
i
(l0) = X
l
µ(m)(l0, l) ˜ Q(m)
i
(l)
Qi(l) = exp(−ψi(l) − X
m
ˆ Q(m)
i
(l))
O(N) O(N) O(N) O(N) O(N2)
Qi(l) = 1 Zi exp(−ψi(l))
41
Mean Field algorithm
Initialize Until convergence:
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
ˆ Q(m)
i
(l0) = X
l
µ(m)(l0, l) ˜ Q(m)
i
(l)
Qi(l) = exp(−ψi(l) − X
m
ˆ Q(m)
i
(l))
O(N) O(N) O(N) O(N) O(N) High-dimensional filter
Qi(l) = 1 Zi exp(−ψi(l))
41
Mean Field algorithm
Initialize Until convergence:
˜ Q(m)
i
(l) = X
j
k(m)(fi, fj)Qj(l)
ˆ Q(m)
i
(l0) = X
l
µ(m)(l0, l) ˜ Q(m)
i
(l)
Qi(l) = exp(−ψi(l) − X
m
ˆ Q(m)
i
(l))
O(N) O(N) O(N) O(N) O(N) High-dimensional filter
linear in number of variables independent of number of pairwise terms
Qi(l) = 1 Zi exp(−ψi(l))
41
42
42
negative definite label compatibility
42
negative definite label compatibility
[3] Parameter Learning and Convergent Inference for Dense Random Fields, Krähenbühl and Koltun, ICML 2013
42
5 10 15 20 KL-divergence Number of iterations θα=θβ=10 θα=θβ=30 θα=θβ=50 θα=θβ=70 θα=θβ=90
43
Q(bird) Q(sky)
44
Q(bird) Q(sky)
45
Q(bird) Q(sky)
46
Q(bird) Q(sky)
47
bird water grass bird water grass tree bird water grass car road tree building sky car road tree building sky car road tree building sky cow grass cow grass cow grass
grid fully con. unary
48
grid fully con. unary
MSRC dataset
TIME GLOBAL AVERAGE UNARY
76.6
49
grid fully con. unary
MSRC dataset
TIME GLOBAL AVERAGE UNARY
76.6
GRID CRF
1s 84.6 77.2
49
grid fully con. unary
MSRC dataset
TIME GLOBAL AVERAGE UNARY
76.6
GRID CRF
1s 84.6 77.2
FC CRF
0.2s 86.0 78.3
49
grid fully con. unary
MSRC dataset
TIME GLOBAL AVERAGE UNARY
76.6
GRID CRF
1s 84.6 77.2
FC CRF
0.2s 86.0 78.3
FILTER
0.05s 85.0 77.5
49
tree sky grass
50
tree sky grass
51
accurate gt
tree sky grass
MSRC Accurate annotations
GLOBAL AVERAGE UNARY
83.2±1.5 80.6±2.3
GRID CRF
84.8±1.5 82.4±1.8
FC CRF
88.2±0.7 84.7±0.7
grid crf tree sky grass
52
accurate gt
tree sky grass
MSRC Accurate annotations
GLOBAL AVERAGE UNARY
83.2±1.5 80.6±2.3
GRID CRF
84.8±1.5 82.4±1.8
FC CRF
88.2±0.7 84.7±0.7
grid crf tree sky grass fully connected tree sky grass
52
TIME IOU ACCURACY UNARY
GRID CRF
2.5s 28.3
FC CRF
0.5s 29.1
PASCAL VOC 2010
ground truth
boat background
fully connected
boat background
ground truth
sheep background
fully connected
sheep background
53
54
fully connected
cat background
55
fully connected
cat background fully connect all the things
55