SLIDE 1 Object Detection and Segmentation from Joint Embedding of Parts and Pixels
Michael Maire1, Stella X. Yu2, Pietro Perona1
1California Institute of Technology - Pasadena, CA 91125 2Boston College - Chestnut Hill, MA 02467
SLIDE 2
Segmentation Detection
SLIDE 3 Segmentation Detection
- Perceptual Grouping Framework
SLIDE 4
Ingredients
Plug in state-of-the-art components:
SLIDE 5 Ingredients
Plug in state-of-the-art components: low-level cues: color, texture, edges
[Arbel´ aez, Maire, Fowlkes, Malik, PAMI 2011]
SLIDE 6 Ingredients
Plug in state-of-the-art components: low-level cues: color, texture, edges
[Arbel´ aez, Maire, Fowlkes, Malik, PAMI 2011]
top-down parts: poselets for person detection
[Bourdev, Maji, Brox, Malik, ECCV 2010]
SLIDE 7 Ingredients
Plug in state-of-the-art components: low-level cues: color, texture, edges
[Arbel´ aez, Maire, Fowlkes, Malik, PAMI 2011]
top-down parts: poselets for person detection
[Bourdev, Maji, Brox, Malik, ECCV 2010]
PASCAL VOC 2010 Person Category: Improved Detection and Segmentation
SLIDE 8
Grouping Relationships
SLIDE 9
Grouping Relationships
SLIDE 10
Pixel Affinity: Color, Texture Similarity
SLIDE 11 Pixel Affinity: Color, Texture Similarity
b
SLIDE 12 Pixel Affinity: Color, Texture Similarity
b
SLIDE 13
Part Affinity: Geometric Compatibility
SLIDE 14
Part Affinity: Geometric Compatibility
SLIDE 16 pixels
b b
parts
SLIDE 17 pixels
b b
parts surround
SLIDE 18 pixels
b b
parts surround
SLIDE 19 pixels
b b
parts surround
SLIDE 20 pixels
b b
parts surround figure/ground prior
b C
SLIDE 21 pixels
b b
parts surround figure/ground prior
bC
⇒
Angular Embedding
⇒ ⇒ ⇒
figure/ground segmentation
SLIDE 22
Angular Embedding
SLIDE 23
Angular Embedding p q
SLIDE 24
Angular Embedding p q
SLIDE 25 Angular Embedding p q
Given:
◮ Relative ordering Θ(·, ·) ◮ Confidence on relationships C(·, ·)
SLIDE 26 Angular Embedding p q
Given:
◮ Relative ordering Θ(·, ·) ◮ Confidence on relationships C(·, ·)
Compute:
◮ Global ordering θ(·) ◮ Embed into unit circle:
p → z(p) = eiθ(p)
θ
SLIDE 27 Angular Embedding p q
Given:
◮ Relative ordering Θ(·, ·) ◮ Confidence on relationships C(·, ·)
Compute:
◮ Global ordering θ(·) ◮ Embed into unit circle:
p → z(p) = eiθ(p)
θ
Subject to:
◮ Linear constraints on embedding solution in columns of U
SLIDE 28 i 1 −1
z(p) z(q) z(r) minimize: ε =
p
- q C(p,q)
- p,q C(p,q) · |z(p) − ˜
z(p)|2
[Yu, PAMI 2011]
SLIDE 29 i 1 −1
z(p) z(q) z(r) z(r)eiΘ(p,r) z(q)eiΘ(p,q) C(p, r) C ( p , q ) Θ(p, r) Θ(p, q) minimize: ε =
p
- q C(p,q)
- p,q C(p,q) · |z(p) − ˜
z(p)|2
[Yu, PAMI 2011]
SLIDE 30 i 1 −1
z(p) z(q) z(r) z(r)eiΘ(p,r) z(q)eiΘ(p,q) C(p, r) C ( p , q ) Θ(p, r) Θ(p, q)
˜ z(p)
minimize: ε =
p
- q C(p,q)
- p,q C(p,q) · |z(p) − ˜
z(p)|2
[Yu, PAMI 2011]
SLIDE 31 b b
b C
Cp Cq (Cs, Θs) (Cf , Θf ) U
SLIDE 32 pixels
surround prior
Cp α · Cq β · Cs γ · Cf β · C T
s
γ · C T
f
Θ = Σ−1 Θs Θf −ΘT
s
−ΘT
f
SLIDE 33
Angular Embedding
Relax to generalized eigenproblem QPQz = λz: P = D−1W Q = I − D−1U(UTD−1U)−1UT with D and W defined as: D = Diag(C1n) W = C • eiΘ Eigenvectors {z0, z1, ..., zm−1} embed pixels and parts into Cm
SLIDE 34
Angular Embedding
∠z0 encodes global ordering z1, z2, ..., zm−1 encode grouping
SLIDE 35
Angular Embedding
∠z0 encodes global ordering z1, z2, ..., zm−1 encode grouping if Θ = 0 ⇒ Normalized Cuts (grouping without ordering)
SLIDE 36
Decoding Eigenvectors: Object Detection
ℜ(z0) ℜ(z1) ℜ(z2) ℑ(z0) ℑ(z1) ℑ(z2) Ordering Grouping
SLIDE 37 Decoding Eigenvectors: Object Detection
ℜ(z0) ℜ(z1) ℜ(z2) ℑ(z0) ℑ(z1) ℑ(z2)
b b b
Ordering Grouping
SLIDE 38 Decoding Eigenvectors: Object Detection
ℜ(z0) ℜ(z1) ℜ(z2) ℑ(z0) ℑ(z1) ℑ(z2)
b b b b b b
Ordering Grouping
SLIDE 39 Decoding Eigenvectors: Object Detection
ℜ(z0) ℜ(z1) ℜ(z2) ℑ(z0) ℑ(z1) ℑ(z2)
b b b b b b b b b
Ordering Grouping
SLIDE 40 Decoding Eigenvectors: Object Detection
ℜ(z0) ℜ(z1) ℜ(z2) ℑ(z0) ℑ(z1) ℑ(z2)
b b b b b b b b b
Ordering Grouping
SLIDE 41 Decoding Eigenvectors: Object Detection
ℜ(z0) ℜ(z1) ℜ(z2) ℑ(z0) ℑ(z1) ℑ(z2)
b b b b b b b b b
Ordering Grouping
SLIDE 42 Decoding Eigenvectors: Object Detection
ℜ(z0) ℜ(z1) ℜ(z2) ℑ(z0) ℑ(z1) ℑ(z2)
b b b b b b b b b
Ordering Grouping
SLIDE 43 Decoding Eigenvectors: Object Detection
ℜ(z0) ℜ(z1) ℜ(z2) ℑ(z0) ℑ(z1) ℑ(z2)
b b b b b b b b b
Ordering Grouping
SLIDE 44
Decoding Eigenvectors: Figure/Ground
ℜ(z) ℑ(z) z0 z1 z2 z3 z4
SLIDE 45
Decoding Eigenvectors: Figure/Ground
ℜ(z) ℑ(z) z0 z1 z2 z3 z4 ⇐ ℑℜ(z) ∠z0 ∇z1 ∇z2 ∇z3 ∇z4
SLIDE 46 Decoding Eigenvectors: Segmentation
ℑℜ(z) ∠z0 ∇z1 ∇z2 ∇z3 ∇z4 Figure/Ground
- Hierarchical Segmentation
[Arbel´ aez, Maire, Fowlkes, Malik, PAMI 2011]
SLIDE 47 Decoding Eigenvectors: Object Segmentation
Assign pixels pk to objects Qi via parts qj: pk → argmin
Qi
qj∈Qi{Dist(pk, qj)}
SLIDE 48 Decoding Eigenvectors: Object Segmentation
Assign pixels pk to objects Qi via parts qj: pk → argmin
Qi
qj∈Qi{Dist(pk, qj)}
SLIDE 49
Decoding Eigenvectors
SLIDE 50
Results: PASCAL 2010 Person Category
Detections Poselet Mask F/G Mask Segmentation
SLIDE 51
Results: PASCAL 2010 Person Category
Detections Poselet Mask F/G Mask Segmentation
SLIDE 52 Results: PASCAL 2010 Person Category
◮ Segmentation task score: 41.1 (35.5 for poselet baseline)
SLIDE 53 Results: PASCAL 2010 Person Category
◮ Segmentation task score: 41.1 (35.5 for poselet baseline) ◮ 11% relative improvement due to better detection
SLIDE 54 Summary
◮ Simultaneous segmentation and detection:
◮ Part detectors → figure pop-out, object grouping ◮ Color, texture → pixel grouping
SLIDE 55 Summary
◮ Simultaneous segmentation and detection:
◮ Part detectors → figure pop-out, object grouping ◮ Color, texture → pixel grouping
◮ Graph:
◮ Parts and pixels as nodes ◮ Links encode multiple relationship types
SLIDE 56 Summary
◮ Simultaneous segmentation and detection:
◮ Part detectors → figure pop-out, object grouping ◮ Color, texture → pixel grouping
◮ Graph:
◮ Parts and pixels as nodes ◮ Links encode multiple relationship types
◮ Embedding: graph nodes → Cm
SLIDE 57 Summary
◮ Simultaneous segmentation and detection:
◮ Part detectors → figure pop-out, object grouping ◮ Color, texture → pixel grouping
◮ Graph:
◮ Parts and pixels as nodes ◮ Links encode multiple relationship types
◮ Embedding: graph nodes → Cm ◮ Decode:
◮ Figure/ground ◮ Image segmentation ◮ Detected objects ◮ Segmentation of each object instance
SLIDE 58 Summary
◮ Simultaneous segmentation and detection:
◮ Part detectors → figure pop-out, object grouping ◮ Color, texture → pixel grouping
◮ Graph:
◮ Parts and pixels as nodes ◮ Links encode multiple relationship types
◮ Embedding: graph nodes → Cm ◮ Decode:
◮ Figure/ground ◮ Image segmentation ◮ Detected objects ◮ Segmentation of each object instance
◮ Better person detection and segmentation on PASCAL
SLIDE 59
Thank You