Convex Methods for Dense Semantic 3D Reconstruction Christian H ane - - PowerPoint PPT Presentation

convex methods for dense semantic 3d reconstruction
SMART_READER_LITE
LIVE PREVIEW

Convex Methods for Dense Semantic 3D Reconstruction Christian H ane - - PowerPoint PPT Presentation

Convex Methods for Dense Semantic 3D Reconstruction Christian H ane Computer Vision and Geometry Group, ETHZ May 2014 Christian H ane (ETHZ) Semantic 3D Reconstruction May 2014 1 / 45 Outline Convex Multi-Label Formulation 1 Joint


slide-1
SLIDE 1

Convex Methods for Dense Semantic 3D Reconstruction

Christian H¨ ane

Computer Vision and Geometry Group, ETHZ

May 2014

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 1 / 45

slide-2
SLIDE 2

Outline

1

Convex Multi-Label Formulation

2

Joint 3D Scene Reconstruction and Class Segmentation

3

Class Specific 3D Object Shape Priors Using Surface Normals

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 2 / 45

slide-3
SLIDE 3

Outline

1

Convex Multi-Label Formulation

2

Joint 3D Scene Reconstruction and Class Segmentation

3

Class Specific 3D Object Shape Priors Using Surface Normals

  • C. Zach, C. H¨

ane, M. Pollefeys, What Is Optimized in Convex Relaxations for Multi-Label Problems: Connecting Discrete and Continuously-Inspired MAP Inference, TPAMI 2014

  • C. Zach, C. H¨

ane, M. Pollefeys, What Is Optimized in Tight Convex Relaxations for Multi-Label Problems?, CVPR 2012

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 3 / 45

slide-4
SLIDE 4

Labeling Problems

Given a set of nodes (pixels, superpixels, voxels) Goal assign one out of L labels to each node Local preference per node plus regularization

Energy minimization problem

Omni present in computer vision Most multi-label (L > 2) instances NP-hard

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 4 / 45

slide-5
SLIDE 5

Approaches

Different solution approaches

Graph-Cuts Belief propagation Convex relaxation ...

This talk: Convex relaxation only

Discrete domain (graphical model) Continuous domain

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 5 / 45

slide-6
SLIDE 6

Discrete Domain

Describe domain by a graph

For images, a node per pixel and edges to the neighbors

Assign one out of L labels to each node θi

s: Cost for assigning label i at node s

θij

st: Cost for assigning i at s and j at t

Find assignment that has minimal cost

s t

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 6 / 45

slide-7
SLIDE 7

LP Relaxation

LP relaxation by Schlesinger et al. 1976, review Tomas Werner 2007 min

x

  • s,i

θi

sxi s +

  • s,t
  • i,j

θij

stxij st

s.t. xi

s =

  • j

xij

st

xi

t =

  • j

xji

st

  • i

xi

s = 1

xi

s ≥ 0

xij

st ≥ 0

∀s, t, i, j θi

s and θij st cost for assigning a label or a transition

xi

s ∈ {0, 1} and xij st ∈ {0, 1} exact solution but non-convex problem

Relaxed to linear program xi

s ∈ [0, 1] and xij st ∈ [0, 1]

Label assignment through thresholding

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 7 / 45

slide-8
SLIDE 8

Metrication artifacts

Grid graph based representation

Smoothness cost

Measured by crossing edges

Inpainting example Multiple equally good solutions

Penalize true boundary length: Continuous formulation

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 8 / 45

slide-9
SLIDE 9

Continuously Inspired Formulation [Chambolle et al. 2008]

Domain continuous (e.g. image plane), label space discrete Domain segmented into areas that have one out of L labels assigned

Smoothness cost θij times boundary length between labels i and j Smoothness needs to form a metric over the label-space Original formulation continuous primal-dual saddle point Our version discretized pure primal formulation [Zach et al. 2012]

min

x,y

  • s,i

θi

sxi s +

  • s
  • i,j:i<j

θij

s yij s 2

s.t.

  • ∇xi

s =

  • j:j<i

yji

s −

  • i:j>i

yij

s

  • i

xi

s = 1

xi

s ≥ 0

∀s, i, j

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 9 / 45

slide-10
SLIDE 10

Interpretation of y ij

s

Consider the following segmentation result

xi = 1 xj = 1

∇xi

s = yij s

Constraint

  • ∇xi

s = j:j<i yji s − i:j>i yij s

It follows

  • ∇xi

s = yij s and

  • ∇xj

s = −yij s

yij

s normal direction of the boundary between i and j at position s

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 10 / 45

slide-11
SLIDE 11

Extensions

Original continuously inspired formulation only metric smoothness θii

s = 0

θij

s ≤ θik s + θkj s

Metric smoothness meaningful for e.g. image denoising Not meaningful for semantic segmentation

Anisotropic smoothness sometimes desired

Aligning segmentation boundary direction with image edge Well known for binary segmentation [Esedoglu and Osher 2004]

Goal: Formulation for non-metric and anisotropic smoothness

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 11 / 45

slide-12
SLIDE 12

Anisotropic Smoothness [Esedoglu and Osher 2004]

xi = 1 xj = 1 nij

1

nij

2

ℓ1 ℓ2

Goal: Penalize boundary length ℓ weighted by its direction n Exchange θij

s yij s 2 by φij s (yij s )

φij

s (·): RN → R+ 0 is a convex positively 1-homogeneous function

How do we specify such functions? Next slide

nij

s normal of boundary between labels i and j at position s

Regularizer penalizes by boundary length times φij

s (nij s )

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 12 / 45

slide-13
SLIDE 13

Wulff Shape [Esedoglu and Osher 2004]

Specifying a function φ(·) can be hard Wulff shape Wφ

Convex shape

All possible φ(·) can be specified by φ(y) = max

µ∈Wφ µ · y

Defining φ(·) through Wφ often easier

0.5 1 1.5 30 210 60 240 90 270 120 300 150 330 180

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 13 / 45

slide-14
SLIDE 14

Non-Metric

LP-relaxation does allow for arbitrary smoothness Continuously inspired formulation allows only for metrics Where is the difference?

LP relaxation contains xij

st variables that have to be non-negative

Continuously inspired formulation contains y ij

s that are in [−1, 1]N

And hence, no non-negative xii

s

Fixed by introducing non-negative pseudo-marginals

Split positive and negative part of y ij

s into individual variables

xij

s := max{0, y ij s } and xji s := − min{0, y ij s }

y ij

s = xij s − xji s still present in the formulation

xii

s and non-negativity constraint added

This allows non-metric smoothness for the discretized case

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 14 / 45

slide-15
SLIDE 15

Final Convex Multi-Label Formulation

The final formulation allows for non-metric and anisotropic smoothness at the same time min

x

  • s,i

θi

sxi s +

  • s
  • i,j:i<j

φij

s

  • xij

s − xji s

  • s.t. xi

s =

  • j
  • xij

s

  • k

xi

s =

  • j
  • xji

s−ek

  • k
  • i

xi

s = 1

xi

s ≥ 0

xij

s ≥ 0

∀s, i, j, k ek: k-th canonical basis vector

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 15 / 45

slide-16
SLIDE 16

Summary

LP-Relaxation

Arbitrary smoothness cost Metrication artifacts

Continuously inspired formulation

Penalizes boundary length Only metric smoothness

Extensions

Anisotropic costs in multi-label case Non-metric smoothness

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 16 / 45

slide-17
SLIDE 17

Outline

1

Convex Multi-Label Formulation

2

Joint 3D Scene Reconstruction and Class Segmentation

3

Class Specific 3D Object Shape Priors Using Surface Normals

  • C. H¨

ane, C. Zach, A. Cohen, R. Angst, M. Pollefeys, Joint 3D Scene Reconstruction and Class Segmentation, CVPR 2013

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 17 / 45

slide-18
SLIDE 18

Idea

Two intrinsically ill-posed problems

Image segmentation Dense 3D modeling

Object category influences desired surface smoothness Optimize both jointly

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 18 / 45

slide-19
SLIDE 19

Formulation

Baseline Method: Volumetric depth map fusion

Segmentation of a voxel space into free and occupied space: us ∈ [0, 1]

Our joint fusion

Labeling of a voxelspace into L labels: xi

s ∈ [0, 1] and i xi s = 1

We use: free space, building, ground, vegetation, clutter

Convex Energy

Unary term

Connects image appearance and depth maps

Smoothness term

Dependent on surface orientation and involved labels

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 19 / 45

slide-20
SLIDE 20

Energy

Objective

E(x, y) =

  • s∈Ω

 

i

ρi

sxi s +

  • i,j:i<j

φij(yij

s )

  Subject to marginalization and normalization constraints xi

s ∈ [0, 1]: indicating whether label i is chosen at voxel s

yij

s ∈ [−1, 1]3: represents the local surface orientation

ρi

s: joint unary term (from depth maps and class likelihoods)

φij: convex smoothness term (trained from cadastral city model) Optimized using primal-dual algorithm [Chambolle and Pock 2011]

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 20 / 45

slide-21
SLIDE 21

Joint Fusion: Training Overview

Image based classifier Geometric priors

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 21 / 45

slide-22
SLIDE 22

Joint Fusion: Inference Overview

Input

Camera poses Vertical direction [Cohen et al. 2012]

Joint Fusion + →

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 22 / 45

slide-23
SLIDE 23

Appearance Likelihoods

Images classified using a boosted decision tree classifier

[STAIR Vision Library, Gould et al. 2010]

5 classes: sky, building, ground, vegetation, clutter Negative log likelihoods σclass i, per super pixel

Figure: Best cost labels

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 23 / 45

slide-24
SLIDE 24

Unary Term I

Computed from depth maps and class likelihoods

weight β surface

Volumetric fusion (depth only)

+

σclass 1 weight surface σclass 2 weight surface

Class likelihoods

=

σclass 1 weight β surface σclass 2 weight β surface

Joint fusion (solid classes)

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 24 / 45

slide-25
SLIDE 25

Unary Term II

Cost induced for a single ray

weight

β

us = 0 us = 1

  • ptimized surface

cost = −

depth measurement

Volumetric fusion (depth only)

σclass 2 weight

β

xi

s = 0 xi s = 1

cost = +σclass 2

depth measurement

  • ptimized surface

Joint fusion (depth and class likelihoods)

Cost for free-space indirect through solid classes High likelihood for sky but no depth → free space preference along ray

Approximates true ray likelihoods faithfully for the important cases

Assuming only one transition occurs in the band around the measured depth

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 25 / 45

slide-26
SLIDE 26

Penalization According to Normal Orientation

Vertical ground free space transition penalized differently then e.g. building free space transition

Convex Smoothness Term

ψij(·; θij) : S2 → R+ Direction dependent cost of normal

Parameterized according to parameters θij Cadastral 3D city model as training data Maximum-likelihood estimation (MLE)

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 26 / 45

slide-27
SLIDE 27

Choices for ψij

ψij is the support function of a convex Wulff shape Wψij [Esedoglu and Osher 2004, Zach et al. 2009] Two shapes

Line Segment Half-sphere plus spherical cap

Parameters of Wulff shape trained via MLE

Which shape Shape parameters: grid search

Including relative frequency of transitions

Adding constant C ij Independent of normal n Trained independently

φij(n) = ψij(n) + C ij

30 210 60 240 90 270 120 300 150 330 180 0.5 1 30 210 60 240 90 270 120 300 150 330 180

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 27 / 45

slide-28
SLIDE 28

Results

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 28 / 45

slide-29
SLIDE 29

Evolution During Optimization

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 29 / 45

slide-30
SLIDE 30

Advantages to Image Based Segmentation I

Input Image [Ladicky ICCV 2009] Joint Fusion

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 30 / 45

slide-31
SLIDE 31

Advantages to Image Based Segmentation II

Images [Ladicky et al. ICCV 2009]

Single image segmentations not consistent

Ambiguous cases eg. ground or building

Joint fusion disambiguates

Joint Fusion

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 31 / 45

slide-32
SLIDE 32

Strength of the Relaxation

Relaxation: integral solution not guaranteed Experiments show relaxation is strong

Similar relaxations experimentally shown to be strong

[Chambolle et al. 2008]

Figure: Slice through volume shows that relaxation is strong

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 32 / 45

slide-33
SLIDE 33

Comparision to Volumetric Fusion

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 33 / 45

slide-34
SLIDE 34

Conclusion

Framework to jointly optimize for segmentation and labeling

To our knowledge: first volumetric approach

Geometry and class segmentation coupled tightly

Previous joint methods on depth maps use height only

[Ladicky et al. 2012]

We use priors on the normal direction

Convex energy can be optimized globally Relaxation strong Geometric priors trained from cadastral 3D model Joint formulation improves best cost labeling and geometry

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 34 / 45

slide-35
SLIDE 35

Outline

1

Convex Multi-Label Formulation

2

Joint 3D Scene Reconstruction and Class Segmentation

3

Class Specific 3D Object Shape Priors Using Surface Normals

  • C. H¨

ane, N. Savinov, M. Pollefeys, Class Specific 3D Object Shape Priors Using Surface Normals, CVPR 2014

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 35 / 45

slide-36
SLIDE 36

Idea

Some object classes hard to reconstruct

Lack of texture Transparency Reflection

Solution: shape prior

Shapes within object class similar Local distribution of surface normals

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 36 / 45

slide-37
SLIDE 37

Formulation

Baseline Method: Volumetric depth map fusion

Segmentation of a voxel space into free and occupied space: us ∈ [0, 1]

Shape prior formulation

Voxel space aligned with object of known class Labeling of a voxelspace into 3 labels: xi

s ∈ [0, 1] and i xi s = 1

free space, ground, object

Convex Energy

Unary term

Computed from depth maps, local preference for solid class

Smoothness term

Dependent on surface orientation, position and involved labels

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 37 / 45

slide-38
SLIDE 38

Convex Energy

Objective

E(x, y) =

  • s∈Ω

 

i

ρi

sxi s +

  • i,j:i<j

φij

s (yij s )

  Subject to marginalization and normalization constraints xi

s ∈ [0, 1]: indicating whether label i is chosen at voxel s

yij

s ∈ [−1, 1]3: represents the local surface orientation

ρi

s: unary term computed from the depth maps

φij

s : convex smoothness term, defines the shape prior

Optimized using primal-dual algorithm [Chambolle and Pock 2011]

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 38 / 45

slide-39
SLIDE 39

Shape Prior Training

Training data, mesh models Transformed into volumetric models Per voxel s

Assemble normals of all training samples Generate histogram over normal directions Define anisotropic smoothness φs that reflects the distribution

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 39 / 45

slide-40
SLIDE 40

Discrete Wulff Shape

φs(·) support function of a Wulff shape Wφs

[Esedoglu and Osher 2004]

Wulff shape: convex shape

Use intersection of half spaces as parameterization of Wφs

n half space normal dn

s distance of half-space boundary to origin

We have φs(n) = dn

s [Esedoglu and Osher 2004]

dn

s determined by training data

n dn

s

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 40 / 45

slide-41
SLIDE 41

Unary data term

weight

β

x0

s = 1

x1

s = 1

  • ptimized surface

cost = −

depth measurement

Both solid classes same data term Label chosen solely based on the shape prior

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 41 / 45

slide-42
SLIDE 42

Shape variations

Can we handle shape variations? Synthetic example: Rotating object such as doors Box shape trained with 32 different rotations Volumetric space fixed with respect to the frame Reconstruction

Box with random rotation angle Synthetic sparse and noisy depth maps

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 42 / 45

slide-43
SLIDE 43

Video

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 43 / 45

slide-44
SLIDE 44

Conclusion

Shape prior based on surface normals Allows for shape variation Preserves expected surface details Segments object from ground Support points are directly inferred by the optimization

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 44 / 45

slide-45
SLIDE 45

Summary

Continuously inspired multi-label segmentation formulation

Non-metric smoothness Anisotropic smoothness

Application of the formulation

Joint reconstruction and segmentation Shape priors

Questions?

Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 45 / 45