and geometric reasoning Martial Hebert Abhinav Gupta David Fouhey, - - PowerPoint PPT Presentation

and geometric reasoning
SMART_READER_LITE
LIVE PREVIEW

and geometric reasoning Martial Hebert Abhinav Gupta David Fouhey, - - PowerPoint PPT Presentation

Using 3D data for image interpretation and geometric reasoning Martial Hebert Abhinav Gupta David Fouhey, Adrien Matricon, Wajahat Hussain Sparse mid-level primitives can be used to transfer geometric information? Can this helps in


slide-1
SLIDE 1

Using 3D data for image interpretation and geometric reasoning

Martial Hebert Abhinav Gupta David Fouhey, Adrien Matricon, Wajahat Hussain

slide-2
SLIDE 2
slide-3
SLIDE 3
  • Sparse mid-level primitives can be used to

transfer geometric information?

  • Can this helps in detection and matching

tasks?

  • Geometric reasoning can use this local

evidence to produce a consistent geometric interpretation?

slide-4
SLIDE 4

Primitives

Visually Discriminative

Image

Geometrically Informative

Surface Normals

Saurabh Singh et al. Discriminative Mid-Level Patches

slide-5
SLIDE 5

NYU v2 Dataset (Silberman et al., 2012)

slide-6
SLIDE 6

Learning primitives

slide-7
SLIDE 7

Representation

Instances Detector Canonical Form

slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11

Learning Primitives

Approach: iterative procedure

slide-12
SLIDE 12

Inference

Sparse Transfer …

19s

slide-13
SLIDE 13

Inference

Sparse Transfer …

slide-14
SLIDE 14

Inference

Sparse Transfer

slide-15
SLIDE 15

Inference

Dense Transfer

slide-16
SLIDE 16

Sample Results – Qualitative

795 /654

slide-17
SLIDE 17

Confidence

Most Confident Result Least Confident Result

rank

slide-18
SLIDE 18

Failures

slide-19
SLIDE 19

Mean Summary Stats (⁰) (Lower Better) Median RMSE % Good Pixels (Higher Better) 11.25⁰ 22.5⁰ 30⁰ 3D Primitives 33.0 28.3 18.8 40.7 52.4 40.0 Karsch et al. 40.8 37.8 7.9 25.8 38.2 46.9 Hoiem et al. 41.2 9.0 31.7 43.9 49.3 34.8 Singh et al. 35.0 32.4 11.2 32.1 45.8 40.6 Saxena et al. 47.1 11.2 28.0 37.4 56.3 42.3 RF + Dense SIFT 36.0 11.4 31.1 44.2 41.7 33.4

RMSE

slide-20
SLIDE 20

More general environments?

slide-21
SLIDE 21

KITTI Dataset: Geiger, Lenz, Urtasun, ‘12

slide-22
SLIDE 22
  • Large regions without surface interpretation
  • Fewer linear/planar structures to anchor
  • Irregular distribution of 3D training data
slide-23
SLIDE 23
slide-24
SLIDE 24
slide-25
SLIDE 25

Discovered Primitives (Examples)

747/203

slide-26
SLIDE 26
slide-27
SLIDE 27

Contact points

slide-28
SLIDE 28

Object surfaces + Contact points

slide-29
SLIDE 29

Failures

slide-30
SLIDE 30

Failures

slide-31
SLIDE 31

Digression

slide-32
SLIDE 32

Style and structure

slide-33
SLIDE 33

Style vs. structure?

Tenenbaum & Freeman. Separating Style and Content with Bilinear Models. Neural

  • Computation. 2000.

Lee, Efros, Hebert. Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time. 2013.

slide-34
SLIDE 34

Casablanca Hotel, New York

slide-35
SLIDE 35
slide-36
SLIDE 36
slide-37
SLIDE 37

Meritan Apartments Sydney Sheraton Hotels (North America)

slide-38
SLIDE 38

Using geometric and physical constraints

slide-39
SLIDE 39

The Story So Far

slide-40
SLIDE 40

The Story So Far

slide-41
SLIDE 41

Adding Physical/Geometric Constraints

slide-42
SLIDE 42

Adding Physical/Geometric Constraints

slide-43
SLIDE 43
slide-44
SLIDE 44

Concave ( - ) Convex ( + )

Edges between surfaces

slide-45
SLIDE 45

Parameterization

vp

1

vp

2

vp

3

slide-46
SLIDE 46

Parameterization

vp

1

vp

2

vp

3

slide-47
SLIDE 47

Parameterization

vp

1

vp

2

vp

3

slide-48
SLIDE 48

Parameterization

32/64

slide-49
SLIDE 49

Parameterization

slide-50
SLIDE 50

Parameterization

slide-51
SLIDE 51

Labeling

: is cell i on?

slide-52
SLIDE 52

Unary terms

Should cell i be on?

slide-53
SLIDE 53

Binary Potentials

8o7s

slide-54
SLIDE 54

Binary terms

slide-55
SLIDE 55

Binary terms

slide-56
SLIDE 56

Binary terms

slide-57
SLIDE 57

Constraints

Gurobi BB

slide-58
SLIDE 58

Qualitative Results

Projected 3D Primitives 3D Primitives Proposed Input Ground Truth

slide-59
SLIDE 59

Projected 3D Primitives 3D Primitives Proposed Input Ground Truth

slide-60
SLIDE 60

Random Qualitative Results

Proposed 3D Primitives

slide-61
SLIDE 61

Quantitative Results

Proposed Mean Summary Stats (⁰) (Lower Better) % Good Pixels (Higher Better) Median RMSE 11.25⁰ 22.5⁰ 30⁰ 37.5 17.2 41.9 53.9 58.0 53.2 3D Primitives 38.5 19.0 41.7 52.4 56.3 54.2 Hedau et al. 43.2 24.8 39.1 48.8 52.3 59.4 Lee et al. 47.6 43.4 28.1 39.7 43.9 60.6 Karsch et al. 46.6 43.0 5.4 19.9 31.5 53.6 Hoiem et al. 45.6 8.6 30.5 41.0 55.1 38.2

rank

slide-62
SLIDE 62
slide-63
SLIDE 63

Now:

Better reasoning Semantic information Less structured environments Coarse-to-fine depth

slide-64
SLIDE 64

Martial Hebert Abhinav Gupta David Fouhey, Adrien Matricon, Wajahat Hussain

ONR MURI NDSEG Bosch R&D