Geometry Beyond 3D Noah Snavely Google Inc., Cornell University - - PowerPoint PPT Presentation

geometry beyond 3d
SMART_READER_LITE
LIVE PREVIEW

Geometry Beyond 3D Noah Snavely Google Inc., Cornell University - - PowerPoint PPT Presentation

Geometry Beyond 3D Noah Snavely Google Inc., Cornell University Bay Area Vision Meeting, 2014 Are we done with 3D modeling? Huge progress in the last 10 years [Snavely et al. SIGGRAPH06] [Pollefeys et al. IJCV04] [Zhou & Koltun ,


slide-1
SLIDE 1

Geometry Beyond 3D

Noah Snavely Google Inc., Cornell University

Bay Area Vision Meeting, 2014

slide-2
SLIDE 2

Are we done with 3D modeling?

  • Huge progress in the last 10 years

[Pollefeys et al. IJCV04] [Snavely et al. SIGGRAPH06] [Zhou & Koltun , SIGGRAPH14] Aerial models

slide-3
SLIDE 3

Are we done with 3D modeling?

[Klingner et al., ICCV 2013] [Agarwal et al. ICCV 2009]

slide-4
SLIDE 4

Are we done with 3D modeling?

  • Not until we have a fully realistic,

editable, semantically meaningful model of the entire world

  • Realistic = correct geometry, materials,

lighting; high-resolution; dynamic

  • In other words, a model you can feed into

your holodeck

See also the Visual Turing Test [Shan et al., 3DV 2013]

slide-5
SLIDE 5

Times Square

slide-6
SLIDE 6

What are the key challenges?

  • Scale – we have made great progress here
  • Robustness
  • Time
  • Materials
  • Semantics / grounding
  • My own biased view
slide-7
SLIDE 7

Robustness

slide-8
SLIDE 8

Are two things the same?

  • How do we know what we are looking at is

the same or different?

slide-9
SLIDE 9

Structural similarities break SfM

slide-10
SLIDE 10

Structural similarities break SfM

slide-11
SLIDE 11

Other examples

  • St. Paul’s Cathedral

Notre Dame Cathedral

slide-12
SLIDE 12

Tracks should contain one 3D point

slide-13
SLIDE 13

Tracks can conflate distinct points

slide-14
SLIDE 14

SfM Disambiguation

  • Most methods reason about inconsistencies

across many images

  • Inconsistencies in

– Loops of pairwise geometries – Visibility – Sequencing – Global geometry

[Zach et al., CVPR 2008], [Zach et al., CVPR 2010], [Roberts et al., CVPR 2011], [Jiang et al., CVPR 2012]

slide-15
SLIDE 15

SfM Disambiguation in the Large

  • We wanted a solution that was

–As simple as possible –Scalable to huge image collections

  • Intuition: visibility of points is (often)

transitive

[Wilson & Snavely, Network Principles for SfM. ICCV 2013]

slide-16
SLIDE 16

Graph topology is a cue for ambiguities

Schematic of a scene with an ambiguous feature (in red) Note that the two sides of the scene have different background (blue and green) [Wilson & Snavely, Network Principles for SfM. ICCV 2013]

slide-17
SLIDE 17

Graph topology is a cue for ambiguities

This structure can be seen in the visibility graph [Wilson & Snavely, Network Principles for SfM. ICCV 2013]

slide-18
SLIDE 18

Larger example

Bad tracks have more than one cluster

  • f context. Measure this with the

bipartite local clustering coefficient.

slide-19
SLIDE 19

Larger example

Bad tracks have more than one cluster

  • f context. Measure this with the

bipartite local clustering coefficient.

slide-20
SLIDE 20

blcc is analagous to the local clustering coefficient

slide-21
SLIDE 21

Filtering by blcc removes bad tracks

Solid line: thresholding tracks on blcc. Dotted line: same, but on a more uniform subgraph.

Algorithm:

  • 1. Compute a covering subgraph
  • 2. Compute blcc for each track
  • 3. Remove tracks lower than a threshold

Use lowest threshold that separates the graph into a user-predetermined number of components.

  • 4. Reconstruct each component separately
  • 5. Rigidly merge components if possible

ROC curve for classifying bad tracks

slide-22
SLIDE 22

Disambiguation results

Sacre Coeur Basilica, Paris

slide-23
SLIDE 23

Disambiguation results

Notre Dame Cathedral, Paris

Before After

slide-24
SLIDE 24

Disambiguation results

Seville Cathedral

slide-25
SLIDE 25

Disambiguation results

Outside the Louvre, Paris

slide-26
SLIDE 26

Network Principles for SfM

+ Extremely fast method + Based on simple local reasoning + Very simple to implement

  • Can sometimes oversegment models
  • Theoretical guarantees?

See also [Heinly et al. ECCV 2014]

slide-27
SLIDE 27

Feature matching as recognition

  • Can’t we just solve this problem using

appearance alone?

  • Better features or image metrics?
slide-28
SLIDE 28

Time

slide-29
SLIDE 29

Places are dynamic

slide-30
SLIDE 30

5pointz, Queens

slide-31
SLIDE 31

5pointz

[Graffiti Archaeology, Cassidy Curtis] How do we model these time-varying scenes?

slide-32
SLIDE 32

4D Cities

[Frank Dellaert, Grant Schindler, et al.]

slide-33
SLIDE 33

Scene Chronology

Step 1: Download photos from Flickr Step 2: Reconstruct a single 3D model with all times mixed up together Step 3: Recover the chronology of the scene

Kevin Matzen and Noah Snavely, Scene Chronology, ECCV 2014 Best Paper Award Winner

slide-34
SLIDE 34

Per-Point Time Observations Single 3D Model (from ~100,000 images)

slide-35
SLIDE 35
slide-36
SLIDE 36
slide-37
SLIDE 37

Space-Time Point Clustering Exploded View across Time

slide-38
SLIDE 38
slide-39
SLIDE 39

Re-time-stamping

Blue: original timestamp Red: our predicted timestamp

slide-40
SLIDE 40

Eisenstadt, 1945 Times Square, 1922 People Physics Weather

slide-41
SLIDE 41

Materials

slide-42
SLIDE 42
slide-43
SLIDE 43
slide-44
SLIDE 44

Sean Bell, Paul Upchurch, Noah Snavely, Kavita Bala, SIGGRAPH 2013 http://opensurfaces.cs.cornell.edu/

slide-45
SLIDE 45
slide-46
SLIDE 46

Sean Bell, Kavita Bala, Noah Snavely, SIGGRAPH 2014, http://intrinsic.cs.cornell.edu

slide-47
SLIDE 47

Semantics / Grounding

slide-48
SLIDE 48

Every image tells a story…

José Luis Murillo Vivienne Gucwa

slide-49
SLIDE 49

Grounding vision in the world

OpenStreetMap 3D city models Weather data Bus schedules

slide-50
SLIDE 50

https://nycopendata.socrata.com (https://data.sfgov.org/, https://data.seattle.gov/, …)

slide-51
SLIDE 51

Grounding vision in the world

  • Which direction is north?
  • What is the shape of the

buildings?

  • What was the weather like?
  • Where are streets?
  • What is the #51 bus

schedule in Rome?

Goal: Integrate images into this ecosystem of geographic data

slide-52
SLIDE 52

First steps: NYC3DCars

[Kevin Matzen and Noah Snavely, ICCV 2013]

slide-53
SLIDE 53

NYCOpenData Roadbeds

slide-54
SLIDE 54

Vision grounded in the real world

Input photo Overlayed GIS data (roads / sidewalks / medians) Overlayed Google Earth models

slide-55
SLIDE 55

Annotated 3D Vehicles

slide-56
SLIDE 56

Video

slide-57
SLIDE 57

3D Detection

slide-58
SLIDE 58

Ground coverage score Elevation score 3D orientation score Appearance score

slide-59
SLIDE 59

Results

Precision / Recall Orientation similarity / Recall

slide-60
SLIDE 60

http://nyc3d.cs.cornell.edu/

slide-61
SLIDE 61

Summary

  • Many interesting challenges in modeling

the world

  • Contributions from every area (cf. much

wonderful recent work):

– Scene understanding, object detection, material recognition, illumination modeling, … – Learning?

slide-62
SLIDE 62

Acknowledgements

  • National Science

Foundation

  • Intel Center for

Science and Technology – Visual Computing

  • Amazon AWS for

Education

Daniel Hauagge Sean Bell Song Cao Chun-Po Wang Kyle Wilson Scott Wehrwein Kevin Matzen Paul Upchurch Yunpeng Li Kavita Bala Dan Huttenlocher Dave Crandall

Students Collaborators

slide-63
SLIDE 63

Thank you!

http://www.cs.cornell.edu/~snavely/ More information at