Geometry Beyond 3D Noah Snavely Google Inc., Cornell University - - PowerPoint PPT Presentation

▶

Mar 06, 2023 498 likes •1.14k views

Geometry Beyond 3D Noah Snavely Google Inc., Cornell University Bay Area Vision Meeting, 2014 Are we done with 3D modeling? Huge progress in the last 10 years [Snavely et al. SIGGRAPH06] [Pollefeys et al. IJCV04] [Zhou & Koltun ,

SLIDE 1

Geometry Beyond 3D

Noah Snavely Google Inc., Cornell University

Bay Area Vision Meeting, 2014

SLIDE 2

Are we done with 3D modeling?

Huge progress in the last 10 years

[Pollefeys et al. IJCV04] [Snavely et al. SIGGRAPH06] [Zhou & Koltun , SIGGRAPH14] Aerial models

SLIDE 3

Are we done with 3D modeling?

[Klingner et al., ICCV 2013] [Agarwal et al. ICCV 2009]

SLIDE 4

Are we done with 3D modeling?

Not until we have a fully realistic,

editable, semantically meaningful model of the entire world

Realistic = correct geometry, materials,

lighting; high-resolution; dynamic

In other words, a model you can feed into

your holodeck

See also the Visual Turing Test [Shan et al., 3DV 2013]

SLIDE 5

Times Square

SLIDE 6

What are the key challenges?

Scale – we have made great progress here
Robustness
Time
Materials
Semantics / grounding
My own biased view

SLIDE 7

Robustness

SLIDE 8

Are two things the same?

How do we know what we are looking at is

the same or different?

SLIDE 9

Structural similarities break SfM

SLIDE 10

Structural similarities break SfM

SLIDE 11

Other examples

St. Paul’s Cathedral

Notre Dame Cathedral

SLIDE 12

Tracks should contain one 3D point

SLIDE 13

Tracks can conflate distinct points

SLIDE 14

SfM Disambiguation

Most methods reason about inconsistencies

across many images

Inconsistencies in

– Loops of pairwise geometries – Visibility – Sequencing – Global geometry

[Zach et al., CVPR 2008], [Zach et al., CVPR 2010], [Roberts et al., CVPR 2011], [Jiang et al., CVPR 2012]

SLIDE 15

SfM Disambiguation in the Large

We wanted a solution that was

–As simple as possible –Scalable to huge image collections

Intuition: visibility of points is (often)

transitive

[Wilson & Snavely, Network Principles for SfM. ICCV 2013]

SLIDE 16

Graph topology is a cue for ambiguities

Schematic of a scene with an ambiguous feature (in red) Note that the two sides of the scene have different background (blue and green) [Wilson & Snavely, Network Principles for SfM. ICCV 2013]

SLIDE 17

Graph topology is a cue for ambiguities

This structure can be seen in the visibility graph [Wilson & Snavely, Network Principles for SfM. ICCV 2013]

SLIDE 18

Larger example

Bad tracks have more than one cluster

f context. Measure this with the

bipartite local clustering coefficient.

SLIDE 19

Larger example

Bad tracks have more than one cluster

f context. Measure this with the

bipartite local clustering coefficient.

SLIDE 20

blcc is analagous to the local clustering coefficient

SLIDE 21

Filtering by blcc removes bad tracks

Solid line: thresholding tracks on blcc. Dotted line: same, but on a more uniform subgraph.

Algorithm:

1. Compute a covering subgraph
2. Compute blcc for each track
3. Remove tracks lower than a threshold

Use lowest threshold that separates the graph into a user-predetermined number of components.

4. Reconstruct each component separately
5. Rigidly merge components if possible

ROC curve for classifying bad tracks

SLIDE 22

Disambiguation results

Sacre Coeur Basilica, Paris

SLIDE 23

Disambiguation results

Notre Dame Cathedral, Paris

Before After

SLIDE 24

Disambiguation results

Seville Cathedral

SLIDE 25

Disambiguation results

Outside the Louvre, Paris

SLIDE 26

Network Principles for SfM

+ Extremely fast method + Based on simple local reasoning + Very simple to implement

Can sometimes oversegment models
Theoretical guarantees?

Feature matching as recognition

Can’t we just solve this problem using

appearance alone?

Better features or image metrics?

SLIDE 28

Time

SLIDE 29

Places are dynamic

SLIDE 30

5pointz, Queens

SLIDE 31

5pointz

[Graffiti Archaeology, Cassidy Curtis] How do we model these time-varying scenes?

SLIDE 32

4D Cities

[Frank Dellaert, Grant Schindler, et al.]

SLIDE 33

Scene Chronology

Step 1: Download photos from Flickr Step 2: Reconstruct a single 3D model with all times mixed up together Step 3: Recover the chronology of the scene

Kevin Matzen and Noah Snavely, Scene Chronology, ECCV 2014 Best Paper Award Winner

SLIDE 34

Per-Point Time Observations Single 3D Model (from ~100,000 images)

SLIDE 35

SLIDE 36

SLIDE 37

Space-Time Point Clustering Exploded View across Time

SLIDE 38

SLIDE 39

Re-time-stamping

Blue: original timestamp Red: our predicted timestamp

SLIDE 40

Eisenstadt, 1945 Times Square, 1922 People Physics Weather

SLIDE 41

Materials

SLIDE 42

SLIDE 43

SLIDE 44

Sean Bell, Paul Upchurch, Noah Snavely, Kavita Bala, SIGGRAPH 2013 http://opensurfaces.cs.cornell.edu/

SLIDE 45

SLIDE 46

Sean Bell, Kavita Bala, Noah Snavely, SIGGRAPH 2014, http://intrinsic.cs.cornell.edu

SLIDE 47

Semantics / Grounding

SLIDE 48

Every image tells a story…

José Luis Murillo Vivienne Gucwa

SLIDE 49

Grounding vision in the world

OpenStreetMap 3D city models Weather data Bus schedules

SLIDE 50

https://nycopendata.socrata.com (https://data.sfgov.org/, https://data.seattle.gov/, …)

SLIDE 51

Grounding vision in the world

Which direction is north?
What is the shape of the

buildings?

What was the weather like?
Where are streets?
What is the #51 bus

schedule in Rome?

Goal: Integrate images into this ecosystem of geographic data

SLIDE 52

First steps: NYC3DCars

[Kevin Matzen and Noah Snavely, ICCV 2013]

SLIDE 53

NYCOpenData Roadbeds

SLIDE 54

Vision grounded in the real world

Input photo Overlayed GIS data (roads / sidewalks / medians) Overlayed Google Earth models

SLIDE 55

Annotated 3D Vehicles

SLIDE 56

Video

SLIDE 57

3D Detection

SLIDE 58

Ground coverage score Elevation score 3D orientation score Appearance score

SLIDE 59

Results

Precision / Recall Orientation similarity / Recall

SLIDE 60

http://nyc3d.cs.cornell.edu/

SLIDE 61

Summary

Many interesting challenges in modeling

the world

Contributions from every area (cf. much

wonderful recent work):

– Scene understanding, object detection, material recognition, illumination modeling, … – Learning?

SLIDE 62

Acknowledgements

National Science

Foundation

Intel Center for

Science and Technology – Visual Computing

Amazon AWS for

Education

Daniel Hauagge Sean Bell Song Cao Chun-Po Wang Kyle Wilson Scott Wehrwein Kevin Matzen Paul Upchurch Yunpeng Li Kavita Bala Dan Huttenlocher Dave Crandall

Students Collaborators

SLIDE 63

Thank you!

http://www.cs.cornell.edu/~snavely/ More information at