Reconstruction and Reconstruction and recognition for realistic - - PowerPoint PPT Presentation

reconstruction and reconstruction and recognition for
SMART_READER_LITE
LIVE PREVIEW

Reconstruction and Reconstruction and recognition for realistic - - PowerPoint PPT Presentation

Reconstruction and Reconstruction and recognition for realistic augmented reality Professor Anton van den Hengel Professor Anton van den Hengel Director, Australian Centre for Visual Technologies Professor, Adelaide University, South


slide-1
SLIDE 1

Reconstruction and Reconstruction and recognition for realistic augmented reality

Professor Anton van den Hengel Professor Anton van den Hengel Director, Australian Centre for Visual Technologies Professor, Adelaide University, South Australia Director, Punchcard Visual Technologies

slide-2
SLIDE 2

The ACVT

Australia’s largest Computer Vision research

group group

Working in

Machine Learning

5 CVPR’11 papers, 2 ICCV’11, JMLR, ...

Parameter Estimation

CVPR’10 Best Paper Prize, CVPR’11, ICCV’11, PAMI...

Video Surveillance

ICCV’11, 2 start/ups, ...

Structure from X

PAMI, JMIV, Siggraph’07, ToG, ...

slide-3
SLIDE 3

The ACVT

slide-4
SLIDE 4

There is Demand for 3D

slide-5
SLIDE 5

Even kids want 3D

slide-6
SLIDE 6

UCC made the Web

Blogs, Wikis, Social networking sites, Advertising, Fanfiction, News Sites, Trip

planners, Mobile Photos & Videos, Customer review sites, Forums, Experience and photo sharing sites, Audio, Video games, Maps and location systems and such, but more

Associated Content, Atom.com, BatchBuzz.com, Brickfish, CreateDebate,

Dailymotion, Deviant Art, Demotix, Digg, eBay, Eventful, Fark, Epinions, Facebook, Filemobile, Flickr, Forelinksters, Friends Reunited, GiantBomb, Helium.com, HubPages, InfoBarrel, iStockphoto, Justin.tv, JayCut, Mahalo, Metacafe, Mouthshut.com, MySpace, Newgrounds, Orkut, OpenStreetMap, Picasa, Photobucket, PhoneZoo, Revver, Scribd, Second Life, Shutterstock, Shvoong, Skyrock, Squidoo, TripAdvisor, The Politicus, TypePad, Twitter, Urban Shvoong, Skyrock, Squidoo, TripAdvisor, The Politicus, TypePad, Twitter, Urban Dictionary, Veoh, Vimeo, Widgetbox, Wigix, Wikia, WikiMapia, Wikinvest, Wikipedia, Wix.com, WordPress, Yelp, YouTube, YoYoGames, Zooppa

slide-7
SLIDE 7

Where is the 3D AR?

slide-8
SLIDE 8

The Problem

slide-9
SLIDE 9

3D Modelling is Hard

slide-10
SLIDE 10

The Solution

Images are everywhere Images are everywhere

A good source of 3D information

Easily accessible

They’re typically captured anyway Almost everything has a camera attached

Humans are very good at interpreting them Humans are very good at interpreting them

slide-11
SLIDE 11
slide-12
SLIDE 12

Image/based 3D UCC

The image is the interface The image is the interface

People can’t help but see images in 3D Most image sets embody 3D

Powerful way to model real objects

Varying levels of interaction Varying levels of interaction Varying types of models

Helps even in modelling imaginary objects

slide-13
SLIDE 13

Video is Easy

slide-14
SLIDE 14

A Point Cloud is not Shape

Point clouds are not a useful model of scene Point clouds are not a useful model of scene

shape

Too much information

More than 3 points to a plane

Too little information

No object parameters, boundaries, relationships, textures, @

But they do contain critical shape information

We want to exploit the point cloud and the image

set to get exactly the shape information we require

slide-15
SLIDE 15

Interaction

slide-16
SLIDE 16

Relationships

Specify constraints on Specify constraints on

  • bject parameters

Allow use of sparser point

clouds

Inform interactions

Otherwise higher

dimensional actions can be reduced to 2D reduced to 2D

Increase modelling power

is the key

relationship

slide-17
SLIDE 17

Optimisation

Graph/based Graph/based

One node per object

One observation per object node

Three likelihoods

2D, 3D and User

One node per relationship One node per relationship

Links between relationship node and nodes of

  • bjects related
slide-18
SLIDE 18

Repetition

slide-19
SLIDE 19

Repetition

Specifies a relationship Specifies a relationship

Constrains parameters

Allows modelling of parts

  • f the scene not visible in

the image set

Or parts that don’t exist

Maintains existing

relationships (which relationships (which simplifies interaction)

Complicates optimisation

slide-20
SLIDE 20

Optimisation

Three likelihoods per object node Three likelihoods per object node

3D likelihood is robust 2D likelihood is fragile, but locally accurate User likelihood is globally robust, but locally

inaccurate

One likelihood per relationship node One likelihood per relationship node

Reflects degree to which relationship fulfilled

No loops in the graph

slide-21
SLIDE 21

Optimisation

2D likelihood 2D likelihood 3D likelihood User likelihood

slide-22
SLIDE 22

Results

slide-23
SLIDE 23

Primitive Modelling

There’s not all that much of the world that

can be modelled as a set of cubes can be modelled as a set of cubes

Medium/level primitives

Planes, NURBS surfaces

Simple enough to be flexible High/level enough to be useful High/level enough to be useful

The kinds of primitives that modelling

packages use

slide-24
SLIDE 24

Sketch/based interface?

slide-25
SLIDE 25

Input

slide-26
SLIDE 26

Modelling

slide-27
SLIDE 27

Results

slide-28
SLIDE 28

Modeling

slide-29
SLIDE 29

Results

slide-30
SLIDE 30

Results

slide-31
SLIDE 31

Put your truck into a game

slide-32
SLIDE 32

Put your truck into a game

slide-33
SLIDE 33

Modelling architecture

slide-34
SLIDE 34

Structure from motion

slide-35
SLIDE 35

Line of sight

Fitting planar faces Fitting planar faces

Image plane Object points Image plane

slide-36
SLIDE 36

Hierarchical RANSAC

Generate bounded plane hypotheses Tests

Support from point cloud Reprojects within new image boundaries Constraints on relative edge length and face

size size

Colour histogram matching on faces Colour matching on edge projections Reprojection is not self/occluding

slide-37
SLIDE 37

Extrusion

slide-38
SLIDE 38

Mirroring

slide-39
SLIDE 39

2D Curves

slide-40
SLIDE 40

3D Curves

slide-41
SLIDE 41

A 3D curve from 2 drawn lines

Match 2 hand/drawn curves in 2 images Match 2 hand/drawn curves in 2 images

Curves generally not drawn accurately May not match image features Many/to/many matching

Generally different start and end points

Interactive Interactive

Speed Editing

slide-42
SLIDE 42

A 3D curve from 2 drawn lines

slide-43
SLIDE 43

A 3D curve from 2 drawn lines

slide-44
SLIDE 44

A 3D curve from one drawn line

slide-45
SLIDE 45

A 3D curve from one drawn line

Seek the 3D (scene) curve which best

Seek the 3D (scene) curve which best matches a curve drawn over one image from the set

Drawn curve specifies a set of possible 3D

curves

Image set used to select from amongst that Image set used to select from amongst that

set

Uses an MRF and graph cuts

Much like dense matching using graph cuts

slide-46
SLIDE 46

A 3D curve from one drawn line

slide-47
SLIDE 47

A 3D curve from one drawn line

slide-48
SLIDE 48

A 3D curve from one drawn line

slide-49
SLIDE 49
slide-50
SLIDE 50

Multiple intersecting 3D curves

slide-51
SLIDE 51

Multiple intersecting 3D curves

Can’t estimate intersecting curves Can’t estimate intersecting curves

independently

Curves don’t necessarily intersect at end

points

Anton van den Hengel, Anthony Dick, Thorsten Thormählen, Ben Ward, and Philip H. S. Torr, , Eurographics 2006, September 2006, Vienna, Austria.

slide-52
SLIDE 52

Multiple intersecting 3D curves

Anton van den Hengel, Anthony Dick, Thorsten Thormählen, Ben Ward, and Philip H. S. Torr, , Eurographics 2006, September 2006, Vienna, Austria.

slide-53
SLIDE 53

Multiple intersecting 3D curves

slide-54
SLIDE 54

Multiple intersecting 3D curves

slide-55
SLIDE 55

Dense surface reconstruction

slide-56
SLIDE 56

Video is a 3D medium

slide-57
SLIDE 57

Video editing requires 3D

We have Photoshop, but where is We have Photoshop, but where is

Videoshop?

What is missing is the 3D

slide-58
SLIDE 58

Video editing requires models

slide-59
SLIDE 59

Video editing requires models

slide-60
SLIDE 60

Lighting is 3D, as are materials

  • A. van den Hengel, D. Sale, A. Dick,

, Computer Graphics Forum, 2009

slide-61
SLIDE 61

Modelling for/with AR

slide-62
SLIDE 62

The Problem

slide-63
SLIDE 63

User/created 2D content for AR

slide-64
SLIDE 64

Google/created content for AR

slide-65
SLIDE 65

Live modelling

Most geometry cannot be modelled Most geometry cannot be modelled

beforehand

You can’t tell where it will be Modelling the whole world won’t work

Need to generate models in/situ Need to generate models in/situ

While you’re there

slide-66
SLIDE 66

Videotrace / Live

slide-67
SLIDE 67

Physical Interaction

slide-68
SLIDE 68

Occlusion

slide-69
SLIDE 69

Fun

slide-70
SLIDE 70

Getting Occlusion Right

Occlusion is a key depth cue Occlusion is a key depth cue

But there is always some misalignment

between model and reality

Solve using a live segmentation of the real

  • bject from the video
slide-71
SLIDE 71

Occlusion boundary refinement

slide-72
SLIDE 72

Occlusion boundary refinement

Graph cut gives a Graph cut gives a

hard segmentation

Fix with an alpha

matte

Blends between

foreground and foreground and synthetic object

Fixes some holes in

the cut

slide-73
SLIDE 73

Live modelling for AR

slide-74
SLIDE 74

AR modelling for other purposes

slide-75
SLIDE 75

AR modelling for other purposes

slide-76
SLIDE 76

In/camera special effects

A lot of video goes straight from the A lot of video goes straight from the

camera to distribution

Youtube, news, facebook, MMS, ...

There are a lot of video cameras sitting in

cupboards cupboards

There is a lot of video that’s not worth

watching

slide-77
SLIDE 77

Minimal interaction modelling

Use the camera as the modelling tool Use the camera as the modelling tool

The user only specifies the object, the rest is

done with the camera

Projective texturing

Some compensation for Visual Hull

slide-78
SLIDE 78

Silhouette modelling

slide-79
SLIDE 79

Minimal interaction modelling

slide-80
SLIDE 80

SLAM is Fragile

SLAM is SLAM is

Designed for robot navigation Dependent on continuous tracks Solitary (rather than collaborative) Inflexible Dependant on geometry estimation Overkill

slide-81
SLIDE 81

STAR

Simultaneous Tracking and Recognition Simultaneous Tracking and Recognition

slide-82
SLIDE 82

Training

slide-83
SLIDE 83

A Forest of Ferns

“Fern”

slide-84
SLIDE 84

Classifier Efficiency

slide-85
SLIDE 85

Classifiers as Feature Transforms

  • 1. Train a set of N classifiers
  • 1. Train a set of N classifiers
  • 2. Apply each to the image patch
  • 3. Each element of the feature vector is the
  • utput of one classifier

This means we can use the same classifiers for Recognition and Tracking

slide-86
SLIDE 86

Sharing

slide-87
SLIDE 87
slide-88
SLIDE 88

What’s next?

New interactions, applications and data sources New interactions, applications and data sources Interactive SFM Videoshop

slide-89
SLIDE 89

How to get Videotrace

It’s available on free beta test It’s available on free beta test

Just register at www.punchcard.com.au They will email you a link It’s a real beta

Hopefully the final version will be free too Hopefully the final version will be free too