Tracking wrapup Course recap Tuesday, Dec 1 Announcements Pset 4 - - PDF document

tracking wrapup course recap
SMART_READER_LITE
LIVE PREVIEW

Tracking wrapup Course recap Tuesday, Dec 1 Announcements Pset 4 - - PDF document

12/1/2009 Tracking wrapup Course recap Tuesday, Dec 1 Announcements Pset 4 grades and solutions available today Reminder: Pset 5 due 12/4 extended to 12/8 if needed Reminder: Pset 5 due 12/4, extended to 12/8 if needed Choose


slide-1
SLIDE 1

12/1/2009 1

Tracking wrapup Course recap

Tuesday, Dec 1

Announcements

  • Pset 4 grades and solutions available today
  • Reminder: Pset 5 due 12/4 extended to 12/8 if needed
  • Reminder: Pset 5 due 12/4, extended to 12/8 if needed

– Choose between Section I (short answers) and II (program) – Extra credit only given for Section III

  • Final exam is 12/14 Monday

– Today’s handout has example final exams

  • Thursday in class: exam review
slide-2
SLIDE 2

12/1/2009 2

Previously

  • Tracking as inference

– Goal: estimate posterior of object position given p j p g measurement

  • Linear models of dynamics

– Represent state evolution and measurement models

  • Kalman filters

– Recursive prediction/correction updates to refine measurement

  • General tracking challenges

Last time: Tracking as inference

  • The hidden state consists of the true parameters

we care about, denoted X.

  • The measurement is our noisy observation that

results from the underlying state, denoted Y.

  • At each time step, state changes (from Xt-1 to Xt )

and we get a new observation Y and we get a new observation Yt.

  • Our goal: recover most likely state Xt given

– All observations seen so far. – Knowledge about dynamics of state transitions.

slide-3
SLIDE 3

12/1/2009 3

measurement

Last time: Tracking as inference

Belief: prediction Corrected prediction Belief: prediction

  • ld belief

Time t Time t+1

Last time: Linear dynamic model

  • Describe the a priori knowledge about

– System dynamics model: represents evolution f t t ti ith i

  • f state over time, with noise.

– Measurement model: at every time step we

) ; ( ~

1 d t t

N Σ Dx x

Measurement model: at every time step we get a noisy measurement of the state.

) ; ( ~

m t t

N Σ Mx y

slide-4
SLIDE 4

12/1/2009 4

Last time: Kalman filter

Know prediction of state, and next measurement Know corrected state from previous time step, and all measurements up Receive measurement measurement Update distribution over current state. p to the current one Predict distribution over next state.

Time update (“Predict”) Measurement update (“Correct”)

( )

y y X P

( )

Time advances: t++

( )

1

, ,

− t t

y y X P K

− − t t σ

μ ,

Mean and std. dev.

  • f predicted state:

( )

t t

y y X P , ,

0 K

+ + t t σ

μ ,

Mean and std. dev.

  • f corrected state:

Kalman filter: pros and cons

  • Gaussian densities, linear dynamic model:

+ Simple updates, compact and efficient – But, restricted class of motions defined by linear model – Unimodal distribution = only single hypothesis

) , ( ~ Σ μ x N ) , (μ

slide-5
SLIDE 5

12/1/2009 5

When is a single hypothesis too limiting?

update initial position

y y

prediction

y

measurement

y x x x x

Figure from Thrun & Kosecka

When is a single hypothesis too limiting?

update initial position

y y

prediction

y

measurement

y x x x x

Consider this example: say we are tracking the

Video from Jojic & Frey

y g face on the right using a skin color blob to get our measurement.

slide-6
SLIDE 6

12/1/2009 6

When is a single hypothesis too limiting?

update initial position

y y

prediction

y

measurement

y x x x x

Consider this example: say we are tracking the

Video from Jojic & Frey

y g face on the right using a skin color blob to get our measurement.

Alternative: particle-filtering and non-Gaussian densities

  • Can represent distribution

with set of weighted samples (“particles”)

  • Allows us to maintain

multiple hypotheses.

For details: CONDENSATION -- conditional density propagation for visual tracking, by Michael Isard and Andrew Blake, Int. J. Computer Vision, 29, 1, 5--28, (1998)

slide-7
SLIDE 7

12/1/2009 7

Alternative: particle-filtering and non-Gaussian densities

M it i di t t lti l K l filt f il it t t http://www.robots.ox.ac.uk/~vdg/dynamics.html Visual Dynamics Group, Dept. Engineering Science, University of Oxford, 1998 Monitor is a distractor, multiple hypotheses necessary. Kalman filter fails once it starts tracking the monitor.

Tracking people by learning their appearance

  • D. Ramanan, D. Forsyth, and A. Zisserman. Tracking People by Learning their
  • Appearance. PAMI 2007.

Tracker

Source: Lana Lazebnik

slide-8
SLIDE 8

12/1/2009 8

Tracking people by learning their appearance

Use a part-based model to encode part appearance + relative geometry.

Bottom-up initialization: Clustering

  • D. Ramanan, D. Forsyth, and A. Zisserman. Tracking People by Learning their
  • Appearance. PAMI 2007.

Source: Lana Lazebnik

slide-9
SLIDE 9

12/1/2009 9

Top-down initialization: Exploit “easy” poses

  • D. Ramanan, D. Forsyth, and A. Zisserman. Tracking People by Learning their
  • Appearance. PAMI 2007.

Tracking by model detection

  • D. Ramanan, D. Forsyth, and A. Zisserman. Tracking People by Learning their
  • Appearance. PAMI 2007.
slide-10
SLIDE 10

12/1/2009 10

Example results

http://www.ics.uci.edu/~dramanan/papers/pose/index.html

Example results

slide-11
SLIDE 11

12/1/2009 11

Example results Example results

slide-12
SLIDE 12

12/1/2009 12

Tracking : summary

  • Tracking as inference

– Goal: estimate posterior of object position given measurement measurement

  • Linear models of dynamics

– Represent state evolution and measurement models

  • Kalman filters

– Recursive prediction/correction updates to refine measurement – Single hypothesis can be limiting

  • General tracking challenges
  • Tracking via detection one way to mitigate drift

(though means losing out on prediction help).

Course recap

slide-13
SLIDE 13

12/1/2009 13

Features and filters

Transforming and describing images; textures, colors, edges

Grouping & fitting

[fig from Shi et al]

Clustering, segmentation, fitting; what parts belong together?

slide-14
SLIDE 14

12/1/2009 14

Multiple views

Multi-view geometry, matching, invariant features, stereo vision

Hartley and Zisserman Lowe Fei-Fei Li

Recognition and learning

R i i bj t Recognizing objects and categories, learning techniques

slide-15
SLIDE 15

12/1/2009 15

Motion and tracking

Tracking objects, video analysis, low level ti ti l fl motion, optical flow

Tomas Izo

Computer Vision

  • Automatic understanding of images and video

1 Computing properties of the 3D world from visual

  • 1. Computing properties of the 3D world from visual

data (measurement)

slide-16
SLIDE 16

12/1/2009 16

  • 1. Vision for measurement

Real-time stereo Structure from motion Tracking

NASA Mars Rover Demirdjian et al. Snavely et al. Wang et al.

Computer Vision

  • Automatic understanding of images and video

1 Computing properties of the 3D world from visual

  • 1. Computing properties of the 3D world from visual

data (measurement)

  • 2. Algorithms and representations to allow a machine

to recognize objects, people, scenes, and

  • activities. (perception and interpretation)
slide-17
SLIDE 17

12/1/2009 17

sky amusement park Cedar Point

Objects Activities Scenes Locations Text / writing

The Wicked Twister

  • 2. Vision for perception, interpretation

water Ferris wheel 12 E tree ride ride ride Lake Erie

Text / writing Faces Gestures Motions Emotions…

Twister tree tree carousel deck people waiting in line umbrellas pedestrians maxair bench tree people sitting on ride

Computer Vision

  • Automatic understanding of images and video

1 Computing properties of the 3D world from visual

  • 1. Computing properties of the 3D world from visual

data (measurement)

  • 2. Algorithms and representations to allow a machine

to recognize objects, people, scenes, and

  • activities. (perception and interpretation)
  • 3. Algorithms to mine, search, and interact with visual

g , , data (search and organization)

slide-18
SLIDE 18

12/1/2009 18

  • 3. Visual search, organization

Image or video archives Query Relevant content

Visual data in 1963

  • L. G. Roberts, Machine Perception
  • f Three Dimensional Solids,

Ph.D. thesis, MIT Department of Electrical Engineering, 1963.

slide-19
SLIDE 19

12/1/2009 19

Visual data in 2009

Personal photo albums Movies, news, sports Surveillance and security Medical and scientific images Slide credit; L. Lazebnik

Why vision?

  • As image sources multiply, so do applications

– Relieve humans of boring easy tasks Relieve humans of boring, easy tasks – Enhance human abilities – Advance human-computer interaction, visualization – Perception for robotics / autonomous agents p g – Organize and give access to visual content

slide-20
SLIDE 20

12/1/2009 20

Faces and digital cameras

Setting camera focus via face Camera waits for focus via face detection Camera waits for everyone to smile to take a photo [Canon]

Linking to info with a mobile device

Situated search Yeh et al., MIT kooaba MSR Lincoln

slide-21
SLIDE 21

12/1/2009 21

Video-based interfaces

Assistive technology systems Human joystick NewsBreaker Live Assistive technology systems Camera Mouse Boston College

Vision for medical & neuroimages

fMRI data Image guided surgery Golland et al. Image guided surgery MIT AI Vision Group

slide-22
SLIDE 22

12/1/2009 22

Special visual effects

The Matrix What Dreams May Come

Mocap for Pirates of the Carribean, Industrial Light and Magic Source: S. Seitz

Safety & security

Navigation, driver safety Monitoring pool

(Poseidon)

Surveillance Pedestrian detection MERL, Viola et al.