Perceptive Context for Pervasive Computing Trevor Darrell Vision - - PowerPoint PPT Presentation

perceptive context for pervasive computing trevor darrell
SMART_READER_LITE
LIVE PREVIEW

Perceptive Context for Pervasive Computing Trevor Darrell Vision - - PowerPoint PPT Presentation

Perceptive Context for Pervasive Computing Trevor Darrell Vision Interface Group MIT AI Lab Perceptually Aware Displays Camera associated with display Display should respond to user - font size - attentional load Camera - passive


slide-1
SLIDE 1

Perceptive Context for Pervasive Computing Trevor Darrell Vision Interface Group MIT AI Lab

slide-2
SLIDE 2

Perceptually Aware Displays

Camera associated with display Display should respond to user

  • font size
  • attentional load
  • passive acknowledgement

e.g., “Magic Mirror”, Interval Compaq’s Smart Kiosk ALIVE, MIT Media Lab

Camera Display

slide-3
SLIDE 3

Example: A Face Responsive Display

  • Faces are natural interfaces!
  • Ubiquitous, fast, expressive, general.
  • Want machines to generate and perceive faces.
  • A Face Responsive Display...
  • Knows when it’s being observed
  • Recognizes returning observers
  • Tracks head pose
  • Robust to changing lighting, moving backgrounds…
slide-4
SLIDE 4

A Face Responsive Display

Tasks

  • Detection
  • Identification
  • Tracking

How? Exploit multiple visual modalities:

  • Shape
  • Color
  • Pattern
slide-5
SLIDE 5

Tasks and Visual Modalities

shape color pattern detection silhouette classifier skin classifier face detection identification biometrics flesh hue face recognition tracking coarse motion estimation clothing histogram fine motion estimation / pose tracking

slide-6
SLIDE 6

Mode and Task Matrix

shape color pattern detection silhouette classifier skin classifier face detection identification biometrics flesh hue face recognition tracking Shape change clothing histogram Appearance change

slide-7
SLIDE 7

Finding Features

2D Head / hands localization

  • contour analysis: mark extremal points (highest curvature or

distance from center of body) as hand features

  • use skin color model when region of hand or face is found (color

model is independent of flesh tone intensity)

slide-8
SLIDE 8

Flesh color tracking

  • Often the simplest, fastest face detector!
  • Initialize region of hue space

[ Crowley, Coutaz, Berard, INRIA ]

slide-9
SLIDE 9

Color Processing

  • Train two-class classifier with examples of skin and not

skin

  • Typical approaches: Gaussian, Neural Net, Nearest

Neighbor

  • Use features invariant to intensity

Log color-opponent [Fleck et al.]

(log(r) - log(g), log(b) - log((r+g)/2) )

Hue & Saturation

slide-10
SLIDE 10

Flesh color tracking

Can use Intel OpenCV lib’s CAMSHIFT algorithm for robust real-time tracking. (open source impl. avail.!)

[ Bradsky, Intel ]

slide-11
SLIDE 11

Intel’s computer vision library

slide-12
SLIDE 12

Detection with multiple visual modes

Find head sized peaks in 2-D or 3-D. Detect skin pigment in hue-based color space Classify intensity vector corresponding to face class Shape Flesh Color Detection Face Pattern Detection

slide-13
SLIDE 13

Common Detection Failure Modes

Fooled by head shaped peaks Fooled by flesh colored objects Misses out of plane rotation

  • r expression

Shape Flesh Color Detection Face Pattern Detection

slide-14
SLIDE 14

Robust real-time performance

Integrated Face Detection Algorithm (temporally asynch. voting scheme)

Shape Flesh Color Detection Face Pattern Detection

slide-15
SLIDE 15

Mode and Task Matrix

shape color pattern detection silhouette classifier skin classifier face detection identification biometrics flesh hue face recognition tracking Shape change clothing histogram Appearance change

slide-16
SLIDE 16

A Key Technology: Video-Rate Stereo

  • Two cameras −> stereo range estimation; disparity

proportional to depth

  • Depth makes tracking people easy
  • segmentation
  • shape characterization
  • pose tracking
  • Real-time implementations becoming commercially

available.

slide-17
SLIDE 17

Video-rate stereo

Foreground pixels; grouped by local connectivity Computed disparity Left and right images

slide-18
SLIDE 18

RGBZ input

slide-19
SLIDE 19

RGBZ input

slide-20
SLIDE 20

RGBZ input

slide-21
SLIDE 21

Range feature for ID!

  • Body shape characteristics -- e.g., height measure.
  • Normalize for motion/pose: median filter over time
  • Near future: full vision-based kinematic estimation and tracking--

active research topic in many labs.

Trevor Mike Gaile

slide-22
SLIDE 22

Color feature for ID!

For long-term tracking / identification, measure color hue and saturation values of hair and skin…. For same-day ID, use histogram of entire body / clothing Gaile Mike Trevor

slide-23
SLIDE 23

Mode and Task Matrix

shape color pattern detection silhouette classifier skin classifier face detection identification biometrics flesh hue face recognition tracking Shape change clothing histogram Appearance change

See lectures by Trevor later in the course

slide-24
SLIDE 24

Robust, Multi-modal Algorithm

Combine modules for detection:

  • Silhouette finds body
  • Color tracks extremities
  • Pattern discriminates head from hands.

Use each also to recognize returning people:

  • Face recognition
  • Biometrics (skeletal structure)
  • Hair and Skin hue
  • Clothing (intra-day.)

[ CVPR ‘98; T. Darrell, G. Gordon, M. Harville, J. Woodfill ]

slide-25
SLIDE 25

System Overview

slide-26
SLIDE 26

Classic Background Subtraction model

  • Background is assumed to be mostly static
  • Each pixel is modeled as by a gaussian distribution in

YUV space

  • Model mean is usually updated using a recursive low-

pass filter Given new image, generate silhouette by marking those pixels that are significantly different from the “background” value.

slide-27
SLIDE 27

Static Background Modeling Examples

[MIT Media Lab Pfinder / ALIVE System]

slide-28
SLIDE 28

Static Background Modeling Examples

[MIT Media Lab Pfinder / ALIVE System]

slide-29
SLIDE 29

Static Background Modeling Examples

[MIT Media Lab Pfinder / ALIVE System]

slide-30
SLIDE 30

The ALIVE System

User Video Screen Autonomous Agents Camera

slide-31
SLIDE 31

ALIVE

  • Real sensing for virtual world
  • Tightly coupled sensing-behavior-action
  • Vision routines: body/head/hand tracking

Kinematics / Rendering Camera Projector Vision Behaviors / Goals User Agents

[ Blumberg, Darrell, Maes, Pentland, Wren, … 1995 ]

slide-32
SLIDE 32

ALIVE system, MIT

http://vismod.www.media.mit.edu/cgi-bin/tr_pagemaker (TR 257)

slide-33
SLIDE 33

http://vismod.www.media.mit.edu/cgi-bin/tr_pagemaker (TR 257)

slide-34
SLIDE 34

A Face Responsive Display

Video Display Stereo Cameras

slide-35
SLIDE 35

Vision-only Application: Interactive Video Effects

slide-36
SLIDE 36

end