LOCAL ACTION RECOGNITION PROBLEM ACTION PRIMITIVES, NOT SEQUENCES, - - PowerPoint PPT Presentation

local action recognition problem action primitives not
SMART_READER_LITE
LIVE PREVIEW

LOCAL ACTION RECOGNITION PROBLEM ACTION PRIMITIVES, NOT SEQUENCES, - - PowerPoint PPT Presentation

School of Informatics, University of Edinburgh School of Informatics, University of Edinburgh LOCAL ACTION RECOGNITION PROBLEM ACTION PRIMITIVES, NOT SEQUENCES, DISTINGUISH THESE 18 ACTIONS, GIVEN EG. A HAND WAVE ONLY ABOUT 10 CONSECUTIVE


slide-1
SLIDE 1

School of Informatics, University of Edinburgh

LOCAL ACTION RECOGNITION ACTION PRIMITIVES, NOT SEQUENCES,

  • EG. A HAND WAVE

TEMPORALLY LOCAL/SHORT-TERM IMAGE ANALYSIS/INSTANTANEOUS APPEARANCE BASED/VIEWPOINT SPECIFIC DAVIS & BOBICK

ECVision Summer School: 4 - Local Action Recognition Fisher slide 1 School of Informatics, University of Edinburgh

PROBLEM DISTINGUISH THESE 18 ACTIONS, GIVEN ONLY ABOUT 10 CONSECUTIVE FRAMES

ECVision Summer School: 4 - Local Action Recognition Fisher slide 2 School of Informatics, University of Edinburgh

CLASSIFICATION OF APPROACHES

USING HUMANOID GEOMETRIC MODELS IMAGE APPEARANCE CHANGES 2D MODELS 3D MODELS ACTION RECOGNITION

ECVision Summer School: 4 - Local Action Recognition Fisher slide 3 School of Informatics, University of Edinburgh

KEY STEPS DETAILS TO COME

  • 1. BACKGROUND SUBTRACTION &

THRESHOLDING

  • 2. TEMPORAL SEGMENTATION
  • 3. ACCUMULATE MEI & MHI
  • 4. COMPUTE 14 MOMENT INVARIANTS
  • 5. MAHALANOBIS CLASSIFIER

ECVision Summer School: 4 - Local Action Recognition Fisher slide 4

slide-2
SLIDE 2

School of Informatics, University of Edinburgh

MOTION ENERGY IMAGE (MEI) GIVEN D(x, y, t): THRESHOLDED DIFFERENCE BETWEEN FRAME t AND BACKGROUND AT PIXEL (x, y) MEI: M(x, y, t) =

τ−1

  • i=0

D(x, y, t) “WHERE” MOTION OCCURS

ECVision Summer School: 4 - Local Action Recognition Fisher slide 5 School of Informatics, University of Edinburgh

MOTION ENERGY IMAGE EXAMPLE

ECVision Summer School: 4 - Local Action Recognition Fisher slide 6 School of Informatics, University of Edinburgh

VIEW BASED REPRESENTATION MULTIPLE 2D VIEWS, NOT 3D SAMPLE AZIMUTH EVERY 30 DEGREES IN TRAINING

ECVision Summer School: 4 - Local Action Recognition Fisher slide 7 School of Informatics, University of Edinburgh

MOTION HISTORY IMAGE (MHI) GIVEN D(x, y, t): THRESHOLDED FRAME DIFFERENCE MHI: H(x, y, t) =

   

τ IF D(x, y, t) = 1 max(0, H(x, y, t − 1) − 1) ELSE “ORDER” MOTION OCCURS

ECVision Summer School: 4 - Local Action Recognition Fisher slide 8

slide-3
SLIDE 3

School of Informatics, University of Edinburgh

MHI EXAMPLE MORE RECENT PIXELS BRIGHTER

ECVision Summer School: 4 - Local Action Recognition Fisher slide 9 School of Informatics, University of Edinburgh

HU’S MOMENT INVARIANTS MOMENT INVARIANT (HERE) IS A NUMERICAL PROPERTY OF A WHOLE IMAGE USED TO SUMMARIZE MEI AND MHI IMAGES HU’S INVARIANTS INDEPENDENT OF: TRANSLATION, ROTATION, SCALE, INVERSION

ECVision Summer School: 4 - Local Action Recognition Fisher slide 10 School of Informatics, University of Edinburgh

INITIAL VALUES LET I(x, y) BE THE INITIAL IMAGE (BINARY OR GREY) AREA: N =

I(x, y)

CENTER OF MASS: cx = 1

N xI(x, y)

cy = 1

N yI(x, y)

ECVision Summer School: 4 - Local Action Recognition Fisher slide 11 School of Informatics, University of Edinburgh

CENTRAL MOMENTS TRANSLATION INVARIANT: mpq =

(x − cx)p(y − cy)qI(x, y)

ADD SCALE INVARIANCE µpq = mpq N (p+q)/2+1

ECVision Summer School: 4 - Local Action Recognition Fisher slide 12

slide-4
SLIDE 4

School of Informatics, University of Edinburgh

ADDING ROTATION INVARIANCE 7 MOMENT INVARIANTS: I1 = (µ20)2 + (µ02)2 I2 = (µ20 − µ02)2 + 4(µ11)2 I3 = (µ30 − 3µ12)2 + (µ03 − 3µ21)2 I4 = (µ30 + µ12)2 + (µ03 + µ21)2 . . . USEFUL TO NORMALIZE I′

n = f(In) TO

SIMILAR VALUE RANGE CAN BE SENSITIVE TO NOISE AND IMAGE QUANTIZATION

ECVision Summer School: 4 - Local Action Recognition Fisher slide 13 School of Informatics, University of Edinburgh

ENCODING & MATCHING MEI & MHI EACH FRAME ENCODED h WITH 14 VALUES: 7 HU MOMENTS FOR MEI & MHI DO FOR ALL TRAINING SEQUENCES i OVER ALL ACTIONS a ∈ A AND ALL VIEWS v ∈ V: { havi} COMPUTE bav = meani({ havi}) OVER MULTIPLE EXAMPLES COMPUTE COVARIANCE MATRIX Rav

ECVision Summer School: 4 - Local Action Recognition Fisher slide 14 School of Informatics, University of Edinburgh

ACTION RECOGNITION FOR AN UNKNOWN FRAME WITH DESCRIPTION x, PICK THE ACTION a AND VIEWPOINT v MINIMIZING MAHALANOBIS DISTANCE: ( x − bav)′R−1

av (

x − bav)

ECVision Summer School: 4 - Local Action Recognition Fisher slide 15 School of Informatics, University of Edinburgh

EXP 1: 1 VIEW, 18 CLASSES

ECVision Summer School: 4 - Local Action Recognition Fisher slide 16

slide-5
SLIDE 5

School of Informatics, University of Edinburgh

EXP 2: 2 VIEWS, 18 CLASSES

ECVision Summer School: 4 - Local Action Recognition Fisher slide 17 School of Informatics, University of Edinburgh

TEMPORAL SEGMENTATION HOW TO CHOOSE TEMPORAL WINDOW SIZE FOR MEI/MHI COMPUTATION? HAS EFFICIENT SCHEME FOR COMPUTING SEVERAL τ = 11 . . . 19 (1-2 SEC) TRIED ALL τ, USED BEST RESULT? LOTS OF MISSING DETAILS

ECVision Summer School: 4 - Local Action Recognition Fisher slide 18 School of Informatics, University of Edinburgh

OPEN ISSUES ACTION TRANSITIONS RECOGNIZING ACTION AT MIDDLE FRAME INSTEAD OF END SOME MOMENTS FRAGILE, MAYBE NOT ALL USEFUL SOME TEMPORAL QUANTIZATION EFFECTS OBVIOUS IN DATA

ECVision Summer School: 4 - Local Action Recognition Fisher slide 19 School of Informatics, University of Edinburgh

WHAT WE HAVE LEARNED

  • 1. LOCAL SPATIO-TEMPORAL

REPRESENTATION

  • 2. MOTION DESCRIPTIONS GIVE GOOD

HYPOTHESIS ABOUT CURRENT ACTIVITY, IN A RESTRICTED DOMAIN

  • 3. APPEARANCE BASED
  • 4. LOTS OF IMPROVEMENTS POSSIBLE

ECVision Summer School: 4 - Local Action Recognition Fisher slide 20

slide-6
SLIDE 6

School of Informatics, University of Edinburgh

Lecture Problem HOW CAN WE DETERMINE HOW MANY VIEWPOINTS SHOULD BE USED IN THE REPRESENTATION?

ECVision Summer School: 4 - Local Action Recognition Fisher slide 21