Egocentric Videos Yair Poleg Chetan Arora Shmuel Peleg CVPR - - PowerPoint PPT Presentation

▶

Apr 07, 2023 185 likes •486 views

Temporal Segmentation of Egocentric Videos Yair Poleg Chetan Arora Shmuel Peleg CVPR 2014 Presenter: Hsin-Ping Huang Egocentric Video Policeman UN Inspectors in Syria Google Glass Browsing long unstructured

SLIDE 1

Temporal Segmentation of Egocentric Videos

Yair Poleg Chetan Arora Shmuel Peleg CVPR 2014

Presenter: Hsin-Ping Huang

SLIDE 2

Browsing long unstructured videos is

time consuming!

Video

Egocentric Video

Policeman UN Inspectors in Syria Google Glass

SLIDE 3

Video credit: HUJI EgoSeg Dataset

SLIDE 4

Related Work

Understanding Objects and Activities Unsupervised Segmentation

Clustering: no semantic meanings Hard to generalize Short-term: seconds Long-term: minutes/hours [Fathi et al., ICCV 2011] [Ryoo et al., CVPR 2013] [Kitani et al., CVPR 2011]

SLIDE 5

Related Work

Story-Driven Summarization

[Lu et al., CVPR 2013]

SLIDE 6

Contribution

Do temporal segmentation into hierarchy of

motion classes

Detect fixation of wearer’s gaze

SLIDE 7

Difficulty

Two sources of information

– Motion of the wearer – objects and activities

Hard to find ego-motion

– Head rotation – Depth variations – Dynamic objects

Feature Tracking Optical Flow Image credit: Voodoo Camera Tracker (top)

SLIDE 8

Classification of Wearer’s Motion

SLIDE 9

Instantaneous Displacement (ID)

Compute the ID at patches

Instantaneous Displacement of One Patch forward motion Motion Detector

SLIDE 10

Cumulative Displacement (CD)

Compute the CD by integrating the ID

horizontal

utside scene: expanding curve

inside scene: horizontal expanding curve right of focus left of focus

SLIDE 11

Motion Vector and Radial Projection Response

Focus of expansion

Compute motion vectors as the slopes of smoothed CDs
Compute radial projection response
Video

< φ ?

SLIDE 12

Video credit: Shmuel Peleg

SLIDE 13

large radially outwards mix small Global Motion Head Motion Instantaneous Displacement Vectors Motion Vectors

Walking Standing Riding Bus

Motion Vector and Radial Projection Response

Radial Projection Response low high low Outside Region

SLIDE 14

Feature

AVG of top/bottom 6% motion vectors
DIFF of top/bottom 6% motion vectors
AVG of motion vectors
Motion vectors
# of successful flow computation
AVG and SD of instantaneous displacements
Radial projection response

SLIDE 15

Train SVM classifiers for each binary classification

task in the proposed class hierarchy

Classifier

SLIDE 16

Detecting Period of Gaze Fixation

SLIDE 17

Gaze

Cumulative Displacement

left motion right motion Smoothed CD Curve Original CD Curve

SLIDE 18

Cumulative Difference

Motion Detector Threshold > 1 standard deviation higher peaks Gaze Gaze Hypothesis Threshold > 80%

Compute the cumulative difference

positive negative

+

SLIDE 19

Experiment

SLIDE 20

Dataset

> 65 hours egocentric videos
Manually annotated as one of the leaf classes
Video

SLIDE 21

Video credit: HUJI EgoSeg Dataset

SLIDE 22

Classification of Wearer’s Motion

leaf node accuracy inner node accuracy Sitting vs Standing Bus vs Standing Average: 70% Best: 97%

SLIDE 23

Valid gaze fixation: a head fixation > 5 seconds

Detecting Period of Gaze Fixation

SLIDE 24

Conclusion

SLIDE 25

Mixed features from adjacent activities

– Short-term sitting when riding

Weakness

SLIDE 26

Mixed activities
Ambiguity in gaze fixation

– A left and right turn in quick succession – A person turns in place

Weakness

Waiting in line = Standing + Walking Riding an open train = Open or Riding ? Standing while coming into the station = Static or Box ?

SLIDE 27

Strength

Simple, efficient and robust
Use only the recorded video
Make no assumptions on the scene structure
Focus on long-term activities to prevent over-

segmentation of the video

SLIDE 28

Extension

Use bilateral filter to find long-term trends
Use a regularization framework like MRF on

the classification results

Handle the ambiguity in gaze fixation
Combine with external sources such as GPS

and inertial sensors

Generalize to detect short-term activities
Aid video summarization