Detecting Activities of Daily Living in First-person Camera Views - PowerPoint PPT Presentation

Detecting Activities of Daily Living in First-person Camera Views Hamed Pirsiavash, Deva Ramanan Computer Science Department, UC Irvine 1

Motivation A sample video of Activities of Daily Living 2

Applications Tele-rehabilitation Long-term at-home monitoring • Kopp et al,, Arch. of Physical Medicine and Rehabilitation. 1997. • Catz et al, Spinal Cord 1997. 3

Applications Life-logging So far, mostly “write-only” memory! This is the right time for computer vision community to get involved. • Gemmell et al, “MyLifeBits: a personal database for everything.” Communications of the ACM 2006. • Hodges et al, “SenseCam: A retrospective memory aid”, UbiComp, 2006. 4

Related work: action recognition There are quite a few video benchmarks for action recognition. UCF sports, CVPR’08 KTH, ICPR’04 Olympics sport, BMVC’10 Hollywood, CVPR’09 UCF Youtube, CVPR’08 VIRAT, CVPR’11 Collecting interesting but natural video is surprisingly hard. It is difficult to define action categories outside “sports” domain 5 5

Wearable ADL detection It is easy to collect natural data 6 6

Wearable ADL detection It is easy to collect ADL actions derived from medical natural data literature on patient rehabilitation 7 7

Outline • Challenges – What features to use? – Appearance model – Temporal model • Our model – “ Active ” vs “passive ” objects – Temporal pyramid • Dataset • Experiments 8 8

Challenges What features to use? Low level features High level features (Weak semantics) (Strong semantics) Human pose Space-time interest points Laptev, IJCV’05 Difficulties of pose: • Detectors are not accurate enough • Not useful in first person camera views 9

Challenges What features to use? Low level features High level features (Weak semantics) (Strong semantics) Space-time interest points Human pose Object-centric features Laptev, IJCV’05 Difficulties of pose: • Detectors are not accurate enough • Not useful in first person camera views 10

Challenges Occlusion / Functional state “Classic” data

Challenges Occlusion / Functional state “Classic” data Wearable data 12

Challenges long-scale temporal structure “Classic” data: boxing

Challenges long-scale temporal structure “Classic” data: boxing Wearable data: making tea Start boiling Do other things Pour in cup Drink tea water (while waiting) time Difficult for HMMs to capture long-term temporal dependencies

“Passive” vs “active” objects Passive Active 16

“Passive” vs “active” objects Passive Active 17

“Passive” vs “active” objects Passive Active Better object detection (visual phrases CVPR’11) Better features for action classification (active vs passive) 18

Appearance feature: bag of objects fridge stove TV SVM classifier fridge stove TV Video clip Bag of detected objects 19

Appearance feature: bag of objects Active Passive Active fridge fridge stove SVM classifier Active Passive Active fridge fridge stove Video clip Bag of detected objects 20

Temporal pyramid Coarse to fine correspondence matching with a multi-layer pyramid Inspired by “Spatial Pyramid” CVPR’06 and “Pyramid Match Kernels” ICCV’05 SVM classifier Video clip Temporal pyramid descriptor 21 time

Sample video with annotations 23

Wearable ADL data collection • 20 persons • 20 different apartments • 10 hours of HD video • 170 degrees of viewing angle • Annotated – Actions – Object bounding boxes – Active-passive objects – Object IDs Prior work: • Lee et al, CVPR’12 • Fathi et al, CVPR’11, CVPR’12 • Kitani et al, CVPR’11 • Ren et al, CVPR’10 24 24

Average object locations Passive Active Active objects tend to appear on the right hand side and closer – Right-handed people are dominant – We cannot mirror-flip images in training Passive Active Passive 25

Experiments Baseline Our model Space-time interest points Object-centric features (STIP) Laptev et al, BMVC’09 24 object categories Low level features High level features 27

Accuracy on 18 action categories • Our model: 40.6% • STIP baseline: 22.8% 28

Accuracy on 18 action categories • Our model: 40.6% • STIP baseline: 22.8% 29

Classification accuracy • Temporal model helps 30 30

Classification accuracy • Temporal model helps • Our object-centric features outperform STIP 31 31

Classification accuracy • Temporal model helps • Our object-centric features outperform STIP • Visual phrases improves accuracy 32 32

Classification accuracy • Temporal model helps • Our object-centric features outperform STIP • Visual phrases improves accuracy • Ideal object detectors double the performance 33 33

Classification accuracy • Temporal model helps • Our object-centric features outperform STIP • Visual phrases improves accuracy • Ideal object detectors double the performance Results on temporally continuous video and taxonomy loss are included in the paper 34 34

Summary Data and code will be available soon! 35

Thanks! 38

Detecting Activities of Daily Living in First-person Camera Views - PowerPoint PPT Presentation

Detecting Activities of Daily Living in First-person Camera Views Hamed Pirsiavash, Deva Ramanan Computer Science Department, UC Irvine 1 Motivation A sample video of Activities of Daily Living 2 Applications Tele-rehabilitation Long-term

Detecting activities of daily living in first person camera views Hamed Pirsiavash, Deva Ramanan

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

Smart Cameras Mark DiVelbiss, Selena Grant, Qing Liu Overview - First Person vs Third Person -

Living the Promise: Living the Promise: Living the Promise: Living the Promise: A Collaborative

Living Actor Living Actor Living Actor - Use Cases Living Actor - Use Cases Use Cases

Modelling Volatility in Financial Time Series: Daily and Intra-daily Data Siem Jan Koopman

NetFlow Analysis: Detecting covert channels on the network Detecting malicious traffic by using

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

First-Person Animation Ryan Duffin Senior Animator at Electronic Arts Twitter: @AnimationMerc

Moor Shore Living Shoreline Project A LIVING SHORELINE EXPERIENCE February, 2019 Moor Shore

Community Living disABILITY Services 1 Community Living disABILITY Services Updates to

English Language - IGCSE Touching the Void (I) Viewpoint 1 st person account vs 3 rd person

{ narrating it. Either as a first person, second person or third person. By: Una Bach Billy

South San Francisco Ferry Service Status Report May 2018 Average Daily Ridership, South San

EUREKA ACTIVITIES DEPARTMENT Jason Green- Activities Director Cindy Hirsch- Activities

PRELIMINARY RESULTS Year ended 31 December 2014 Ladbrokes plc 26 February 2015 Disclaimer

Partnership Basis and Distributions: Navigating Sections 731-737, 751(b) and 755 TUESDAY,

INFORMATION NIGHT Important Notes Please leave any questions to the conclusion of the

1158 Broadway Borough of Manhattan How to Testify: Via Zoom:

Debate In season Shoulder Instability To operate or to wait? Anthony Miniaci MD FRCSC Professor

Presentation and Approval of Amendments to AIBA Statutes Objectives The main purpose of the

Community Asset Transfer Tillydrone Library & Family Centre . Partners & Friends Ag. 16

Advance CertaDrive X Series LED Drivers! June 25 th , 2020 Zach Eagleton & Oscar Velderrain

Detecting Activities of Daily Living in First-person Camera Views - PowerPoint PPT Presentation

Detecting Activities of Daily Living in First-person Camera Views Hamed Pirsiavash, Deva Ramanan Computer Science Department, UC Irvine 1 Motivation A sample video of Activities of Daily Living 2 Applications Tele-rehabilitation Long-term

Detecting activities of daily living in first person camera views Hamed Pirsiavash, Deva Ramanan

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

Smart Cameras Mark DiVelbiss, Selena Grant, Qing Liu Overview - First Person vs Third Person -

Living the Promise: Living the Promise: Living the Promise: Living the Promise: A Collaborative

Living Actor Living Actor Living Actor - Use Cases Living Actor - Use Cases Use Cases

Modelling Volatility in Financial Time Series: Daily and Intra-daily Data Siem Jan Koopman

NetFlow Analysis: Detecting covert channels on the network Detecting malicious traffic by using

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

First-Person Animation Ryan Duffin Senior Animator at Electronic Arts Twitter: @AnimationMerc

Moor Shore Living Shoreline Project A LIVING SHORELINE EXPERIENCE February, 2019 Moor Shore

Community Living disABILITY Services 1 Community Living disABILITY Services Updates to

English Language - IGCSE Touching the Void (I) Viewpoint 1 st person account vs 3 rd person

{ narrating it. Either as a first person, second person or third person. By: Una Bach Billy

South San Francisco Ferry Service Status Report May 2018 Average Daily Ridership, South San

EUREKA ACTIVITIES DEPARTMENT Jason Green- Activities Director Cindy Hirsch- Activities

PRELIMINARY RESULTS Year ended 31 December 2014 Ladbrokes plc 26 February 2015 Disclaimer

Partnership Basis and Distributions: Navigating Sections 731-737, 751(b) and 755 TUESDAY,

INFORMATION NIGHT Important Notes Please leave any questions to the conclusion of the

1158 Broadway Borough of Manhattan How to Testify: Via Zoom:

Debate In season Shoulder Instability To operate or to wait? Anthony Miniaci MD FRCSC Professor

Presentation and Approval of Amendments to AIBA Statutes Objectives The main purpose of the

Community Asset Transfer Tillydrone Library &amp; Family Centre . Partners &amp; Friends Ag. 16

Advance CertaDrive X Series LED Drivers! June 25 th , 2020 Zach Eagleton &amp; Oscar Velderrain

Community Asset Transfer Tillydrone Library & Family Centre . Partners & Friends Ag. 16

Advance CertaDrive X Series LED Drivers! June 25 th , 2020 Zach Eagleton & Oscar Velderrain