much ado about time exhaustive annotation of temporal data
play

Much Ado About Time: Exhaustive Annotation of Temporal Data Gunnar - PowerPoint PPT Presentation

Much Ado About Time: Exhaustive Annotation of Temporal Data Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, Abhinav Gupta Datasets drive computer vision progress Need: Computer vision capabilities (1) Dense, detailed,


  1. Much Ado About Time: Exhaustive Annotation of Temporal Data Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, Abhinav Gupta

  2. Datasets drive computer vision progress Need: Computer vision capabilities (1) Dense, detailed, ImageNet multi-label [Deng ’09] annotations PASCAL VOC (2) Large-scale [Everingham ’07] Caltech 101 annotated [Fei-Fei ‘04] video datasets Algorithms: [Deng ’10], [Sanchez ’11], [Lin ’11], [Krizhevsky ’12], Algorithms: [Zeiler ’13], [Wang ’13], [Chum ’07], [Felzenszwalb ’08], [Sermanet ’13], [Simonyan ’14], Algorithms: [Wang ’09], [Harzallah ’09], [Lin ’14],[Girshick ’14], [Berg ’05], [Grauman ’05], [Bourdev ’09], [Vedaldi ’09], [Szegedy ’14], [He ’15], … [Zhang ’06], [Lazebnik ’06], [Lin ’09], [Lampert ’09], [Jain ’08], [Boiman ’08], [Carreira ’10], [Wang ’10], [Yang ’09], [Maji ’09] [Song ’11], [vanDeSande ’11], … [Wang ’10], [Zhou ’10], [Feng ’11], [Jiang ’11], … Dataset scale and complexity M UCH ADO ABOUT TIME : E XHAUSTIVE ANNOTATION OF TEMPORAL DATA HTTP :// ALLENAI . ORG / PLATO / CHARADES /

  3. Multi-label video annotation puts 100-200 opens book turns on book on shelf walks stove eats sits down sneezes labels - - - + - - - - + + - - - + - - + - - + - - - - - + - - + - + - - + - 10,000 videos M UCH ADO ABOUT TIME : E XHAUSTIVE ANNOTATION OF TEMPORAL DATA HTTP :// ALLENAI . ORG / PLATO / CHARADES /

  4. Multi-label video annotation puts opens book on turns on book shelf walks stove eats sits down sneezes ? - - + - - - ? + + - - - + ? - + - - + - ? - - - + - - ? - + - - + - M UCH ADO ABOUT TIME : E XHAUSTIVE ANNOTATION OF TEMPORAL DATA HTTP :// ALLENAI . ORG / PLATO / CHARADES /

  5. Multi-label video annotation puts opens book on turns on book shelf walks stove eats sits down sneezes ? ? ? ? ? ? ? - + + - - - + - - + - - + - - - - - + - - + - + - - + - M UCH ADO ABOUT TIME : E XHAUSTIVE ANNOTATION OF TEMPORAL DATA HTTP :// ALLENAI . ORG / PLATO / CHARADES /

  6. Which interface is better? One-label All-labels ☐ Opens book ☐ Opens book ☐ Puts book on shelf ☐ Walks ☐ Turns on stove vs ☐ Eats ☐ Sits down … Repeat N times for N labels Expect better annotation Expect better annotation accuracy time M UCH ADO ABOUT TIME : E XHAUSTIVE ANNOTATION OF TEMPORAL DATA HTTP :// ALLENAI . ORG / PLATO / CHARADES /

  7. Which interface is better? All-labels One-label ☐ Opens book Data: 140 videos, each ~30 secs long ☐ Puts book on shelf ☐ Opens book ☐ Walks Labels: 52 human actions ☐ Turns on stove … Charades dataset of [Sigurdsson ECCV 2016] Experiment on Amazon Mechanical Turk Repeat N times for N labels Time Accuracy Many-labels is better Few-labels is better [Miller PsychologyReview 1956] M UCH ADO ABOUT TIME : E XHAUSTIVE ANNOTATION OF TEMPORAL DATA HTTP :// ALLENAI . ORG / PLATO / CHARADES /

  8. Many-labels is better Improving annotation time Consistency in the few-labels setting Ask same worker about the same actions for multiple videos => 13.6% reduction in annotation time ☐ Opens book ☐ Opens book ☐ Opens book Worker 1: vs ☐ Opens book ☐ Walks ☐ Sits down Worker 1: Play video at 2x speed [Lasecki UIST 2014] Semantic hierarchy of labels [Deng CHI 2014] M UCH ADO ABOUT TIME : E XHAUSTIVE ANNOTATION OF TEMPORAL DATA HTTP :// ALLENAI . ORG / PLATO / CHARADES /

  9. Few-labels is better Improving recall Video summary Request a 20-word description of the video Many-labels ☐ Opens book => no effect on recall, 40% slower ☐ Puts book on shelf ☐ Walks ☐ Turns on stove ☐ Eats ☐ Sits down Forced response ☐ Sneezes ☐ Picks up a cup Request a yes/no response for every label ☐ Holds a dish … => actually drops recall! (annoys workers?) Consensus annotation Rely on multiple rounds of annotation with different workers => recall improves from 58.0% to 83.3% with 3 rounds [Krishna CHI 2016] M UCH ADO ABOUT TIME : E XHAUSTIVE ANNOTATION OF TEMPORAL DATA HTTP :// ALLENAI . ORG / PLATO / CHARADES /

  10. Bringing it all together Data: 1,815 videos, each ~30 secs long, 2x speed Labels: 157 human actions, organized into a hierarchy with 52 high-level actions Charades dataset of [Sigurdsson ECCV 2016] Experiments on Amazon Mechanical Turk Label is positive if >= 1 worker marks it as positive 100 100 7 rounds 95 90 Many-label Precision interface (26) 90 80 Recall 1st 3 rounds 85 Few-label round 7 rounds 70 interface (5) 80 60 75 3 rounds 1st round 50 70 0 5 10 0 5 10 Average time to ann Average time to an Cumulative time [min] Cumulative time [min] M UCH ADO ABOUT TIME : E XHAUSTIVE ANNOTATION OF TEMPORAL DATA HTTP :// ALLENAI . ORG / PLATO / CHARADES /

  11. Conclusions • Quantitative analysis of multi-label video annotation • Many-labels interface is better than the few-labels interface • Annotated of 157 human actions on 9,848 videos (incl. temporal extent) Download dataset at http://allenai.org/plato/charades Actions Video (3x speed) M UCH ADO ABOUT TIME : E XHAUSTIVE ANNOTATION OF TEMPORAL DATA HTTP :// ALLENAI . ORG / PLATO / CHARADES /

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend