george awad national institute of standards and
play

George Awad National Institute of Standards and Technology Dakota - PowerPoint PPT Presentation

1 TRECVID 2016 TRECVID-2016 Concept Localization : Overview George Awad National Institute of Standards and Technology Dakota Consulting, Inc 2 TRECVID 2016 Goal Make concept detection more precise in time and space than current


  1. 1 TRECVID 2016 TRECVID-2016 Concept Localization : Overview George Awad National Institute of Standards and Technology Dakota Consulting, Inc

  2. 2 TRECVID 2016 • Goal • Make concept detection more precise in time and space than current shot-level evaluation. • Encourage context independent concepts design to increase their reusability. • Task set up • For each of the 10 new test concepts, NIST provided set of ≈ 1000 shots. • Any shot may or may not contain the target concept. • Task • For each I-Frame within the shot that contains the target, return the x,y coordinates of the (UL,LR) vertices of a bounding rectangle containing all of the target concept and as little more as possible. • Systems were allowed to submit more than 1 bounding box per I- frame but only the ones with maximum f-score were scored.

  3. 3 TRECVID 2016 10 New evaluated concepts Non action concepts New action concepts Animal Bicycling Boy Dancing Baby Instrumental_musician Running Sitting_down Skier Explosion_fire

  4. 4 TRECVID 2016 NIST Evaluation framework • Testing data • IACC.2.A-C (600 h, used between 2013 to 2015 in semantic indexing task). • About 1000 shots per concept were sampled from the ground truth (with true positive (TP) clips of max = 300, avg = 178, min = 12). • Total of 9 587 shots and 2 205 140 i-frames were distributed to systems. • Human assessors were given all the i-frames (total of 55 789 images) of all TP shots to create the ground truth (drawing bounding box around the concept if it exists). • Human assessors had to watch the video clips of the images to verify the concepts.

  5. 5 TRECVID 2016 Evaluation metrics • Temporal localization: precision, recall and f-score based on the judged I-frames. • Spatial localization: precision, recall and f-score based on the located pixels representing the concept. • An average of precision, recall and f-score for temporal and spatial localization across all I-frames for each concept and for each run.

  6. 6 TRECVID 2016 Participants (Finishers: 3 out of 21) • 3 teams submitted 11 runs • TokyoTech (4 runs) • Tokyo Institute of Technology • NII_Hitachi_UIT (3 runs) • National Institute of Informatics; Hitachi, Ltd; University of Information Technology • UTS_CMU_D2DCRC (4 runs) • University of Technology, Sydney; Carnegie Mellon University; D2DCRC

  7. 7 TRECVID 2016 Temporal localization results by run (sorted by F-score) Mean per run across all concepts 1 I-frame F-score 0.9 0.8 I-frame Precision 0.7 I-frame Recall 0.6 0.5 0.4 0.3 0.2 0.1 0

  8. Mean per run across all concepts Mean per run across all concepts 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 1 0.2 0.4 0.6 0.8 TRECVID 2016 CCNY_sub1.result.txt 0 1 CCNY_sub2.result.txt CCNY_sub3.result.txt CCNY_sub4.result.txt insightdcu.DCU_Loc MediaMill_Qualcomm MediaMill_Qualcomm MediaMill_Qualcomm 2013 MediaMill_Qualcomm PicSOM.PicSOM_LO 2015 PicSOM.PicSOM_LO PicSOM.PicSOM_LO PicSOM.PicSOM_LO TokyoTech.run_tokyo TokyoTech.run_tokyo TokyoTech.run_tokyo TokyoTech.run_tokyo Trimps_1.txt Trimps_2_NEG_04.tx Mean per run across all concepts Trimps_3_NEG_NOC 0.2 0.4 0.6 0.8 Trimps_3_NOC_015. 0 1 (mainly objects) 2016 (mainly action) >> 2013 & 2014 Temporal Localization results to systems to localize. ONLY TP shots were given 2014 8

  9. 9 TRECVID 2016 Spatial Localization results by run (sorted by F-score) Mean per run across all concepts 1 0.9 Harder than 0.8 temporal 0.7 localization 0.6 0.5 0.4 0.3 0.2 0.1 0 Mean Pixel F-score Mean Pixel Precision Mean Pixel Recall

  10. Mean per run across all concepts 0.2 0.4 0.6 0.8 Mean per run across all concepts 0 1 TRECVID 2016 0.2 0.4 0.6 0.8 0 1 CCNY_sub1.result.txt CCNY_sub2.result.txt CCNY_sub3.result.txt CCNY_sub4.result.txt insightdcu.DCU_Loca MediaMill_Qualcomm 2013 MediaMill_Qualcomm MediaMill_Qualcomm MediaMill_Qualcomm 2015 PicSOM.PicSOM_LO PicSOM.PicSOM_LO PicSOM.PicSOM_LO PicSOM.PicSOM_LO TokyoTech.run_tokyot TokyoTech.run_tokyot TokyoTech.run_tokyot TokyoTech.run_tokyot Trimps_1.txt Mean per run across all concepts Trimps_2_NEG_04.tx 0.2 0.4 0.6 0.8 Trimps_3_NEG_NOC 0 1 Trimps_3_NOC_015.t 2016 (actions) ~ 2014 (objects) 2016 (actions) > 2013 (objects) Spatial Localization results to systems to localize. ONLY TP shots were given 2014 10

  11. 11 TRECVID 2016 Results per concept top 10 runs Spatial localization Temporal localization 1 1 Median Median 0.9 0.9 10 10 0.8 0.8 9 9 0.7 0.7 Mean F-score 8 F-score 0.6 8 0.6 0.5 0.5 7 7 0.4 0.4 6 6 0.3 0.3 5 5 0.2 0.2 4 4 0.1 0.1 3 3 0 0 2 2 1 1 Most concepts perform better in temporal compared to spatial localization A lot of resemblance between same concepts

  12. 12 TRECVID 2016 Results per concept across all runs Temporal localization Spatial localization 1 1 Mean Recall 0.9 0.9 baby 0.8 0.8 0.7 0.7 Recall Inst_musi 0.6 0.6 0.5 0.5 0.4 0.4 bicycling 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Precision Mean precision Many systems submi-ed a lot of non-target I-frames, while submitted bounding boxes few found a good balance. approximate the size of ground truth boxes and overlap with them. Many systems are good in finding the real box sizes.

  13. 13 TRECVID 2016 General Observations • Consistent observations in the last 4 years ü Temporal localization is easier than spatial localization. ü Systems report approximate G.T box sizes. • Performance of action/dynamic concepts are higher than object concepts tested in 2013 to 2014. • Assessment of action/dynamic concepts proved to be challenging in many cases to the human assessors. • Lower finishing% of teams compared to signups.

  14. 14 TRECVID 2016 Next team talks • TokyoTech • UTS_CMU_D2DCRC

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend