columbia university trecvid 2006 high level feature
play

Columbia University TRECVID-2006 High-Level Feature Extraction - PowerPoint PPT Presentation

Columbia University TRECVID-2006 High-Level Feature Extraction Shih-Fu Chang, Winston Hsu, Wei Jiang, Lyndon Kennedy, Dong Xu, Akira Yanagawa, and Eric Zavesky Digital Video and Multimedia Lab, Columbia University


  1. Columbia University TRECVID-2006 High-Level Feature Extraction Shih-Fu Chang, Winston Hsu, Wei Jiang, Lyndon Kennedy, Dong Xu, Akira Yanagawa, and Eric Zavesky Digital Video and Multimedia Lab, Columbia University http://www.ee.columbai.edu/dvmm

  2. Overview – 5 methods & 6 submitted runs 5 methods 1 2 baseline context-based concept fusion 4 3 text feature lexicon-spatial 5 pyramid matching event detection Visual-based 6 runs baseline context LSPM text visual_concept adaptive multi-model_concept adaptive 2

  3. Overview – performance MAP 0.16 visual-text best all visual-based best visual 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 multi-model_ visual_ text LSPM context baseline Every method contributes incrementally to the final detection A_CL1_1 A_CL2_2 A_CL3_3 A_CL4_4 A_CL5_5 A_CL6_6 concept adaptive concept adaptive � context > baseline context-based concept fusion ( CBCF ) improves baseline � LSPM > context lexicon-spatial pyramid matching ( LSPM ) further improves detection � text > LSPM: text features improve visual 3

  4. Overview – performance MAP 0.16 visual-text best all visual-text best all visual-based visual-based best visual best visual 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 multi-model_ visual_ text spatial pyramid context baseline A_CL1_1 A_CL2_2 A_CL3_3 A_CL4_4 A_CL5_5 A_CL6_6 concept adaptive concept adaptive visual_concept adaptive > LSPM (also > context > baseline): best of visual selection works text > multi-model_concept adaptive: best of all selection does not work well probably due to over fitting of text tool 4

  5. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 5

  6. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 6

  7. Individual Methods: (1) Baseline Average fusion of two SVM baseline classification results Based on 3 visual features � color moments over 5x5 fixed grid partitions � Gabor texture � edge direction histogram from the whole image 1 Fixed/Global � Color � Texture … � Edge Support Vector Machines (SVM) coarse local features, layout, and global appearance 7

  8. Individual Methods: (1) Baseline Average fusion of two SVM baseline classification results Based on 3 visual features Features and models � color moments over 5x5 fixed grid partitions available for download � Gabor texture soon! � edge direction histogram from the whole image 2 Fixed/Global � Color � Texture … � Edge ensemble classifier Yanagawa et al., Tec. Rep., Columbia Univ., 2006 , http://www.ee.columbia.edu/dvmm/newPublication.htm 8

  9. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 9

  10. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 10

  11. Individual Methods: (2) CBCF Background on Context Fusion government-leader different person different view Hard/specific concept “Government-Leader” Detector large variance in appearance Context-based Model Government-Leader - + Generic concept Generic concept “Face” Detector “outdoor” Detector Outdoor Face Context Information 11

  12. Individual Methods: (2) CBCF Formulation outdoor detector government-leader detector face detector (outdoor|image) P (government-leader|image) P (face|image) P context-based model (Naphade et al 2002) � � � (government-leader|image) (face|image) (outdoor|image) P P P 12

  13. Individual Methods: (2) CBCF Our approach: Discriminative + Generative I C C C 1 3 2 outdoor detector government-leader detector face detector x x (outdoor|image) x P (government-leader|image) P (face|image) P 1 2 3 observation Conditional Random Field (Jiang, Chang, et al I CI P 2006) outdoor airplane office updated posteriors � � � (government-leader|image) (face|image) (outdoor|image) P P P p y = p y = p y = ( 1| ) ( 1| ) X X ( 1| ) X 3 2 1 13

  14. Individual Methods: (2) CBCF Our approach: Discriminative + Generative I C C C 1 3 2 outdoor detector government-leader detector face detector x x (outdoor|image) x P (government-leader|image) P (face|image) P 1 2 3 observation Conditional Random Field ∏∏ + − = − = = − (1 )/ 2 (1 )/ 2 y y ( 1| ) ( 1| ) J p y X p y X i i min i i I C i iteratively minimized by boosting updated posteriors � � � (government-leader|image) (face|image) (outdoor|image) P P P p y = p y = p y = ( 1| ) ( 1| ) X X ( 1| ) X 3 2 1 14

  15. Individual Methods: (2) CBCF During each iteration t: Classifier 2 keeps updating through iteration two SVM classifiers are trained for each concept: And captures inter-conceptual influences ∏∏ 1. Using input independent detection results + − = − = = − (1 )/ 2 (1 )/ 2 y y ( 1| ) ( 1| ) J p y X p y X i i min i i 2. Using updated posteriors from iteration t-1 I C i iteratively minimized by boosting Without classifier 2, Traditional AdaBoost 15

  16. Individual Methods: (2) CBCF Database & lexicon for context • Predefined lexicon to provide context -- 374 concepts from LSCOM ontology ( observation ) airplane, building, car, boat, person, outdoor, sports, etc • Independent detector -- our baseline • Test concepts -- the 39 concepts defined by NIST ( update posteriors ) 16

  17. Individual Methods: (2) CBCF experimental results over TRECVID 2005 development set 1.2 24 improve context-based fusion independent detector independent detector Boosted CRF 15 degrade 1 0.8 AP 0.6 0.4 0.2 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 17

  18. Selective Application of Context • Not every concept classification benefits from context-based fusion Consistent with previous context-based fusion: IBM: no more than 8 out of 17 concepts gained performance [Amir et al., TRECVID Workshop, 2003] Mediamill: 80 out of 101 concepts [Snoek et al., TRECVID Workshop, 2005] • Is there a way to predict when it works? 18

  19. Predict When Context Helps Why CBCF may not help every concept ? � Complex inter-conceptual relationships vs. limited training samples � Strong classifiers may suffer from fusion with weak context Avoid using CBCF for if is strong and with weak context C C i i Use CBCF for concept if C C is weak or with strong context i i ( ; ) I C C C C -- mutual information between and i i j j ( ) E C C -- error rate of independent detector for i i ∑ ( ; ) ( ) I C C E C j i j ≠ > λ , < β C j i ( ) E C j ∑ or i ( ; ) I C C j i ≠ , C j i j weak concept 19 Strong context

  20. Predict When Context Helps Change parameters to predict different number of concepts # predicted # concept improved precision of prediction MAP gain 62% 3.0% 39 24 9.5% 20 15 75% 88% 14% 16 14 9 9 100% 7.2% 20

  21. Example Fighter_Combat Military I ndividual House . . . 21

  22. I ndependent Detector Example 22

  23. Context-based concept fusion Example 23

  24. Context-based concept fusion Example House 24

  25. Positive frames are moved forward Context-based concept fusion with the help of Fighter_Combat Example 25

  26. Context-Based Fusion + Baseline TRECVI D 2005 development set R6 R5 All get improved ! baseline context 1 0.9 MAP Gain: 0.8 14% 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 26

  27. Context-Based Fusion + Baseline TRECVI D 2006 evaluation 4 concepts Similar to results over TRECVI D 2005 set ! 0.3 baseline context 0.25 0.2 AP 0.15 0.1 0.05 0 1 2 3 4 27

  28. Discussion The smaller the better ∑ ( ; ) ( ) I C C E C j i j ≠ , C j i j Quality of context: ∑ ( ; ) I C C j i ≠ , C j i j Concepts with performance improved: 3.23 Concepts with performance degraded: 4.17 Adding context – strong relationship and robust 28

  29. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 29

  30. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 30

  31. Individual Methods: (3) LSPM Local features (SIFT) Spatial layout sky tree water Spatial Pyramid Matching (SPM) [ Lazebnik et al. CVPR, 2006 ] multi-resolution histogram matching in spatial domain, bags-of-features Appropriate size for visual lexicon ? Lexicon-Spatial Pyramid Matching (LSPM) SPM matching guided by multi-resolution lexicons 31

  32. Individual Methods: (3) LSPM SI FT features t n t 1 t 2 t 3 t 4 t 5 Lexicon level 0 Lexicon t 2_2 t 4_2 t 5_2 level 1 t n_2 t 1_1 t 1_2 t 3_1 t 3_2 32 t 4_1 t 5_1 . . . t n_1 t 2_1

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend