C Context-based Visual Concept Context C t t t b t based - - PowerPoint PPT Presentation

c context based visual concept context c t t t b t based
SMART_READER_LITE
LIVE PREVIEW

C Context-based Visual Concept Context C t t t b t based - - PowerPoint PPT Presentation

C Context-based Visual Concept Context C t t t b t based Visual Concept b d Vi d Vi l C l C t t Detection Using Domain Adaptive Detection Using Domain Adaptive Detection Using Domain Adaptive Detection Using Domain Adaptive


slide-1
SLIDE 1

C t t C t t b d Vi l C t b d Vi l C t Context Context-based Visual Concept based Visual Concept Detection Using Domain Adaptive Detection Using Domain Adaptive Detection Using Domain Adaptive Detection Using Domain Adaptive Semantic Diffusion Semantic Diffusion

Yu-Gang Jiang‡, Jun Wang‡, Shih-Fu Chang‡, Chong-Wah Ngo†

† VIREO Research Group (VIREO), City University of Hong Kong ‡ Digital Video and Multimedia Lab (DVMM), Columbia University 1

NIST TRECVID Workshop, Nov. 2009

slide-2
SLIDE 2

Overview: framework

Local Feature Global Feature SVM Classifiers 6 5 VIREO-374: 374 LSCOM

Domain Adaptive S ti Diff i

1‐4 374 LSCOM concept detectors

Semantic Diffusion

slide-3
SLIDE 3

Overview: performance p

0.25 0.20

Precision

DASD Local + global features

0 10 0.15

d Average

g Local feature alone

0.05 0.10

ean Inferred

0.00

Me

222 system runs

Local feature is still the most powerful component (MAP=0.150) Global features help a little bit (MAP=0.156) DASD further contributes incrementally to the final detection

3

DASD further contributes incrementally to the final detection

slide-4
SLIDE 4

Overview: framework

Local Feature Global Feature SVM Classifiers 6 5 VIREO-374: 374 LSCOM

Domain Adaptive S ti Diff i

1‐4 374 LSCOM concept detectors

Semantic Diffusion

slide-5
SLIDE 5

Local feature representation p

space SIFT feature S

5

Chang et al TRECVID 2008; Jiang, Yang, Ngo & Hauptmann, IEEE TMM, to appear

slide-6
SLIDE 6

Context-based concept detection p

Local Feature Global Feature SVM Classifiers 6 5 VIREO-374: 374 LSCOM

DASD: Domain Adaptive Semantic

1‐4 374 LSCOM concept detectors

Adaptive Semantic Diffusion

1 4

slide-7
SLIDE 7

DASD - motivation

  • Most existing methods aim at the

Most existing methods aim at the assignment of concept labels individually

but concepts do not occur in isolation! – but concepts do not occur in isolation!

military personnel smoke building explosion_fire vehicle road

  • utdoor

7

slide-8
SLIDE 8

DASD - motivation

  • Most existing methods aim at the

Most existing methods aim at the assignment of concept labels individually

but concepts do not occur in isolation! – but concepts do not occur in isolation!

  • Domain change between training and

Broadcast

testing data was not considered

News Videos

8

Documentary Videos

slide-9
SLIDE 9

DASD - overview

road vehicle

0.05 0.01

water

0.11

sky

0.01 0.19 0.80 0.12 0.91 0.58 0.10 0.36 0.53 0.46 0.13 0.18 0.05 0.13 0.02 0.17 0.23 0.05 0.02 0.23

9

Jiang, Wang, Chang & Ngo, ICCV 2009

slide-10
SLIDE 10

DASD - overview

  • Domain adaptive

p semantic diffusion (DASD)

road

0.1 0.2 0.8 0.5 0.1 …

( )

– Semantic graph

  • Nodes are concepts

0.4 0.0 0.1 0.9

  • Edges represent

concept correlation

G h diff i

vehicle

0.1 0.6 0.1 0.0 0.4 0.5 0.2 0.1 … 0.3

– Graph diffusion

  • Smooth concept

detection scores w.r.t

Water sky

0.1 0.0 … 0.8 0.2 0.8 … 0.7

detection scores w.r.t the concept correlation

10

slide-11
SLIDE 11

DASD - formulation

  • Energy function

gy

f l Detection score of concept ci on test samples Concept affinity Concept affinity

11

slide-12
SLIDE 12

DASD - formulation (cont.) ( )

  • Gradually smooth the function makes the
  • Gradually smooth the function makes the

detection scores in accordance with the t l ti hi concept relationships

Detection score smoothing process

12

slide-13
SLIDE 13

DASD - formulation (cont.) ( )

  • Graph adaptation

p p

Graph adaptation process

13

slide-14
SLIDE 14

Graph adaptation - example

WEAPON WEAPON WEAPON WEAPON WEAPON WEAPON

0.24

CLOUDS

0.18

CLOUDS

0.12

CLOUDS

0.05

CLOUDS

0.00

CLOUDS

0.00

CLOUDS

0 16

DESERT CLOUDS

0.10 0 16

DESERT CLOUDS

0.13 0 15

DESERT CLOUDS

0.16 0.15

DESERT CLOUDS

0.19 0 15

DESERT CLOUDS

0.24 0.15

DESERT CLOUDS

0.27 0.16

VEHICLE

0.64 0.20 0.29

SKY CAR

0.16

VEHICLE

0.64 0.19 0.32

SKY CAR

0.15

VEHICLE

0.64 0.17 0.34

SKY CAR

0.15

VEHICLE

0.64 0.17 0.38

SKY CAR

0.15

VEHICLE

0.64 0.16 0.42

SKY CAR

0.15

VEHICLE

0.64 0.16 0.43

SKY CAR

0.09

VEHICLE SKY PARKING_LOT

0.09

VEHICLE SKY PARKING_LOT

0.09

VEHICLE SKY PARKING_LOT

0.08

VEHICLE SKY PARKING_LOT

0.08

VEHICLE SKY PARKING_LOT

0.08

VEHICLE SKY PARKING_LOT

Iteration: 8 Iteration: 12 Iteration: 0 Iteration: 4 Iteration: 16 Iteration: 20

Broadcast news video domain Documentary video domain

14

slide-15
SLIDE 15

Experiments on TV ’05-’07 p

  • Baseline detectors

– VIREO-374

  • Graph construction:
  • Graph construction:

– Ground-truth labels on TRECVID 2005

TRECVID 05/06 (Broadcast News Videos) TRECVID 07 (Documentary Videos)

SPORTS SPORTS WEATHER WEATHER WALKING WALKING PEOPLE PEOPLE MAP MAP SPORTS WEATHER OFFICE CLASSROOM BUS OFFICE OFFICE BUILDING BUILDING DESERT DESERT MOUNTAIN MOUNTAIN PEOPLE PEOPLE- MARCHING MARCHING EXPLOSION EXPLOSION-

  • FIRE

FIRE TRUCK TRUCK

  • CORP. LEADER
  • CORP. LEADER

DESERT DESERT MOUNTAIN MOUNTAIN WATER WATER POLICE POLICE MILITARY MILITARY ANIMAL ANIMAL TWO PEOPLE TWO PEOPLE NIGHT TIME NIGHT TIME TELEPHONE TELEPHONE STREET STREET

15

slide-16
SLIDE 16

Results on TV ’05-’07

  • Performance gain on TRECVID 05-07

g Datasets

TRECVID‐ 2005 2006 2007 # of evaluated concepts 39 20 20 Baseline (MAP) 0.166 0.154 0.099 SD 11.8% 15.6% 12.1% DASD 11 9% 17 5% 16 2% DASD 11.9% 17.5% 16.2%

SD: semantic diffusion (without graph adaptation) SD: semantic diffusion (without graph adaptation) Consistent improvement over all 3 data sets DASD: domain adaptive semantic diffusion

16

Graph adaptation further improves the performance

slide-17
SLIDE 17

Results on TV ’05-’07 (cont.) ( )

TRECVID 2006 Test Data

0.5

Baseline Semantic Graph Diffusion

0.3 0.4

Precision

0.1 0.2

Average P

C i ith th t t f th t TRECVID Jiang et al Aytar et al Weng et al DASD 2005 2.2% 4.0% N/A 11.9% Comparison with the state‐of‐the‐arts

17

2005 2.2% 4.0% N/A 11.9% 2006 N/A N/A 16.7% 17.5%

slide-18
SLIDE 18

Results on TRECVID ’09

0.4 0.3 0.35

A_vireo.localglobal_5 A_vireo.dasd20fcs_2

0.2 0.25 0.1 0.15

30% 10%

5%

0.05

18

slide-19
SLIDE 19

Results on TRECVID ’09 (cont.) ( )

  • Quality of contextual detectors (VIREO-374)

0.25

  • n

y ( )

5% DASD performance gain

0 15 0.20

ge Precisio

TV09 detectors

16%

0.10 0.15

rred Avera

Context TV06 d t t TV07 detectors

18%

0.05

Mean Infer

VIREO-374 detectors

0.00

222 system runs

19

slide-20
SLIDE 20

DASD - computational time p

  • Complexity is O(mn)

Complexity is O(mn)

– m: # concepts; n: # video shots

O l 2 illi d h t/k f !

  • Only 2 milliseconds per shot/keyframe!

TRECVID 05 TRECVID 06 TRECVID 07 SD 59s 84s 12s DASD 89s 165s 28s

20

slide-21
SLIDE 21

Summary

  • A well-designed approach using local features

achieves good results for concept detection achieves good results for concept detection.

  • Context information is helpful !

– Domain adaptive semantic diffusion

  • effective for enhancing concept detection accuracy

ll i t th ff t f d t d i h

  • can alleviate the effect of data domain changes
  • highly efficient !

Future directions include: – Future directions include:

  • detector reliability: diffusion over directed graph
  • web data annotation: utilize contextual information to improve

p the quality of tags

– Source code available for download from DVMM lab

21

research page

slide-22
SLIDE 22

22