VIREO-TNO @ TRECVID 2014 Zero-Shot Event Detection and Recounting - - PowerPoint PPT Presentation

vireo tno trecvid 2014
SMART_READER_LITE
LIVE PREVIEW

VIREO-TNO @ TRECVID 2014 Zero-Shot Event Detection and Recounting - - PowerPoint PPT Presentation

VIREO-TNO @ TRECVID 2014 Zero-Shot Event Detection and Recounting Speaker: Maaike de Boer (TNO) Yi-Jie Lu 1 , Hao Zhang 1 ,Chong-Wah Ngo 1 Maaike de Boer 2 , John Schavemaker 2 , Klamer Schutte 2 , Wessel Kraaij 2 1 VIREO Group, City University of


slide-1
SLIDE 1

VIREO-TNO @ TRECVID 2014 Zero-Shot Event Detection and Recounting

Speaker: Maaike de Boer (TNO)

Yi-Jie Lu1, Hao Zhang1,Chong-Wah Ngo1 Maaike de Boer2, John Schavemaker2, Klamer Schutte2, Wessel Kraaij2

1VIREO Group, City University of Hong Kong, Hong Kong 2Netherlands Organization for Applied Scientific Research (TNO), Netherlands

slide-2
SLIDE 2

Outline

 0-Shot System

– System Overview – Findings

 MER System

– System Workflow – Results

slide-3
SLIDE 3

 Semantic Query Generation (SQG)

– Given an event query, SQG translates the query

description into a representation of semantic concepts

Event Query (Attempting a Bike Trick)

SQG

< Objects >

  • Bike

0.60

  • Motorcycle

0.60

  • Mountain bike

0.60 < Actions >

  • Bike trick

1.00

  • Ridding bike

0.62

  • Flipping bike

0.61 < Scenes >

  • Parking lot

0.01 Semantic Query Relevant Concepts Relevance Score Concept Bank

$

Concept Bank

TRECVID SIN

Research Collection

ƒ

HMDB51

$

UCF101

ImageNet
slide-4
SLIDE 4

 Concept Bank

– Research collection (497 concepts) – ImageNet ILSVRC’12 (1000 concepts) – SIN’14 (346 concepts)

$

Concept Bank

TRECVID SIN

Research Collection

ƒ

HMDB51

$

UCF101

ImageNet
slide-5
SLIDE 5

 Event Search

– Ranking according to the SQ and concept responses

< Objects >

  • Bike

0.60

  • Motorcycle

0.60

  • Mountain bike

0.60 < Actions >

  • Bike trick

1.00

  • Ridding bike

0.62

  • Flipping bike

0.61 < Scenes >

  • Parking lot

0.01 Semantic Query Video Ranking

Event Search

i

s 

i

qc q

Concept Response

i

c

slide-6
SLIDE 6

Outline

 0-Shot System

– System Overview – Findings

 MER System

– System Workflow – Results

slide-7
SLIDE 7

 SQG Experiments

– Exact matching vs. WordNet/ConceptNet matching – How many concepts are used to represent an event? – To further improve the weighting:

  • TF-IDF
  • Term specificity
slide-8
SLIDE 8

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Average Precision Event ID

WordNet ExactMatching EM-TOP

WordNet Exact Matching Exact matching but

  • nly retains the top

few concepts 7%

 Exact matching vs. WordNet matching

slide-9
SLIDE 9

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 1 6 11 16 21 26

Mean Average Precision Top k Concepts MAP(all)

Hit the best MAP by only retaining the Top 8 concepts

 Amount of concepts used to represent event

slide-10
SLIDE 10

Insights

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 1 6 11 16 21 26

Average Precision Top k Concepts

21

Event 21: Attempting a bike trick

Trick Wheel Paddle wheel Car wheel Potter wheel Person riding Jumping

slide-11
SLIDE 11

Insights

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 1 6 11 16 21 26

Average Precision Top k Concepts

31

Event 31: Beekeeping Honeycomb (ImageNet) Bee (ImageNet) Bee house (ImageNet) Cutting (research collection) Cutting down tree (research collection)

slide-12
SLIDE 12

Insights

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 1 6 11 16 21 26

Average Precision Top k Concepts

23

Event 23: Dog show Brush dog (research collection) Dog show (research collection)

slide-13
SLIDE 13

 Improvements by TF-IDF and word specificity

Method MAP (on MED14-Test) Exact Matching Only 0.0306 Exact Matching + TF 0.0420 Exact Matching + TFIDF 0.0495 Exact Matching + TFIDF + Word Specificity 0.0502

0.01 0.02 0.03 0.04 0.05 0.06 EM Only EM + TF EM + TFIDF EM + TFIDF + Spec.

slide-14
SLIDE 14

Findings

  • 1. Exact matching performs better than matching with

WordNet and/or ConceptNet

  • 2. Performance is even better by only retaining the top few

exactly matched concepts

  • 3. Adding both TF-IDF and Word Specificity increases

performance

slide-15
SLIDE 15

 Why ontology-based mapping would not work?

A sample query in TRECVID 2009

slide-16
SLIDE 16

 Why ontology-based mapping would not work?

Dog Show Concept “dog” cat horse mammal carnivore animal kit fox red wolf

SIN ImageNet

slide-17
SLIDE 17

 Why ConceptNet mapping would not work?

Tailgating car food helmet team uniform portable shelter parking lot driver engine tailgating

desires

bus

slide-18
SLIDE 18

Findings

 It is difficult to

– harness the ontology-based mapping while constraining

the mapping by event context

slide-19
SLIDE 19

 In the Ad-Hoc event “Extinguishing a Fire”

– Key concepts are missing:

  • Fire extinguisher
  • Firefighter
slide-20
SLIDE 20

Findings

 It is reasonable to

– Scale up the number of concepts, thus increasing the

chance of exact matching

slide-21
SLIDE 21

MED14-Eval-Full Results

 PS 000Ex

– Automatic semantic query generation and search – Fusion of 0-Shot and OCR system – Achieves the MAP of 5.2

 AH 000Ex

– System is the same as in PS 000Ex – Achieves the MAP of 2.6 – Performance drops due to the lack of key concepts

slide-22
SLIDE 22

Outline

 0-Shot System

– System Overview – Findings

 MER System

– System Workflow – Results

slide-23
SLIDE 23

MER System

 In algorithm design, we aim to optimize

– Concept-to-event relevancy – Evidence diversity – Viewing time of evidential shots

slide-24
SLIDE 24

MER System

 In algorithm design, we aim to optimize

– Concept-to-event relevancy

  • First, we require that candidate shots are relevant to the event;
  • Second, we do concept-to-shot alignment.

– Evidence diversity – Viewing time of evidential shots

slide-25
SLIDE 25

MER System

 In algorithm design, we aim to optimize

– Concept-to-event relevancy

  • First, we require that candidate shots are relevant to the event;
  • Second, we do concept-to-shot alignment.

– Evidence diversity

  • In concept-to-shot alignment, we recount each shot with a unique concept

different from other shots.

– Viewing time of evidential shots

slide-26
SLIDE 26

MER System

 In algorithm design, we aim to optimize

– Concept-to-event relevancy

  • First, we require that candidate shots are relevant to the event;
  • Second, we do concept-to-shot alignment.

– Evidence diversity

  • In concept-to-shot alignment, we recount each shot with a unique concept

different from other shots.

– Viewing time of evidential shots

  • Select only the three most confident shots as key evidence
  • Basically, each shot is in about 5 seconds
slide-27
SLIDE 27

Outline

 0-Shot System

– System Overview – Findings

 MER System

– System Workflow – Results

slide-28
SLIDE 28

 Key Evidence Localization

Extract keyframes uniformly

slide-29
SLIDE 29

 Key Evidence Localization

Concept Reponses

Apply concept detectors

$

Concept Bank € TRECVID SIN ₤ Research Collection ƒ HMDB51 $ UCF101 ¥ ImageNet
slide-30
SLIDE 30

 Key Evidence Localization

Choose keyframes that are most relevant to this event

  • All concepts in semantic query are taken into account by calculating

the weighted sum

i

s 

i

wr

slide-31
SLIDE 31

 Key Evidence Localization

Expand keyframes to shots

slide-32
SLIDE 32

 Key Evidence Localization

The top 3 shots are selected as key evidences

slide-33
SLIDE 33

 Key Evidence Localization

The rests are non-key evidences

slide-34
SLIDE 34

 Concept-to-Shot Alignment

The top concept in the key evidence is selected as the representative concept

* We choose unique concept for each shot

< Objects >

  • Bike
  • Motorcycle
  • Mountain bike

< Actions >

  • Bike trick
  • Ridding bike
  • Flipping bike

< Scenes >

  • Parking lot

Semantic Query Key Non-Key Ridding bike Bike trick Bike Bike trick Bike Ridding bike Key Key

slide-35
SLIDE 35

MER14 Results

The percentage of strongly agree

(b) Event query quality (a) Evidence quality

0% 5% 10% 15% 20% 25% 30% VIREO Team1 Team2 Team3 Team4 Team6 Team5 0% 5% 10% 15% 20% 25% 30% Team2 VIREO Team4 Team3 Team6 Team1 Team5

slide-36
SLIDE 36

MER14 Results

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Team1 Team2 Team3 VIREO Team4 Team5 Team6 0% 10% 20% 30% 40% 50% 60% 70% Team2 VIREO Team4 Team1 Team6 Team5 Team3

The percentage of both agree and strongly agree

(b) Event query quality (a) Evidence quality

slide-37
SLIDE 37

Summary

 0-Shot System

– The simple exact matching performs the best – The quality of concepts selected to represent an event is

more important than quantity

– It’s an open problem of how to harness the ontology-

based mapping

slide-38
SLIDE 38

Summary

 MER System

– In key evidence localization, we emphasize the event

relevancy first, then the hot concepts

– We recommend three shots as key evidences and each in

about 5 seconds

slide-39
SLIDE 39

Thanks!