They Are Not Equally Reliable: Semantic Event Search using - - PowerPoint PPT Presentation

β–Ά
they are not equally reliable semantic event search using
SMART_READER_LITE
LIVE PREVIEW

They Are Not Equally Reliable: Semantic Event Search using - - PowerPoint PPT Presentation

They Are Not Equally Reliable: Semantic Event Search using Differentiated Concept Classifiers Inkyu An Content 1. Motivation 2. Previous paper 3. Goal 4. Related Work 5. Approach 6. Result 2 They Are Not Equally Reliable: Semantic Event


slide-1
SLIDE 1

They Are Not Equally Reliable: Semantic Event Search using Differentiated Concept Classifiers

Inkyu An

slide-2
SLIDE 2

2

  • 1. Motivation
  • 2. Previous paper
  • 3. Goal
  • 4. Related Work
  • 5. Approach
  • 6. Result

Content

slide-3
SLIDE 3

3

MOTIVATION

They Are Not Equally Reliable: Semantic Event Search using Differentiated Concept Classifiers

slide-4
SLIDE 4

4

Motivation | Semantic image retrieval

Person interacting with panda

<Query>

Is it better to use meaning of sentence?

<Query sentence>

slide-5
SLIDE 5

5

PREVIOUS PAPER

They Are Not Equally Reliable: Semantic Event Search using Differentiated Concept Classifiers

slide-6
SLIDE 6

6

Person interacting with panda

Query image

Person feeding panda Person holding animals Person feeding calf

Implied-by Type-of Mutual-exclusive Result images

Previous paper | Semantic image retrieval

slide-7
SLIDE 7

7

[Girls doing handstand]

CNN feature

Extracting image features (CNN)

Word Vector

Word2Vector (Skip-grams)

Query

Sentence Image

  • Extract the image features and word vectors

Previous paper | Semantic image retrieval

slide-8
SLIDE 8

8

[Girls doing handstand] CNN feature Word Vector Nonsimilar Similar Query CNN feature Word Vector CNN feature Word Vector

System

οƒ Measure scores of Mutually exclusive, Implied-by and Type-of

Training …

[Girl dancing on beach] [Girl doing cartwheel]

Previous paper | Semantic image retrieval

slide-9
SLIDE 9

9

Previous paper | Semantic image retrieval

slide-10
SLIDE 10

10

Previous paper | Semantic image retrieval

  • There are scalability & Time-consuming

issue

  • 𝒅 = 𝒅𝒃𝒅 + πœ·π’”π‘«π’”π’‡π’… + πœ·π’π‘«π’π’Žπ’’ + πœ·π’…π‘«π’…π’‘π’π’• + 𝝁 𝑿 πŸ‘

πŸ‘

𝐷𝑂𝑂 𝑔𝑓𝑏𝑒𝑣𝑠𝑓𝑑 π‘’π‘—π‘›π‘“π‘œπ‘‘π‘—π‘π‘œ ∢ 4096 π΅π‘‘π‘’π‘—π‘π‘œπ‘‘ π‘’π‘—π‘›π‘“π‘œπ‘‘π‘—π‘π‘œ ∢ 27425, π‘ π‘“π‘šπ‘π‘’π‘“π‘’ 𝑗𝑛𝑏𝑕𝑓𝑑 ∢ 100 πΉπ‘›π‘π‘“π‘’π‘’π‘—π‘œπ‘• π‘’π‘—π‘›π‘“π‘œπ‘‘π‘—π‘π‘œ ∢ π‘œ(64) Especially Those issues could be fatal in video search algorithms

slide-11
SLIDE 11

11

GOAL

They Are Not Equally Reliable: Semantic Event Search using Differentiated Concept Classifiers

slide-12
SLIDE 12

12

Goal | Semantic event search from videos

Input sentence : β€œHorse Riding Competition” without video

Video Search System

Result videos Test videos 98,000 videos

slide-13
SLIDE 13

13

Goal | Semantic event search from videos

< Main Contribution > 1) Unsupervised Learning 2) Solve the scalability issue 3) Faster than other method 4) Differentiated Concept Classifiers

slide-14
SLIDE 14

14

APPROACH

They Are Not Equally Reliable: Semantic Event Search using Differentiated Concept Classifiers

slide-15
SLIDE 15

15

Related Work |

  • 1. Skip-grams
  • Weight vectors of actions(Sentence)
  • 2. Spectral meta-learning
  • Unsupervised Learning Method
slide-16
SLIDE 16

16

Approach | Proposed framework

Detected Videos

slide-17
SLIDE 17

17

Approach | Proposed framework

Detected Videos

Relevance Vector [Binary Vector] Warped Spectral meta-learning [Unsupervised & Fast conversion]

slide-18
SLIDE 18

18

Approach | Unsupervised Learning

Test Videos, 𝑀

1 𝑀 3 2

β€œHorse riding competition” true or false?

???

  • Because this is unsupervised learning, We don’t know the

test video is β€œHorse riding competition” or not. ??? ??? ???

slide-19
SLIDE 19

19

Approach | Unsupervised Learning

Test Videos, 𝑀

1 𝑀 3 2

β€œHorse riding competition” true or false?

true

  • Because this is unsupervised learning, We don’t know the

test video is β€œHorse riding competition” or not. true false true Proposed System Word2Vector

slide-20
SLIDE 20

20

Approach | Word to Vector

β€œHorse riding competition” Bee Biking Horse Riding Blowing Candle

Concept Vocabulary (total m) Event Description Skip-Gram Model

π‘Š

𝐼𝑝𝑠𝑑𝑓 π‘ π‘—π‘’π‘—π‘œπ‘• π·π‘π‘›π‘žπ‘“π‘’π‘—π‘—π‘π‘œ

π‘Š

πΆπ‘šπ‘π‘₯π‘—π‘œπ‘• π·π‘π‘œπ‘’π‘šπ‘“

π‘Š

𝐢𝑓𝑓

π‘Š

πΆπ‘—π‘™π‘—π‘œπ‘• π‘ŠπΌπ‘π‘ π‘‘π‘“ π‘†π‘—π‘’π‘—π‘œπ‘•

π‘Š

𝐼𝑝𝑠𝑑𝑓 π‘ π‘—π‘’π‘—π‘œπ‘• π·π‘π‘›π‘žπ‘“π‘’π‘—π‘—π‘π‘œ

π‘ŠπΌπ‘π‘ π‘‘π‘“

π‘†π‘—π‘’π‘—π‘œπ‘•

π‘Š

πΆπ‘—π‘™π‘—π‘œπ‘•

π‘Š

πΆπ‘šπ‘π‘₯π‘—π‘œπ‘• π·π‘π‘œπ‘’π‘šπ‘“

π‘Š

𝐢𝑓𝑓

Word Vector

  • Apply Skip-Gram method to both the event and concepts
slide-21
SLIDE 21

21

Approach | Relevance Score Vector

π‘Š

𝐼𝑝𝑠𝑑𝑓 π‘ π‘—π‘’π‘—π‘œπ‘• π·π‘π‘›π‘žπ‘“π‘’π‘—π‘—π‘π‘œ

π‘Š

π‘„π‘“π‘π‘žπ‘šπ‘“

π‘Š

𝐼𝑝𝑠𝑑𝑓 π‘Š π‘‡β„Žπ‘π‘₯ π‘˜π‘£π‘›π‘žπ‘—π‘œπ‘•

π‘Š

πΊπ‘—π‘“π‘šπ‘’

π‘Š

𝐼𝑝𝑠𝑑𝑓 π‘ π‘—π‘’π‘—π‘œπ‘• π·π‘π‘›π‘žπ‘“π‘’π‘—π‘—π‘π‘œ

Compute distance

Concept Vocabulary(total m) 0.8726 0.7647 0.7256 0.0624 π‘Š

π‘„π‘“π‘π‘žπ‘šπ‘“

π‘Š

𝐼𝑝𝑠𝑑𝑓

π‘Š π‘‡β„Žπ‘π‘₯

π‘˜π‘£π‘›π‘žπ‘—π‘œπ‘•

π‘Š

πΊπ‘—π‘“π‘šπ‘’

Too High Too Low

Event Description Relevance Vector β€œw” 1 1 Relevance Score Vector

  • Compute distances between the event and concepts and make

Relevance Vectors

  • Relevance Vector means how the event is similar with concepts

Binary vector

slide-22
SLIDE 22

22

Approach | Proposed framework

Detected Videos

Relevance Vector [Binary Vector] Waped Spectral meta-learning [Unsupervised & Fast conversion]

slide-23
SLIDE 23

23

Approach | Differentiated Concept Classifier

Concepts, 𝑛 Test Videos, 𝑀

𝑇1,1, 𝑇1,2, … , 𝑇1,𝑛 𝑇2,1, 𝑇2,2, … , 𝑇2,𝑛 𝑇3,1, 𝑇3,2, … , 𝑇3,𝑛 𝑇𝑀,1, 𝑇𝑀,2, … , 𝑇𝑀,𝑛 Compute Similarity 1 𝑛 3 2 1 𝑀 3 2

𝑻𝒋,π’Œ ∈ βˆ’πŸ, 𝟐

  • Differentiated Concept Classifier measures the similarity

between the test video and concepts If the 1st video is similar with concept 1, 𝑇1,1 is 1. If the 1st video isn’t similar with concept 1, 𝑇1,1 is -1.

slide-24
SLIDE 24

24

Approach | Spectral meta-learning

Test Videos, 𝑀

1 𝑀 3 2

β€œHorse riding competition” true or false?

???

  • Because this is unsupervised learning,

??? ??? ???

π’›βˆ— = 𝒕𝒋𝒉𝒐 𝒋=𝟐

𝒏 𝑻𝒋 π’˜

πŸ‘π†π’‹ βˆ’ 𝟐 β‰ˆ 𝒕𝒋𝒉𝒐 𝒋=𝟐

𝒏 𝑻𝒋 π’˜ 𝒗𝒋 οƒ  Estimate the eigenvector 𝑣𝑗 of concept classifier’s covariance matrix to find the optimal solution 𝝆𝒋 ∢ Accuracy of the iβ€²th concept classifier

slide-25
SLIDE 25

25

Approach | Generalized Conditional Gradient (GCG)

  • Because this is unsupervised learning,
  • to find a eigenvector 𝑣𝑗 of covariance matrix, They used

Generalized Conditional Gradient(GCG) algorithm.

  • GCG algorithm can be converged quickly.

min

𝑉 π‘—β‰ π‘˜

π‘£π‘£π‘ˆ 𝑗,π‘˜ βˆ’ 𝑅𝑗,π‘˜

2

+ πœ‡ 𝑣 2

  • Update the eigenvector 𝑣

𝑣 ← π‘šπ‘“π‘π‘’π‘—π‘œπ‘• π‘“π‘—π‘•π‘“π‘œπ‘€π‘“π‘‘π‘’π‘π‘  𝑝𝑔 βˆ’ 𝐻 Repeat until convergence … { }

π‘§βˆ— β‰ˆ π‘‘π‘—π‘•π‘œ 𝑗=1

𝑛 𝑇𝑗 𝑀 𝑣𝑗

  • Local minimizer
  • Rank Test videos using below equation
slide-26
SLIDE 26

26

RESULT

They Are Not Equally Reliable: Semantic Event Search using Differentiated Concept Classifiers

slide-27
SLIDE 27

27

Result | Speed comparison on synthetic data

  • It is Faster than previous works
slide-28
SLIDE 28

28

Result | Mean average precision result

π‘ˆβ„Žπ‘“ π‘œπ‘£π‘›π‘π‘“π‘  𝑝𝑔 π‘‘π‘π‘œπ‘‘π‘“π‘žπ‘’π‘‘ ∢ 3,135 (𝑛)

slide-29
SLIDE 29

29

Summary |

  • Solve scalability & time-consuming issues
  • n unsupervised learning.
  • They used Skip-grams to convert a word to

a vector.

  • They used Spectral-meta learning method

to solve the unsupervised problem.

  • They used Generalized Conditional Gradient

(GCG) algorithm to improve the calculation speed.

slide-30
SLIDE 30

30

  • Thank you.

Q & A |

slide-31
SLIDE 31

31

APPENDIX

They Are Not Equally Reliable: Semantic Event Search using Differentiated Concept Classifiers

slide-32
SLIDE 32

32

Appendix | Spectral meta-learning

π‘žπ‘— = Pr 𝑇𝑗 𝑀 = 1 | 𝑧 = 1 π‘œπ‘— = Pr 𝑇𝑗 𝑀 = βˆ’1 | 𝑧 = βˆ’1 πœŒπ‘— = π‘žπ‘— + π‘œπ‘— 2

  • The accuracy of the 𝑗-th

concept classifier at 𝑀 video

π‘§βˆ— = argmaxyβ„’ 𝑇1 𝑀 , … , 𝑇𝑁 𝑀 ; 𝑧

𝑗=1

𝑁 Pr 𝑇𝑗 𝑂 |𝑧

= β„’ 𝑇1 𝑀 , … , 𝑇𝑁 𝑀 ; 𝑧

οƒ  Find a maximum 𝑧 point

  • f likelihood

π‘§βˆ— = π‘‘π‘—π‘•π‘œ 𝑗=1

𝑛 𝑇𝑗 𝑀

2πœŒπ‘— βˆ’ 1 β‰ˆ π‘‘π‘—π‘•π‘œ 𝑗=1

𝑛 𝑇𝑗 𝑀 𝑣𝑗

οƒ  Estimate the eigenvector 𝑣𝑗 by finding the optimal solution rather than πœŒπ‘— οƒ  Because 𝑣𝑗 ∝ 2πœŒπ‘— βˆ’ 1

slide-33
SLIDE 33

33

Appendix | Spectral meta-learning

min

𝑆β‰₯0, π‘ π‘π‘œπ‘™ 𝑆 =1 π‘—β‰ π‘˜

𝑅𝑗,π‘˜ βˆ’ 𝑆𝑗,π‘˜

2

π‘†π‘π‘œπ‘™ π‘π‘œπ‘“ 𝑛𝑏𝑒𝑠𝑗𝑦 𝑺 = 𝝁𝒗𝒗𝑼 𝝁: π‘“π‘—π‘•π‘“π‘œπ‘€π‘π‘šπ‘£π‘“, 𝒗: π‘“π‘—π‘•π‘“π‘œπ‘€π‘“π‘‘π‘’π‘π‘ 

𝑅𝑗,π‘˜ = 𝐹𝑀[ 𝑇𝑗 𝑀 βˆ’ πœˆπ‘— (𝑇

π‘˜ 𝑀 βˆ’ πœˆπ‘˜)] =

1 βˆ’ πœˆπ‘—

2, 𝑗 = π‘˜

2πœŒπ‘— βˆ’ 1 2πœŒπ‘˜ βˆ’ 1 1 βˆ’ 𝑐2 , 𝑗 β‰  π‘˜

  • Covariance matrix 𝑅 between concept i, and concept j at video v

Ranking and combining multiple predictors without labeled data [PNAS, 2014]

  • mean prediction 𝜈 of concept i

πœˆπ‘— = 𝐹𝑀[𝑇𝑗 𝑀 ]

slide-34
SLIDE 34

34

Appendix | warping function

𝑒𝑗 𝑀 = 𝑔

𝑗 𝑇𝑗 𝑀

= π‘₯𝑗𝑇𝑗 𝑀 , 𝑗 = 1, … , 𝑛

  • To incorporate the relevance vector β€˜w’ (page 16), They made

warping functions

  • Also, covariance matrix 𝑅 and mean is converted into

𝑅𝑔 and πœˆπ‘”

𝑅𝑗,π‘˜

𝑔 =

1 π‘œ βˆ’ 1

𝑙=1 π‘œ

𝑒𝑗 𝑀𝑙 βˆ’ πœˆπ‘—

𝑔

π‘’π‘˜ 𝑀𝑙 βˆ’ πœˆπ‘˜

𝑔 ,

𝑣𝑗

𝑔 = 1

π‘œ

𝑙=1 π‘œ

𝑒𝑗 𝑀𝑙 𝑧𝑔 = π‘‘π‘—π‘œπ‘•

𝑗=1 𝑛

𝑔

𝑗 𝑇𝑗 𝑣𝑗

  • The spectral meta-learner for the warped classifiers
slide-35
SLIDE 35

35

Appendix | Generalized Conditional Gradient (GCG)

𝐻 = 𝑔 𝑦 = 0, 𝑦 = 0 𝑆𝑗,π‘˜ βˆ’ 𝑅𝑗,π‘˜, 𝑦 β‰  0, {𝑆 ← 𝑣𝑒𝑣𝑒

π‘ˆ}

min

𝑉 π‘—β‰ π‘˜

π‘£π‘£π‘ˆ 𝑗,π‘˜ βˆ’ 𝑅𝑗,π‘˜

2

+ πœ‡ 𝑣 2

  • Update the eigenvector 𝑣

𝑣 ← π‘šπ‘“π‘π‘’π‘—π‘œπ‘• π‘“π‘—π‘•π‘“π‘œπ‘€π‘“π‘‘π‘’π‘π‘  𝑝𝑔 βˆ’ 𝐻 Repeat until convergence … { }

π‘§βˆ— β‰ˆ π‘‘π‘—π‘•π‘œ 𝑗=1

𝑛 𝑇𝑗 𝑀 𝑣𝑗

  • Local minimizer
  • Rank Test videos using below equation
slide-36
SLIDE 36

36

Appendix | Generalized Conditional Gradient (GCG)

min

𝑆β‰₯0, π‘ π‘π‘œπ‘™ 𝑆 =1 π‘—β‰ π‘˜

𝑅𝑗,π‘˜ βˆ’ 𝑆𝑗,π‘˜

2

π‘†π‘π‘œπ‘™ π‘π‘œπ‘“ 𝑛𝑏𝑒𝑠𝑗𝑦 𝑺 = 𝝁𝒗𝒗𝑼 𝝁: π‘“π‘—π‘•π‘“π‘œπ‘€π‘π‘šπ‘£π‘“, 𝒗: π‘“π‘—π‘•π‘“π‘œπ‘€π‘“π‘‘π‘’π‘π‘ 

  • Because

𝑅𝑗,π‘˜ βˆ’ 𝑆𝑗,π‘˜

2 is not convex function, We need other

function to convergence.

  • GCG is algorithm for solving a optimization problem quickly.

min

𝑉 π‘—β‰ π‘˜

π‘‰π‘‰π‘ˆ 𝑗,π‘˜ βˆ’ 𝑅𝑗,π‘˜

2

+ πœ‡ 𝑉 2

  • Update the eigenvector 𝑣 every iteration

𝐻 = 𝛼𝑆[

π‘—β‰ π‘˜

𝑆𝑗,π‘˜ βˆ’ 𝑅𝑗,π‘˜

2]

𝑣 = 𝑏𝑠𝑕𝑛𝑏𝑦 𝑨 2≀1 π‘¨π‘ˆπ»π‘¨ ,