Content-Based Video Copy Detection: PRISMA at TRECVID 2010 Juan - - PowerPoint PPT Presentation

content based video copy detection prisma at trecvid 2010
SMART_READER_LITE
LIVE PREVIEW

Content-Based Video Copy Detection: PRISMA at TRECVID 2010 Juan - - PowerPoint PPT Presentation

Content-Based Video Copy Detection: PRISMA at TRECVID 2010 Juan Manuel Barrios and Benjamin Bustos PRISMA Research Group Department of Computer Science University of Chile { jbarrios,bebustos } @dcc.uchile.cl November 17, 2010 PRISMA


slide-1
SLIDE 1

Content-Based Video Copy Detection: PRISMA at TRECVID 2010

Juan Manuel Barrios and Benjamin Bustos

PRISMA Research Group Department of Computer Science University of Chile

{jbarrios,bebustos}@dcc.uchile.cl

November 17, 2010

PRISMA (University of Chile) CCD Task November 17, 2010 1 / 25

slide-2
SLIDE 2

PRISMA System Overview

Copy Detection System developed for TRECVID 2010. Three Global descriptors. No Audio information. Pivot-based index with approximate search. Voting algorithm for copy localization. Implemented in C with OpenCV library. System divided in five tasks/steps.

PRISMA (University of Chile) CCD Task November 17, 2010 2 / 25

slide-3
SLIDE 3

PRISMA System Overview

Feature Extraction Frame Sampling Preprocessing Similarity Search Copy Localization Query Videos Reference Videos Detection Result

1 2 3 4 5

PRISMA (University of Chile) CCD Task November 17, 2010 3 / 25

slide-4
SLIDE 4

System Tasks

1 Preprocessing:

Skip irrelevant frames. Remove black borders. Inverse transformations for Camcording, PIP and Flip.

Query videos increased from 1,608 to 5,378. Reference videos kept in 11,524.

PRISMA (University of Chile) CCD Task November 17, 2010 4 / 25

slide-5
SLIDE 5

System Tasks

2 Frame Sampling:

Divides each video in groups of similar consecutive frames (GF). Uniform subsampling of 3 frames per second. Similarity between frames defined as maximum difference between intensity of pixels.

Query Videos are divided into 1,000,000 groups. Reference Videos are divided into 4,000,000 groups.

PRISMA (University of Chile) CCD Task November 17, 2010 5 / 25

slide-6
SLIDE 6

System Tasks

2 Frame Sampling:

Divides each video in groups of similar consecutive frames (GF). Uniform subsampling of 3 frames per second. Similarity between frames defined as maximum difference between intensity of pixels.

Query Videos are divided into 1,000,000 groups. Reference Videos are divided into 4,000,000 groups.

GF1 GF2 GF3 GF4 GF5 GF6 GF7 GF8 GF9 GF10 GF11 GF13 GF12

PRISMA (University of Chile) CCD Task November 17, 2010 5 / 25

slide-7
SLIDE 7

System Tasks

3 Feature Extraction:

Descriptor of a group is the average of descriptors for each frame. Extracts three global visual descriptors :

EH: Edge Histogram (4 × 4 × 10 = 160 dimensions) GH: Gray Histogram (3 × 3 × 20 = 180 dimensions) CH: RGB Histogram (2 × 2 × 48 = 192 dimensions) (1 byte per dimension)

( )

...

EH

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

GH

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

CH

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

GF1 GF2 GF3 GF4 GF5 GF6 GF7 GF8 GF9 GF10 GF11 GF13 GF12

PRISMA (University of Chile) CCD Task November 17, 2010 6 / 25

slide-8
SLIDE 8

System Tasks

4 Similarity Search:

Compares descriptors from query groups with descriptors from reference groups. DIST(Gi, Gj) is a distance function that measures the similarity between groups Gi and Gj. DIST is defined as a combination of two descriptors:

Run ehdNgryhst: DIST combines EH and GH. Run ehdNclrhst: DIST combines EH and CH.

PRISMA (University of Chile) CCD Task November 17, 2010 7 / 25

slide-9
SLIDE 9

Similarity Search Task

Distance between groups is a static weighted combination of distance between descriptors (γ): δ(Gi, Gj) = w1 × γ1(Gi, Gj) + w2 × γ2(Gi, Gj) We defined γ as L1 (Manhattan) distance for EHD, GH and CH vectors: L1(x, y) =

d

  • i=0

|xi − yi| Final distance between groups is the average of δ between three consecutive groups: DIST(Gi, Gj) = δ(Gi−1, Gj−1) + δ(Gi, Gj) + δ(Gi+1, Gj+1) 3 DIST requires more than 1,000 operations to be evaluated.

PRISMA (University of Chile) CCD Task November 17, 2010 8 / 25

slide-10
SLIDE 10

Similarity Search Task

We set weights for each descriptor using a histogram of distances between pairs of vectors. Weights normalize to 100 the distance that covers 0.01% of pairs

  • n each histogram:

100 1469 = 0.068 100 1106 = 0.090 100 660 = 0.152

ehdNgryhst: δ = 0.068 × EH + 0.090 × GH ehdNclrhst: δ = 0.068 × EH + 0.152 × CH

PRISMA (University of Chile) CCD Task November 17, 2010 9 / 25

slide-11
SLIDE 11

Similarity Search Task

The intrinsic dimensionality

µ2 2σ2 quantifies how hard is to search

  • n a metric space [Ch´

avez et al, 2001]. Move w2 to a value that locally maximizes intrinsic dimensionality

  • f δ.

Iterative algorithm that converged to:

ehdNgryhst: δ = 0.068 × EH + 0.090 × GH ehdNclrhst: δ = 0.068 × EH + 0.045 × CH

PRISMA (University of Chile) CCD Task November 17, 2010 10 / 25

slide-12
SLIDE 12

Similarity Search Task

The output of the Similarity Search task is a Nearest-Neighbors Table with most similar reference groups for each query group.

Query1Group1 Query1Group2 Query1Group3 Query1Group4 Query1Group5 Query1Group6 Query1Group7 Query1Group8 Vid07_Grp54 dist Vid09_Grp13 dist Vid07_Grp34 dist Vid09_Grp15 dist Vid01_Grp88 dist Vid09_Grp54 dist Vid01_Grp45 dist Vid09_Grp19 dist Vid08_Grp73 dist Vid02_Grp34 dist Vid03_Grp54 dist Vid02_Grp13 dist Vid01_Grp12 dist Vid09_Grp17 dist Vid03_Grp43 dist Vid01_Grp12 dist Vid01_Grp68 dist Vid02_Grp33 dist Vid09_Grp14 dist Vid03_Grp65 dist Vid07_Grp58 dist Vid07_Grp59 dist Vid03_Grp20 dist Vid07_Grp61 dist ... ... ... ...

Query NN 1 NN 2 NN 3

A naive approach would evaluate 1,000,000 × 4,000,000 times DIST (this takes about 11 month!).

PRISMA (University of Chile) CCD Task November 17, 2010 11 / 25

slide-13
SLIDE 13

Similarity Search Task

DIST complies with metric properties: Reflexivity, Non-Negativity, Symmetry, and Triangle Inequality. Let q be a group of frames from a query video, and v be a group of frames from a reference video. A lower bound for DIST(q, v) can be calculated with pivots:

PRISMA (University of Chile) CCD Task November 17, 2010 12 / 25

slide-14
SLIDE 14

Similarity Search Task

DIST complies with metric properties: Reflexivity, Non-Negativity, Symmetry, and Triangle Inequality. Let q be a group of frames from a query video, and v be a group of frames from a reference video. A lower bound for DIST(q, v) can be calculated with pivots: Lower Bound: DIST(q, v) ≥ |DIST(p, q) − DIST(p, v)|

PRISMA (University of Chile) CCD Task November 17, 2010 12 / 25

slide-15
SLIDE 15

Similarity Search Task

DIST complies with metric properties: Reflexivity, Non-Negativity, Symmetry, and Triangle Inequality. Let q be a group of frames from a query video, and v be a group of frames from a reference video. A lower bound for DIST(q, v) can be calculated with pivots: Let S = {p1, ..., pm} be a set of pivots, then: DIST(q, v) ≥ maxp∈S {|DIST(p, q) − DIST(p, v)|}

PRISMA (University of Chile) CCD Task November 17, 2010 12 / 25

slide-16
SLIDE 16

Similarity Search Task

Index creation:

The system selects 4 sets of 9 pivots with the incremental SSS algorithm [Bustos et al, 2008].

Each set requires a table with 9 × 4,000,000 distances.

The system compares the 4 sets and selects the set that has the greatest average lower bound and discards the others [Zezula et al, 2005].

PRISMA (University of Chile) CCD Task November 17, 2010 13 / 25

slide-17
SLIDE 17

Similarity Search Task

Index creation:

The system selects 4 sets of 9 pivots with the incremental SSS algorithm [Bustos et al, 2008].

Each set requires a table with 9 × 4,000,000 distances.

The system compares the 4 sets and selects the set that has the greatest average lower bound and discards the others [Zezula et al, 2005].

PRISMA (University of Chile) CCD Task November 17, 2010 13 / 25

slide-18
SLIDE 18

Similarity Search Task

Similarity search for a query group q:

For every pivot p evaluate DIST(q, p). For every reference group v calculate a lower bound for DIST(q, v)

Only 9 operations to calculate each lower bound.

Select 4,000 objects (0.1%) with lowest lower bounds. Calculate actual DIST(q, v) just for the 4,000 objects and select the NNs between them.

PRISMA (University of Chile) CCD Task November 17, 2010 14 / 25

slide-19
SLIDE 19

Similarity Search Task

Similarity search for a query group q:

For every pivot p evaluate DIST(q, p). For every reference group v calculate a lower bound for DIST(q, v)

Only 9 operations to calculate each lower bound.

Select 4,000 objects (0.1%) with lowest lower bounds. Calculate actual DIST(q, v) just for the 4,000 objects and select the NNs between them.

PRISMA (University of Chile) CCD Task November 17, 2010 14 / 25

slide-20
SLIDE 20

Similarity Search Task

Similarity search for a query group q:

For every pivot p evaluate DIST(q, p). For every reference group v calculate a lower bound for DIST(q, v)

Only 9 operations to calculate each lower bound.

Select 4,000 objects (0.1%) with lowest lower bounds. Calculate actual DIST(q, v) just for the 4,000 objects and select the NNs between them.

PRISMA (University of Chile) CCD Task November 17, 2010 14 / 25

slide-21
SLIDE 21

Similarity Search Task

Similarity search for a query group q:

For every pivot p evaluate DIST(q, p). For every reference group v calculate a lower bound for DIST(q, v)

Only 9 operations to calculate each lower bound.

Select 4,000 objects (0.1%) with lowest lower bounds. Calculate actual DIST(q, v) just for the 4,000 objects and select the NNs between them.

PRISMA (University of Chile) CCD Task November 17, 2010 14 / 25

slide-22
SLIDE 22

Similarity Search Task

Similarity search for a query group q:

For every pivot p evaluate DIST(q, p). For every reference group v calculate a lower bound for DIST(q, v)

Only 9 operations to calculate each lower bound.

Select 4,000 objects (0.1%) with lowest lower bounds. Calculate actual DIST(q, v) just for the 4,000 objects and select the NNs between them.

PRISMA (University of Chile) CCD Task November 17, 2010 14 / 25

slide-23
SLIDE 23

Similarity Search Task

Similarity search for a query group q:

For every pivot p evaluate DIST(q, p). For every reference group v calculate a lower bound for DIST(q, v)

Only 9 operations to calculate each lower bound.

Select 4,000 objects (0.1%) with lowest lower bounds. Calculate actual DIST(q, v) just for the 4,000 objects and select the NNs between them.

PRISMA (University of Chile) CCD Task November 17, 2010 14 / 25

slide-24
SLIDE 24

System Tasks

5 Copy Localization:

Takes NNs table and searches for chains of groups belonging to a same reference video with temporal coherence. Voting algorithm based on NN rank, NN distance and spread of votes in chain. Copy localization set as start/end of chain.

Query1Group1 Query1Group2 Query1Group3 Query1Group4 Query1Group5 Query1Group6 Query1Group7 Query1Group8 Vid07_Grp54 dist Vid09_Grp13 dist Vid07_Grp34 dist Vid09_Grp15 dist Vid01_Grp88 dist Vid09_Grp54 dist Vid01_Grp45 dist Vid09_Grp19 dist Vid08_Grp73 dist Vid02_Grp34 dist Vid03_Grp54 dist Vid02_Grp13 dist Vid01_Grp12 dist Vid09_Grp17 dist Vid03_Grp43 dist Vid01_Grp12 dist Vid01_Grp68 dist Vid02_Grp33 dist Vid09_Grp14 dist Vid03_Grp65 dist Vid07_Grp58 dist Vid07_Grp59 dist Vid03_Grp20 dist Vid07_Grp61 dist ... ... ... ...

Query NN 1 NN 2 NN 3

PRISMA (University of Chile) CCD Task November 17, 2010 15 / 25

slide-25
SLIDE 25

System Tasks

5 Copy Localization:

Takes NNs table and searches for chains of groups belonging to a same reference video with temporal coherence. Voting algorithm based on NN rank, NN distance and spread of votes in chain. Copy localization set as start/end of chain.

Query1Group1 Query1Group2 Query1Group3 Query1Group4 Query1Group5 Query1Group6 Query1Group7 Query1Group8 Vid07_Grp54 dist Vid09_Grp13 dist Vid07_Grp34 dist Vid09_Grp15 dist Vid01_Grp88 dist Vid09_Grp54 dist Vid01_Grp45 dist Vid09_Grp19 dist Vid08_Grp73 dist Vid02_Grp34 dist Vid03_Grp54 dist Vid02_Grp13 dist Vid01_Grp12 dist Vid09_Grp17 dist Vid03_Grp43 dist Vid01_Grp12 dist Vid01_Grp68 dist Vid02_Grp33 dist Vid09_Grp14 dist Vid03_Grp65 dist Vid07_Grp58 dist Vid07_Grp59 dist Vid03_Grp20 dist Vid07_Grp61 dist ... ... ... ...

Query NN 1 NN 2 NN 3

score Vid07= 2.2 PRISMA (University of Chile) CCD Task November 17, 2010 15 / 25

slide-26
SLIDE 26

System Tasks

5 Copy Localization:

Takes NNs table and searches for chains of groups belonging to a same reference video with temporal coherence. Voting algorithm based on NN rank, NN distance and spread of votes in chain. Copy localization set as start/end of chain.

Query1Group1 Query1Group2 Query1Group3 Query1Group4 Query1Group5 Query1Group6 Query1Group7 Query1Group8 Vid07_Grp54 dist Vid09_Grp13 dist Vid07_Grp34 dist Vid09_Grp15 dist Vid01_Grp88 dist Vid09_Grp54 dist Vid01_Grp45 dist Vid09_Grp19 dist Vid08_Grp73 dist Vid02_Grp34 dist Vid03_Grp54 dist Vid02_Grp13 dist Vid01_Grp12 dist Vid09_Grp17 dist Vid03_Grp43 dist Vid01_Grp12 dist Vid01_Grp68 dist Vid02_Grp33 dist Vid09_Grp14 dist Vid03_Grp65 dist Vid07_Grp58 dist Vid07_Grp59 dist Vid03_Grp20 dist Vid07_Grp61 dist ... ... ... ...

Query NN 1 NN 2 NN 3

score Vid07= 2.2 score Vid09= 3.7 PRISMA (University of Chile) CCD Task November 17, 2010 15 / 25

slide-27
SLIDE 27

Results

RESULTS

PRISMA (University of Chile) CCD Task November 17, 2010 16 / 25

slide-28
SLIDE 28

Results

Submitted Runs:

balanced.ehdNgryhst: δ = 0.068 × EH + 0.090 × GH balanced.ehdNclrhst: δ = 0.068 × EH + 0.045 × CH nofa.ehdNgryhst: equal to balanced.ehdNgryhst with stricter voting algorithm. nofa.ehdNghT10: equal to nofa.ehdNgryhst but with a different threshold.

Analysis focused on Optimal NDCR. EH+GH slightly better than EH+CH. Better results in NOFA profile than in Balanced profile.

PRISMA (University of Chile) CCD Task November 17, 2010 17 / 25

slide-29
SLIDE 29

Results nofa.ehdNgryhst

Optimal NDCR:

Lower NDCR than median for each transformation. Better results for Insertion of Pattern and Strong Reencoding.

0.001 0.010 0.100 1.000 10.000 100.000 1000.000 9999.999

minimal NDCR

PRISMA (University of Chile) CCD Task November 17, 2010 18 / 25

slide-30
SLIDE 30

Results nofa.ehdNgryhst

Optimal NDCR:

Lower NDCR than median for each transformation. Better results for Insertion of Pattern and Strong Reencoding.

0.001 0.010 0.100 1.000 10.000 100.000 1000.000 9999.999

minimal NDCR Camcording PIP Insertion

  • f Pattern

Strong reencoding Change gamma Decrease quality Postproduction Random combination

PRISMA (University of Chile) CCD Task November 17, 2010 18 / 25

slide-31
SLIDE 31

Results nofa.ehdNgryhst

Optimal F1:

Good localization for PIP and bad localization for Camcording and Change in gamma.

0.0 0.2 0.4 0.6 0.8 1.0

Optimal mean F1 for TPs Camcording PIP Insertion

  • f Pattern

Strong reencoding Change gamma Decrease quality Postproduction Random combination

PRISMA (University of Chile) CCD Task November 17, 2010 19 / 25

slide-32
SLIDE 32

Results nofa.ehdNgryhst

Mean Time:

Slightly higher than the median, specially for camcording and PIP.

0.001 0.010 0.100 1.000 10.000 100.000 1000.000 9999.999

mean processing time (s) Camcording PIP Insertion

  • f Pattern

Strong reencoding Change gamma Decrease quality Postproduction Random combination

PRISMA (University of Chile) CCD Task November 17, 2010 20 / 25

slide-33
SLIDE 33

Comparison

Comparison with Optimal NDCR averaged between all transformations. 22 teams, 41 submitted runs for balanced profile and 37 for nofa profile.

Run Avg Opt NDCR global rank video-only rank balanced.ehdNgryhst 0.597 14th of 41 1st of 15 balanced.ehdNclrhst 0.658 16th of 41 3rd of 15 nofa.ehdNgryhst 0.611 10th of 37 1st of 14 nofa.ehdNghT10 0.611 11th of 37 2nd of 14 Run Avg Opt F1 global rank video-only rank balanced.ehdNgryhst 0.820 15th of 41 2nd of 15 balanced.ehdNclrhst 0.820 16th of 41 3rd of 15 nofa.ehdNgryhst 0.828 14th of 37 1st of 14 nofa.ehdNghT10 0.828 15th of 37 2nd of 14

PRISMA (University of Chile) CCD Task November 17, 2010 21 / 25

slide-34
SLIDE 34

Comparison

1.0 0.8 0.6 0.4 0.2

Average Optimal F1

0.1 1.0 10 100 1000

Average Optimal NDCR

PRISMA

Video only Audio only Audio+Video

No False Alarms Profile

(logarithmic scale)

PRISMA (University of Chile) CCD Task November 17, 2010 22 / 25

slide-35
SLIDE 35

Comparison

1.0 0.8 0.6 0.4 0.2

Average Optimal F1

0.5 1.0 1.5 2.0

Average Optimal NDCR

PRISMA

Video only Audio only Audio+Video

Balanced Profile

PRISMA (University of Chile) CCD Task November 17, 2010 23 / 25

slide-36
SLIDE 36

Conclusions

Acceptable overall results:

Global descriptors can achieve competitive results with TRECVID transformations. Pivot-based approximation enables to discard 99.9% of distance computations and still have good effectiveness.

Two novel techniques:

Set weights maximizing intrinsic dimensionality. Calculate actual distance just for 0.1% lowest lower bounds.

Future work:

Improve the efficiency of preprocessing task. Test other distances for descriptors instead of L1 (in particular some non-metric similarity measure). Test the inclusion of audio information and local descriptors.

PRISMA (University of Chile) CCD Task November 17, 2010 24 / 25

slide-37
SLIDE 37

Thank you!

Feature Extraction Frame Sampling Preprocessing Similarity Search Copy Localization Query Videos Reference Videos Detection Result

1 2 3 4 5 Query1Group1 Query1Group2 Query1Group3 Query1Group4 Query1Group5 Query1Group6 Query1Group7 Query1Group8 Vid07_Grp54 dist Vid09_Grp13 dist Vid07_Grp34 dist Vid09_Grp15 dist Vid01_Grp88 dist Vid09_Grp54 dist Vid01_Grp45 dist Vid09_Grp19 dist Vid08_Grp73 dist Vid02_Grp34 dist Vid03_Grp54 dist Vid02_Grp13 dist Vid01_Grp12 dist Vid09_Grp17 dist Vid03_Grp43 dist Vid01_Grp12 dist Vid01_Grp68 dist Vid02_Grp33 dist Vid09_Grp14 dist Vid03_Grp65 dist Vid07_Grp58 dist Vid07_Grp59 dist Vid03_Grp20 dist Vid07_Grp61 dist ... ... ... ...

Query NN 1 NN 2 NN 3

score Vid07= 2.2 score Vid09= 3.7

1.0 0.8 0.6 0.4 0.2 Average Optimal F1 0.5 1.0 1.5 2.0 Average Optimal NDCR PRISMA Video only Audio only Audio+Video

Balanced Profile

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

( )

...

GF1 GF2 GF3 GF4 GF5 GF6 GF7 GF8 GF9 GF10 GF11 GF13 GF12

Thank you!

PRISMA (University of Chile) CCD Task November 17, 2010 25 / 25