Category-specific video summarization Speaker: Danila Potapov - PowerPoint PPT Presentation

Category-specific video summarization Speaker: Danila Potapov Joint work with: Matthijs Douze Zaid Harchaoui Cordelia Schmid LEAR team, Inria Grenoble Rhône-Alpes Christmas Colloquium on Computer Vision Moscow, 28.12.2015 1 / 22

Introduction ◮ size of video data is growing ◮ 300 hours of video uploaded on YouTube every minute ◮ types of video data: user-generated, sports, news, movies User-generated Sports News Movies ◮ common need for structuring video data 2 / 22

Video summarization Detecting the most important part in a “Landing a fish” video 3 / 22

Goals ◮ Recognize events accurately and efficiently ◮ Identify the most important moments in videos ◮ Quantitative evaluation of video analysis algorithms 4 / 22

Contributions ◮ supervised approach to video summarization ◮ temporal localization at test time ◮ MED-Summaries dataset for evaluation of video summarization Publication ◮ D. Potapov, M. Douze, Z. Harchaoui, C. Schmid “Category-specific video summarization”, ECCV 2014 ◮ MED-Summaries dataset online http://lear.inrialpes.fr/people/potapov/med_summaries 5 / 22

MED-Summaries dataset ◮ evaluation benchmark for video summarization ◮ subset of TRECVID Multimedia Event Detection 2011 dataset ◮ 10 categories T otal duration 30 YouT ubeHl 25 20 UTE 15 10 MED-Summaries 5 SumMe 0 Number of annotators per video 20 SumMe 15 10 YouT ubeHl 5 MED-Summaries UTE 0 Number of segments 10000 MED-Summaries 8000 6000 4000 UTE SumMe YouT ubeHl 2000 0 6 / 22

Definition A video summary ◮ built from subset of temporal segments of original video ◮ conveys the most important details of the video Original video, and its video summary for the category “Birthday party” 7 / 22

Overview of our approach ◮ produce visually coherent temporal segments ◮ no shot boundaries, camera shake, etc. inside segments ◮ identify important parts ◮ category-specific importance : a measure of relevance to the type of event Input video (category: Working on a sewing project) KTS segments Per-segment classification scores Maxima Output summary 8 / 22

Related works ◮ specialized domains ◮ Lu and Grauman [2013], Lee et al. [2012]: summarization of egocentric videos ◮ Khosla et al. [2013]: keyframe summaries, canonical views for cars and trucks from web images ◮ Sun et al. [2014] “Ranking Domain-specific Highlights by Analyzing Edited Videos” ◮ automatic approach for harvesting data ◮ highlight detection vs. temporally coherent summarization ◮ Gygli et al. [2014] “Creating Summaries from User Videos” ◮ cinematic rules for segmentation ◮ small set of informative descriptors 9 / 22

Kernel temporal segmentation ◮ goals: group similar frames such that semantic changes occur at the boundaries ◮ kernelized Multiple Change-Point Detection algorithm ◮ change-points divide the video into temporal segments ◮ input: robust frame descriptor (SIFT + Fisher Vector) − 0.25 0.00 0.25 0.50 0.75 1.00 Kernel matrix and temporal segmentation of a video 10 / 22

Kernel temporal segmentation algorithm Input: temporal sequence of descriptors x 0 , x 1 , . . . , x n − 1 1. Compute the Gram matrix A : a i , j = K ( x i , x j ) 2. Compute cumulative sums of A 3. Compute unnormalized variances v t , t + d = � t + d − 1 � t + d − 1 a i , i − 1 a i , j i = t i , j = t d t = 0 , . . . , n − 1 , d = 1 , . . . , n − t 4. Do the forward pass of dynamic programming � � L i , j = min t = i ,..., j − 1 L i − 1 , t + v t , j , L 0 , j = v 0 , j i = 1 , . . . , m max , j = 1 , . . . , n 5. Select the optimal number of change points m ⋆ = arg min m = 0 ,..., m max L m , n + C m ( log ( n / m ) + 1 ) 6. Find change-point positions by backtracking � � t m ⋆ = n , t i − 1 = arg min t L i − 1 , t + v t , t i i = m ⋆ , . . . , 1 Output: Change-point positions t 0 , . . . , t m ⋆ − 1 11 / 22

Supervised summarization ◮ Training: train a linear SVM from a set of videos with just video-level class labels ◮ Testing: score segment descriptors with the classifiers trained on full videos; build a summary by concatenating the most important segments of the video Input video (category: Working on a sewing project) KTS segments Per-segment classification scores Maxima Output summary 12 / 22

MED-Summaries dataset ◮ 100 test videos (= 4 hours) from TRECVID MED 2011 ◮ multiple annotators ◮ 2 annotation tasks: ◮ segment boundaries (median duration: 3.5 sec.) ◮ segment importance (grades from 0 to 3) ◮ 0 = not relevant to the category ◮ 3 = highest relevance Central frame for each segment with importance annotation for category “Changing a vehicle tyre”. 13 / 22

Annotation interface 14 / 22

Dataset statistics Training Validation Test MED dataset Total videos 10938 1311 31820 Total duration, hours 468 57 980 MED-Summaries Annotated videos — 60 100 Total duration, hours — 3 4 Annotators per video — 1 2-4 Total annotated segments (units) — 1680 8904 15 / 22

Evaluation metrics for summarization (1) ◮ often based on user studies ◮ time-consuming, costly and hard to reproduce ◮ Our approach: rely on the annotation of test videos ◮ ground truth segments { S i } m i = 1 ◮ computed summary { � S j } ˜ m j = 1 � � S i ∩ � ◮ coverage criterion: > α P i duration S j period period covered by the summary t ground truth covers the ground-truth summary no match ◮ importance ratio for summary � S of duration T total importance I ( � I ∗ ( � S ) covered by the summary S ) = I max ( T ) max. possible total importance for a summary of duration T 16 / 22

Evaluation metrics for summarization (2) ◮ a meaningful summary covers a ground-truth segment of importance 3 1 2 0 3 3 importance 3 segments are required ground truth to see an importance-3 segment summary classification score 0.7 0.5 0.9 Meaningful summary duration (MSD): minimum length for a meaningful summary Evaluation metric for temporal segmentation ◮ segmentation f-score : match when overlap/union > β 17 / 22

Experiments Baselines ◮ Users : keep 1 user in turn as a ground truth for evaluation of the others ◮ SD + SVM : shot detector Massoudi et al. [2006] for segmentation + SVM-based importance scoring ◮ KTS + Cluster : Kernel Temporal Segmentation + k-means clustering for summarization ◮ sort segments by increasing distance to centroid Our approach Kernel Video Summarization = Kernel Temporal Segmentation + SVM-based importance scoring 18 / 22

Results Method Segmentation Summarization Avg. f-score Med. MSD (s) higher better lower better Users 49.1 10.6 SD + SVM 30.9 16.7 KTS + Cluster 13.8 41.0 KVS 41.0 12.5 Segmentation and summarization performance 52 50 Importance ratio 48 Users SD + SVM 46 KTS + Cluster 44 KVS-SIFT KVS-MBH 42 40 38 10 15 20 25 Duration, sec. Importance ratio for different summary durations 19 / 22

Example summaries 20 / 22

Conclusion ◮ KVS delivers short and highly-informative summaries, with the most important segments for a given category ◮ temporal segmentation algorithm produces visually coherent segments ◮ KVS is trained in a weakly-supervised way ◮ does not require segment annotations in the training set ◮ MED-Summaries — dataset for evaluation of video summarization ◮ annotations and evaluation code available online Publication ◮ D. Potapov, M. Douze, Z. Harchaoui, C. Schmid “Category-specific video summarization”, ECCV 2014 ◮ MED-Summaries dataset online http://lear.inrialpes.fr/people/potapov/med_summaries 21 / 22

Thank you for your attention! 22 / 22

Category-specific video summarization Speaker: Danila Potapov - PowerPoint PPT Presentation

Category-specific video summarization Speaker: Danila Potapov Joint work with: Matthijs Douze Zaid Harchaoui Cordelia Schmid LEAR team, Inria Grenoble Rhne-Alpes Christmas Colloquium on Computer Vision Moscow, 28.12.2015 1 / 22

ACL19 Summarization Xiachong Feng Papers Multi-Document Summarization Scientific Paper

Document Summarization Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein UC

Overview of TAC 2011 Summarization Track Karolina Owczarzak, Hoa Trang Dang National Institute of

A Neural Attention Model for Sentence Summarization Alexander M. Rush, Sumit Chopra, Jason

Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein UC Berkeley Document

Automatic Summarization (and other stuff) Taylor Berg-Kirkpatrick CS 288 UC Berkeley

Category Specific Information for Guided Summarization Jun-Ping Ng Praveen Bysani Ziheng Lin

Video Summarization Ben Wing CS 395T, Spring 2008 April 11, 2008 Overview Video

Movie Summarization and Movie Summarization and Skimming Demonstrator Skimming Demonstrator

Get To The Point: Summarization with Pointer-Generator Networks Abigail See* Peter J. Liu

A Neural Attention Model for Abstractive Sentence Summarization Alexander Rush Sumit Chopra

Tutorial on Abstractive Text Summarization Advaith Siddharthan NLG Summer School, Aberdeen, 22

Recent Advances in Automatic Speech Summarization Sadaoki Furui Department of Computer Science

Alternative Perspectives on Summarization Systems & Applications Ling 573 May 25, 2017

Alternative Summarization: Abstraction, Reviews & Speech Ling 573 Systems and Applications

linking, cross-lingual entity linking) TAC 2011 Summarization Track Guided Summarization task

Graph Databases for Polyglot Persistence with NotaQL 2017-03-08 Johannes Schildgen

A True Hardware Read Barrier Matthias Meyer Institute of Communication Networks and Computer

Test Coverage and Post-Verification Defects: A Multiple Case Study A. Mockus - audris@avaya.com

Distributed Transactions Definition a transaction in which more than one server is

PART I Galaxy Formation Models Darren Croton Centre for Astrophysics and Supercomputing

Windows Azure, Java and NoSQL Mario Szpuszta Jrgen Mayrburl T echnical Evangelist

May 9, 2017 8:30am ET 1 1Q Safe Harbor Statement Certain statements made within this

Application Characteristics and Performance on a Cray XE6 Performance on a Cray XE6 Courtenay T.

Category-specific video summarization Speaker: Danila Potapov - PowerPoint PPT Presentation

Category-specific video summarization Speaker: Danila Potapov Joint work with: Matthijs Douze Zaid Harchaoui Cordelia Schmid LEAR team, Inria Grenoble Rhne-Alpes Christmas Colloquium on Computer Vision Moscow, 28.12.2015 1 / 22

ACL19 Summarization Xiachong Feng Papers Multi-Document Summarization Scientific Paper

Document Summarization Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein UC

Overview of TAC 2011 Summarization Track Karolina Owczarzak, Hoa Trang Dang National Institute of

A Neural Attention Model for Sentence Summarization Alexander M. Rush, Sumit Chopra, Jason

Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein UC Berkeley Document

Automatic Summarization (and other stuff) Taylor Berg-Kirkpatrick CS 288 UC Berkeley

Category Specific Information for Guided Summarization Jun-Ping Ng Praveen Bysani Ziheng Lin

Video Summarization Ben Wing CS 395T, Spring 2008 April 11, 2008 Overview Video

Movie Summarization and Movie Summarization and Skimming Demonstrator Skimming Demonstrator

Get To The Point: Summarization with Pointer-Generator Networks Abigail See* Peter J. Liu

A Neural Attention Model for Abstractive Sentence Summarization Alexander Rush Sumit Chopra

Tutorial on Abstractive Text Summarization Advaith Siddharthan NLG Summer School, Aberdeen, 22

Recent Advances in Automatic Speech Summarization Sadaoki Furui Department of Computer Science

Alternative Perspectives on Summarization Systems &amp; Applications Ling 573 May 25, 2017

Alternative Summarization: Abstraction, Reviews &amp; Speech Ling 573 Systems and Applications

linking, cross-lingual entity linking) TAC 2011 Summarization Track Guided Summarization task

Graph Databases for Polyglot Persistence with NotaQL 2017-03-08 Johannes Schildgen

A True Hardware Read Barrier Matthias Meyer Institute of Communication Networks and Computer

Test Coverage and Post-Verification Defects: A Multiple Case Study A. Mockus - audris@avaya.com

Distributed Transactions Definition a transaction in which more than one server is

PART I Galaxy Formation Models Darren Croton Centre for Astrophysics and Supercomputing

Windows Azure, Java and NoSQL Mario Szpuszta Jrgen Mayrburl T echnical Evangelist

May 9, 2017 8:30am ET 1 1Q Safe Harbor Statement Certain statements made within this

Application Characteristics and Performance on a Cray XE6 Performance on a Cray XE6 Courtenay T.

Alternative Perspectives on Summarization Systems & Applications Ling 573 May 25, 2017

Alternative Summarization: Abstraction, Reviews & Speech Ling 573 Systems and Applications