L1-regularized Logistic Regression Stacking and Transductive CRF - PowerPoint PPT Presentation

L1-regularized Logistic Regression Stacking and Transductive CRF Smoothing for Action Recognition in Video Svebor Karaman , Lorenzo Seidenari, Andrew D. Bagdanov, Alberto Del Bimbo Media Integration and Communication Center (MICC) University of Florence, Florence, Italy { svebor.karaman, lorenzo.seidenari } @unifi.it, { bagdanov, delbimbo } @dsi.unifi.it http://www.micc.unifi.it/vim/people Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 1 / 18

THUMOS Workshop First International Workshop on Action Recognition with a Large Number of Classes 101 Classes, 5 types: Human-Object Interaction, Human-Human Interaction, Body-Motion Only, Playing Musical Instruments, Sports. 13320 videos (25 groups) Pre-computed and pre-encoded (hard-assigned 4000 BoW) low-level features: STIP, Dense Trajectory Features (MBH, HOG, HOF, TR) 3 splits : 2/3 train, 1/3 test (disjoint groups in train/test) Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 2 / 18

Introduction Our game plan and our goals Priority: establish a working BOW pipeline on given hard assigned coded features (MBH, HOG, HOF, STIP, TR) to establish our baseline Limitations: I Loss due of hard assignment I No contextual features I Lots and lots of classes and features, unclear how to fuse Goal 1: improve the features in our baseline I Use better encoding of provided features (after re-extraction) I Add static contextual features extracted from keyframes Goal 2: experiment with fusion schemes I Regularized stacking of experts I Transductive smoothing of expert outputs Note we did not use any external data or the provided attributes Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 3 / 18

Baseline with provided features (Run-1) Run 1: a respectable baseline Late fusion (sum) of 1-vs-All SVM classifiers (Histogram Intersection Kernel) learned on M = 5 features X E f class( x ) = arg max c ( x ) (1) c f ∈ F org Performance: 74.6% (Split1: 72.85%, Split2: 74.96% , Split3: 75.97%) �� Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 4 / 18

Better encoding of dense trajectories features Extraction of dense trajectories [Wang:2013] I On a modest cluster of 20CPUs: F 5 nodes F Quad Core 2.7Ghz CPUs F 48GB Total RAM I Total time to extract: 25h I Disk usage: 660GB Extracted features: I Separate x- and y-components (MBHx and MBHy) I Standard concatenation of the two local descriptors (MBH). I Histogram of Gradients (HoG) Fisher encoding of all features independently: I 256 Gaussians with diagonal covariance I Gradients with respect to means and covariances Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 5 / 18

Is context relevant for action recognition? We extract the central frame of each video as keyframe Visualizing the mean keyframe each class is illuminating: Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 6 / 18

Is context relevant for action recognition? We extract the central frame of each video as keyframe Visualizing the mean keyframe each class is illuminating: Basketball Playing Cello Ice Dancing Soccer Penalty Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 6 / 18

Additional contextual features Dense sampled Pyramidal-SIFT [Seidenari:2013] features (P-SIFT and P-OpponentSIFT) on keyframes I Pyramidal-SIFT: three pooling levels, corresponding to 2 × 2 , 4 × 4 , 6 × 6 pooling regions. Each level has its own dictionary: 1500, 2500 and 3000 words respectively. I Spatial pyramid configuration: 1x1, 2x2, 1x3 I Locality-constrained Linear Coding and max pooling [Wang:2010] Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 7 / 18

Late fusion with all features (Run-2) Run-2: more features, better encoding The Fisher encoded MBH, MBHx, MBHy, and the LLC encoded P-SIFT and P-OSIFT are fed to Linear 1-vs-all SVMs Combined with provided feature histograms: total of M = 11 features Performance: 82.46% (Split1: 81.47%, Split2: 83.01%, Split3: 82.88%) Run-1: 74.6% �� Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 8 / 18

Stacking Stacking: learn a classifier on top of the concatenation of expert decisions: [ E j S ( x ) = i ] , for j ∈ { 1 , . . . M } , i ∈ { 1 , . . . N } (2) Having lots of class/feature experts makes THUMOS an excellent playground for this type of fusion approach. Our idea: use L1-regularized LR for class/feature expert selection. Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 9 / 18

Stacking Stacking: learn a classifier on top of the concatenation of expert decisions: [ E j S ( x ) = i ] , for j ∈ { 1 , . . . M } , i ∈ { 1 , . . . N } (2) Having lots of class/feature experts makes THUMOS an excellent playground for this type of fusion approach. Our idea: use L1-regularized LR for class/feature expert selection. Doing it wrong: decisions values on training samples from classifiers trained on those samples (a) Train (b) Test Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 9 / 18

Stacking Stacking: learn a classifier on top of the concatenation of expert decisions: [ E j S ( x ) = i ] , for j ∈ { 1 , . . . M } , i ∈ { 1 , . . . N } (2) Having lots of class/feature experts makes THUMOS an excellent playground for this type of fusion approach. Our idea: use L1-regularized LR for class/feature expert selection. Doing it right: reconstruct the decisions on the training samples by running multiple held out training/test folds (a) Train hold-out (b) Test Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 9 / 18

Logistic regression for stacking (Run-3) Run-3: L1 regularized logistic stacking Motivation: smart weighted/selection scheme Model ( β c , b c ) of class c obtained by minimizing the loss: n ln(1 + e − y i β T S ( x i )+ b ) X ( β c , b c ) = arg min β ,b || β || 1 + C (3) i =1 Performance: 84.44% (Split1: 83.70%, Split2: 85.56%, Split3: 84.07%) Run-2: 82.46% �� Svebor Karaman et al. (MICC) THUMOS Submission 40 December 7, 2013 10 / 18

L1-regularized Logistic Regression Stacking and Transductive CRF - PowerPoint PPT Presentation

L1-regularized Logistic Regression Stacking and Transductive CRF Smoothing for Action Recognition in Video Svebor Karaman , Lorenzo Seidenari, Andrew D. Bagdanov, Alberto Del Bimbo Media Integration and Communication Center (MICC) University of

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Todays lecture Logistic regression How can we use logistic regression for reranking? Shay

From Logistic Regression to Neural Networks CMSC 470 Marine Carpuat Logistic Regression What

LEARNING Outline Math Behind Logistic Regression Visualizing Logistic Regression Loss

Workshop 10.5a: Logistic regression Murray Logan August 23, 2016 Table of contents 1 Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Logistic Regression using OLS1D in Excel 2013 XL4D: V0H XL4D: V0H XL4D: V0H 2015 Schield

Workshop 10.5a: Logistic regression Murray Logan 05 Sep 2016 Section 1 Logistic regression

Lecture 3: Logistic Regression Feng Li Shandong University fli@sdu.edu.cn September 21, 2020

Book Stacking Harmonic Sums table Albert R Meyer, April 6, 2012 Albert R Meyer,

XL4B: Logistic Regression using OLS1B in Excel 2013 25 Feb 2018 V0C-2x XL4B: V0C-2x XL4B: V0C-2x

Logistic regression Shay Cohen (based on slides by Sharon Goldwater) 28 October 2019 Todays

Machine Learning Logistic Regression Hamid R. Rabiee Spring 2015

Learning From Data Lecture 9 Logistic Regression and Gradient Descent Logistic Regression

Logistic regression Predict binary outcomes (success/failure) from numerical or categorical

Overview Video classification Bag of spatio-temporal features Action localization

Winter Summit 2019 Shehaqua Family Camp Summit Purpose Need your help to keep camp going strong!

Underground Laboratories Underground Laboratories Eugenio Coccia INFN Laboratori Nazionali del

SIS and DIS Neutrino Interactions 4. Conclusion Subscribe NuSTEC News

Natural Language Processing Dan Klein, John DeNero, GSI: David Gaddy UC Berkeley Logistics

Pattern Landscapes --- or what we can learn from Dating Patterns Aino Vonge Corry

Part 2: Challenging the admissibility of Drill music in criminal trials Judy Khan QC, Garden

Rolling of projective planes with G2-symmetry Gil Bor, CIMAT Guanajuato, Mexico Collaborators: