Localization with Spatio-Temporal Selective Search and SPPnet - PowerPoint PPT Presentation

Localization with Spatio-Temporal Selective Search and SPPnet Ryosuke Yamamoto, Nakamasa Inoue, Koichi Shinoda Tokyo Institute of Technology TRECVID 2015 TokyoTech 1

Outline ● Previous works – Selective Search – Spatial Pyramid Pooling (SPP) net ● Our Methods 1.Spatio-Temporal Selective Search 2.Multi-Frame Score Fusion 3.Neighbor-Frame Score Boosting ● Experiments, Results and Conclusion TRECVID 2015 TokyoTech 2

Selective Search ● Selective Search produces a large number of object region proposals from an image – Use several strategies including useless ones The image is from the paper J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, A. W. M. Smeulders, Selective search for object recognition. In IJCV, vol.104, pp.154-171, 2013 TRECVID 2015 TokyoTech 3

Spatial Pyramid Pooling (SPP) net ● An efficient method to extract Region proposals by Selective Search [2] CNN scores from a large number of object regions of an image – CNN layers shared among all regions CNN – SVMs computed for each region – Selective Search is used for region SPP layer SPP layer proposals FC FC ReLU ReLU FC FC K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in ReLU ReLU deep convolutional networks for visual recognition. In IEEE SVM SVM Transactions on Pattern Analysis and Machine Intelligence, Score Score pp.1904-1916, 2015 TRECVID 2015 TokyoTech 4

Spatio-Temporal Region Proposals 1. (1) ● Selective Search with temporal dimensional extended region proposals – Produce temporally continuous regions – Contains a large number of meaningless regions – Each video is separated at each I-frame and segmented since computational time is limited Image pixels Video voxels Edges weighted with similarity Time TRECVID 2015 TokyoTech 5

Spatio-Temporal Region Proposals 1. (2) ⊃ ⊃ ⊃ Hierarchy ⊃ ⊃ ⊃ Time Regions are hierarchical and temporally continuous TRECVID 2015 TokyoTech 6

Multi-Frame Score Fusion 2. (1) ● Basic idea – Some frames contain noise or object deformation making detection harder – Results of ST-Region Proposals contain many meaningless region proposals ➔ Information of neighbor frames provides robustness ● Fuse feature maps among several frames – This requires region proposals temporal continuous – ST-Region Proposals adopted TRECVID 2015 TokyoTech 7

Multi-Frame Score Fusion 2. (2) – In experiments, we concluded late fusion is the best I-Frame I-Frame P-Frame P-Frame P-Frame P-Frame P-Frame P-Frame CNN CNN CNN CNN SPP SPP SPP SPP FC FC FC FC RELU RELU RELU RELU FC Fusion FC FC Fusion FC RELU RELU FC: Fully connected layer SVM Fusion SVM SPP: Spatial Pyramod Pooling layer Score ReLU: Rectified linear unit TRECVID 2015 TokyoTech 8

Neighbor-Frame Score Boosting 3. ● Basic idea – Based on same aspect of previous score fusion – Objects will appear in several continuous frames ➔ Information of neighbor frames provides robustness ● Boost scores of I-frames between positives by Increase their scores by a constant I-Frame I-Frame I-Frame I-Frame I-Frame I-Frame Boosted Boosted Time TRECVID 2015 TokyoTech 9

Experiments – Manual Annotations ● Airplane, Boat_Ship, Bridges, Bus, Motorcycle, Telephones, Flags, Quadruped – provided ● Anchorperson – annotated 12k I-frames ● Computers – annotated 7k I-frames TRECVID 2015 TokyoTech 10

Experiments – Training ● Deciding the threshold and the fusion method – Used last year's dataset and concepts – Train: IACC_2_A – Val: IACC_2_B ● Submitted runs – Train: IACC_2_A including additional annotations, IACC_2_B – Test: IACC_2_C TRECVID 2015 TokyoTech 11

Results ● Multi-Frame Score Fusion and Neighbor-Frame Score Boosting improved the score ● We archived 3 rd place among all teams with harmonic mean of F-scores Harm. Mean of F-scores Run ID Method Val Test (Base) Selective Search + SPPnet 0.4481 0.5656 + ST-Region Proposals, Multiple 0.4518 0.5716 Multi-Frame Score Fusion Multiple_Aug3 + Neighbour-Frame Score Boost 0.4569 0.5750 TRECVID 2015 TokyoTech 12

Results ● Multi-Frame Score Fusion and Neighbor-Frame Score Boosting improved the score ● We archived 3 rd place among all teams with harmonic mean of F-scores I-frame F-score Mean pixel F-score Harmonic mean 0.8 0.7 0.6 M D C 0.5 e T d C O r C i i m N a u P 0.4 M r p Y U i c s B i l B S l B e 0.3 e O B B s e s e M t s t s e t 0.2 t B s e s t 0.1 t 0 TRECVID 2015 TokyoTech 13

Results – Examples System output ● Sometimes better than GT Ground truth TRECVID 2015 TokyoTech 14

Results – Spatial Score ● We achieved 1 st place in Mean Pixel F-score by throttling a number of positives to reduce FPs – Of course I-frame F-score is not good 0.8 Multiple_Spat Multiple_Spat 0.7 Multiple_Aug3 Multiple_Aug3 0.6 F-Score Mean pixel Multiple Multiple 0.5 Single2 Single2 I-frame 0.4 0.3 0.2 ● Mean Pixel F-score is calculated from true positive and false positive I-frames, not intuitive TRECVID 2015 TokyoTech 15

Conclusion ● We developed a localization system using ST- Region Proposals and CNN with SPP-net ● Multi-Frame Score Fusion with ST-Region Proposals and Neighbor-Frame Score Boosting improved the score ● Problem: The detection results strongly depend on quality of ST-Region Proposals – Improve ST-Region Proposals quality – Localization without region candidates TRECVID 2015 TokyoTech 16

Localization with Spatio-Temporal Selective Search and SPPnet - PowerPoint PPT Presentation

Localization with Spatio-Temporal Selective Search and SPPnet Ryosuke Yamamoto, Nakamasa Inoue, Koichi Shinoda Tokyo Institute of Technology TRECVID 2015 TokyoTech 1 Outline Previous works Selective Search Spatial Pyramid Pooling

Spatio-Temporal Statistics with R Chapter Two: Exploring Spatio-Temporal Data Spatio-Temporal

Overview Video classification Bag of spatio-temporal features Action localization

Temporal, Spatial, and Spatio-temporal Granularities Gabriele Pozzani Department of Computer

Lecture 1 Spatio-temporal data & Linear Models Colin Rundel 1/18/2017 1 Spatio-temporal

Estimating parameters in spatio- temporal Quermass- in spatio-temporal interaction process

Overview Optical flow Video classification Bag of spatio-temporal features Action

Realistic Image Synthesis - Spatio-temporal Sampling and Reconstruction. Exploiting Temporal

Category-level localization Cordelia Schmid Category-level localization Localization of

Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in

Weakly-Supervised Temporal Localization via Occurrence Count Learning Julien Schroeter

Spaten : a Spatio-Temporal and Textual Big Data Generator Thaleia Dimitra Doudali* Ioannis

Building a Visual Analytics System for Spatio-temporal Analysis Alan Tan , Yue Lin, Ralf Gommers 5

Detecting Wikipedia Vandalism via Spatio- Temporal Analysis of Revision Metadata Andrew G. West

Localization Nischal K N System Overview Mapping Hector Mapping Localization Path Planning

Category-level localization Cordelia Schmid Category-level localization Localization up to a

Sequential Data Types of data Temporal (focusing on this one today) Bi-Temporal (Physical Time

Prelude to Parting After the birth of Rachels first child and his 11 th son Jacob

#3: Trademark Two Problems Eric R. Waltmire Searching Patent Attorney 1. Not Searching 2.

def

SEARCHING FOR DARK PHOTONS WITH POSITRONS AT JEFFERSON LAB Luca Marsicano INFN

AP BIOLOGY This material is made freely available at www.njctl.org and is intended for the

Lounge Evolutionary Algorithms Christopher Mark Gore http://www.cgore.com cgore@cgore.com

Natural Computing Lecture 2: Genetic Algorithms J. Michael Herrmann michael.herrman@ed.ac.uk

Genetic variation for Wood Basic Density, Knot index and Their Genetic variation for Wood Basic

Sambuz

Useful Links

Newsletter

Mail Us

Localization with Spatio-Temporal Selective Search and SPPnet - PowerPoint PPT Presentation

Localization with Spatio-Temporal Selective Search and SPPnet Ryosuke Yamamoto, Nakamasa Inoue, Koichi Shinoda Tokyo Institute of Technology TRECVID 2015 TokyoTech 1 Outline Previous works Selective Search Spatial Pyramid Pooling

Spatio-Temporal Statistics with R Chapter Two: Exploring Spatio-Temporal Data Spatio-Temporal

Overview Video classification Bag of spatio-temporal features Action localization

Temporal, Spatial, and Spatio-temporal Granularities Gabriele Pozzani Department of Computer

Lecture 1 Spatio-temporal data &amp; Linear Models Colin Rundel 1/18/2017 1 Spatio-temporal

Estimating parameters in spatio- temporal Quermass- in spatio-temporal interaction process

Overview Optical flow Video classification Bag of spatio-temporal features Action

Realistic Image Synthesis - Spatio-temporal Sampling and Reconstruction. Exploiting Temporal

Category-level localization Cordelia Schmid Category-level localization Localization of

Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in

Weakly-Supervised Temporal Localization via Occurrence Count Learning Julien Schroeter

Spaten : a Spatio-Temporal and Textual Big Data Generator Thaleia Dimitra Doudali* Ioannis

Building a Visual Analytics System for Spatio-temporal Analysis Alan Tan , Yue Lin, Ralf Gommers 5

Detecting Wikipedia Vandalism via Spatio- Temporal Analysis of Revision Metadata Andrew G. West

Localization Nischal K N System Overview Mapping Hector Mapping Localization Path Planning

Category-level localization Cordelia Schmid Category-level localization Localization up to a

Sequential Data Types of data Temporal (focusing on this one today) Bi-Temporal (Physical Time

Prelude to Parting After the birth of Rachels first child and his 11 th son Jacob

#3: Trademark Two Problems Eric R. Waltmire Searching Patent Attorney 1. Not Searching 2.

def

SEARCHING FOR DARK PHOTONS WITH POSITRONS AT JEFFERSON LAB Luca Marsicano INFN

AP BIOLOGY This material is made freely available at www.njctl.org and is intended for the

Lounge Evolutionary Algorithms Christopher Mark Gore http://www.cgore.com cgore@cgore.com

Natural Computing Lecture 2: Genetic Algorithms J. Michael Herrmann michael.herrman@ed.ac.uk

Genetic variation for Wood Basic Density, Knot index and Their Genetic variation for Wood Basic

Sambuz

Useful Links

Newsletter

Mail Us

Lecture 1 Spatio-temporal data & Linear Models Colin Rundel 1/18/2017 1 Spatio-temporal