Very Deep ConvNets for Large-Scale Image Recognition
Karen Simonyan, Andrew Zisserman Visual Geometry Group, University of Oxford
ILSVRC Workshop 12 September 2014
for Large-Scale Image Recognition Karen Simonyan , Andrew Zisserman - - PowerPoint PPT Presentation
Very Deep ConvNets for Large-Scale Image Recognition Karen Simonyan , Andrew Zisserman Visual Geometry Group, University of Oxford ILSVRC Workshop 12 September 2014 2 Summary of VGG Submission Localisation task 1 st place, 25.3% error
ILSVRC Workshop 12 September 2014
2
3
4
image conv-64 conv-64 maxpool FC-4096 FC-4096 FC-1000 softmax conv-128 conv-128 maxpool conv-256 conv-256 maxpool conv-512 conv-512 maxpool conv-512 conv-512 maxpool
5
1st 3x3 conv. layer 2nd 3x3 conv. layer 5 5
6
(scale jittering)
7
256 N≥256 224 224 384 N≥384
to multiple crops
8
image class score map conv. layers pooling class scores
9
image batch
10
9.4 8.8 9 9.3 8.7 8.7 7 7.5 8 8.5 9 9.5 13 layers 16 layers 19 layers
Top-5 Classification Error (Val. Set)
256 384 [256;512] training image smallest side
better
11
9.4 8.8 9 9.3 8.7 8.7 8 7 7.5 8 8.5 9 9.5 13 layers 16 layers 19 layers
Top-5 Classification Error (Val. Set)
256 384 [256;512]
better
training image smallest side
12
9.4 8.8 9 9.3 8.7 8.7 8.2 7.6 7.5 7 7.5 8 8.5 9 9.5 13 layers 16 layers 19 layers
Top-5 Classification Error (Val. Set)
256 384 [256;512]
better
training image smallest side
13
7.3 6.7 8.1 11.7 8.4 7.9 9.1 12.5
6 7 8 9 10 11 12
Top-5 Classification Error (Test Set)
multiple nets single net
better
14
7.3 7 6.7 8.1 11.7 8.4 7.3 7.9 9.1 12.5
6 7 8 9 10 11 12
Top-5 Classification Error (Test Set)
multiple nets single net
better
1. Localisation ConvNet predicts a set of bounding boxes 2. Bounding boxes are merged 3. Resulting boxes are scored by a classification ConvNet
15
16
224x224 crop bbox
17
25.3 26.4 29.9 31.9 24 25 26 27 28 29 30 31 32
Top-5 Localisation Error (Test Set)
better
18
27 15.2 7 10 20 30 2012 2013 2014
VGG Team ILSVRC Progress
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPUs used for this research.