Return of the Devil in the Details: Delving Deep into Convolutional - PowerPoint PPT Presentation

Return of the Devil in the Details: Delving Deep into Convolutional Nets Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman Visual Geometry Group, Department of Engineering Science, Univesity of Oxford Hilal E. Akyüz 1

2 slide by Chatfeld et al

3 slide by Chatfeld et al

What is Changed Since 2011? ● Different deep architectures ● The latest generation of CNNs have achieved impressive results ● Unclear how the different methods introduced recently compare to each other and to shallow methods 4

Overview of the Paper This paper compare the latest (till 2014) methods ● on a commond ground . Several properties of CNN-based representation ● and data augmentation techniques Compare both different pre-trained network ● architectures and different learning heuristics . 5

Dataset (pre-training) ● ILSVRC-2012 – Contains 1,000 object categories from ImageNet – ~1.2M training images – 50,000 validation images – 100,000 test images ● Performance is evaluated using top-5 classification error 6

Datasets (training, fine-tuning) ● Pascal VOC 2012 ● Pascal VOC 2007 – Multi-label dataset – Multi-label dataset – Contains ~ twice as – Contains ~10,000 images many images – 20 objects classes – Does not include test set, instead, evaluation uses the – Images split into train, official PASCAL validation and test sets. Evaluation Server. ● Performance is measured as mean Average Precision ( mAP ) 7

Datasets (training, fine-tuning) ● Caltech-101 ● Caltech-256 – 101 classes – 256 classes – Three random split – Two random split – 30 training, 30 testing – 60 training, the rest are images per class . used for testing ● Performance is measured using mean class accuracy 8

Outline ● 3 scenarios: – Shallow represantation – Deep representation (CNN) with pre-training – Deep representation (CNN) with pre-training and fine-tuning ● Different pre-trained networks – CNN-S, CNN-M, CNN-F Scenario-specifc Reducing CNN final layer output dimensionality ● best practices Data augmentation ( for both CNN and IFV ) ● Generally-applicable Color information best practices ● Feature normalisation (for both CNN and IFV) ● 9

1 0 Data Augmentation slide by Chatfeld et al

1 1 slide by Chatfeld et al

Scenario1: Shallow Representation (IFV) IFV usually outperformed related encoding ● methods Power normalization for improved ● 1 2

IFV Details Multi-scale dense sampling ● SIFT features ● Soft quantized using GMM with K=256 components ● Spatial Pyramid (1x1, 3x1, 2x2) ● 3 modification: ● – Intra-norm ● L2 norm is >applied to the subblocks – Spatially-extended local descriptors ● Memory-efficient than SPM – Color features ● Local Color Statistics 1 3

Scenario2: Deep Representation (CNN) with Pre-training ● Pre-trained on ImageNet ● 3 different pre-trained networks 1 4

1 6 Pre-Trained Networks slide by Chatfeld et al

Scenario3: Deep Representation (CNN) with Pre-training & Fine-tuning Pre-trained on one dataset and applied to another ● Improve the performance ● Become dataset-specific ● 1 7

CNN Details ● Trained with same training protocol, same implementation ● Caffe framework ● L2 normalization of CNN features – Before introducing to SVM 1 8

CNN Training ● Gradient descent with momentum – Momentum is 0.9 – Weight decay is 5x10 -4 – Learning rate is 10 -2 , decreased by 10 ● Data augmentation – Random crops – Flips – RGB jitterring ● 3 weeks with a Titan Black (Slow arch.) 1 9

CNN Fine-tuning ● Only last layer ● Classification hinge loss (CNN-S TUNE-CLS), ranking hinge loss (CNN-S TUNE-RNK) for VOC ● Softmax regression loss for Caltech-101 ● Lower initial learning rate (VOC & Caltech) 2 0

Analysis 2 2

2 9 VOC 2007 Results slide by Chatfeld et al

Take Home Messages Data augmentation helps a lot, both for deep and ● shallow methods Fine-tuning makes a difference, and use of ranking ● loss can be prefferred Smaller filters and deeper networks help, although feature ● computation is slower CNN-based methods >> shallow methods ● We can transfer tricks from deep features to shallow ● features We can achieve incredibly low dimensional (~128D) but ● performant features with CNN-based methods ● If you get the details right, it's possible to get to state-of-the-art with very simple methods!! 3 2

Thank You For Listening.. Q&A? (DEMO) Hilal E. Akyüz 3 4

DEMO CNN Model Pascal VOC 2007 mAP CNN-S 76.10 CNN-M 76.11 AlexNet 71.40 GoogleNet 80.91 ResNet 83.06 VGG19 81.01 3 5

Demo Model FPS (batch size=1) CNN_M 169 CNN_S 151 ResNet 11 GoogleNet 71 VGG19 50 3 6

3 7 Extras slide by Chatfeld et al

Return of the Devil in the Details: Delving Deep into Convolutional - PowerPoint PPT Presentation

Return of the Devil in the Details: Delving Deep into Convolutional Nets Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman Visual Geometry Group, Department of Engineering Science, Univesity of Oxford Hilal E. Akyz 1 2

Delving Deep into Computer Vision Caner Hazirbas Machine Learning Meetup #1 Delving Deep into

Return of the Devil in the Details: Delving Deep into Convolutional Nets Ken Chatfield - Karen

Devil Figures What is the devil figure archetype in a story? Definition: The devil figure

Better the Devil You Know Who or what is the devil? The name commonly given to the fallen

Red Devil Mine Red Devil Mine Mike McCrum, BLM Alaska Doug Cox, Ph.D., BLM Natl Operations

Red Devil Mine Risk Assessment for Mercury Releases to the Kuskokwim River from the BLM Red Devil

Dashboard Block Block details by height Block details by ID Transaction details by receipient

STAY HEALTHY | RETURN SMARTER | RETURN STRONGER THANK YOU STAY HEALTHY | RETURN SMARTER | RETURN

Iteration Announcements Return Return Statements 4 Return Statements A return statement

Red Devil Mine Human Health Risk Assessment for Mercury Releases to the Kuskokwim River from the

Whoever makes a practice of sinning is of the devil, for the devil has been sinning from the

Returns Optimization 101 Episode 7: Examining Return Reasons Power of Return Reasons Return

Expansive Mind - Heightened Consciousness Delving into the Mystic Path of the Baal Shem Tov:

Delving more deeply into UNIX Bualo Chapter 3 1 / 21 Overview 1) A Little Review 2) Unix

Delving further into privacy policies Engineering & Public Policy Lorrie Cranor October

In Bed With The Devil: Recognizing Human Teratogenic Exposures Jan M. Friedman, MD, PhD Jan M.

Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Je

14:332:231 DIGITAL LOGIC DESIGN Ivan Marsic, Rutgers University Electrical & Computer

Unit 1 Circuit Basics KVL, KCL, Ohm's Law LED Outputs Buttons/Switch Inputs 1.2 VOLTAGE AND

Unit 1 Circuit Basics KVL, KCL, Ohm's Law LED Outputs Buttons/Switch Inputs 1.2 VOLTAGE AND

The Global Atmosphere Watch (GAW) Reactive Gases Measurement Network Martin Schultz 1 , Hajime

Continuous Improvement Toolkit QFD (Quality Function Deployment) Continuous Improvement Toolkit .

Completeness of Queries over SQL Databases Werner Nutt and Simon Razniewski Introduction }

A Discriminatively Trained, Multiscale, Deformable Part Model February 24, 2016 Adam Allevato

Return of the Devil in the Details: Delving Deep into Convolutional - PowerPoint PPT Presentation

Return of the Devil in the Details: Delving Deep into Convolutional Nets Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman Visual Geometry Group, Department of Engineering Science, Univesity of Oxford Hilal E. Akyz 1 2

Delving Deep into Computer Vision Caner Hazirbas Machine Learning Meetup #1 Delving Deep into

Return of the Devil in the Details: Delving Deep into Convolutional Nets Ken Chatfield - Karen

Devil Figures What is the devil figure archetype in a story? Definition: The devil figure

Better the Devil You Know Who or what is the devil? The name commonly given to the fallen

Red Devil Mine Red Devil Mine Mike McCrum, BLM Alaska Doug Cox, Ph.D., BLM Natl Operations

Red Devil Mine Risk Assessment for Mercury Releases to the Kuskokwim River from the BLM Red Devil

Dashboard Block Block details by height Block details by ID Transaction details by receipient

STAY HEALTHY | RETURN SMARTER | RETURN STRONGER THANK YOU STAY HEALTHY | RETURN SMARTER | RETURN

Iteration Announcements Return Return Statements 4 Return Statements A return statement

Red Devil Mine Human Health Risk Assessment for Mercury Releases to the Kuskokwim River from the

Whoever makes a practice of sinning is of the devil, for the devil has been sinning from the

Returns Optimization 101 Episode 7: Examining Return Reasons Power of Return Reasons Return

Expansive Mind - Heightened Consciousness Delving into the Mystic Path of the Baal Shem Tov:

Delving more deeply into UNIX Bualo Chapter 3 1 / 21 Overview 1) A Little Review 2) Unix

Delving further into privacy policies Engineering &amp; Public Policy Lorrie Cranor October

In Bed With The Devil: Recognizing Human Teratogenic Exposures Jan M. Friedman, MD, PhD Jan M.

Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Je

14:332:231 DIGITAL LOGIC DESIGN Ivan Marsic, Rutgers University Electrical &amp; Computer

Unit 1 Circuit Basics KVL, KCL, Ohm's Law LED Outputs Buttons/Switch Inputs 1.2 VOLTAGE AND

Unit 1 Circuit Basics KVL, KCL, Ohm's Law LED Outputs Buttons/Switch Inputs 1.2 VOLTAGE AND

The Global Atmosphere Watch (GAW) Reactive Gases Measurement Network Martin Schultz 1 , Hajime

Continuous Improvement Toolkit QFD (Quality Function Deployment) Continuous Improvement Toolkit .

Completeness of Queries over SQL Databases Werner Nutt and Simon Razniewski Introduction }

A Discriminatively Trained, Multiscale, Deformable Part Model February 24, 2016 Adam Allevato

Delving further into privacy policies Engineering & Public Policy Lorrie Cranor October

14:332:231 DIGITAL LOGIC DESIGN Ivan Marsic, Rutgers University Electrical & Computer