Underwater sparse image classification using deep convolutional neural networks
Mohamed Elawady
Heriot-Watt University
VIBOT Msc 2014
26 Nov 2015 Deep Learning Workshop, Lyon, 2015 1
Underwater sparse image classification using deep convolutional - - PowerPoint PPT Presentation
Underwater sparse image classification using deep convolutional neural networks Mohamed Elawady Heriot-Watt University VIBOT Msc 2014 26 Nov 2015 Deep Learning Workshop, Lyon, 2015 1 About Me! 2014-Current: PhD in Imaging Processing
VIBOT Msc 2014
26 Nov 2015 Deep Learning Workshop, Lyon, 2015 1
University [FR])
arts” supervised by Christophe Ducottet, Cecile Barat, Philippe Colantoni.
(University of Burgundy [FR], University of Girona [SP], Heriot-Watt University [UK]).
networks” supervised by Neil Robertson, David Lane.
Science] (Faculty of Computers & Informatics, Suez Canal University [EG]).
26 Nov 2015 Deep Learning Workshop, Lyon, 2015 2
26 Nov 2015 3 Deep Learning Workshop, Lyon, 2015
Neil Robertson, David Lane and David Corne.
save world’s coral reefs.
VIBOT Msc work (2013 - 2015).
– http://www.coralbots.org/ – https://youtu.be/MJ-_d3HZOi4 – https://youtu.be/6q4UiuiqZuA
26 Nov 2015 Deep Learning Workshop, Lyon, 2015 4
26 Nov 2015 Deep Learning Workshop, Lyon, 2015 5
Consists of tiny animals (not plants). Takes long time to grow (0.5 – 2cm per year). Exists in more than 200 countries. Generates 29.8 billion dollars per year through different ecosystem services. 10% of the world's coral reefs are dead, more than 60% of the world's reefs are at risk due to human- related activities. By 2050, all coral reefs will be in danger.
26 Nov 2015 6 Deep Learning Workshop, Lyon, 2015
Coral gardening through involvement
SCUBA divers in coral reef reassemble and transplantation. Examples: Reefs capers Project 2001 at Maldives & Save Coral Reefs 2012 at Thailand. Limitations: time & depth per dive session. Robot-based strategy in deep-sea coral restoration through intelligent autonomous underwater vehicles (AUVs) grasp cold-water coral samples and replant them in damaged reef areas.
26 Nov 2015 7 Deep Learning Workshop, Lyon, 2015
26 Nov 2015 8
Dense Classification
Millions of coral images Thousands of hours of underwater videos Massive number of hours to annotate every pixel inside each coral image or video frame
Manual sparse Classification
Manually annotated through coral experts by matching some random uniform pixels to target classes More than 400 hours are required to annotate 1000 images (200 coral labelled points per image)
Automatic sparse Classification
Supervised learning algorithm to annotate images autonomously Input data are ROIs around random points
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 9
Moorea Labeled Corals (MLC) University of California, San Diego (UCSD) Island of Moorea in French Polynesia ~ 2000 Images (2008, 2009, 2010) 200 Labeled Points per Image
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 10
5 Coral Classes
4 Non-coral Classes
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 11
Atlantic Deep Sea (ADS) Heriot-Watt University (HWU) North Atlantic West of Scotland and Ireland ~ 160 Images (2012) 200 Labeled Points per Image
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 12
5 Coral Classes
Sponge”
4 Non-coral Classes
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 13 Deep Learning Workshop, Lyon, 2015
26 Nov 2015 14 Deep Learning Workshop, Lyon, 2015
26 Nov 2015 15 Deep Learning Workshop, Lyon, 2015
26 Nov 2015 16 Sparse (Point-Based) Classification Deep Learning Workshop, Lyon, 2015
26 Nov 2015 17
Shallow vs Deep Classification:
Traditional architecture extracts hand-designed key features based on human analysis for input data. Modern architecture trains learning features across hidden layers; starting from low level details up to high level details.
Structure of Network Hidden Layers:
Trainable weights and biases. Independent relationship within
Pre-defined range measures. Further faster calculation.
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 18
First back-propagation convolutional neural network (CNN) for handwritten digit recognition
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 19
Buyssens (2012): Cancer cell image classification. Krizhevsky (2013): Large scale visual recognition challenge 2012.
Girshick (2013): PASCAL visual
Syafeeza (2014): Face recognition system. Pinheiro (2014): Scene labelling.
Object detection system overview (Girshick) More than 10% better than top contest performer
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 20
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 21
3 Basic Channels (RGB) Extra Channels (Feature maps) Find suitable weights of convolutional kernel and additive biases Classification Layer Color Enhancement Deep Learning Workshop, Lyon, 2015
26 Nov 2015 22
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 23
Three different-in-size patches are selected across each annotated point (61x61, 121x121, 181x181). Scaling patches up to size of the largest patch (181x181) allowing blurring in inter-shape coral details and keeping up coral’s edges and corners. Scaling patches down to size of the smallest patch (61x61) for fast classification computation.
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 24
Zero Component Analysis (ZCA) whitening makes data less-redundant by removing any neighbouring correlations in adjacent pixels. Weber Local Descriptor (WLD) shows a robust edge representation of high- texture images against high-noisy changes in illumination of image environment. Phase Congruency (PC) represents image features in such format which should be high in information and low in redundancy using Fourier transform.
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 25
Bazeille’06 solves difficulties in capturing good quality under-water images due to non-uniform lighting and underwater perturbation. Iqbal ‘07 clears under-water lighting problems due to light absorption, vertical polarization, and sea structure. Beijbom’12 figures out compensation of color differences in underwater turbidity and illumination.
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 26
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 27
The network initialized biases to zero, and kernel weights using uniform random distribution using the following range: where Nin and Nout represent number of input and output maps for each hidden layer (i.e. number of input map for layer 1 is 1 as gray-scale image or 3 as color image), and k symbolizes size of convolution kernel for each hidden layer.
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 28
Convolution layer construct output maps by convoluting trainable kernel over input maps to extract/combine features for better network behaviour using the following equation: where xi
l-1& xj l are output maps of previous (l-1) & current (l) layers with
convolution kernel numbers (input i and output j ) with weight kij
l, f (.) is
activation sigmoid function for calculated maps after summation, and bj
lis an
addition bias of current layer l with output convolution kernel number j.
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 29
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 30
The functionality of down-sampling layer is dimensional reduction for feature maps through network's layers starting from input image ending to sufficient small feature representation leading to fast network computation in matrix calculation, which uses the following equation: where hnis non-overlapping averaging function with size nxn with neighbourhood weights w and applied on convoluted map x of kernel number j at layer l to get less-dimensional output map y of kernel number j at layer l (i.e. 64x64 input map will be reduced using n=2 to 32x32
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 31
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 32
An adapt learning rate is used rather than a constant one with respect to network's status and performance as follows: where αn& αn-1are learning rates of current & previous iterations (if first network iteration is the current one, then learning rate of previous network iteration represents initial learning rate as network input), n & N are number of current network iteration & total number of iterations, en is back-propagated error of current network iteration, and g(.) is linear limitation function to keep value of learning rate in range (0,1].
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 33
The network is back-propagated with squared-error loss function as follows: where N & C are number of training samples & output classes, and t & y are target & actual outputs.
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 34
Ratio of training/test sets 2:1 Size of hybrid input image (61 x 61) , (121 x 121) , (181 x 181) Number of input channels 3 (RGB) , 4 +(WLD, PC, ZCA) , 6 +(WLD + PC,+ZCA) Number of samples per class 300 Enhancement for RBG input Bazeille'06 , Iqbal'07, Beijbom'12, NoEhance Normalization method min-max Initial learning rate 1 Network batch size 3 Number of network epochs 10 Number of hidden output maps (6-12) , (12-24) , (24-48) Size of last hidden output maps 4 x 4 Number of output classes 9
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 35
MLC ADS
Unified-scaling multi-size image patches have less error rates over single-sized image patches. Up-scaling in multi-size image patches have the best comparison results across different measurements. Hybrid down-scaling (61) is finally selected for fast computation.
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 36
MLC ADS
Combination of three feature-based maps has slightly better classification results over basic color channels without any additional supplementary channels. In conclusion, additional feature-based channels besides basic color channels can be useful in coral discrimination in both datasets (MLC,ADS)!
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 37
MLC ADS
Bazeille'06 is the best color enhancement algorithm over other algorithms (Iqbal'07, Beijbom'12). Raw image data without any enhancement is the best pre- processing choice for network classification.
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 38
MLC ADS
Outrageous number (24-48) of hidden
classification output. (6-12) and (12-24) have similar classification rates!
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 39
Size of hybrid input image (61 x 61) , (121 x 121) , (181 x 181) Number of input channels 3 (RGB) , 4 +(WLD, PC, ZCA) , 6 +(WLD + PC,+ZCA) Enhancement for RBG input Bazeille'06 , Iqbal'07, Beijbom'12, NoEhance Number of hidden output maps (6-12) , (12-24) , (24-48)
Number of network epochs 50
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 40
MLC ADS
In MLC dataset , testing phase of has almost the same results and training phase has better results number of hidden output maps (12-24) and using additional feature-based maps as supplementary channels. In ADS dataset, testing phase has best significant accuracy results with same selected configuration.
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 41
MLC ADS
In MLC dataset, best classification Acrop (coral) and Sand (non-coral), and lowest classification Pavon (coral) and Turf (non-coral). Misclassification Pavon as Monti / Macro and Turf as Macro/CCA/Sand due to similarity in their shape properties or growth environment. In ADS dataset, perfect classification DRK (non- coral) due to its distinct nature (almost dark blue plain image), excellent classification LEIO (coral) due to its distinction color property (orange).
56 % 81 %
Deep Learning Workshop, Lyon, 2015
26 Nov 2015 42
processing.
cold-water coral reefs.
images, manipulating point-based multi-channel input data.
detection.
environments.
Deep Learning Workshop, Lyon, 2015
– Dealing with N-dimensional data --> Split them in two deep models (Basic, Extra) and then fuse them. – Dealing with large-size images of high texture background --> windowing + CNN. – Adding scheduling noise with certain gaps to avoid local minima. – Try data augmentation (i.e. rotation, scaling, …) and use small initial learning rate for better results.
26 Nov 2015 Deep Learning Workshop, Lyon, 2015 43
26 Nov 2015 Deep Learning Workshop, Lyon, 2015 44
LAB + Max Wavelet Response + SVM Beijbom UCSD 73.3 % Complex Algorithm Shihavuddin Girona 85.5 % CNN (Caffe) Elawady HWU 70% DBN (VisualRBM) Elawady HWU 20% Scattering Transform + SVM (LibSVM) Elawady HWU 61%
Based Coral Reef Classification and Thematic Mapping,” Remote Sensing, vol. 5,
annotation of coral reef survey images,” 2012 IEEE CVPR, pp. 1170–1177, 2012.
networks: Tricks of the trade, pp. 9–48, Springer, 2012.
Technical University of Denmark, Palm, 2012.
document recognition,” Proceedings of the IEEE, vol. 86, pp. 2278–2324, 1998.
26 Nov 2015 45 Deep Learning Workshop, Lyon, 2015
26 Nov 2015 46 Deep Learning Workshop, Lyon, 2015
26 Nov 2015 47 Deep Learning Workshop, Lyon, 2015