HUMIES @ GECCO 2018 1
DENSER: Deep Evolutionary Network Structured Representation
Filipe Assunção, Nuno Lourenço, Penousal Machado and Bernardete Ribeiro University of Coimbra, Coimbra, Portugal {fga, naml, machado, bribeiro}@dei.uc.pt
DENSER: Deep Evolutionary Network Structured Representation Filipe - - PowerPoint PPT Presentation
HUMIES @ GECCO 2018 1 DENSER: Deep Evolutionary Network Structured Representation Filipe Assuno, Nuno Loureno, Penousal Machado and Bernardete Ribeiro University of Coimbra, Coimbra, Portugal {fga, naml, machado, bribeiro}@dei.uc.pt
HUMIES @ GECCO 2018 1
Filipe Assunção, Nuno Lourenço, Penousal Machado and Bernardete Ribeiro University of Coimbra, Coimbra, Portugal {fga, naml, machado, bribeiro}@dei.uc.pt
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
automated deep neural network design
2
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
convolutional neural network
3
feature extraction / representation learning classification
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
denser
4
ANN structure
<features> ::= <convolution> | <pooling> <convolution> ::= layer:conv [num-filters,int,1,32,256] [filter-shape,int,1,1,5] [stride,int,1,1,3] <padding> <activation> <bias> <batch-normalisation> <merge-input> <batch-normalisation> ::= batch-normalisation:True | batch-normalisation:False <merge-input> ::= merge-input:True | merge-input:False <pooling> ::= <pool-type> [kernel-size,int,1,1,5] [stride,int,1,1,3] <padding> <pool-type> ::= layer:pool-avg | layer:pool-max <padding> ::= padding:same | padding:valid <classification> ::= <fully-connected> <fully-connected> ::= layer:fc <activation> [num-units,int,1,128,2048 <bias> <activation> ::= act:linear | act:relu | act:sigmoid <bias> ::= bias:True | bias:False <softmax> ::= layer:fc act:softmax num-units:10 bias:True <learning> ::= learning:gradient-descent [lr,float,1,0.0001,0.1]
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
denser
5
<features> ::= <convolution> | <pooling> <convolution> ::= layer:conv [num-filters,int,1,32,256] [filter-shape,int,1,1,5] [stride,int,1,1,3] <padding> <activation> <bias> <batch-normalisation> <merge-input> <batch-normalisation> ::= batch-normalisation:True | batch-normalisation:False <merge-input> ::= merge-input:True | merge-input:False <pooling> ::= <pool-type> [kernel-size,int,1,1,5] [stride,int,1,1,3] <padding> <pool-type> ::= layer:pool-avg | layer:pool-max <padding> ::= padding:same | padding:valid <classification> ::= <fully-connected> <fully-connected> ::= layer:fc <activation> [num-units,int,1,128,2048 <bias> <activation> ::= act:linear | act:relu | act:sigmoid <bias> ::= bias:True | bias:False <softmax> ::= layer:fc act:softmax num-units:10 bias:True <learning> ::= learning:gradient-descent [lr,float,1,0.0001,0.1]
layers
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
denser
6
close-choice parameters real-valued parameters
<features> ::= <convolution> | <pooling> <convolution> ::= layer:conv [num-filters,int,1,32,256] [filter-shape,int,1,1,5] [stride,int,1,1,3] <padding> <activation> <bias> <batch-normalisation> <merge-input> <batch-normalisation> ::= batch-normalisation:True | batch-normalisation:False <merge-input> ::= merge-input:True | merge-input:False <pooling> ::= <pool-type> [kernel-size,int,1,1,5] [stride,int,1,1,3] <padding> <pool-type> ::= layer:pool-avg | layer:pool-max <padding> ::= padding:same | padding:valid <classification> ::= <fully-connected> <fully-connected> ::= layer:fc <activation> [num-units,int,1,128,2048 <bias> <activation> ::= act:linear | act:relu | act:sigmoid <bias> ::= bias:True | bias:False <softmax> ::= layer:fc act:softmax num-units:10 bias:True <learning> ::= learning:gradient-descent [lr,float,1,0.0001,0.1]
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
example of a candidate solution
7
<features> <features> <features> <classification> <softmax> <learning> <features> <pooling> <pooling-type> <padding> [{DSGE: 0, {kernel-size: 4, stride: 2}] [{DSGE: 1, {}] [{DSGE: 1, {}] [{DSGE: 0, {}]
Layer type: pooling Pooling func.: max Kernel size: 4 x 4 Stride: 2 x 2 Padding: same
... ...
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
hinton
8
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
hinton
9
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
denser benchmarking
10
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
denser vs. other automatic design methods
11
Accuracy (%) 92 92,575 93,15 93,725 94,3 CoDeepNEAT CGP-CNN (ConvSet) Fractional Max-Pooling CGP-CNN (ResSet) DENSER
94,13 94,02 93,63 93,25 92,7
(CIFAR-10)
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
denser vs. human-designed networks
12
Accuracy (%) 92 92,75 93,5 94,25 95 VGG ResNet Human Performance DENSER DenseNet
94,76 94,13 94 93,39 92,26
(CIFAR-10)
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
denser vs. human-designed networks
13
Accuracy (%) 99,5 99,55 99,6 99,65 99,7 ResNet Fractional Max-Pooling VGG DENSER
99,7 99,68 99,68 99,68
(MNIST)
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
denser vs. human-designed networks
14
Accuracy (%) 80 85 90 95 100 Human Performance VGG DENSER ResNet DenseNet
95,4 94,9 94,7 93,5 83,5
(FASHION-MNIST)
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
denser vs. human-designed networks
15
Accuracy (%) 69 71,25 73,5 75,75 78 ResNet VGG Fractional Max-Pooling DenseNet DENSER
77,51 75,58 73,61 71,95 71,14
(CIFAR-100)
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
robustness, generalisation, scalability
16
Accuracy (%) 70 77,5 85 92,5 100 CIFAR-10 MNIST Fashion-MNIST CIFAR-100
77,51 94,7 99,7 94,13
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
why the best entry?
search of Deep Artificial Neural Networks (DANNs);
DENSER can effectively discover (and even surpass)
generalisable, and scalable;
17
DENSER: Deep Evolutionary Network Structured Representation
HUMIES @ GECCO 2018
why the best entry?
18
Conv:165:5:1:valid:norm:bias Input Merge Activation: ReLU Conv:250:5:1:same:none:none Merge Activation: Linear MaxPool:5:1:valid Conv:165:5:1:same:norm:bias Merge Activation: ReLU Conv:218:5:3:same:norm:bias Activation: Linear Conv:165:5:1:same:norm:bias Merge Activation: ReLU Conv:157:4:2:same:none:bias Merge Activation: Linear MaxPool:5:2:same FC:1948:bias Activation: ReLU Activation: Sigmoid FC:10:bias Activation: Softmax MaxPool:3:2:same MaxPool:3:2:same MaxPool:4:3:same MaxPool:3:2:same MaxPool:2:1:same FC:495:bias Argmax Output
HUMIES @ GECCO 2018 19
Filipe Assunção, Nuno Lourenço, Penousal Machado and Bernardete Ribeiro University of Coimbra, Coimbra, Portugal {fga, naml, machado, bribeiro}@dei.uc.pt