Adaptive Neural Trees
Ryutaro Tanno, Kai Arulkumaran, Daniel C. Alexander, Antonio Criminisi, Aditya Nori
#82 Adaptive Neural Trees Ryutaro Tanno , Kai Arulkumaran, Daniel - - PowerPoint PPT Presentation
#82 Adaptive Neural Trees Ryutaro Tanno , Kai Arulkumaran, Daniel C. Alexander, Antonio Criminisi, Aditya Nori Two Paradigms of Machine Learning Deep Neural Networks Decision Trees hierarchical representation of data hierarchical
Ryutaro Tanno, Kai Arulkumaran, Daniel C. Alexander, Antonio Criminisi, Aditya Nori
Water
Grey matter White Matter
Deep Neural Networks Decision Trees
『hierarchical representation of data』 『hierarchical clustering of data』
Super-resolution of dMR brain images with a DT [Alexander et al. NeuroImage 2017]
ImageNet classifiers with CNNs [Zeiler and Fergus, ECCV 2014]
Low-level features Mid-level features High-level features
Trainable Classifier
Oriented edges & colours Textures & patterns Object parts
Deep Neural Networks Decision Trees
『hierarchical representation of data』 『hierarchical clustering of data』
+ learn features of data + scalable learning with stochastic optimisation
parameter of the model for each input
Deep Neural Networks Decision Trees
『hierarchical representation of data』 『hierarchical clustering of data』
+ learn features of data + scalable learning with stochastic optimisation
parameter of the model for each input
Deep Neural Networks
+ architectures are learned from data + lightweight inference, activating only a fraction
Decision Trees
『hierarchical representation of data』 『hierarchical clustering of data』
+ learn features of data + scalable learning with stochastic optimisation + architectures are learned from data + lightweight inference, activating only a fraction
『hierarchical representation of data』 『hierarchical clustering of data』
Adaptive Neural Trees
ANTs unify the two paradigms and generalise previous work
+ learn features of data + scalable learning with stochastic optimisation + architectures are learned from data + lightweight inference, activating only a fraction
『hierarchical representation of data』 『hierarchical clustering of data』
Adaptive Neural Trees
ANTs unify the two paradigms and generalise previous work
input, x
(1). DTs which uses NNs in every path and routing decisions.
(2). DT-like architecture growth using SGD
(1). DTs which uses NNs in every path and routing decisions. (a) Split (b) Deepen Target Node OR
(2). DT-like architecture growth using SGD
(1). DTs which uses NNs in every path and routing decisions.
0.9 1.8
ANT 1 ANT 2 ANT 3
5 10
ANT 1 ANT 2 ANT 3
0.8 1.6
ANT
0K 51K 101K
ANT 1 ANT 2 ANT 3
Multi-path inference Single-path inference
0M 0.65M 1.3M
ANT 1 ANT 2 ANT 3
0K 50K 100K
ANT
Errors
MNIST (%)
Model size drops!
Number of Parameters
CIFAR10 (%) SARCOS (mse) MNIST CIFAR10 SARCOS
Models are trained on subsets of size 50, 250, 500, 2.5k, 5k, 25k, 45k examples.
Please come & see me at poster #82 for details!