Capsule Architectures Sara Sabour Google Brain, University of - PowerPoint PPT Presentation

Neural Architects Workshop 28th October, ICCV 2019 Capsule Architectures Sara Sabour Google Brain, University of Toronto

Joint work with ● Geoff Hinton @Google brain ● Nicholas Frosst @Google brain ● Adam Kosiorek @Oxford University ● Yee Whye Teh @Oxford & Deepmind

Agreement Idea Viewpoint Why Iterative algorithm or How Optimization

Idea 101: Agreement and Capsules 4

Close look at a typical non-linearity 1. Each neuron is multiplied by a trainable parameter. 5

Close look at a typical non-linearity 1. Each neuron is multiplied by a trainable parameter. 6

Close look at a typical non-linearity 1. Each neuron is multiplied by a trainable parameter. 2. The incoming votes are summed. 7

Close look at a typical non-linearity 1. Each neuron is multiplied by a trainable parameter. 2. The incoming votes are summed. 3. A nonlinearity (ReLU) is applied where a higher sum means more activated. 8

Close look at a typical non-linearity 1. Each neuron is multiplied by a trainable parameter. 2. The incoming votes are summed. 3. A nonlinearity (ReLU) is applied where a higher sum means more activated. 10 1 2 3 4 Consider these three cases: 1 2 2 2 2 1 1 2 1 1 9

Close look at a typical non-linearity 1. Each neuron is multiplied by a trainable parameter. 2. The incoming votes are summed. 3. A nonlinearity (ReLU) is SUM applied where a higher sum means more activated. 10 1 2 3 4 10 1 2 3 4 20 Consider these three cases: 1 2 2 2 2 1 2 2 2 2 9 1 1 2 1 1 8 Dictatorship Support comes from a confident shouter! 10

Agreement Invariance 1. Each neuron is multiplied by a trainable parameter. 2. Do they agree with each other. Agree? 50 5 10 15 20 10 1 2 3 4 0 1 2 2 2 2 5 10 10 10 10 1 Democracy 1 1 2 1 1 5 5 10 5 5 1 Support comes from coordinated mass! SUM + ReLU -------------> Count 11

Agreement, enhanced Invariance Equivarience 1. Each neuron is multiplied by a trainable parameter. 2. Do they agree with each Agree? On? other. 3. What are they agreeing upon. 10 1 2 3 4 0 0 1 2 2 2 2 1 2 1 1 2 1 1 1 1 No loss of information! If 5 is multiplied to everything, what they are agreeing upon will be multiplied by 5. 12

Agreement, what we get? Invariance Equivarience 1. Each neuron is multiplied by a trainable parameter. 2. Do they agree with each Agree? On? other. 3. What are they agreeing upon. 10 1 2 3 4 0 0 1 2 2 2 2 1 2 Training with this non-linearity 1 1 2 1 1 1 1 Counting: Non-differentiable ● Similarity function: differentiable ● 13

Multi Dimension Enhanced Agreement Stronger Invariance Stronger Equivarience 1. Each neuron is multiplied by a trainable parameter. 2. Do they agree with each other. (10,0) (1,1) (2,2) (3,3) (4,4) 3. What are they agreeing upon. (1,0) (2,1) (2,1) (2,1) (2,1) (1,0) (1,1) (2,8) (1,1) (1,2) Stronger and more robust agreement finding. 14

Recap Base idea ● Agreement non-linearity How many are the same rather than who is larger Enhancements ● Presence + Value ○ (10,0) (1,1) (2,2) (3,3) (4,4) Multi-Dimensional Value ○ (1,0) (2,1) (2,1) (2,1) (2,1) (1,0) (1,1) (2,8) (1,1) (1,2) New neurons: Capsules 15

Recap: Capsules Base idea ● Agreement non-linearity How many are the same rather than who is larger Enhancements ● Presence + Value ○ (10,0) (1,1) (2,2) (3,3) (4,4) Multi-Dimensional Value ○ (1,0) (2,1) (2,1) (2,1) (2,1) A network of Capsules (1,0) (1,1) (2,8) (1,1) (1,2) Each capsule has whether it is present and ● how it is present. Each capsule gets activated if incoming ● votes agree. 16

Use Case: Computer Vision 17

Which one is a house?

Which one is a house? 1. Both the parts should exist. Image 1 is not a house. ○ 2. How the roof and the walls 1 2 3 exist should match a common house. Image 2 & 3 are not ○ houses.

What stays constant? The relation between a part and the whole stays constant. Camera Coordinate Frame

What stays constant? The relation between a part and the whole stays constant: Between the Roof arrows and the House arrows. Camera coordinate Frame

What stays constant? The relation between a part and the whole stays constant: Between the Roof arrows and the House arrows. Given the Roof arrow transformation, output the House arrow transformations

What stays constant? The relation between a part and the whole stays constant: Between the Wall arrows and the House arrows. Given the Wall arrow T, output the House arrow T

Recap Input to the layer: How to transform the Camera arrows Into Roof and Wall arrows. Output of the layer: How to transform the Camera arrows Into House arrows. What we learn: How to transform the transformations.

What stays constant? The relation between a part and the whole stays constant: Between the part arrows and the House arrows. Compare the House arrow predictions.

Network of Capsules for Computer Vision Each Capsule represents a part or an (1,1) object. The presence of a capsule ○ represents whether that entity exists in the image. The value of a capsule carries the ○ spatial position of how that entity exists. I.e. the transformation between the coordinate frame of (1,0) (1,1) (2,8) (1,1) (1,2) camera and the entity. The trainable parameter between ○ two capsules is the transformation between their coordinate frame (5,2) (2,3) (2,2) (3,3) (3,2) transformations as a part and a whole. 26

Capsule Network Same trained transformation works for (1,1) all viewpoints of input. Input is transformed and so the ○ value of the output capsule is transformed accordingly. Value is viewpoint equivariant. (1,0) (1,1) (2,8) (1,1) (1,2) The agreement of parts would ○ not change. Presence is (5,2) (2,3) (2,2) (3,3) (3,2) viewpoint invariant. 27

How: Iterative routing 28

Matrix Capsules with EM routing EM routing for Gaussian Capsules Geoff Hinton Nick Frosst Layer L+1 Layer L 2D capsules ○ Position shows their 2D value ○ Radius shows their presence ○ What is the value and presence ○ of next layer capsules?

Matrix Capsules with EM routing Transform Geoff Hinton Nick Frosst Transform Transform Is there any Agreement?

Matrix Capsules with EM routing Agreement (M step) Geoff Hinton Nick Frosst Euclidean Distance Find the clusters Expectation Maximization for fitting Mixture of Gaussians.

Matrix Capsules with EM routing Agreement (M step) Geoff Hinton Nick Frosst Euclidean Distance Transform

Matrix Capsules with EM routing Assignment (E step) Geoff Hinton Nick Frosst Transform

Matrix Capsules with EM routing Agreement (M step) Geoff Hinton Nick Frosst Transform

Routing in action Iteration 1 Iteration 2 Iteration 3

Viewpoint generalization Train Test CNN vs Capsule Test error % Azimuth 20% 13.5% Elevation 17.8% 12.3% Code available at: https://github.com/google-research/google-research/tre e/master/capsule_em

Agreement Finding Iterative Routing Opt-Caps & SVD-Caps [1, 2] ● G-Caps & SOVNET [3, 4] ● Explicit group equivarience ○ W EncapNet [5] ● Sinkhorn iteration ○ [1]: Dilin Wang and Qiang Liu. An optimization view on dynamic routing between capsules. 2018. [2]: Mohammad Taha Bahadori. Spectral capsule networks. 2018 [3]: Jan Eric Lenssen, Matthias Fey, and Pascal Libuschewski. Group equivariant capsule networks, NIPS 2018 [4]: Anonymous ICLR 2020 submission. [5]: Hongyang Li, Xiaoyang Guo, Bo Dai, Wanli Ouyang, and Xiaogang Wang. Neural network encapsulation. ECCV, 2018.

Can we learn a neural network to do the clustering rather than running explicit clustering algorithm? 39

Learn a cluster finder Neural Network Previously: Now: It should still be true:

Learn a cluster finder Neural Network X X X X X X X X X Linear Transform X Each Layer is an autoencoder with ● a single linear decoder. Optimize mixture model A whole capsule gives predictions ● log-likelihood. for its part capsules.

Stacked Capsule Autoencoder Part Capsule Autoencoder infer parts presence & values Object Capsule Autoencoder predict objects Unsupervised! reassemble part image likelihood likelihood part is explained as a mixture of object templates (learned) predictions Adam Kosiorek et al, Neurips 2019. 42

SCAE on MNIST Unsupervised Train with 24 object capsules. Cluster -> 98.7% Accuracy. No Image Augmentation. TSNE of Capsule Presences:

MNIST: Part Capsules rec learned templates affine-transformed templates part caps rec obj caps rec overlap 44

Finding Constellations Two squares and a triangle ● Patterns might be absent ● Visualizing the mixture model ● assignments. Error: Best: 2.8% ● Average: 4.0% ● Baseline: 26.0% ● 45

Capsule Architectures Sara Sabour Google Brain, University of - PowerPoint PPT Presentation

Neural Architects Workshop 28th October, ICCV 2019 Capsule Architectures Sara Sabour Google Brain, University of Toronto Joint work with Geoff Hinton @Google brain Nicholas Frosst @Google brain Adam Kosiorek @Oxford University

INFORMATION CAPSULE INFORMATION CAPSULE Research Services Vol 1610 Christie Blazer, Supervisor

Introduction to Capsule Networks Vasileios Lioutas School of Computer Science

Capsule Networks for NLP Will Merrill Advanced NLP 10/25/18 Capsule Networks: A Better ConvNet

Architectures Architectural styles Software architectures Architectures versus middleware

Braemar GP Seminar (i) Capsule endoscopy (ii) CRC screening Graeme Dickson BSc(hons) MB BS

Capsule Networks and Active Learning Chris Aasted, PhD Lockheed Martin Autonomous Systems

@PaniniJ: Generating Capsule Systems from Annotated Java Dec15-12: Trey Erenberger, Dalton Mills,

Sensory receptors Unencapsulated receptors Encapsulated receptors Have connective tissue capsule

Ladder Capsule Network Taewon Joeng, Youngmin Lee, Heeyoung Kim Industrial Statistics Lab, KAIST

Tutorial 01 Capsule 01 Activity 1 Topic : Aircraft Component Nomenclature Interactive Discussion

Capsule Networks - An Overview Luca Dombetzki July 13, 2018 Advisor: Marton Kajo Chair of

CompSci 356: Computer Network Architectures Lecture 2: Network Architectures Xiaowei Yang

Architectures, Architectures, Microkernels, IPC, Microkernels, IPC, Capabilities Capabilities

Overview Agent Architectures Definition of agent architecture Classical Architectures for

CompSci 356: Computer Network Architectures Lecture 2: Network Architectures Xiaowei Yang

HPC Architectures Types of resource currently in use Outline Shared memory architectures

ENERGY STAR Connected Thermostats CT Metrics Stakeholder Meeting Slides November 13, 2020 1

Inter-agency COVID-19 Local Boards of Health Webinar November 10, 2020 Webinar Agenda

Thomas Jefferson a mere thing of wax in the hands of the judiciary, which they may twist

North Carolina Local Governments and the Opioid Crisis A Collective Impact Approach to Making

Waiting on Jesus John 20:19-31 page 1647 Jesus was lost to them finding Jesus John

Breakfast with Jesus John 21:1-25 page 1648 the resurrection fish John 21 original

14/09/2016 1 14/09/2016 Eucharist Liturgy The table of bread is now to be made ready. It is

The church