Similarity of Neural Network Representations Revisited Simon - - PowerPoint PPT Presentation

similarity of neural network representations revisited
SMART_READER_LITE
LIVE PREVIEW

Similarity of Neural Network Representations Revisited Simon - - PowerPoint PPT Presentation

Similarity of Neural Network Representations Revisited Simon Kornblith, Mohammad Norouzi, Honglak Lee, Geofgrey Hinton Motivation We need tools to understand trained neural networks Neural network training involves interactions between an


slide-1
SLIDE 1

Simon Kornblith, Mohammad Norouzi, Honglak Lee, Geofgrey Hinton

Similarity of Neural Network Representations Revisited

slide-2
SLIDE 2

Similarity of Neural Network Representations Revisited

Motivation

  • We need tools to understand trained neural networks

○ Neural network training involves interactions between an algorithm and structured data ○ We don’t know the structure of the data

  • One way to understand trained neural networks is by comparing their

representations

slide-3
SLIDE 3

What is a Representation?

Similarity of Neural Network Representations Revisited

(Centered) Net A Features Examples Examples (Centered) Net B Features

slide-4
SLIDE 4

Comparing Features = Comparing Examples

Similarity of Neural Network Representations Revisited

Sum of squared dot products (similarities) between features Dot product between reshaped inter-example similarity matrices

slide-5
SLIDE 5

Comparing Features = Comparing Examples

Similarity of Neural Network Representations Revisited

slide-6
SLIDE 6

Comparing Features = Comparing Examples

Similarity of Neural Network Representations Revisited

Centered kernel alignment (CKA) (Corues et al., 2012) RV-coeffjcient (Roberu & Escoufjer, 1976) Tucker’s congruence coeffjcient (Tucker, 1951)

slide-7
SLIDE 7

Similarity of Neural Network Representations Revisited

The Kernel Trick

P 7

H is the centering matrix

slide-8
SLIDE 8

A Sanity Check for Similarity

Similarity of Neural Network Representations Revisited

conv1 conv2 conv3 conv4 conv5 conv6 conv7 conv8 conv1 conv2 conv3 conv4 conv5 conv6 conv7 conv8

Given two architecturally identical networks A and B trained from difgerent random initializations, a layer from net A should be most similar to the architecturally corresponding layer in net B

avgpool avgpool

slide-9
SLIDE 9

A Sanity Check for Similarity

Similarity of Neural Network Representations Revisited

slide-10
SLIDE 10

A Sanity Check for Similarity

Similarity of Neural Network Representations Revisited

slide-11
SLIDE 11

CKA Reveals Network Pathology

P 11

1x Depth (94.1% on CIFAR-10) 2x Depth (95.1%) 4x Depth (93.2%) 8x Depth (91.9%)

Similarity of Neural Network Representations Revisited

slide-12
SLIDE 12

CKA Reveals Network Pathology

P 12

1x Depth (94.1%) 2x Depth (95.0%) 4x Depth (93.2%) 8x Depth (91.9%)

Similarity of Neural Network Representations Revisited

slide-13
SLIDE 13

CKA Reveals Network Pathology

P 13

1x Depth (94.1%) 2x Depth (95.0%) 4x Depth (93.2%) 8x Depth (91.9%)

Similarity of Neural Network Representations Revisited

slide-14
SLIDE 14

CKA Reveals Network Pathology

Similarity of Neural Network Representations Revisited

slide-15
SLIDE 15

CKA Reveals Network Pathology

Similarity of Neural Network Representations Revisited

slide-16
SLIDE 16

Thank You!

Similarity of Neural Network Representations Revisited

cka-similarity.github.io