Tutorials Interpretable Deep Learning: Towards Understanding & - - PowerPoint PPT Presentation

tutorials
SMART_READER_LITE
LIVE PREVIEW

Tutorials Interpretable Deep Learning: Towards Understanding & - - PowerPoint PPT Presentation

Tutorials Interpretable Deep Learning: Towards Understanding & Explaining DNNs P a r t 2 : M e t h o d s o f E x p l a n a t i o n W o j c i e c h S a m e k , G r g o i r e M o n t a v


slide-1
SLIDE 1

1 / 3 6

Interpretable Deep Learning: Towards Understanding & Explaining DNNs

P a r t 2 : M e t h

  • d

s

  • f

E x p l a n a t i

  • n

W

  • j

c i e c h S a m e k , G r é g

  • i

r e M

  • n

t a v

  • n

, K l a u s

  • R
  • b

e r t M ü l l e r

Tutorials

slide-2
SLIDE 2

2 / 3 6 interpreting predicted classes

explaining individual decisions

W h a t W i l l b e C

  • v

e r e d i n P a r t 2

slide-3
SLIDE 3

3 / 3 6

Q: Where in the image the neural networks sees evidence for a car?

E x p l a i n i n g I n d i v i d u a l D e c i s i

  • n

s

car non-car

slide-4
SLIDE 4

4 / 3 6

E x a m p l e s

  • f

M e t h

  • d

s t h a t E x p l a i n D e c i s i

  • n

s

slide-5
SLIDE 5

5 / 3 6

E x p l a i n i n g I n d i v i d u a l D e c i s i

  • n

s

Q: In which proportion has each car contributed to the prediction?

car non-car

slide-6
SLIDE 6

6 / 3 6

E x p l a i n i n g b y D e c

  • m

p

  • s

i n g

G

  • a

l : D e t e r m i n e t h e s h a r e

  • f

t h e

  • u

t p u t t h a t s h

  • u

l d b e a t t r i b u t e d t

  • e

a c h i n p u t v a r i a b l e . D e c

  • m

p

  • s

i t i

  • n

p r

  • p

e r t y :

i n p u t

D N N

slide-7
SLIDE 7

7 / 3 6

E x p l a i n i n g b y D e c

  • m

p

  • s

i n g

G

  • a

l : D e t e r m i n e t h e s h a r e

  • f

t h e

  • u

t p u t t h a t s h

  • u

l d b e a t t r i b u t e d t

  • e

a c h i n p u t v a r i a b l e . D e c

  • mp
  • s

i n g a p r e d i c t i

  • n

i s g e n e r a l l y d i f fi c u l t .

slide-8
SLIDE 8

8 / 3 6

S e n s i t i v i t y A n a l y s i s

c

  • m

p u t e s f

  • r

e a c h p i x e l : e x p l a n a t i

  • n

f

  • r

“ c a r ” ( h e a t m a p ) : e v i d e n c e f

  • r

“ c a r ” D N N

i n p u t

slide-9
SLIDE 9

9 / 3 6

S e n s i t i v i t y A n a l y s i s

Question: If sensitivity analysis computes a decomposition of something: Then, what does it decompose?

slide-10
SLIDE 10

1 / 3 6

S e n s i t i v i t y A n a l y s i s

S e n s i t i v i t y a n a l y s i s e x p l a i n s a v a r i a t i

  • n
  • f

t h e f u n c t i

  • n

, n

  • t

t h e f u n c t i

  • n

v a l u e i t s e l f .

e x p l a n a t i

  • n

f

  • r

“ c a r ” i n p u t v a r i a t i

  • n

= m a k e s

  • m

e t h i n g a p p e a r l e s s / m

  • r

e a c a r .

slide-11
SLIDE 11

1 1 / 3 6

T h e T a y l

  • r

E x p a n s i

  • n

A p p r

  • a

c h

2 . F i r s t

  • r

d e r e x p a n s i

  • n

a t r

  • t

p

  • i

n t : O b s e r v a t i

  • n

: e x p l a n a t i

  • n

d e p e n d s

  • n

t h e r

  • t

p

  • i

n t . 1 . T a k e a l i n e a r m

  • d

e l : 3 . I d e n t i f y i n g l i n e a r t e r m s :

a d e c

  • mp
  • s

i t i

  • n

r

  • t

p

  • i

n t

slide-12
SLIDE 12

1 2 / 3 6

T h e T a y l

  • r

E x p a n s i

  • n

A p p r

  • a

c h

O b t a i n e d r e l e v a n c e s c

  • r

e s H

  • w

t

  • c

h

  • s

e t h e r

  • t

p

  • i

n t ?

  • C

l

  • s

e n e s s t

  • t

h e a c t u a l d a t a p

  • i

n t

  • M

e m b e r s h i p t

  • t

h e i n p u t d

  • m

a i n ( e . g . p i x e l s p a c e )

  • M

e m b e r s h i p t

  • t

h e d a t a m a n i f

  • l

d . r

  • t

p

  • i

n t

slide-13
SLIDE 13

1 3 / 3 6

N

  • n
  • L

i n e a r M

  • d

e l s

s e c

  • n

d

  • r

d e r t e r m s a r e h a r d t

  • i

n t e r p r e t a n d c a n b e v e r y l a r g e

S i mp l e T a y l

  • r

d e c

  • mp
  • s

i t i

  • n

i s n

  • t

s u i t a b l e f

  • r

h i g h l y n

  • n
  • l

i n e a r mo d e l s .

N

  • n

l i n e a r m

  • d

e l

slide-14
SLIDE 14

1 4 / 3 6

O v e r c

  • m

i n g N

  • n

L i n e a r i t y

I n t e g r a t e d G r a d i e n t s

[ S u n d a r a r a j a n ’ 1 7 ]

:

  • F

u l l y d e c

  • m

p

  • s

a b l e

  • R

e q u i r e c

  • m

p u t i n g a n i n t e g r a l ( e x p e n s i v e )

  • W

h i c h i n t e g r a t i

  • n

p a t h ?

[ S u n d a r a r a j a n ’ 1 7 ] A x i

  • m

a t i c A t t r i b u t i

  • n

f

  • r

D e e p N e t w

  • r

k s . I C M L 2 1 7 : 3 3 1 9

  • 3

3 2 8

slide-15
SLIDE 15

1 5 / 3 6

O v e r c

  • m

i n g N

  • n

L i n e a r i t y

S p e c i a l c a s e w h e n t h e

  • r

i g i n i s a r

  • t

p

  • i

n t a n d t h e g r a d i e n t a l

  • n

g t h e i n t e g r a t i

  • n

p a t h i s c

  • n

s t a n t :

g r a d i e n t x i n p u t

slide-16
SLIDE 16

1 6 / 3 6

Let’s consider a different approach ...

slide-17
SLIDE 17

1 7 / 3 6

O v e r c

  • m

i n g N

  • n

L i n e a r i t y

V i e w t h e d e c i s i

  • n

a s a g r a p h c

  • mp

u t a t i

  • n

i n s t e a d

  • f

a f u n c t i

  • n

e v a l u a t i

  • n

, a n d p r

  • p

a g a t e t h e d e c i s i

  • n

b a c k w a r d s u n t i l t h e i n p u t i s r e a c h e d .

slide-18
SLIDE 18

1 8 / 3 6

L a y e r

  • W

i s e R e l e v a n c e P r

  • p

a g a t i

  • n

( L R P )

[ B a c h ’ 1 5 ]

slide-19
SLIDE 19

1 9 / 3 6

G r a d i e n t

  • B

a s e d v s . L R P

slide-20
SLIDE 20

2 / 3 6

L a y e r

  • W

i s e R e l e v a n c e P r

  • p

a g a t i

  • n

( L R P )

[ B a c h ’ 1 5 ]

C a r e f u l l y e n g i n e e r e d p r

  • p

a g a t i

  • n

r u l e :

neuron contribution available for redistribution normalization term pooling received messages

slide-21
SLIDE 21

2 1 / 3 6

L R P P r

  • p

a g a t i

  • n

R u l e s : T w

  • V

i e w s

neuron contribution available for redistribution normalization term pooling received messages neuron activation available for redistribution normalization term weighted sum

View 1: View 2:

slide-22
SLIDE 22

2 2 / 3 6

I m p l e m e n t i n g P r

  • p

a g a t i

  • n

R u l e s ( 1 )

neuron activation available for redistribution normalization term weighted sum

E l e m e n t

  • w

i s e

  • p

e r a t i

  • n

s V e c t

  • r
  • p

e r a t i

  • n

s

slide-23
SLIDE 23

2 3 / 3 6

I m p l e m e n t i n g P r

  • p

a g a t i

  • n

R u l e s ( 2 )

C

  • d

e t h a t r e u s e s f

  • r

w a r d a n d g r a d i e n t c

  • m

p u t a t i

  • n

s :

neuron activation available for redistribution normalization term weighted sum

S e e a l s

  • h

t t p : / / w w w . h e a t m a p p i n g .

  • r

g / t u t

  • r

i a l

slide-24
SLIDE 24

2 4 / 3 6

H

  • w

F a s t i s L R P ?

G P U

  • b

a s e d i m p l e m e n t a t i

  • n
  • f

L R P : C h e c k

  • u

t i N N v e s t i g a t e [ A l b e r ’ 1 8 ] h t t p s : / / g i t h u b . c

  • m

/ a l b e r m a x / i n n v e s t i g a t e

slide-25
SLIDE 25

2 5 / 3 6

Is there an underlying mathe- matical framework for LRP?

slide-26
SLIDE 26

2 6 / 3 6

Q u e s t i

  • n

: S u p p

  • s

e t h a t w e h a v e p r

  • p

a g a t e d t h e r e l e v a n c e u n t i l a g i v e n l a y e r . H

  • w

s h

  • u

l d i t b e p r

  • p

a g a t e d

  • n

e l a y e r f u r t h e r ? I d e a : B y p e r f

  • r

m i n g a T a y l

  • r

e x p a n s i

  • n
  • f

t h e r e l e v a n c e .

D e e p T a y l

  • r

D e c

  • m

p

  • s

i t i

  • n

[ M

  • n

t a v

  • n

’ 1 7 ]

slide-27
SLIDE 27

2 7 / 3 6

T h e S t r u c t u r e

  • f

R e l e v a n c e

O b s e r v a t i

  • n

: R e l e v a n c e a t e a c h l a y e r i s a p r

  • d

u c t

  • f

t h e a c t i v a t i

  • n

a n d a n a p p r

  • x

i m a t e l y l

  • c

a l l y c

  • n

s t a n t t e r m .

neuron activation available for redistribution normalization term weighted sum

R e mi n d e r :

slide-28
SLIDE 28

2 8 / 3 6

D e e p T a y l

  • r

D e c

  • m

p

  • s

i t i

  • n

R e l e v a n c e n e u r

  • n

: T a y l

  • r

e x p a n s i

  • n

: R e d i s t r i b u t i

  • n

:

slide-29
SLIDE 29

2 9 / 3 6

C h

  • s

i n g t h e R

  • t

P

  • i

n t

(same as LRP-

α

1

β

)

( D e e p T a y l

  • r

g e n e r i c ) ✔

1 . n e a r e s t r

  • t

2 . r e s c a l e d e x c i t a t i

  • n

s C h

  • i

c e

  • f

r

  • t

p

  • i

n t

slide-30
SLIDE 30

3 / 3 6

C h

  • s

i n g t h e R

  • t

P

  • i

n t

( D e e p T a y l

  • r

g e n e r i c ) P i x e l s d

  • ma

i n : C h

  • i

c e

  • f

r

  • t

p

  • i

n t R e s u l t i n g p r

  • p

a g a t i

  • n

r u l e

slide-31
SLIDE 31

3 1 / 3 6

C h

  • s

i n g t h e R

  • t

P

  • i

n t

( D e e p T a y l

  • r

g e n e r i c ) W

  • r

d e mb e d d i n g :

a d a p t e d f r

  • m

T e n s

  • r

fl

  • w

t u t

  • r

i a l

C h

  • i

c e

  • f

r

  • t

p

  • i

n t R e s u l t i n g p r

  • p

a g a t i

  • n

r u l e

king man queen woman

slide-32
SLIDE 32

3 2 / 3 6

D T D V i e w

  • n

E x p l a i n i n g a C

  • n

v N e t

[ M

  • n

t a v

  • n

’ 1 7 ]

slide-33
SLIDE 33

3 3 / 3 6

D T D V i e w

  • n

E x p l a i n i n g a n O C S V M

[ K a u f f m a n n ’ 1 8 ]

O u t l i e r s c

  • r

e

O n e

  • c

l a s s S V M r e w r i t t e n a s a m i n

  • p
  • l

i n g

  • v

e r d i s t a n c e s : D e e p T a y l

  • r

d e c

  • m

p

  • s

i t i

  • n

:

slide-34
SLIDE 34

3 4 / 3 6

D T D

  • O

C S V M

  • n

M N I S T

  • u

t l i e r d i g i t s p i x e l

  • w

i s e e x p l a n a t i

  • n

w h y t h e y a r e

  • u

t l i e r s d a t a s e t

slide-35
SLIDE 35

3 5 / 3 6

i n p u t e x p l a n a t i

  • n

f

  • r
  • u

t l i e r n e s s

D T D

  • O

C S V M

  • n

I m a g e s

slide-36
SLIDE 36

3 6 / 3 6

C

  • n

c l u s i

  • n

f

  • r

P a r t 2

Explaining deep neural networks is non-trivial. Simple gradient-based methods either do not ask the right question, or are difficult to scale to deep models. Propagation-based approaches (e.g. LRP) seem to work better on complex DNN models. (This will be validated in Part 3). Deep Taylor Decomposition provides a theoretical framework for understanding and deriving LRP-type explanation procedures.