Statistical Neurodynamics of Deep Networks Shun ichi Amari RIKEN - - PowerPoint PPT Presentation

statistical neurodynamics of deep networks
SMART_READER_LITE
LIVE PREVIEW

Statistical Neurodynamics of Deep Networks Shun ichi Amari RIKEN - - PowerPoint PPT Presentation

Statistical Neurodynamics of Deep Networks Shun ichi Amari RIKEN Brain Science Institute Statistical Neurodynamics Rozonoer (1969 Amari (1971; 197 Amari et al (2013) Toyoizumi et al (2015) Poole, , Ganguli (2016) ~ (0, 1) w N ij


slide-1
SLIDE 1

Statistical Neurodynamics

  • f Deep Networks

Shun‐ichi Amari

RIKEN Brain Science Institute

slide-2
SLIDE 2

Statistical Neurodynamics

Rozonoer (1969) Amari (1971; 197 Amari et al (2013) Toyoizumi et al (2015) Poole, …, Ganguli (2016)

~ (0, 1)

ij

w N

Macroscopic behaviors common to almost all (typical) networks

slide-3
SLIDE 3

Macroscopic variables

2 1 1

1 activity: distance: = [ : '] curvature: ( ) ( )

i l l l l

A x n D D A F A D K D

 

  

x x

slide-4
SLIDE 4

Deep Networks

1 2 1

( ) 1 ( )

i ij i i l l i l l l l l

x w x w A x n A F A 

 

   

 

2

~ (0, 1/ ) 0, 1 '(0) =const

ij i

w N n w       

slide-5
SLIDE 5

Pullback Metric

2

1

l a b ab l

ds g g dx dx d d n   

x x

slide-6
SLIDE 6

1

ab a b l

g n   e e

slide-7
SLIDE 7

1 l l

n n

 

slide-8
SLIDE 8

Poole et al (2016) Deep neural networks

slide-9
SLIDE 9

Dynamics of Activity

2 2 1 2

( ) ( ) ~ (0, ) 1 ( ) [ ( ) ] ( ) ( ) ( ) ~ (0,1)

k k l

y w y u u N A A y E u A n A Av Dv v N

     

     

     

  

  

slide-10
SLIDE 10

(0) (0) 1 ( ) converge

i

A A x         

slide-11
SLIDE 11

Dynamics of Metric

2 2 2 1

( ) ( '( ) ) E[ '( )) ] E[ '( )) ]E[ ] mean field approximation ( ) '( )

k k a a k k ab k j kj k j k j

dy B dy B B B u w g B B g u w w u w w A Av Dv

            

             

  

e e   

slide-12
SLIDE 12

1 1 1 1

( ) conformal transformation! ( )

ab ab ab l l ab ab

g A g A g            rotation, expansion

slide-13
SLIDE 13

Dynamics of Curvature

2 2

''( )( )( ) '( ) | |

ab a b a b a b a b ab ab ab ab ab

H y u H

  

 

             e w e w e w e H H H H

     

slide-14
SLIDE 14

2 2 2 2 1 2 1 1 1 2 2 1

( ) ''( ) ( ) ( )(2 1) ( ) 1 ( )(2 1) exponwntial expansion!

l l l l l ab ab ab l l ab ab

A Av Dv H A A A H H l A          

       

slide-15
SLIDE 15

Dynamics of Distance (Amari, 1974)

 

2

1 ( , ') ( ') 1 ( , ') ' ' ' 2 ~N(0, V) ' ' V= ( ') E[ (

i i i i k k k k

D x x x x n C x x x x x x n D A A C u w y u w y A C C A C A

   

            

   

 ) ( ' )] C C A C C        

slide-16
SLIDE 16

1 1

( ) 1

l l

D K D dD dD 

 

  

slide-17
SLIDE 17

Poole et al (2016) Deep neural networks

slide-18
SLIDE 18

Problem! ( , ) ( ) equidistance property

l l

D D D K D   x x

slide-19
SLIDE 19

Shuttering

Multiplicity Dynamics of recurrent net Dropout and backprop

slide-20
SLIDE 20

Multilayer Perceptrons

 

i i

y v  

w x

   

,

i i

f v  

x w x 

1 2

( , ,..., )

n

x x x x 

1 1

( ,..., ; ,..., )

m m

w w v v  

 

1

  w x

y

x

slide-21
SLIDE 21

Multilayer Perceptron

   

 

1 1,

, , ; ,

i i m m

y f v v v     

  x θ w x θ w w

neuromanifold

( ) x 

space of functions S

slide-22
SLIDE 22

singularities

slide-23
SLIDE 23

Geometry of singular model

 

y v n     w x

v

v | | 0  w

slide-24
SLIDE 24

     

1

, , :

t t t t

G y G l l 

          Fisher Natural Gradient Stochastic Descent Information Matrix invarint; steepest descent  x

   

    

slide-25
SLIDE 25

model: 2 hidden neurons

         

2

1 1 2 2 2

, , 1 2

t u

f w w y f u e dt     

 

      

x J x J x x  

slide-26
SLIDE 26

Singular Region in Parameter Space

  

    

     

1 2 1 2 1 2 2 1 2 1 1 1 2 2

, , 0, , , 0, , R w w w w w w w w w w f w w                    J J J J J J J J x J x J x    

slide-27
SLIDE 27

Coordinate transformation

 

1 1 2 2 1 2 1 2 2 1 2 1 1 2

, , , , , , w w w w w w w w w z w w w z             J J v u J J v u

slide-28
SLIDE 28

Singular Region

     

, 1 R w z      J u

slide-29
SLIDE 29

Milnor attractor

slide-30
SLIDE 30

Topology of singular R

 

   

2 2 1 2 3 2

blow-down coordinates , , 1 , 1 , , 1

n

c z u u c z z u S             : = e u u e e u 

slide-31
SLIDE 31

Dynamic vector fields: Redundant case

slide-32
SLIDE 32