About me : ENS -> MVA -> FAIR (engineer) -> FAIR (PhD 3rd - PowerPoint PPT Presentation

Learning Gradient Descent Back to minimizing Problem : How do we compute the gradient ? Large -> Stochastic Gradient Descent n 1 X ` ( f ( x i ; ✓ ) , y i ) r θ n i =1 Complicated function (neural net) -> BackProp

Learning Gradient Descent Back to minimizing Problem : How do we compute the gradient ? Large -> Stochastic Gradient Descent n 1 X ` ( f ( x i ; ✓ ) , y i ) r θ n i =1

Learning Stochastic Gradient Descent Killing n n 1 X ` ( f ( x i ; ✓ ) , y i ) r θ n i =1 One function n = 1 X r θ ` ( f ( x i ; ✓ ) , y i ) n i =1 ⇡ r θ ` ( f ( x j ; ✓ ) , y j )

Learning Stochastic Gradient Descent Killing n n 1 X ` ( f ( x i ; ✓ ) , y i ) r θ The Gradient of the Average n i =1 n = 1 X r θ ` ( f ( x i ; ✓ ) , y i ) n i =1 ⇡ r θ ` ( f ( x j ; ✓ ) , y j )

Learning Stochastic Gradient Descent Killing n n 1 X ` ( f ( x i ; ✓ ) , y i ) r θ The Gradient of the Average n i =1 n = 1 X r θ ` ( f ( x i ; ✓ ) , y i ) = The Average of the Gradients n i =1 ⇡ r θ ` ( f ( x j ; ✓ ) , y j )

Learning Stochastic Gradient Descent Killing n n 1 X ` ( f ( x i ; ✓ ) , y i ) r θ The Gradient of the Average n i =1 n = 1 X r θ ` ( f ( x i ; ✓ ) , y i ) = The Average of the Gradients n i =1 ⇡ r θ ` ( f ( x j ; ✓ ) , y j ) In expectation, for uniform j

Learning Stochastic Gradient Descent Killing n For some number of iterations : Gradient Step Pick some random example ( x j , y j ) ✓ n +1 ✓ n � ⌘ r θ ` ( f ( x j ; ✓ n ) , y j ) Learning-rate

Learning Back Propagation Computing the gradient Problem : How do we compute the gradient ? n 1 X ` ( f ( x i ; ✓ ) , y i ) r θ n i =1 Complicated function (neural net) -> BackProp

Learning BackProp Computing the gradient Problem : How do we compute the gradient ? f i ( x ) = σ ( A i x + b i ) Hidden Layer i

Learning BackProp Computing the gradient Problem : How do we compute the gradient ? f i ( x ) = σ ( A i x + b i ) Hidden Layer i f = f h ( f h − 1 ( f h − 2 ( . . . ))) = ( f h � f h − 1 � . . . � f 1 )( x ) Complete Neural Network

Learning BackProp Computing the gradient Problem : How do we compute the gradient ? f i ( x ) = σ ( A i x + b i ) Hidden Layer i f = f h ( f h − 1 ( f h − 2 ( . . . ))) = ( f h � f h − 1 � . . . � f 1 )( x ) Complete Neural Network r θ ` ( f ( x i ; ✓ ) , y i ) ??

Learning BackProp Computing the gradient ∂ f ∂ x = ∂ f ∂ y Chain-rule : ∂ y ∂ x

Learning BackProp Computing the gradient y h ( y h − 1 ) x y 1 ( x ) y h − 1 ( y h − 2 ) y 2 ( y 1 ) y h − 2 ( y h − 3 ) … ` ( y h ) θ h − 2 θ 2 θ 1 θ h − 1 θ h

Learning BackProp Computing the gradient r θ h ` ( y h ) … ` ( y h ) θ h

Learning BackProp r θ h ` ( y h ) Computing the gradient θ h

Learning BackProp r θ h ` ( y h ) Computing the gradient @` ( y h ) = @` ( y h ) @ y h @✓ h,i @ y h @✓ h,i Chain-Rule θ h

Learning BackProp r θ h ` ( y h ) Computing the gradient @` ( y h ) = @` ( y h ) @ y h @✓ h,i @ y h @✓ h,i Doesn’t depend on current layer Only depends on ` θ h

Learning BackProp r θ h ` ( y h ) Computing the gradient @` ( y h ) = @` ( y h ) @ y h @✓ h,i @ y h @✓ h,i Only depends on current layer θ h

Learning BackProp r θ h − 1 ` ( y h ) Computing the gradient = Φ h − 1 ( ✓ h − 1 , r y h − 1 ` ( y h )) θ h − 1

Learning BackProp r θ h − 1 ` ( y h ) Computing the gradient = Φ h − 1 ( ✓ h − 1 , r y h − 1 ` ( y h )) Depends on current layer’s structure θ h − 1

Learning BackProp r θ h − 1 ` ( y h ) Computing the gradient = Φ h − 1 ( ✓ h − 1 , r y h − 1 ` ( y h )) Known Depends on current layer’s structure θ h − 1

Learning BackProp r θ h − 1 ` ( y h ) Computing the gradient = Φ h − 1 ( ✓ h − 1 , r y h − 1 ` ( y h )) Known Depends on Already computed current layer’s structure θ h − 1

Learning BackProp r y h ( ` ( y h )) It’s Backwards !

Learning BackProp r y h ( ` ( y h )) It’s Backwards ! r θ h ( ` ( y h ))

About me : ENS -> MVA -> FAIR (engineer) -> FAIR (PhD 3rd - PowerPoint PPT Presentation

Presentation About me : ENS -> MVA -> FAIR (engineer) -> FAIR (PhD 3rd year) About my PhD : Interested in sign matrices and tensors (graphs / multi-graphs) Observe a few entries, predict the remaining edges Factorization

WELCOME/BIENVENUE MVA POWER INC - Page 1 of 48 Who is MVA Power ? Founded in 1991 by Marc

The Shmitah Cycle Common Holy Year 1 Year 2 Year 1 Year 2 Year 3 Year 4 Year 5 Year 6

Approximate Dynamic Programming A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2 MVA

The Multi-Arm Bandit Framework A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2 MVA

Approximate Dynamic Programming A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2 MVA

Jean Ponce (ponce@di.ens.fr) http://www.di.ens.fr/~ponce Equipe-projet WILLOW ENS/INRIA/CNRS UMR

Jieun Kim Hi-Sun Kim University of Chicago 1 st 2 nd 3 rd 4 th 5 th st nd rd th th year

http://www.di.ens.fr/willow/teaching/recvis11/ Jean Ponce (ponce@di.ens.fr)

Jean Ponce (ponce@di.ens.fr) http://www.di.ens.fr/~ponce Equipe-projet WILLOW ENS/INRIA/CNRS UMR

Fair Testing - O Wings We are learning to carry out a fair test. What is a fair test? Fair

Introduction to Reinforcement Learning A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master

The Exploration-Exploitation Dilemma A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2

Reinforcement Learning Algorithms A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2

Sample Complexity of ADP Algorithms A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2

Markov Decision Processes and Dynamic Programming A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS

Reinforcement Learning Algorithms A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS Cachan - Master 2

S02 - Poisson Regression STAT 401 (Engineering) - Iowa State University April 23, 2018

Clock lock Tree ee Res esynt nthes hesis is for or Mult ulti-cor i-corner ner Mult

Last time: iterated integrals (3 x 2 + 3 y 2 ) dA . Let D = [0 , 2] [ 3 , 1].

Background Background Total ankle replacement has become a viable As implants improve and

Seismic Modeling, Migration, and Velocity Inversion Questions Bee Bednar Panorama Technologies,

Gas Sales Competition Law Aspects Catherine Banet LL M MA Associate Lawyer LL.M, MA,

9 10 13

The San Francisco State College of Business Welcomes you to the Third International Workshop on