Gated Orthogonal Recurrent Units: On Learning to Forget
Li Jing, Çağlar Gülçehre, John Peurifoy, Yichen Shen, Max Tegmark, Marin Soljačić, Yoshua Bengio
Gated Orthogonal Recurrent Units: On Learning to Forget Li Jing, a - - PowerPoint PPT Presentation
Gated Orthogonal Recurrent Units: On Learning to Forget Li Jing, a lar Glehre, John Peurifoy, Yichen Shen, Max Tegmark, Marin Solja i , Yoshua Bengio Gradient Vanishing/Explosion Problem During backpropagation through time,
Gated Orthogonal Recurrent Units: On Learning to Forget
Li Jing, Çağlar Gülçehre, John Peurifoy, Yichen Shen, Max Tegmark, Marin Soljačić, Yoshua Bengio
hidden to hidden Jacobian matrix is multiplied multiple times.
RNN hard to train
Li Jing
Li Jing
Unitary/Orthogonal matrices keep the norm of vectors:
By enforcing hidden to hidden transition matrix to be unitary/
the norm of the gradient will stay the same
Li Jing
z 1-z
h
r
IN OUT
modReLU U Wx
Gated Orthogonal Recurrent Unit Long Term Dependency Forgetting Unitary/Orthogonal Matrices Gated Mechanism
Li Jing
Synthetic Tasks: GORU is the only one succeeding in all tasks
Li Jing
Real Tasks: GORU outperforms all other models