cuhk
Recurrent Neural Network
Xiaogang Wang
xgwang@ee.cuhk.edu.hk
February 26, 2019
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 1 / 52
Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk - - PowerPoint PPT Presentation
Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 1 / 52 Outline 1 Recurrent neural networks Recurrent neural networks BP on RNN Variants
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 1 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 2 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 3 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 4 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 5 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 6 / 52
cuhk
T
◮ Complexity would grow without limit as the number of observations
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 7 / 52
cuhk
T
T
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 8 / 52
cuhk
◮ Pick h1 at random from the distribution P(h1). Pick x1 from the distribution
◮ For t = 2 to T ⋆ Choose ht at random from the distribution p(ht|ht−1) ⋆ Choose xt at random from the distribution p(xt|ht)
T
T
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 9 / 52
cuhk
Left: physical implementation of RNN, seen as a circuit. The black square indicates a delay of 1 time step. Right: the same seen as an unfolded flow graph, where each node is now associated with one particular time instance. Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 10 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 11 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 12 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 13 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 14 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 15 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 16 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 17 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 18 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 19 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 20 / 52
cuhk
t Lt
T
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 21 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 22 / 52
cuhk
◮ as an extra input at each time step, or ◮ as the initial state h0, or ◮ both
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 23 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 24 / 52
cuhk
◮ In speech recognition, the correct interpretation of the current sound as a
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 25 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 26 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 27 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 28 / 52
cuhk
∂LT ∂hT ∂hT ∂ht ∂Fθ(ht−1,xt ) ∂θ
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 29 / 52
cuhk
∂ht+1 ∂ht
∂LT ∂θ is a weighted sum of terms over spans T − t, with weights that are
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 30 / 52
cuhk
∂ht ∂ht−1 ), the unfolded recurrent network now has paths through which gradients grow as
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 31 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 32 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 33 / 52
cuhk
◮ For example, if a video sequence is composed as subsequences corresponding to
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 34 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 35 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 36 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 37 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 38 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 39 / 52
cuhk
t = tanh(Wxt + rt ⊙ Uht−1)
t and what from the previous steps? ht−1
t Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 40 / 52
cuhk
T ′
The model reads an input sentence “ABC” and produces “WXYZ” as the output sentence. The model stops making predictions after outputting the end-of-sentence token. Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 41 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 42 / 52
cuhk
The figure shows a 2-dimensional PCA projection of the LSTM hidden states that are obtained after processing the phrases in the
difficult to capture with a bag-of-words model. The figure clearly shows that the representations are sensitive to the order of words, while being fairly insensitive to the replacement of an active voice with a passive voice. Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 43 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 44 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 45 / 52
cuhk
N
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 46 / 52
cuhk
N
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 47 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 48 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 49 / 52
cuhk
T ′
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 50 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 51 / 52
cuhk
Xiaogang Wang (CUHK) Recurrent Neural Network February 26, 2019 52 / 52