SLIDE 23 3/3/2020 23
- Reset gates π π’ control which parts of the state get used to compute the
next target state
- It introduces additional nonlinear effect in the relationship between
past state and future state
Gated Recurrent Unit (GRU)
βπ’ = 1 β π¨π’ β βπ’β1 + π¨π’ β ΰ·© βπ’ π¨π’ = π(π
π¨ β βπ’β1, π¦π’ + ππ¨)
π π’ = π(π
π β βπ’β1, π¦π’ + ππ )
ΰ·© βπ’ = tanh π β π π’ β βπ’β1, π¦π’ + π
Comparison LSTM and GRU
βπ’ = 1 β π¨π’ β βπ’β1 + π¨π’ β ΰ·© βπ’ π¨π’ = π(π
π¨ β βπ’β1, π¦π’ + ππ¨)
π π’ = π(π
π β βπ’β1, π¦π’ + ππ )
ΰ·© βπ’ = tanh π β π π’ β βπ’β1, π¦π’ + π π
π’ = π(π π β βπ’β1, π¦π’ + ππ)
α π·π’ = tanh(π
π· β βπ’β1, π¦π’ + ππ·)
ππ’ = π(π
π β βπ’β1, π¦π’ + ππ)
π·π’ = π
π’ β π·π’β1 + ππ’ β α
π·π’ βπ’ = ππ’ β tanh π·π’ ππ’ = π(π
π β βπ’β1, π¦π’ + ππ)
LSTM GRU
βπ’β1 π·π’β1 π·π’ βπ’ βπ’ π¦π’