implement contextual processing
Google Research ICML 2020 Niru Maheswaranathan and David Sussillo
in sentiment analysis How recurrent networks
@niru_m
How recurrent networks implement contextual processing in sentiment - - PowerPoint PPT Presentation
How recurrent networks implement contextual processing in sentiment analysis Niru Maheswaranathan and David Sussillo Google Research ICML 2020 @niru_m Sentiment classification using RNNs Sentiment classification using RNNs That restaurant
@niru_m
n3 n2 n1
n2 n1
n3 n2 n1
n2 n1
Maheswaranathan*, Williams* et al, NeurIPS 2019
Maheswaranathan*, Williams* et al, NeurIPS 2019
Maheswaranathan*, Williams* et al, NeurIPS 2019
Maheswaranathan*, Williams* et al, NeurIPS 2019
Maheswaranathan*, Williams* et al, NeurIPS 2019
Approximate line attractor dynamics explain the most of the RNN’s performance
Maheswaranathan*, Williams* et al, NeurIPS 2019
Approximate line attractor dynamics explain the most of the RNN’s performance
Approximate line attractor dynamics explain the most of the RNN’s performance Line attractor
Probe sentences 5 10 15 Time (t) −4 −3 −2 −1 1 2 3 4 Moe eitio (oit) Basee "This movie is aesome ie it"
Probe sentences 5 10 15 Time (t) −4 −3 −2 −1 1 2 3 4 Moe eitio (oit) Baseline "This movie is aesome ie it" Neain "This movie is not aesome
Probe sentences 5 10 15 Time (t) −4 −3 −2 −1 1 2 3 4 Moe eitio (oit) Baseline "This movie is aesome ie it" Neain "This movie is not aesome
"This movie is etreme aesome enite ie it" nensie
Probe sentences 5 10 15 Time (t) −4 −3 −2 −1 1 2 3 4 Moe eitio (oit) Baseline "This movie is aesome ie it" Neain "This movie is not aesome
"This movie is etreme aesome enite ie it" nensie
Data-driven method to identify contextual inputs Analysis of the strength and timing of modifier effects Experiments that demonstrate the identified mechanisms are necessary and sufficient for RNN performance
Data-driven method to identify contextual inputs Analysis of the strength and timing of modifier effects Experiments that demonstrate the identified mechanisms are necessary and sufficient for RNN performance
Data-driven method to identify contextual inputs Analysis of the strength and timing of modifier effects Experiments that demonstrate the identified mechanisms are necessary and sufficient for RNN performance
Modier token
||∆Jinp||F
<latexit sha1_base64="pZ/myz1EV1gBoP37+hKey50Iz3k=">AI5XicfVb9s2Fa7S13vlm5PxV6IBQE6YMgs392nNpehNZosa5o0QOQaFHVsE6FEgaJip4qwX7C3Ya972uv2sj+zfzPSUqyE9EZAEHW+71y+Q1L0Y0YT2Wj8c+/+Bx9+9PGD2sP6J59+9vkXG4+PE14KgicEM64OPNxAoxGcCKpZHAWC8Chz+Ctf7Gr8beXIBLKozfyKoZRiKcRnVCpTKNx5fX3t7wCRG3pBG8fX12JOwkNkP+Xhjs7HdWA5kT9xysumU42j86MHfXsBJGkIkCcNJcu42YjnKsJCUMjrXpAjMkFnsK5mkY4hGSULTXkaEtZAjThQj2REvrbY8Mh0lyFfqKGWI5S0xMG9dh56mc9EeZ0pZKiEiRaJIyJDnSDUEBFUAku1ITARVtSIywITqdpWr9e30EkCSM5AFcYn9NoinS3l7VqM42UE2bIV9YAlc1GSeqHVMpSk4BLCvOnd4peFMrW7eNEKXKDcJcJz5SsXiEGX2vwrwMYy5kgrBKsvDUL0TtANTGqEnEl9AhCaCh4gLqkyqnENIxcujY3T8HMU4BvEtMlumG53kqBhbqn4F+RyLAOkuIt1tzqxGLxfhjGiBCaqYVUowkMFypsQyxa43ze/QyDJtup3YtgTgoRmeHebplP4kC/PcAOcVOLfA1xX42gJfVeArG8Q6q6ouKIrMPGXxA2wR34Pgd5kNizOsMg0tcFGBCwucVeDMAs8q8MwCTyvw1AKBMUOdtpgsUYUQCjRQHyQ2gixNZhSuw0gaXWU7P1qLABW6b6Nh4axeiyKQicMtHGxcg6u/1dg1T9Ux1qXP3xXMPSn4a7/eLcp+tOs/xIzLPNzd5R5Qp1tOc42XSvX3nHB0FGWp1plC/Js7/gpWrLvkA8PTLPUiXk8KBk170AJuoPU4GdPHvx5kDt53ZzMA3txm9G0bT7xK/sYbRXzF2O81+cw1jsGI8b7e6HbMTYTzDCU3WFd8rl0FtJK3AlMAFjqaVCAj6g1Y3X8dZySCNTrdpCi04KyGD3XbzPzgrKb2dVtvdN6REMF1ef6aUVQ3/J0YCZt2b8C130Bt0cptRrQfp4Z6/hlGtR7+z3zVlaEa1Hq32fqdpng0eqHschCmijG3vu5ny0Lt9q48M/o71/e8a97q9uS0ue2tzs/tTef7ZQ3fs352vnGeK4Ts95rxwjpwThzg/O384fzp/1a1X2q/1n4rqPfvlT5fOXdG7fd/AbRWOiY=</latexit>Modier token
||∆Jinp||F
<latexit sha1_base64="pZ/myz1EV1gBoP37+hKey50Iz3k=">AI5XicfVb9s2Fa7S13vlm5PxV6IBQE6YMgs392nNpehNZosa5o0QOQaFHVsE6FEgaJip4qwX7C3Ya972uv2sj+zfzPSUqyE9EZAEHW+71y+Q1L0Y0YT2Wj8c+/+Bx9+9PGD2sP6J59+9vkXG4+PE14KgicEM64OPNxAoxGcCKpZHAWC8Chz+Ctf7Gr8beXIBLKozfyKoZRiKcRnVCpTKNx5fX3t7wCRG3pBG8fX12JOwkNkP+Xhjs7HdWA5kT9xysumU42j86MHfXsBJGkIkCcNJcu42YjnKsJCUMjrXpAjMkFnsK5mkY4hGSULTXkaEtZAjThQj2REvrbY8Mh0lyFfqKGWI5S0xMG9dh56mc9EeZ0pZKiEiRaJIyJDnSDUEBFUAku1ITARVtSIywITqdpWr9e30EkCSM5AFcYn9NoinS3l7VqM42UE2bIV9YAlc1GSeqHVMpSk4BLCvOnd4peFMrW7eNEKXKDcJcJz5SsXiEGX2vwrwMYy5kgrBKsvDUL0TtANTGqEnEl9AhCaCh4gLqkyqnENIxcujY3T8HMU4BvEtMlumG53kqBhbqn4F+RyLAOkuIt1tzqxGLxfhjGiBCaqYVUowkMFypsQyxa43ze/QyDJtup3YtgTgoRmeHebplP4kC/PcAOcVOLfA1xX42gJfVeArG8Q6q6ouKIrMPGXxA2wR34Pgd5kNizOsMg0tcFGBCwucVeDMAs8q8MwCTyvw1AKBMUOdtpgsUYUQCjRQHyQ2gixNZhSuw0gaXWU7P1qLABW6b6Nh4axeiyKQicMtHGxcg6u/1dg1T9Ux1qXP3xXMPSn4a7/eLcp+tOs/xIzLPNzd5R5Qp1tOc42XSvX3nHB0FGWp1plC/Js7/gpWrLvkA8PTLPUiXk8KBk170AJuoPU4GdPHvx5kDt53ZzMA3txm9G0bT7xK/sYbRXzF2O81+cw1jsGI8b7e6HbMTYTzDCU3WFd8rl0FtJK3AlMAFjqaVCAj6g1Y3X8dZySCNTrdpCi04KyGD3XbzPzgrKb2dVtvdN6REMF1ef6aUVQ3/J0YCZt2b8C130Bt0cptRrQfp4Z6/hlGtR7+z3zVlaEa1Hq32fqdpng0eqHschCmijG3vu5ny0Lt9q48M/o71/e8a97q9uS0ue2tzs/tTef7ZQ3fs352vnGeK4Ts95rxwjpwThzg/O384fzp/1a1X2q/1n4rqPfvlT5fOXdG7fd/AbRWOiY=</latexit>−3 −2 −1 1 2 3 Modier componen 1 −2 −1 1 2 Modier componen 2
0.00 0.05 0.10 0.15 0.20 0.25 0.30 Change in Input Jacobian (||ΔJinp||F) −3 −2 −1 1 2 3 Modier component 1 −2 −1 1 2 Modier component 2
0.00 0.05 0.10 0.15 0.20 0.25 0.30 Change in Input Jacobian (||ΔJinp||F) −3 −2 −1 1 2 3 Modier component 1 −2 −1 1 2 Modier component 2 not etreme
0.00 0.05 0.10 0.15 0.20 0.25 0.30 Change in Input Jacobian (||ΔJinp||F) −3 −2 −1 1 2 3 Modier component 1 −2 −1 1 2 Modier component 2 not etreme neer e ero er but
−4 −2 24 Principal component − −2 −
not
−4 −2 24 Principal component − −2 −
not
−4 −2 24 Principal component − −2 −
extremely not
−4 −2 24 Princil comonent − −2 −
−4 −2 2 4 Principal component #1 −3 −2 −1 1 2 3 Modifjer component #1
not extremely
5 10 Time (t) 1 2 3 Distance from line attractor
(b) (a)
"not" 2 toens "extremely" 1 toens 5 10 Time (t) 1 2 3 Distance from line attractor
(b) (a)
"not" 2 toens "extremely" 1 toens
extremely not
−4 −2 24 Princil comonent − −2 −
Line attractor
Line attractor
Modifier subspace
Bag of Words (Baseline) 93.6% RNN (GRU) 95.8%
Bag of Words (Baseline) 93.6% Augmented Bag-of-Words (includes modifier effects) 95.5% RNN (GRU) 95.8%
5 10 15 TiPe (t) −4 −3 −2 −1 1 2 3 4 0odel prediFtion (logit) )ull networN
Baseline IntensiIier 1egation
5 10 15 TiPe (t) −4 −3 −2 −1 1 2 3 4 0odel prediFtion (logit) 3roMeFt out oI PodiIier subspaFe 5 10 15 TiPe (t) −4 −3 −2 −1 1 2 3 4 0odel prediFtion (logit) 3roMeFt out oI randoP subspaFe
Original network
5 10 15 TiPe (t) −4 −3 −2 −1 1 2 3 4 0odel prediFtion (logit) )ull networN
Baseline IntensiIier 1egation
5 10 15 TiPe (t) −4 −3 −2 −1 1 2 3 4 0odel prediFtion (logit) 3roMeFt out oI PodiIier subspaFe 5 10 15 TiPe (t) −4 −3 −2 −1 1 2 3 4 0odel prediFtion (logit) 3roMeFt out oI randoP subspaFe
Original network Perturbed network
5 10 15 TiPe (t) −4 −3 −2 −1 1 2 3 4 0odel prediFtion (logit) )ull networN
Baseline IntensiIier 1egation
5 10 15 TiPe (t) −4 −3 −2 −1 1 2 3 4 0odel prediFtion (logit) 3roMeFt out oI PodiIier subspaFe 5 10 15 TiPe (t) −4 −3 −2 −1 1 2 3 4 0odel prediFtion (logit) 3roMeFt out oI randoP subspaFe
Original network Perturbed network
Line attractor
Modifier subspace