Towards a Deep and Unified Understanding of Deep Neural Models in - PowerPoint PPT Presentation

Towards a Deep and Unified Understanding of Deep Neural Models in NLP Chaoyu Guan* 2 , Xiting Wang* 2 , Quanshi Zhang 1 , Runjin Chen 1 , Di He 2 , Xing Xie 2 * Equal Contribution 1 John Hopcroft Center and the MoE Key Lab of Artificial Intelligence, AI Institute, at the Shanghai Jiao Tong University, Shanghai, China 2 Microsoft Research Asia, Beijing, China

Towards a Deep and Unified Understanding of Deep Neural Models in NLP #62 Introduction A key task in explainable AI is to associate latent representations with input units by quantifying layerwise information discarding of inputs. Most explanation methods (e.g., DNN visualization) have coherency & generality issues • Coherency: requires that a method generates consistent explanations across different neurons, layers, and models. • Generality: existing measures are usually defined under certain restrictions on model architectures or tasks.

Towards a Deep and Unified Understanding of Deep Neural Models in NLP #62 Our solution Considering both coherency and generality • A unified information-based measure: quantify the information of each input word that is encoded in an intermediate layer of a deep NLP model. • The information-based measure as a tool • Evaluating different explanation methods. • Explaining different deep NLP models • This measure enriches the capability of explaining DNNs.

Towards a Deep and Unified Understanding of Deep Neural Models in NLP #62 Problem • Quantification of sentence-level information discarding: quantify the information of an entire sentence 𝐲 that is encoded in 𝐭 . • Quantification of word-level information discarding : quantify the information of each specific word 𝐲 𝑗 that is encoded in 𝐭 . • Fine-grained analysis of word attributes : analyze the fine-grained reason why 𝐭 uses the information of 𝐲 𝑗 . 𝑈 𝑈 ∈ 𝐘 : Input sentence 𝑈 , … , 𝐲 𝑜 𝑈 : word embedding 𝐲 = 𝐲 1 𝐲 𝑗 𝐭 = Φ 𝐲 ∈ 𝐓 : hidden state Φ ⋅ : function of the intermediate layer

Towards a Deep and Unified Understanding of Deep Neural Models in NLP #62 Word Information Quantification Multi-Level Quantification Corpus level 𝑁𝐽 𝐘; 𝐓 = 𝐼 𝐘 − 𝐼(𝐘|𝐓) 𝑁𝐽(𝐘; 𝐓) 𝐼(𝐘|𝐓) 𝐼 𝐘 𝐓 = න 𝑞 𝐭 𝐼 𝐘 𝐭 𝑒𝐭 𝐭∈𝐓 𝑞 𝐲 ′ 𝐭 log 𝑞 𝐲 ′ 𝐭 𝑒𝐲′ 𝐼(𝐲) = − න Sentence level 𝐼(𝐘) 𝐲 ′ ∈𝐘 𝐼 𝐘 𝐭 = ∗ ෍ 𝐼 𝐘 𝑗 𝐭 Word level 𝐼(𝐘 𝑗 |𝐭 = Φ(𝐲)) reflects how 𝑗 much information from word ′ 𝐭 log 𝑞 𝐲 𝑗 ′ 𝐭 𝑒𝐲′ 𝑗 𝐲 𝑗 is discarded by 𝐭 during 𝐼 𝐘 𝑗 𝐭 = − න 𝑞 𝐲 𝑗 the forward propagation. ′ ∈𝐘 𝒋 𝐲 𝒋 * Suppose the words are independent in one sentence.

Towards a Deep and Unified Understanding of Deep Neural Models in NLP #62 Word Information Quantification Perturbation-based Approximation We use 𝐼(෩ 𝐘 𝑗 |𝐭) to approximate 𝐼(𝐘 𝑗 |𝐭) by minimizing the following loss: 𝑜 2 − 𝜇 ෍ 𝐼(෩ 𝑀 𝝉 = 𝔽 𝝑 Φ ෤ 𝐲 − 𝐭 𝐘 𝑗 𝐭 ቚ 2 𝐉) 𝝑 𝑗 ∼𝒪(𝟏,𝜏 𝑗 𝑗=1

Towards a Deep and Unified Understanding of Deep Neural Models in NLP #62 Fine-Grained Analysis of Word Attributes Disentangle the information of a common concept 𝐝 away from each word 𝐲 𝑗 Importance of the i-th word ′ |𝐭) 𝐵 𝑗 = log 𝑞(𝐲 𝑗 |𝐭) − 𝔽 𝐲 𝑗 ′ ∈𝐘 𝑗 log 𝑞(𝐲 𝑗 concerning random words Importance of the common ′ |𝐭) − 𝔽 𝐲 𝑗 ′ |𝐭) 𝐵 𝐝 = 𝔽 𝐲 𝑗 ′ ∈𝐘 𝐝 log 𝑞(𝐲 𝑗 ′ ∈𝐘 𝑗 log 𝑞(𝐲 𝑗 concept c w.r.t. random words 𝑠 𝑗,𝐝 = 𝐵 𝑗 − 𝐵 𝐝 indicates the remaining information of the word 𝐲 𝑗 when we remove the information of the common attribute 𝐝 from the word.

Towards a Deep and Unified Understanding of Deep Neural Models in NLP #62 Comparative Study • Three baselines: LRP, gradient-based, perturbation • Conclusion: our method provides the most faithful explanations for • Across timestamp analysis Our method clearly shows that the • Across layer analysis model gradually focuses on the most • Across model analysis important parts of the sentence.

Towards a Deep and Unified Understanding of Deep Neural Models in NLP #62 Understanding Neural Models in NLP We explain four NLP models (BERT, Transformer, LSTM, and CNN): • What information is leveraged for prediction? • How does the information flow through layers? • How do the models evolve during training?

Towards a Deep and Unified Understanding of Deep Neural Models in NLP #62 Understanding Neural Models in NLP • Bert and Transformer use words for prediction, while LSTM and CNN use subsequences of sentences for prediction. • Different models process the input sentence in different manners.

Towards A Deep and Unified Understanding of Deep Neural Models in NLP Please visit our poster at #62 !

Towards a Deep and Unified Understanding of Deep Neural Models in - PowerPoint PPT Presentation

Towards a Deep and Unified Understanding of Deep Neural Models in NLP Chaoyu Guan* 2 , Xiting Wang* 2 , Quanshi Zhang 1 , Runjin Chen 1 , Di He 2 , Xing Xie 2 * Equal Contribution 1 John Hopcroft Center and the MoE Key Lab of Artificial

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Basics of Unified Sports Ways to get involved with Unified Sports in Ohio Ohio 1 What are

SARVAM UCS Unified Communication Server Unified Communication Server for Modern Enterprises

Towards Deep Multi-View Stereo Silvano Galliani October 2, 2017 1 / 40 Towards Deep Multi-View

Towards Understanding Towards Understanding Objectives Objectives Good basic understanding of

Unified Straight and Curved Steel Girder Design Specifications Introduction Unified Steel

SPORTS! Unified Basketball Special Olympics U NIFIED B ASKETBALL Unified Basketball helps

UNIFIED MEMORY IN CUDA 6 MARK HARRIS NVIDIA CONFIDENTIAL Unified Memory Dramatically Lower

UNIFIED PAYMENTS AT A GLANCE DEAR MERCHANT, WELCOME TO UNIFIED PAYMENTS! At Unified Payments,

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

ICASSP 2017 Tutorial on Methods for Interpreting and Understanding Deep Neural Networks G.

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

The Fundamentals of Deep Learning Building Blocks Theory with Applications Neural Units Neural

Overview Understanding the neural code Neural Encoding Encoding: Prediction of neural response to

Visualizing and Interpreting Deep Neural Networks Bolei Zhou Department of Information

W HAT IS A PPRA? Your customizable dashboard tracking project maturity and associated decisions

Associates Rachael Ross & Peter Northrop Skills for Care Locality Managers Peter Hodkinson

Notes on SBATCH and software specifics SBATCH #!/bin/bash #SBATCH --job-name=<JOB-NAME>

Adaptive treatment assignment in experiments for policy choice Maximilian Kasy Anja Sautmann

Using DANE to Associate Payment Informa6on (PMTA) Glen Wiley,

Jonathan Shuffield Associate Legislative Director WIR Staff Liaison Western Interstate Region

AmeriCorps VISTA Sponsor Call April 1, 2020 Tech Check Audio: Computer or Phone

COUNTERFACTUAL REASONING Jonathan Laurent , Jean Yang (Carnegie Mellon University), Walter Fontana

Towards a Deep and Unified Understanding of Deep Neural Models in - PowerPoint PPT Presentation

Towards a Deep and Unified Understanding of Deep Neural Models in NLP Chaoyu Guan* 2 , Xiting Wang* 2 , Quanshi Zhang 1 , Runjin Chen 1 , Di He 2 , Xing Xie 2 * Equal Contribution 1 John Hopcroft Center and the MoE Key Lab of Artificial

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Basics of Unified Sports Ways to get involved with Unified Sports in Ohio Ohio 1 What are

SARVAM UCS Unified Communication Server Unified Communication Server for Modern Enterprises

Towards Deep Multi-View Stereo Silvano Galliani October 2, 2017 1 / 40 Towards Deep Multi-View

Towards Understanding Towards Understanding Objectives Objectives Good basic understanding of

Unified Straight and Curved Steel Girder Design Specifications Introduction Unified Steel

SPORTS! Unified Basketball Special Olympics U NIFIED B ASKETBALL Unified Basketball helps

UNIFIED MEMORY IN CUDA 6 MARK HARRIS NVIDIA CONFIDENTIAL Unified Memory Dramatically Lower

UNIFIED PAYMENTS AT A GLANCE DEAR MERCHANT, WELCOME TO UNIFIED PAYMENTS! At Unified Payments,

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

ICASSP 2017 Tutorial on Methods for Interpreting and Understanding Deep Neural Networks G.

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

The Fundamentals of Deep Learning Building Blocks Theory with Applications Neural Units Neural

Overview Understanding the neural code Neural Encoding Encoding: Prediction of neural response to

Visualizing and Interpreting Deep Neural Networks Bolei Zhou Department of Information

W HAT IS A PPRA? Your customizable dashboard tracking project maturity and associated decisions

Associates Rachael Ross &amp; Peter Northrop Skills for Care Locality Managers Peter Hodkinson

Notes on SBATCH and software specifics SBATCH #!/bin/bash #SBATCH --job-name=&lt;JOB-NAME&gt;

Adaptive treatment assignment in experiments for policy choice Maximilian Kasy Anja Sautmann

Using DANE to Associate Payment Informa6on (PMTA) Glen Wiley,

Jonathan Shuffield Associate Legislative Director WIR Staff Liaison Western Interstate Region

AmeriCorps VISTA Sponsor Call April 1, 2020 Tech Check Audio: Computer or Phone

COUNTERFACTUAL REASONING Jonathan Laurent , Jean Yang (Carnegie Mellon University), Walter Fontana

Associates Rachael Ross & Peter Northrop Skills for Care Locality Managers Peter Hodkinson

Notes on SBATCH and software specifics SBATCH #!/bin/bash #SBATCH --job-name=<JOB-NAME>