Towards Interpretable Deep Learning for Natural Language Processing - PowerPoint PPT Presentation

Training Procedure Formally End-to-end training: ◮ Input f θ 0 ( v ) f θ 1 ( v ) f θ 2 ( v ) f θ 3 ( v ) s 0 s 1 s 2 s 3 s 4 ◮ ◮ Word embeddings: word → R d ◮ Training data: pairs of < document, sentiment label > ◮ Output ◮ Parameter values: θ . 13 / 37

Training Procedure Formally End-to-end training: Test: ◮ Input ◮ Input f θ 0 ( v ) f θ 1 ( v ) f θ 2 ( v ) f θ 3 ( v ) f θ 0 ( v ) f θ 1 ( v ) f θ 2 ( v ) f θ 3 ( v ) s 0 s 1 s 2 s 3 s 4 s 0 s 1 s 2 s 3 s 4 ◮ ◮ ◮ Word embeddings: word → R d ◮ Word embeddings: word → R d ◮ Training data: pairs of ◮ Learned parameters: θ ◮ New data: < document > < document, sentiment label > ◮ Output ◮ Output ◮ Parameter values: θ ◮ Prediction: < sentiment label > . 13 / 37

Training Procedure Formally End-to-end training: Test: ◮ Input ◮ Input f θ 0 ( v ) f θ 1 ( v ) f θ 2 ( v ) f θ 3 ( v ) f θ 0 ( v ) f θ 1 ( v ) f θ 2 ( v ) f θ 3 ( v ) s 0 s 1 s 2 s 3 s 4 s 0 s 1 s 2 s 3 s 4 ◮ ◮ ◮ Word embeddings: word → R d ◮ Word embeddings: word → R d ◮ Training data: pairs of ◮ Learned parameters: θ ◮ New data: < document > < document, sentiment label > ◮ Output ◮ Output ◮ Parameter values: θ ◮ Prediction: < sentiment label > ◮ Standard training procedure ◮ Backpropagation ◮ Stochastic gradient descent . 13 / 37

Benefits of Neural WFSAs 1: Informed Model Development Fixed length: s 0 s 1 s 2 s 3 s 4 such a great talk 14 / 37

Benefits of Neural WFSAs 1: Informed Model Development Fixed length: s 0 s 1 s 2 s 3 s 4 such a great talk Self loops: s 0 s 1 s 2 s 3 s 4 such a great, wonderful, funny talk 14 / 37

Benefits of Neural WFSAs 1: Informed Model Development Fixed length: s 0 s 1 s 2 s 3 s 4 such a great talk Self loops: s 0 s 1 s 2 s 3 s 4 such a great, wonderful, funny talk Epsilon transitions: s 0 s 1 s 2 s 3 s 4 such great shoes f θ ǫ () 14 / 37

Benefits of Neural WFSAs 1: Informed Model Development Fixed length: s 0 s 1 s 2 s 3 s 4 such a great talk Self loops: s 0 s 1 s 2 s 3 s 4 such a great, wonderful, funny talk Epsilon transitions: s 0 s 1 s 2 s 3 s 4 such great shoes f θ ǫ () . . . s 0 s 0 s 1 s 1 s 2 s 2 s 3 s 3 s 4 s 4 14 / 37

Benefits of Neural WFSAs 2: ◮ They are neural ◮ Backpropagation ◮ Stochastic gradient descent ◮ PyTorch, TensorFlow, AllenNLP 15 / 37

Benefits of Neural WFSAs 2: ◮ They are neural ◮ Backpropagation ◮ Stochastic gradient descent ◮ PyTorch, TensorFlow, AllenNLP ◮ Coming up: ◮ Many deep models are mathematically equivalent to neural WFSAs ◮ A (new) joint framework ◮ Allows extension of these models 15 / 37

Overview ◮ Background: Weighted Finite-State Automata ◮ Neural Weighted Finite-State Automata ◮ Existing Deep Models as Weighted Finite-State Automata ◮ Case Study: Convolutional neural networks 16 / 37

Case Study: Convolutional Neural Networks (ConvNets) A Linear-Kernel Filter with Max-Pooling v 1 v 2 v 3 v 4 v 5 v 6 v 7 17 / 37

Case Study: Convolutional Neural Networks (ConvNets) A Linear-Kernel Filter with Max-Pooling v 1 v 2 v 3 v 4 v 5 v 6 v 7 S θ ( v 1 : v 4 ) = � θ j · v j j =1:4 Learnable parameters Word vectors 17 / 37

Proposition 1: ConvNet Filters are Computing WFSA scores Schwartz et al., ACL 2018 s 0 s 1 s 2 s 3 s 4 18 / 37

Proposition 1: ConvNet Filters are Computing WFSA scores Schwartz et al., ACL 2018 ◮ f θ j ( v ) = θ j · v s 0 s 1 s 2 s 3 s 4 18 / 37

Proposition 1: ConvNet Filters are Computing WFSA scores Schwartz et al., ACL 2018 ◮ f θ j ( v ) = θ j · v ◮ s θ ( v 1 : v 4 ) = � f θ j ( v j ) = � ( θ j · v j ) j =1:4 j =1:4 s 0 s 1 s 2 s 3 s 4 18 / 37

ConvNets are ( Implicitly ) Computing WFSA Scores! � ConvNet : S θ ( v 1 : v d ) = ( θ j · v j ) (1) j =1: d � Neural WFSA : s θ ( v 1 : v d ) = ( θ j · v j ) (2) j =1: d 19 / 37

ConvNets are ( Implicitly ) Computing WFSA Scores! � ConvNet : S θ ( v 1 : v d ) = ( θ j · v j ) (1) j =1: d � Neural WFSA : s θ ( v 1 : v d ) = ( θ j · v j ) (2) j =1: d Benefits : ❉ Interpret ConvNets ❉ Improve ConvNets 19 / 37

A ConvNet Learns a Fixed-Length Soft -Pattern! Schwartz et al., ACL 2018 s 0 s 1 s 2 s 3 s 4 ◮ E.g., “such a great talk” ◮ what a great song ◮ such an awesome movie 20 / 37

Improving ConvNets: SoPa (Soft-Patterns) Schwartz et al., ACL 2018 ◮ Language pattern are often flexible-length ◮ such a great talk ◮ such a great, funny, interesting talk ◮ such great shoes 21 / 37

Improving ConvNets: SoPa (Soft-Patterns) Schwartz et al., ACL 2018 ◮ Language pattern are often flexible-length ◮ such a great talk ◮ such a great, funny, interesting talk ◮ such great shoes Convolutional Neural Network: S θ ( v 1 : v d ) = � ( θ j · v j ) j =1: d 21 / 37

Improving ConvNets: SoPa (Soft-Patterns) Schwartz et al., ACL 2018 ◮ Language pattern are often flexible-length ◮ such a great talk ◮ such a great, funny, interesting talk ◮ such great shoes Weighted Finite-State Automaton: such a great talk s 0 s 1 s 2 s 3 s 4 21 / 37

Improving ConvNets: SoPa (Soft-Patterns) Schwartz et al., ACL 2018 ◮ Language pattern are often flexible-length ◮ such a great talk ◮ such a great, funny, interesting talk ◮ such great shoes Weighted Finite-State Automaton: funny , interesting such a great talk s 0 s 1 s 2 s 3 s 4 21 / 37

Improving ConvNets: SoPa (Soft-Patterns) Schwartz et al., ACL 2018 ◮ Language pattern are often flexible-length ◮ such a great talk ◮ such a great, funny, interesting talk ◮ such great shoes Weighted Finite-State Automaton: funny , interesting such a great talk s 0 s 1 s 2 s 3 s 4 ǫ 21 / 37

Sentiment Analysis Experiments Classify ( ❯ / ❉ ) v 1:7 = v 1 v 2 v 3 v 4 v 5 v 6 v 7 I saw such a great talk today 22 / 37

Sentiment Analysis Experiments Classify ( ❯ / ❉ ) Sequence encoders: ◮ SoPa (ours) v 1:7 = ◮ ConvNet v 1 v 2 v 3 v 4 v 5 v 6 v 7 I saw such a great talk today 22 / 37

Sentiment Analysis Experiments Classify ( ❯ / ❉ ) Sequence encoders: ◮ SoPa (ours) v 1:7 = ◮ ConvNet LSTM v 1 v 2 v 3 v 4 v 5 v 6 v 7 I saw such a great talk today 22 / 37

Sentiment Analysis Results Schwartz et al., ACL 2018 Classification Accuracy 85 90 85 80 80 75 75 70 100 1 , 000 10 , 000 100 1 , 000 10 , 000 Num. Training Samples (SST) Num. Training Samples (Amazon) 23 / 37

Sentiment Analysis Results Schwartz et al., ACL 2018 Classification Accuracy 85 90 85 80 80 75 75 SoPa (ours) ConvNet 70 LSTM 100 1 , 000 10 , 000 100 1 , 000 10 , 000 Num. Training Samples (SST) Num. Training Samples (Amazon) 23 / 37

Interpreting SoPa Soft Patterns! ◮ For each learned pattern, extract the 4 top scoring phrases in the training set 24 / 37

Interpreting SoPa Soft Patterns! ◮ For each learned pattern, extract the 4 top scoring phrases in the training set Highest Scoring Phrases mesmerizing portrait of a engrossing portrait of a Patt. 1 clear-eyed portrait of an fascinating portrait of a portrait of a � s 0 s 1 s 2 s 3 s 4 24 / 37

Interpreting SoPa Soft Patterns! ◮ For each learned pattern, extract the 4 top scoring phrases in the training set Highest Scoring Phrases Highest Scoring Phrases mesmerizing portrait of a honest , and enjoyable engrossing portrait of a forceful , and beautifully Patt. 1 Patt. 2 clear-eyed portrait of an energetic , and surprisingly fascinating portrait of a portrait of a , and � � � s 0 s 1 s 2 s 3 s 4 s 0 s 1 s 2 s 3 s 4 24 / 37

Interpreting SoPa Soft Patterns! ◮ For each learned pattern, extract the 4 top scoring phrases in the training set Highest Scoring Phrases Highest Scoring Phrases mesmerizing portrait of a honest , and enjoyable engrossing portrait of a forceful , and beautifully Patt. 1 Patt. 2 clear-eyed portrait of an energetic , and surprisingly fascinating portrait of a unpretentious , charming SL , quirky � portrait of a , and � � � s 0 s 1 s 2 s 3 s 4 s 0 s 1 s 2 s 3 s 4 24 / 37

25 / 37

s 0 s 1 s 2 s 3 s 4 25 / 37

s 0 s 1 s 2 s 3 s 4 More expressive WFSA s 0 s 1 s 2 s 3 s 4 f () 25 / 37

s 0 s 1 s 2 s 3 s 4 Interpretable More expressive WFSA More Robust Convolutional Neural Network ++ s 0 s 1 s 2 s 3 s 4 f () 25 / 37

Many Existing Deep Models are Neural WFSAs! Peng, Schwartz et al., EMNLP 2018 Mikolov et al. arXiv 2014 Balduzzi and Ghifary ICML 2016 Bradbury et al. ICLR 2017 Lei et al. EMNLP 2018 Lei et al. NAACL 2016 Foerster et al. ICML 2017 26 / 37

Many Existing Deep Models are Neural WFSAs! Peng, Schwartz et al., EMNLP 2018 Mikolov et al. arXiv 2014 Balduzzi and Ghifary ICML 2016 Bradbury et al. ICLR 2017 s 0 s 1 Lei et al. EMNLP 2018 Lei et al. NAACL 2016 s 0 s 1 s 2 Foerster et al. ICML 2017 s 0 s 1 s 2 s 3 26 / 37

Many Existing Deep Models are Neural WFSAs! Peng, Schwartz et al., EMNLP 2018 Mikolov et al. arXiv 2014 Balduzzi and Ghifary ICML 2016 Bradbury et al. ICLR 2017 s 0 s 1 Lei et al. EMNLP 2018 Lei et al. NAACL 2016 s 0 s 1 s 2 Foerster et al. ICML 2017 s 0 s 1 s 2 s 3 ◮ Six recent recurrent neural networks (RNN) models are also implicitly computing WFSA scores 26 / 37

Developing more Robust WFSA Models S 2 : S 3 : s 0 s 1 s 0 s 1 s 2 Mikolov et al. (2014) Lei et al. (2016) Balduzzi and Ghifary (2016) Bradbury et al. (2017) Lei et al. (2018) 27 / 37

Developing more Robust WFSA Models S 2 : S 3 : s 0 s 1 s 0 s 1 s 2 Mikolov et al. (2014) Lei et al. (2016) Balduzzi and Ghifary (2016) Bradbury et al. (2017) Lei et al. (2018) S 2 3 : s 0 s 1 s 2 Peng, Schwartz et al. (2018) 27 / 37

Sentiment Analysis Results Peng, Schwartz et al., EMNLP 2018 28 / 37

Language Modeling Results Peng, Schwartz et al., EMNLP 2018 lower is better 29 / 37

s 0 s 1 Weighted Finite-State Automata Deep Learning ❯ widely studied ❯ backpropagation ❯ understandable ❯ stochastic gradient descent ❯ interpretable ❯ PyTorch, TensorFlow, AllenNLP ❯ informed model development ❯ state-of-the-art ❉ low performance ❉ architecture engineering 30 / 37

Work in Progress 1: Are All Deep Models for NLP Equivalent to WFSAs? ◮ Elman RNN: h i = σ ( Wh i − 1 + Uv i + b ) ◮ The interaction between h i and h i − 1 is via affine transformations followed by nonlinearities ◮ Same for LSTM ◮ Most probably not equivalent to a WFSA 31 / 37

Work in Progress 2: Automatic Model Development s 0 s 0 s 1 s 1 s 2 s 2 s 3 s 3 s 4 s 4 32 / 37

Work in Progress 2: Automatic Model Development s 0 s 0 s 1 s 1 s 2 s 2 s 3 s 3 s 4 s 4 s 0 s 0 s 1 s 1 s 2 s 2 s 3 s 3 s 4 s 4 32 / 37

Work in Progress 2: Automatic Model Development s 0 s 0 s 1 s 1 s 2 s 2 s 3 s 3 s 4 s 4 Deep learning: model engineering s 0 s 0 s 1 s 1 s 2 s 2 s 3 s 3 s 4 s 4 32 / 37

Work in Progress 2: Automatic Model Development s 0 s 0 s 1 s 1 s 2 s 2 s 3 s 3 s 4 s 4 Deep learning: model engineering SoPa: informed model development s 0 s 0 s 1 s 1 s 2 s 2 s 3 s 3 s 4 s 4 32 / 37

Work in Progress 2: Automatic Model Development s 0 s 0 s 1 s 1 s 2 s 2 s 3 s 3 s 4 s 4 Deep learning: model engineering SoPa: informed model development New: Automatic model development s 0 s 0 s 1 s 1 s 2 s 2 s 3 s 3 s 4 s 4 32 / 37

Other Projects output: Classify ( ❯ / ❉ ) prediction sequence v 1:7 encoders word v 1 v 2 v 3 v 4 v 5 v 6 v 7 embeddings input: I saw such a great talk today words 33 / 37

Other Projects output: Classify ( ❯ / ❉ ) prediction sequence v 1:7 encoders word v 1 v 2 v 3 v 4 v 5 v 6 v 7 embeddings input: Schwartz et al., EMNLP 2013 I saw such a great talk today Schwartz et al., COLING 2014 words 33 / 37

Other Projects output: Classify ( ❯ / ❉ ) prediction sequence v 1:7 encoders Schwartz et al., CoNLL 2015 Rubinstein et al., ACL 2015 word v 1 v 2 v 3 v 4 v 5 v 6 v 7 Schwartz et al., NAACL 2016 embeddings Vuli´ c et al., CoNLL 2017 Peters et al., 2018 input: Schwartz et al., EMNLP 2013 I saw such a great talk today Schwartz et al., COLING 2014 words 33 / 37

Other Projects output: Classify ( ❯ / ❉ ) prediction Schwartz et al., ACL 2018 sequence v 1:7 Peng et al., EMNLP 2018 encoders Liu et al., RepL4NLP 2018 *best paper award* Schwartz et al., CoNLL 2015 Rubinstein et al., ACL 2015 word v 1 v 2 v 3 v 4 v 5 v 6 v 7 Schwartz et al., NAACL 2016 embeddings Vuli´ c et al., CoNLL 2017 Peters et al., 2018 input: Schwartz et al., EMNLP 2013 I saw such a great talk today Schwartz et al., COLING 2014 words 33 / 37

Other Projects output: Labeled datasets: Classify ( ❯ / ❉ ) prediction < sentence, label > pairs Schwartz et al., ACL 2011 Schwartz et al., COLING 2012 Schwartz et al., ACL 2018 Schwartz et al., CoNLL 2017 sequence v 1:7 Peng et al., EMNLP 2018 Gururangan et al., NAACL 2018 encoders Liu et al., RepL4NLP 2018 Kang et al., NAACL 2018 *best paper award* Zellers et al., EMNLP 2018 Schwartz et al., CoNLL 2015 Rubinstein et al., ACL 2015 word v 1 v 2 v 3 v 4 v 5 v 6 v 7 Schwartz et al., NAACL 2016 embeddings Vuli´ c et al., CoNLL 2017 Peters et al., 2018 input: Schwartz et al., EMNLP 2013 I saw such a great talk today Schwartz et al., COLING 2014 words 33 / 37

Annotation Artifacts in NLP Datasets Schwartz et al., CoNLL 2017; Gururangan, Swayamdipta, Levy, Schwartz et al., NAACL 2018 Premise A person is running on the beach Hypothesis The person is sleeping Textual Entailment (state-of-the-art ∼ 90% accuracy) 34 / 37

Annotation Artifacts in NLP Datasets Schwartz et al., CoNLL 2017; Gururangan, Swayamdipta, Levy, Schwartz et al., NAACL 2018 Premise A person is running on the beach ? entailment ? Hypothesis The person is sleeping contradiction ? neutral Textual Entailment (state-of-the-art ∼ 90% accuracy) 34 / 37

Annotation Artifacts in NLP Datasets Schwartz et al., CoNLL 2017; Gururangan, Swayamdipta, Levy, Schwartz et al., NAACL 2018 Premise A person is running on the beach ? entailment ? Hypothesis The person is sleeping contradiction ? neutral Textual Entailment (state-of-the-art ∼ 90% accuracy) AllenNLP Demo! 34 / 37

Annotation Artifacts in NLP Datasets Schwartz et al., CoNLL 2017; Gururangan, Swayamdipta, Levy, Schwartz et al., NAACL 2018 Premise A person is running on the beach ? entailment ? Hypothesis The person is sleeping contradiction ? neutral Textual Entailment (state-of-the-art ∼ 90% accuracy) ◮ The word “sleeping” is over-represented in the training data with contradiction label ◮ annotation artifact ◮ State-of-the-art models focus on this word rather than understanding the text 34 / 37

Annotation Artifacts in NLP Datasets Schwartz et al., CoNLL 2017; Gururangan, Swayamdipta, Levy, Schwartz et al., NAACL 2018 Premise A person is running on the beach ? entailment ? Hypothesis The person is sleeping contradiction ? neutral Textual Entailment (state-of-the-art ∼ 90% accuracy) ◮ The word “sleeping” is over-represented in the training data with contradiction label ◮ annotation artifact ◮ State-of-the-art models focus on this word rather than understanding the text ◮ Models are not as strong as we think they are 34 / 37

Towards Interpretable Deep Learning for Natural Language Processing - PowerPoint PPT Presentation

Towards Interpretable Deep Learning for Natural Language Processing Roy Schwartz University of Washington & Allen Institute for Artificial Intelligence December 2018 1 / 37 ( Deep-Learning -Based) AI Today 2 / 37 3 / 37 Deep Learning

Not Just a Black Box: Interpretable Deep Learning for Genomics Avan> Shrikumar, Peyton

From ML Successes to Applications ICIP18 Tutorial on Interpretable Deep Learning 2 Black Box

Interpretable sets in o-minimal structures Will Johnson March 27, 2015 Will Johnson

Towards Deep Multi-View Stereo Silvano Galliani October 2, 2017 1 / 40 Towards Deep Multi-View

Interpretable & Transparent Deep Learning Fraunhofer HHI, Machine Learning Group Wojciech

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Deep Visual Models with Interpretable Features and Modularized Structures Quanshi Zhang John

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Tutorials Interpretable Deep Learning: Towards Understanding & Explaining DNNs P a r t 3

Tutorials Interpretable Deep Learning: Towards Understanding & Explaining DNNs P a r t 2

Not Just a Black Box: Interpretable Deep Learning for Genomics Presented by: AvanA Shrikumar 1

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules

SDRL: Interpretable and Data-efficient Deep Liu Reinforcement Learning Introduction Background

Natural Language Processing with Deep Learning CS224N The Future of Deep Learning + NLP Kevin

Planting, Growing, and Pruning Trees: Connected Filters Applied to Document Image Analysis

Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks with Accurate Detector

Chapter Four: DFA Applications Formal Language, chapter 4, slide 1 1 We have seen how DFAs

Dependency Grammars Dependency grammars . ltekin, SfS / University of Tbingen WS

CSC2515 Lecture 9: Convolutional Networks Marzyeh Ghassemi Material and slides developed by

Status LHC Collimation Phase I and Phase II Plans R. Assmann, CERN/AB 27/10/2008 for the

PRODUCTIVITY CORPORATION D R I V I N G P R O D U C T I V I T Y O F T H E N AT I O N KS#3

Seminar: Search and Optimization 2. Search Problems Florian Pommerening Universit at Basel

Towards Interpretable Deep Learning for Natural Language Processing - PowerPoint PPT Presentation

Towards Interpretable Deep Learning for Natural Language Processing Roy Schwartz University of Washington & Allen Institute for Artificial Intelligence December 2018 1 / 37 ( Deep-Learning -Based) AI Today 2 / 37 3 / 37 Deep Learning

Not Just a Black Box: Interpretable Deep Learning for Genomics Avan&gt; Shrikumar, Peyton

From ML Successes to Applications ICIP18 Tutorial on Interpretable Deep Learning 2 Black Box

Interpretable sets in o-minimal structures Will Johnson March 27, 2015 Will Johnson

Towards Deep Multi-View Stereo Silvano Galliani October 2, 2017 1 / 40 Towards Deep Multi-View

Interpretable &amp; Transparent Deep Learning Fraunhofer HHI, Machine Learning Group Wojciech

Deep learning for natural language processing A short primer on deep learning Benoit Favre &lt;

Deep Visual Models with Interpretable Features and Modularized Structures Quanshi Zhang John

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Tutorials Interpretable Deep Learning: Towards Understanding &amp; Explaining DNNs P a r t 3

Tutorials Interpretable Deep Learning: Towards Understanding &amp; Explaining DNNs P a r t 2

Not Just a Black Box: Interpretable Deep Learning for Genomics Presented by: AvanA Shrikumar 1

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules

SDRL: Interpretable and Data-efficient Deep Liu Reinforcement Learning Introduction Background

Natural Language Processing with Deep Learning CS224N The Future of Deep Learning + NLP Kevin

Planting, Growing, and Pruning Trees: Connected Filters Applied to Document Image Analysis

Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks with Accurate Detector

Chapter Four: DFA Applications Formal Language, chapter 4, slide 1 1 We have seen how DFAs

Dependency Grammars Dependency grammars . ltekin, SfS / University of Tbingen WS

CSC2515 Lecture 9: Convolutional Networks Marzyeh Ghassemi Material and slides developed by

Status LHC Collimation Phase I and Phase II Plans R. Assmann, CERN/AB 27/10/2008 for the

PRODUCTIVITY CORPORATION D R I V I N G P R O D U C T I V I T Y O F T H E N AT I O N KS#3

Seminar: Search and Optimization 2. Search Problems Florian Pommerening Universit at Basel

Not Just a Black Box: Interpretable Deep Learning for Genomics Avan> Shrikumar, Peyton

Interpretable & Transparent Deep Learning Fraunhofer HHI, Machine Learning Group Wojciech

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Tutorials Interpretable Deep Learning: Towards Understanding & Explaining DNNs P a r t 3

Tutorials Interpretable Deep Learning: Towards Understanding & Explaining DNNs P a r t 2