Analyzing Neural Language Models Introduction
Shane Steinert-Threlkeld Jan 9, 2020
1
Analyzing Neural Language Models Introduction Shane - - PowerPoint PPT Presentation
Analyzing Neural Language Models Introduction Shane Steinert-Threlkeld Jan 9, 2020 1 Todays Plan Motivation / background NLPs ImageNet moment NLPs Clever Hans moment 15 minute break Course information /
Shane Steinert-Threlkeld Jan 9, 2020
1
2
3
4
link
5
CVPR ‘09
6
7
8
8
9
10 source
10 source
11
source
12
NeurIPS 2012 paper
12
NeurIPS 2012 paper
“AlexNet”
13
VGG16
13
VGG16 Inception
13
VGG16 Inception ResNet (34 layers above; up to 152 in paper)
“We use features extracted from the OverFeat network as a generic image representation to tackle the diverse range of recognition tasks of object image classification, scene recognition, fine grained recognition, attribute detection and image retrieval applied to a diverse set of
Astonishingly, we report consistent superior results compared to the highly tuned state-of-the- art systems in all the visual classification tasks on various datasets”
14
15
15
15
15
16
17
17
17
17
17
17
17
17
17
17
17
17
17
18
19
CVPR ’17 paper
19
CVPR ’17 paper
Pre-trained ResNet
20
21
21
21
21
21
21
21
21
21
22
22
22
22
23
24
24
24
25
26
https://twitter.com/rgblong/status/916062474545319938?lang=en
27
28
29
Peters et. al (2018)
30
Peters et. al (2018)
30
Peters et. al (2018)
30
Peters et. al (2018)
31
Source Nearest Neighbors GloVe
play playing, game, games, played, players, plays, player, Play, football, multiplayer
biLM
Chico Ruiz made a spectacular play on Alusik’s grounder… Kieffer, the only junior in the group, was commended for his ability to hit in the clutch, as well as his all-round excellent play. Olivia De Havilland signed to do a Broadway play for Garson… …they were actors who had been handed fat roles in a successful play, and had talent enough to fill the roles competently, with nice understatement.
Peters et. al (2018)
32
SQuAD = Stanford Question Answering Dataset SNLI = Stanford Natural Language Inference Corpus SST
figure: Matthew Peters
Bidirectional Encoder Representations from Transformers Devlin et al 2019
33
34
35
https://www.blog.google/products/search/search-language-understanding-bert/
36
37
General Language Understanding Evaluation (GLUE) / SuperGLUE
38
38
39
39
39
40
40
Mikolov et al 2013a (the OG word2vec paper)
41
42
link
43
44
44
44
44
44
44
44
45
46
McCoy et al 2019
47
48
(performance improves if fine-tuned on this challenge set)
49
link
50
data, training protocols, …
LSTMs vs Transformers), similarities to human learning biases
51
52
53
54
[architectures, tasks, data, …]
[visualization, probing classifiers, artificial data, …]
[guest lecture by Rachel Rudinger]
55
56
57
58
59
60
61
62
1ziyww5J49iQ7iE8ElgzMX6rl2cR_cbcKxK89pvZDPF4/edit?usp=sharing
63
64