Part-of-Speech Tagging for Historical English
Yi Yang and Jacob Eisenstein
Georgia Tech
Part-of-Speech Tagging for Historical English Yi Yang and Jacob - - PowerPoint PPT Presentation
Part-of-Speech Tagging for Historical English Yi Yang and Jacob Eisenstein Georgia Tech Digital humaniEes research How does the portrayal of men and women differ in Shakespeares plays? Whats the language use paMerns in North
Yi Yang and Jacob Eisenstein
Georgia Tech
[Muralidharan and Hearst, 2011&2012]
women differ in Shakespeare’s plays?
North American slave narraEves?
[Muralidharan and Hearst, 2011&2012]
women differ in Shakespeare’s plays?
North American slave narraEves?
[Muralidharan and Hearst, 2011&2012]
women differ in Shakespeare’s plays?
North American slave narraEves?
Hee said nobody had said anything agt mee .
[Henry Oxinden, 1660]
Hee said nobody had said anything agt mee .
He against He me
[Henry Oxinden, 1660]
Hee said nobody had said anything agt mee .
Stanford:
Hee said nobody had said anything agt mee .
Stanford: Gold:
5 10 15 20 25
3.0
Error rate
Modern English
[Rayson et al., 2007]
5 10 15 20 25
18.0 3.0
Error rate
Modern English Early Modern English
[Rayson et al., 2007]
contemporary forms.
Rayson et al. (2007) Scheible et al. (2011) Bollmann (2011)
contemporary forms.
representaEon learning.
Rayson et al. (2007) Scheible et al. (2011) Bollmann (2011)
Yang & Eisenstein (2014) Yang & Eisenstein (2015)
[VARD; Baron and Rayson, 2008]
Original: Hee said nobody had said anything agt mee . Normalized: Hee said nobody had said anything aged me .
[VARD; Baron and Rayson, 2008]
Original: Hee said nobody had said anything agt mee . Normalized: Hee said nobody had said anything aged me .
[VARD; Baron and Rayson, 2008]
Original: Hee said nobody had said anything agt mee . Normalized: Hee said nobody had said anything aged me .
against
[VARD; Baron and Rayson, 2008]
Original: Hee said nobody had said anything agt mee . Normalized: Hee said nobody had said anything aged me .
against
He
[VARD; Baron and Rayson, 2008]
Normalized: Hee said nobody had said anything aged me .
Stanford: Gold:
[VARD; Baron and Rayson, 2008]
Normalized: Hee said nobody had said anything aged me .
Stanford: Gold:
Hee said nobody had said anything agt mee .
Hee said nobody had said anything agt mee .
Hee said nobody had said anything agt mee .
Hee said nobody had said anything agt mee . Hee said was came told …
He I We … said was came told …
IV OOV Context Context
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
[FEMA; Yang and Eisenstein, 2015]
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
u2
Output embeddings Input embeddings
v1 v3 v4 p(ft|f2) ∝ exp
>vt
[FEMA; Yang and Eisenstein, 2015]
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
u2
Output embeddings Input embeddings
v1 v3 v4 p(ft|f2) ∝ exp
>vt
T
X
t6=2
log p(ft|f2)
[word2vec; Mikolov et al., 2013]
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features hee said nobody had …
words
1 2 3 4 1 2 3 4
[word2vec; Mikolov et al., 2013]
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features hee said nobody had …
words
1 2 3 4 1 2 3 4
[word2vec; Mikolov et al., 2013]
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features hee said nobody had …
words
1 2 3 4 1 2 3 4
[word2vec; Mikolov et al., 2013]
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features hee said nobody had …
words
1 2 3 4 1 2 3 4
[word2vec; Mikolov et al., 2013]
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features hee said nobody had …
words
1 2 3 4 1 2 3 4
[FEMA; Yang and Eisenstein, 2015]
involves in two domains.
[FEMA; Yang and Eisenstein, 2015]
involves in two domains.
[FEMA; Yang and Eisenstein, 2015]
involves in two domains.
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
Domain AMributes: Genre Epoch
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
leMers 1600+ Domain AMributes: Genre Epoch
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
leMers 1600+ Domain AMributes:
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
Genre Epoch
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
leMers 1600+ Domain AMributes:
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
= +
(shared) (leMers)
+
(1600+)
Genre Epoch
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
leMers 1600+ Domain AMributes:
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
= +
(shared) (leMers)
+
(1600+)
Genre Epoch
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
(shared) (leMers)
= + +
(1600+)
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
(shared) (leMers)
= + +
(1600+)
u2 = h(shared)
2
+ h(letters)
2
+ h(1600+)
2
= + +
[FEMA; Yang and Eisenstein, 2015]
Hee said nobody had said anything agt mee .
CurrWord = hee NextWord = said Prefix1 = h Suffix1 = e …
features
1 2 3 4
(shared) (leMers) (1600+)
u2 = h(shared)
2
+ h(letters)
2
+ h(1600+)
2
= + +
p(ft|f2) ∝ exp
>vt
Modern BriEsh English (MBE)
1840-1914 1770-1839 1700-1769 110,000 220,000 330,000 440,000
343,024 427,424 322,255
# of tokens
Early Modern English (EME)
1640-1710 1570-1639 1500-1569 177,500 355,000 532,500 710,000
640,255 706,587 614,315
# of tokens
[Krochand Taylor, 2000; Kroch et al., 2004]
[Moon and Baldridge, 2007]
[Moon and Baldridge, 2007]
ADJ PCHE PTB JJ ADV ALSO RB VB VB VBI … …
[Blitzer et al., 2006; Brown et al., 1992; Mikolov et al.,2013]
Modern BriEsh English (MBE)
1840-1914 1770-1839 1700-1769 110,000 220,000 330,000 440,000
343,024 427,424 322,255
# of tokens
Early Modern English (EME)
1640-1710 1570-1639 1500-1569 177,500 355,000 532,500 710,000
640,255 706,587 614,315
# of tokens
Train Train Test 1 Test 1 Test 2 Test 2
1.2 2.4 3.6 4.8 6
4.6
Average error rate Baseline SCL Brown word2vec FEMA
1.2 2.4 3.6 4.8 6
4.4 4.2 4.3 4.6
Average error rate Baseline SCL Brown word2vec FEMA
1.2 2.4 3.6 4.8 6
3.7 4.4 4.2 4.3 4.6
Average error rate Baseline SCL Brown word2vec FEMA
(- 0.9)
(Our method)
2.2 4.4 6.6 8.8 11
9.4
Baseline SCL Brown word2vec FEMA Average error rate
2.2 4.4 6.6 8.8 11
8.3 8.0 8.2 9.4
Baseline SCL Brown word2vec FEMA Average error rate
2.2 4.4 6.6 8.8 11
6.6 8.3 8.0 8.2 9.4
Baseline SCL Brown word2vec FEMA Average error rate
(- 2.8)
(Our method)
Penn Treebank Modern BriEsh English Early Modern English
500,000 1,000,000 1,500,000 2,000,000
1,961,157 1,092,703 969,905
# of tokens
Train Test 1 Test 2
Standard evaluaEon scenario for English POS tagging.
Standard evaluaEon scenario for English POS tagging.
Insufficient data annotaEon for historical texts.
4.6 9.2 13.8 18.4 23
18.9
Error rate Baseline SCL Brown word2vec FEMA
4.6 9.2 13.8 18.4 23
18.3 18.4 18.4 18.9
Baseline SCL Brown word2vec FEMA Error rate
4.6 9.2 13.8 18.4 23
17.5 18.3 18.4 18.4 18.9
Baseline SCL Brown word2vec FEMA Error rate
(- 1.4)
(Our method)
6 12 18 24 30
25.9
Baseline SCL Brown word2vec FEMA Error rate
6 12 18 24 30
24.2 24.0 24.1 25.9
Baseline SCL Brown word2vec FEMA Error rate
6 12 18 24 30
22.1 24.2 24.0 24.1 25.9
Baseline SCL Brown word2vec FEMA Error rate
(- 3.8)
(Our method)
6 12 18 24 30
22.1 25.9
Baseline FEMA+ VARD FEMA Error rate
(- 3.8) (- 2.6) (- 4.9)
RepresentaEon learning
(- 3.8)
FEMA
(- 3.8)
22.1
6 12 18 24 30
23.3 22.1 25.9
Baseline VARD FEMA+ VARD FEMA Error rate
(- 3.8) (- 2.6) (- 4.9)
RepresentaEon learning Spelling normalizaEon
6 12 18 24 30
21.0 23.3 22.1 25.9
Baseline VARD FEMA+ VARD FEMA Error rate
(- 3.8) (- 2.6) (- 4.9)
RepresentaEon learning + normalizaEon RepresentaEon learning Spelling normalizaEon
token annotaEons in PCHE annotaEons in PTB , (comma)
, (comma; 83.4%) . (period; 16.6%)
, (comma) . (period)
, (comma; 12.3%) . (period; 87.7%)
. (period) to
TO (54.6%) IN (44.3%)
TO all/any/every JJ DT
token annotaEons in PCHE annotaEons in PTB , (comma)
, (comma; 83.4%) . (period; 16.6%)
, (comma) . (period)
, (comma; 12.3%) . (period; 87.7%)
. (period) to
TO (54.6%) IN (44.3%)
TO all/any/every JJ DT
token annotaEons in PCHE annotaEons in PTB , (comma)
, (comma; 83.4%) . (period; 16.6%)
, (comma) . (period)
, (comma; 12.3%) . (period; 87.7%)
. (period) to
TO (54.6%) IN (44.3%)
TO all/any/every JJ DT
token annotaEons in PCHE annotaEons in PTB , (comma)
, (comma; 83.4%) . (period; 16.6%)
, (comma) . (period)
, (comma; 12.3%) . (period; 87.7%)
. (period) to
TO (54.6%) IN (44.3%)
TO all/any/every JJ (quanEfier) DT
exploiEng task-specific informaEon in feature templates.
complementary for improving tagging performance.
exploiEng task-specific informaEon in feature templates.
complementary for improving tagging performance.
taggers for historical English.
exploiEng task-specific informaEon in feature templates.