Character Eyes: Seeing Language through Character-Level Taggers
Yuval Pinter Marc Marone Jacob Eisenstein
@yuvalpi @ruyimarone @jacobeisenstein
https://github.com/ruyimarone/character-eyes
Character Eyes: Seeing Language through Character-Level Taggers - - PowerPoint PPT Presentation
Character Eyes: Seeing Language through Character-Level Taggers Yuval Pinter Marc Marone Jacob Eisenstein @yuvalpi @ruyimarone @jacobeisenstein Blackbox NLP 2019 https://github.com/ruyimarone/character-eyes Taggers N sg V past RB DET
@yuvalpi @ruyimarone @jacobeisenstein
https://github.com/ruyimarone/character-eyes
3
4
5
6
7
7
Agglutination
8
8
Prefixing morphology (e.g. Coptic)
9
9
Introflexive morphology (Hebrew, Arabic)
10
10
11
characterize languages based on model analysis; help engineer language- aware systems
12
The cat walked fast
12
The cat walked fast
Hidden unit #n
12
The cat walked fast
Hidden unit #n
12
The cat walked fast
Hidden unit #n
Hidden unit #m
○ ev ‘house’; evler ‘houses’; evleriniz ‘your houses’; evlerinizden ‘from your houses’
13
○ ev ‘house’; evler ‘houses’; evleriniz ‘your houses’; evlerinizden ‘from your houses’
13
Unit 3 (→)
○ ev ‘house’; evler ‘houses’; evleriniz ‘your houses’; evlerinizden ‘from your houses’
13
Unit 124 () Unit 3 (→)
14
○ POS tags + Morphosyntactic Descriptions
14
○ POS tags + Morphosyntactic Descriptions
○ 5 agglutinative languages ○ 2 introflexive languages ○ 3 isolating, 14 fusional
14
Source for language classes: WALS
○ POS tags + Morphosyntactic Descriptions
○ 5 agglutinative languages ○ 2 introflexive languages ○ 3 isolating, 14 fusional
○ (All) 1 prefixing language ○ 2 non-affixing ○ 2 equally pre- and suffixing ○ 19 suffixing
14
Source for language classes: WALS
○ POS tags + Morphosyntactic Descriptions ○ Linguistic diversity (synthesis + affixation)
○ (Not analyzed) ○ No word embeddings
15
○ POS tags + Morphosyntactic Descriptions ○ Linguistic diversity (synthesis + affixation)
○ (Not analyzed) ○ No word embeddings
○ Char embedding size: 256
15
16
16
16
0.42
16
0.42
16
Unit 42 [0.0,0.1) [0.1,0.2) … [0.9,1.0) NOUN 8 2 … 40 VERB 20 … 4 … … … … … ADJ 10 10 … 10
0.42
○ (Higher PDI = better discriminator)
16
Unit 42 [0.0,0.1) [0.1,0.2) … [0.9,1.0) NOUN 8 2 … 40 VERB 20 … 4 … … … … … ADJ 10 10 … 10
○ (Higher PDI = better discriminator)
Unit 40 [0.0,0.1) [0.1,0.2) … [0.9,1.0) NOUN 8 2 … 40 VERB 20 … 4 … … … … … ADJ 10 10 … 10 Unit 41 [0.0,0.1) [0.1,0.2) … [0.9,1.0) NOUN 8 2 … 40 VERB 20 … 4 … … … … … ADJ 10 10 … 10
17
Unit 42 [0.0,0.1) [0.1,0.2) … [0.9,1.0) NOUN 8 2 … 40 VERB 20 … 4 … … … … … ADJ 10 10 … 10
○ (Higher PDI = better discriminator)
○ Summing total mass ○ Reporting % of forward units before mass median
Unit 40 [0.0,0.1) [0.1,0.2) … [0.9,1.0) NOUN 8 2 … 40 VERB 20 … 4 … … … … … ADJ 10 10 … 10 Unit 41 [0.0,0.1) [0.1,0.2) … [0.9,1.0) NOUN 8 2 … 40 VERB 20 … 4 … … … … … ADJ 10 10 … 10
17
Unit 42 [0.0,0.1) [0.1,0.2) … [0.9,1.0) NOUN 8 2 … 40 VERB 20 … 4 … … … … … ADJ 10 10 … 10
mass median
○ Large mass (easy to distinguish POS based on char sequence) ○ Forward-heavy (71%)
18
○ Small mass (hard to capture POS) ○ Backward-heavy (80%)
19
Total PDI mass
19
Total PDI mass
19
Total PDI mass
20
20
20
20
21
21
○ Especially on agglutinative languages and
○ Fully-forward better than fully-backward
21
0.2 0.4 0.6 0.8
Agglutinative Introflexive Fusional
0.2 0.4 0.6
Strongly Suffixing Little Affixation Prefixing
Tagging accuracy change
○ Especially on agglutinative languages and
○ Fully-forward better than fully-backward ○ MAJOR caveat – 128x128 > 2*(64x64)
21
0.2 0.4 0.6 0.8
Agglutinative Introflexive Fusional
0.2 0.4 0.6
Strongly Suffixing Little Affixation Prefixing
Tagging accuracy change
○ Especially on agglutinative languages and
○ Fully-forward better than fully-backward ○ MAJOR caveat – 128x128 > 2*(64x64)
21
0.2 0.4 0.6 0.8
Agglutinative Introflexive Fusional
0.2 0.4 0.6
Strongly Suffixing Little Affixation Prefixing
Tagging accuracy change
○ Especially on agglutinative languages and
○ Fully-forward better than fully-backward ○ MAJOR caveat – 128x128 > 2*(64x64)
21
0.2 0.4 0.6 0.8
Agglutinative Introflexive Fusional
0.2 0.4 0.6
Strongly Suffixing Little Affixation Prefixing
Tagging accuracy change
22
22
○ Extensible to any <instance, unit> metric on any neural classifier
22
○ Extensible to any <instance, unit> metric on any neural classifier
22
○ Extensible to any <instance, unit> metric on any neural classifier
22
○ Extensible to any <instance, unit> metric on any neural classifier
22
○ Extensible to any <instance, unit> metric on any neural classifier
○ A saturation effect?
22
○ Extensible to any <instance, unit> metric on any neural classifier
○ A saturation effect? ○ Fault in assuming PDI measures unit importance?
22
uvp@gatech.edu mmarone6@gatech.edu https://github.com/ruyimarone/character-eyes