Character Eyes: Seeing Language through Character-Level Taggers - PowerPoint PPT Presentation

Character Eyes: Seeing Language through Character-Level Taggers Yuval Pinter Marc Marone Jacob Eisenstein @yuvalpi @ruyimarone @jacobeisenstein Blackbox NLP 2019 https://github.com/ruyimarone/character-eyes

Taggers N sg V past RB DET The cat walked fast 3

Neural Taggers DET N sg V past RB The cat walked fast 4

Character-level Neural Taggers DET N sg V past RB The cat walked fast T h e c a t w a l k e d f a s t 5

Character-level Recurrent Neural Taggers DET N sg V past RB The cat walked fast T h e c a t w a l k e d f a s t 6

Recurrent Taggers – Good at Finding Morphemes? DET N sg V past RB The cat walked fast T h e c a t w a l k e d f a s t 7

Recurrent Taggers – Good at Finding Morphemes? DET N sg V past RB Agglutination The cat walked fast T h e c a t w a l k e d f a s t 7

Recurrent Taggers – Good at Prefixes and Suffixes? N sg;def V past RB thecat walked fast t h e c a t w a l k e d f a s t 8

Recurrent Taggers – Good at Prefixes and Suffixes? N sg;def V past RB Prefixing morphology thecat walked fast (e.g. Coptic) t h e c a t w a l k e d f a s t 8

Recurrent Taggers – Can They Handle diSCoNtinUiTY? DET N sg V past RB The cat waeldk fast T h e c a t w a e l d k f a s t 9

Recurrent Taggers – Can They Handle diSCoNtinUiTY? DET N sg V past RB Introflexive morphology The cat waeldk fast (Hebrew, Arabic) T h e c a t w a e l d k f a s t 9

Main Idea(s) Model Language w a l k e d t h e c a t w a e l d k 10

Main Idea(s) Model Language measure how models encode different w a l k e d linguistic patterns t h e c a t w a e l d k 10

Main Idea(s) Model Language characterize languages based on model analysis; w a l k e d help engineer language- aware systems t h e c a t w a e l d k 11

Analysis Primitive – Unit Decomposition DET N sg V past RB The cat walked fast T h e c a t w a l k e d f a s t 12

Analysis Primitive – Unit Decomposition ● Assumption: units are “in DET N sg V past RB charge” of tracking morphemes that help predict POS The cat walked fast Hidden unit #n T h e c a t w a l k e d f a s t 12

Analysis Primitive – Unit Decomposition ● Assumption: units are “in DET N sg V past RB charge” of tracking morphemes that help predict POS The cat walked fast ● Hypothesis: easy for agglutinations , difficult for introflexions Hidden unit #n T h e c a t w a l k e d f a s t 12

Analysis Primitive – Unit Decomposition ● Assumption: units are “in DET N sg V past RB charge” of tracking morphemes that help predict POS The cat walked fast ● Hypothesis: easy for agglutinations , difficult for introflexions Hidden unit #n ● Hypothesis: unit’s direction Hidden unit #m affects ease of tracking suffixes vs. prefixes T h e c a t w a l k e d f a s t 12

Evidence? ● Turkish is an agglutinative language ○ ev ‘house’; evler ‘houses’; evleriniz ‘your houses’; evlerinizden ‘from your houses’ 13

Evidence? ● Turkish is an agglutinative language ○ ev ‘house’; evler ‘houses’; evleriniz ‘your houses’; evlerinizden ‘from your houses’ Unit 3 ( → ) 13

Evidence? ● Turkish is an agglutinative language ○ ev ‘house’; evler ‘houses’; evleriniz ‘your houses’; evlerinizden ‘from your houses’ Unit 3 ( → ) Unit 124 (  ) 13

Model & Data DET N sg V past RB The cat walked fast T h e c a t w a l k e d f a s t 14

Model & Data ● Universal Dependencies (n=24) DET N sg V past RB ○ POS tags + Morphosyntactic Descriptions The cat walked fast T h e c a t w a l k e d f a s t 14

Model & Data ● Universal Dependencies (n=24) DET N sg V past RB ○ POS tags + Morphosyntactic Descriptions ● Linguistic diversity – morph. synthesis: ○ 5 agglutinative languages 2 introflexive languages ○ ○ 3 isolating, 14 fusional The cat walked fast T h e c a t w a l k e d f a s t Source for language classes: WALS 14

Model & Data ● Universal Dependencies (n=24) DET N sg V past RB ○ POS tags + Morphosyntactic Descriptions ● Linguistic diversity – morph. synthesis: ○ 5 agglutinative languages 2 introflexive languages ○ ○ 3 isolating, 14 fusional The cat walked fast ● Linguistic diversity – affixation: ○ (All) 1 prefixing language 2 non-affixing ○ ○ 2 equally pre- and suffixing T h e c a t w a l k e d f a s t ○ 19 suffixing Source for language classes: WALS 14

Model & Data ● Universal Dependencies (n=24) DET N sg V past RB ○ POS tags + Morphosyntactic Descriptions Linguistic diversity (synthesis + affixation) ○ ● Word → Tag: Bidirectional LSTM + MLP (Not analyzed) ○ The cat walked fast ○ No word embeddings T h e c a t w a l k e d f a s t 15

Model & Data ● Universal Dependencies (n=24) DET N sg V past RB ○ POS tags + Morphosyntactic Descriptions Linguistic diversity (synthesis + affixation) ○ ● Word → Tag: Bidirectional LSTM + MLP (Not analyzed) ○ The cat walked fast ○ No word embeddings ● Char → Word: Bidirectional LSTM ○ Char embedding size: 256 T h e c a t w a l k e d f a s t 15

Analysis Metrics 16

Analysis Metrics ● Run model on training data words 16

Analysis Metrics ● Run model on training data words ● Collect activation levels for each unit 16

Analysis Metrics ● Run model on training data words 0.42 ● Collect activation levels for each unit ● Aggregate to single measure (e.g. average absolute or max-delta ) 16

Analysis Metrics ● Run model on training data words 0.42 ● Collect activation levels for each unit ● Aggregate to single measure (e.g. average absolute or max-delta ) ● Bin per unit over parts of speech … Unit 42 [0.0,0.1) [0.1,0.2) [0.9,1.0) … NOUN 8 2 40 … VERB 20 0 4 … … … … … … ADJ 10 10 10 16

Analysis Metrics ● Run model on training data words 0.42 ● Collect activation levels for each unit ● Aggregate to single measure (e.g. average absolute or max-delta ) ● Bin per unit over parts of speech ● Mutual Information metric – POS … Unit 42 [0.0,0.1) [0.1,0.2) [0.9,1.0) … NOUN 8 2 40 Discrimination Index, or PDI … VERB 20 0 4 ○ (Higher PDI = better discriminator) … … … … … … ADJ 10 10 10 16

… Analysis Metrics Unit 40 [0.0,0.1) [0.1,0.2) [0.9,1.0) … Unit 41 [0.0,0.1) [0.1,0.2) [0.9,1.0) … NOUN 8 2 40 … Unit 42 [0.0,0.1) [0.1,0.2) [0.9,1.0) … NOUN 8 2 40 … VERB 20 0 4 … NOUN 8 2 40 … VERB 20 0 4 … … … … … ● Run model on training data words … VERB 20 0 4 … … … … … … ADJ 10 10 10 … … … … … ● Collect activation levels for each unit … ADJ 10 10 10 … ADJ 10 10 10 ● Aggregate to single measure (e.g. average absolute or max-delta ) ● Bin per unit over parts of speech ● Mutual Information metric – POS Discrimination Index, or PDI ○ (Higher PDI = better discriminator) ● Aggregate across units by 17

… Analysis Metrics Unit 40 [0.0,0.1) [0.1,0.2) [0.9,1.0) … Unit 41 [0.0,0.1) [0.1,0.2) [0.9,1.0) … NOUN 8 2 40 … Unit 42 [0.0,0.1) [0.1,0.2) [0.9,1.0) … NOUN 8 2 40 … VERB 20 0 4 … NOUN 8 2 40 … VERB 20 0 4 … … … … … ● Run model on training data words … VERB 20 0 4 … … … … … … ADJ 10 10 10 … … … … … ● Collect activation levels for each unit … ADJ 10 10 10 … ADJ 10 10 10 ● Aggregate to single measure (e.g. average absolute or max-delta ) ● Bin per unit over parts of speech ● Mutual Information metric – POS  mass median Discrimination Index, or PDI ○ (Higher PDI = better discriminator) ● Aggregate across units by Summing total mass ○ Reporting % of forward units before mass median ○ 17

Findings (Cherry Pick) ● English: fusional, suffixing ● Coptic: agglutinative, prefixing ○ Small mass (hard to capture POS) Large mass (easy to distinguish POS ○ based on char sequence) Backward-heavy (80%) ○ ○ Forward-heavy (71%) 18

Total PDI mass Findings (General Trends) 19

Total PDI mass Findings (General Trends) ● 4/5 agglutinatives hold 4/6 top total-mass positions 19

Total PDI mass Findings (General Trends) ● 4/5 agglutinatives hold 4/6 top total-mass positions ● 2/2 introflexives in bottom 2/4 spots (Persian and Hindi below, both fusional w/ non-Latin charsets) 19

Direction Balance Study 20

Direction Balance Study ● Some languages might not need two equal LSTM directions 20

Direction Balance Study ● Some languages might not need two equal LSTM directions ● What if… they don’t need one of them at all? 20

Direction Balance Study ● Some languages might not need two equal LSTM directions ● What if they need them in a different balance? Somewhere in the middle? ● What if… they don’t need one of them at all? 20

Balance Study – Results 21

Character Eyes: Seeing Language through Character-Level Taggers - PowerPoint PPT Presentation

Character Eyes: Seeing Language through Character-Level Taggers Yuval Pinter Marc Marone Jacob Eisenstein @yuvalpi @ruyimarone @jacobeisenstein Blackbox NLP 2019 https://github.com/ruyimarone/character-eyes Taggers N sg V past RB DET

MEDIA DISRUPTION SEEING BEYOND SEEING BEYOND SEEING BEYOND SEEING BEYOND LED BY THE BLIND

ays of Seeing Ways of Seeing ohn Berger John Berger Ways of Seeing W ohn Berger John Berger

Design Elements Issue Task Force March 12, 2014 1 Historic Character 2 Historic Character 3

COASTAL DEVELOPMENT IN KARNATAKA SEEING NEOLIBERALISM THROUGH LEGAL PLURAL EYES 14 DECEMBER

Character-level Language Models With Word-level Learning Arvid Frydenlund March 16, 2018

Curriculum on Character Development L1/A: Character in Leadership Character Development Agenda

Curriculum on Character Development Character in Leadership Character Development Agenda

Red Eyes, Red Spots, and Red Flags Red Eyes Common reason for primary care visits Red

Do you suffer from red, itchy eyes? You may suffer from allergic conjunctivitis. Symptoms You

Sensory Integration Module Vision Seeing and Hearing Events Fisher Force & tactile

PARADOX THE UPSIDE DOWN TRUTH OF FAITH PARADOX Week 4 Seeing the Unseen to Truly See

Seeing Single Molecules Seeing Single Molecules Dr. Arindam Chowdhury Department of Chemistry

Seeing Further: Extending Seeing Further: Extending Visualization as a Basis for Visualization

Eyes wide open, seeing nothing The challenge of the Gospel of Johns non-vizualizable texture

Hows Your Sight Seeing? Help keep your eyes and vision healthy with these top tips Eat Well

Open Your Eyes! Seeing Linear Equations Differently

MEASURING STRUCTURAL CHANGES IN TRADE INTEGRATION AND PRODUCTION NETWORK Norihiko Yamano, Colin

On designing the perfect boat Growth in the Canadian Urban System, 2001-2006 Richard Shearmur

Impact of Corporate Subsidies on Borrowing Costs of Local Governments Sudheer Chava, Baridhi

2110412 Parallel Comp Arch Parallel Programming Paradigm Natawut Nupairoj, Ph.D. Department of

Historical linguistics : the study of how language changes over time sound change: phonemic and

Effects of oligoribonucleotides-D-mannitol complexes on the hemagglutinin-glycan interactions

Hebrew Dependency Parsing: Initial Results Yoav Goldberg Michael Elhadad IWPT 2009, Paris

Machine Translation 3: Linguistics in SMT and NMT Ond rej Bojar bojar@ufal.mff.cuni.cz

Character Eyes: Seeing Language through Character-Level Taggers - PowerPoint PPT Presentation

Character Eyes: Seeing Language through Character-Level Taggers Yuval Pinter Marc Marone Jacob Eisenstein @yuvalpi @ruyimarone @jacobeisenstein Blackbox NLP 2019 https://github.com/ruyimarone/character-eyes Taggers N sg V past RB DET

MEDIA DISRUPTION SEEING BEYOND SEEING BEYOND SEEING BEYOND SEEING BEYOND LED BY THE BLIND

ays of Seeing Ways of Seeing ohn Berger John Berger Ways of Seeing W ohn Berger John Berger

Design Elements Issue Task Force March 12, 2014 1 Historic Character 2 Historic Character 3

COASTAL DEVELOPMENT IN KARNATAKA SEEING NEOLIBERALISM THROUGH LEGAL PLURAL EYES 14 DECEMBER

Character-level Language Models With Word-level Learning Arvid Frydenlund March 16, 2018

Curriculum on Character Development L1/A: Character in Leadership Character Development Agenda

Curriculum on Character Development Character in Leadership Character Development Agenda

Red Eyes, Red Spots, and Red Flags Red Eyes Common reason for primary care visits Red

Do you suffer from red, itchy eyes? You may suffer from allergic conjunctivitis. Symptoms You

Sensory Integration Module Vision Seeing and Hearing Events Fisher Force &amp; tactile

PARADOX THE UPSIDE DOWN TRUTH OF FAITH PARADOX Week 4 Seeing the Unseen to Truly See

Seeing Single Molecules Seeing Single Molecules Dr. Arindam Chowdhury Department of Chemistry

Seeing Further: Extending Seeing Further: Extending Visualization as a Basis for Visualization

Eyes wide open, seeing nothing The challenge of the Gospel of Johns non-vizualizable texture

Hows Your Sight Seeing? Help keep your eyes and vision healthy with these top tips Eat Well

Open Your Eyes! Seeing Linear Equations Differently

MEASURING STRUCTURAL CHANGES IN TRADE INTEGRATION AND PRODUCTION NETWORK Norihiko Yamano, Colin

On designing the perfect boat Growth in the Canadian Urban System, 2001-2006 Richard Shearmur

Impact of Corporate Subsidies on Borrowing Costs of Local Governments Sudheer Chava, Baridhi

2110412 Parallel Comp Arch Parallel Programming Paradigm Natawut Nupairoj, Ph.D. Department of

Historical linguistics : the study of how language changes over time sound change: phonemic and

Effects of oligoribonucleotides-D-mannitol complexes on the hemagglutinin-glycan interactions

Hebrew Dependency Parsing: Initial Results Yoav Goldberg Michael Elhadad IWPT 2009, Paris

Machine Translation 3: Linguistics in SMT and NMT Ond rej Bojar bojar@ufal.mff.cuni.cz

Sensory Integration Module Vision Seeing and Hearing Events Fisher Force & tactile