Lexical Normalization for Neural Network Parsing Rob van der Goot, - - PowerPoint PPT Presentation

lexical normalization for neural network parsing
SMART_READER_LITE
LIVE PREVIEW

Lexical Normalization for Neural Network Parsing Rob van der Goot, - - PowerPoint PPT Presentation

Lexical Normalization for Neural Network Parsing Rob van der Goot, Gertjan van Noord University of Groningen r.van.der.goot@rug.nl 26-01-2018 1 / 31 Last Year (CLIN27) kheb da gzien ik heb dat gezien orig gzien kheb da tokenize lookup


slide-1
SLIDE 1

Lexical Normalization for Neural Network Parsing

Rob van der Goot, Gertjan van Noord University of Groningen r.van.der.goot@rug.nl 26-01-2018

1 / 31

slide-2
SLIDE 2

Last Year (CLIN27)

kheb da gzien tokenize

  • rig

lookup w2v aspell kheb ik heb hb kga heb khee da daar wa dat dea doa gzien gezien gzn gezien gezien geziene +N-grams classifier ik heb dat gezien

2 / 31

slide-3
SLIDE 3

This Year

Use normalization to adapt neural network dependency parsers Evaluate the effect of normalization versus externally trained word embeddings and character level models See if we can exploit top-n candidates New treebank to evaluate domain adaptation

3 / 31

slide-4
SLIDE 4

New Treebank

Why? Manually corrected train data Gold normalization available Data should be non-canonical UD format

4 / 31

slide-5
SLIDE 5

New Treebank

Pre-filtered to contain non-standard words Data from Li and Liu (2015): Owoputi and LexNorm 600 Tweets / 10,000 words UD2.1 format

5 / 31

slide-6
SLIDE 6

New Treebank

I feel so bad .. Not so sure about wrk tomarra

root nsubj xcomp advmod parataxis punct advmod advmod

  • bl

case nmod:tmod

6 / 31

slide-7
SLIDE 7

New Treebank

Experimental setup: Train: English Web Treebank Dev: Owoputi Test: Lexnorm

7 / 31

slide-8
SLIDE 8

Neural Network parser

the s2 jumped s1

  • ver

s0 the b0 lazy b1 dog b2 ROOT b3 fox brown

Configuration:

Scoring:

LSTM f xthe concat LSTM f xbrown concat LSTM f xfox concat LSTM f xjumped concat LSTM f xover concat LSTM f xthe concat LSTM f xlazy concat LSTM f xdog concat LSTM f xROOT concat LSTM b s0 LSTM b s1 LSTM b s2 LSTM b s3 LSTM b s4 LSTM b s5 LSTM b s6 LSTM b s7 LSTM b s8 Vthe Vbrown Vfox Vjumped Vover Vthe Vlazy Vdog VROOT MLP (ScoreLeftArc, ScoreRightArc, ScoreShift)

Taken from Kiperwasser and Goldberg (2016)

8 / 31

slide-9
SLIDE 9

Neural Network parser

UUparser (de Lhoneux et al., 2017) Performs well Relatively easy to adapt No POS tags Characters + external embeddings

9 / 31

slide-10
SLIDE 10

Neural Network parser

word1

  • t1
  • c1
  • e1

LSTM f LSTM b word2

  • t2
  • c2
  • e2

LSTM f LSTM b word3

  • t3
  • c3
  • e3

LSTM f LSTM b 10 / 31

slide-11
SLIDE 11

Use Normalization as Pre-processing

new pix comming tomorroe

root compound compound amod

11 / 31

slide-12
SLIDE 12

Use Normalization as Pre-processing

new pix coming tomorrow new pix comming tomorroe

root

  • bl
  • bj

amod

12 / 31

slide-13
SLIDE 13

Use Normalization as Pre-processing

b a s e + c h a r + e x t + c h a r + e x t 48 50 52 54 56 58 60 62 64

raw norm.

13 / 31

slide-14
SLIDE 14

Use Normalization as Pre-processing

b a s e + c h a r + e x t + c h a r + e x t 48 50 52 54 56 58 60 62 64

raw norm.

14 / 31

slide-15
SLIDE 15

Use Normalization as Pre-processing

b a s e + c h a r + e x t + c h a r + e x t 48 50 52 54 56 58 60 62 64

raw norm.

15 / 31

slide-16
SLIDE 16

Use Normalization as Pre-processing

b a s e + c h a r + e x t + c h a r + e x t 48 50 52 54 56 58 60 62 64

raw norm.

16 / 31

slide-17
SLIDE 17

Use Normalization as Pre-processing

b a s e + c h a r + e x t + c h a r + e x t 48 50 52 54 56 58 60 62 64

raw norm. gold

17 / 31

slide-18
SLIDE 18

Integrate Normalization

new pix comming tomorroe

18 / 31

slide-19
SLIDE 19

Integrate Normalization

new pix comming tomoroe new 0.9466 pix 0.7944 coming 0.5684 tomorrow 0.5451 news 0.0315 selfies 0.0882 comming 0.4314 tomoroe 0.3946 knew 0.0111 pictures 0.0559 combing 0.0002 tomorrow’s 0.0191 now 0.0063 photos 0.0449 comping <0.0001 Tagore 0.0174 newt 0.0045 pic 0.0165 common <0.0001 tomorrows 0.0173

19 / 31

slide-20
SLIDE 20

Integrate Normalization

word1

  • t1
  • c1
  • e1

LSTM f LSTM b word2

  • t2
  • c2
  • e2

LSTM f LSTM b word3

  • t3
  • c3
  • e3

LSTM f LSTM b 20 / 31

slide-21
SLIDE 21

Integrate Normalization

  • wi =

n

  • j=0

Pij ∗ nij

21 / 31

slide-22
SLIDE 22

Integrate Normalization

  • w1 = (

new ∗ 0.9466) + ( news ∗ 0.0315) + ( knew ∗ 0.0111) + ( now ∗ 0.0063) + ( newt ∗ 0.0045)

22 / 31

slide-23
SLIDE 23

Integrate Normalization

b a s e + c h a r + e x t + c h a r + e x t 48 50 52 54 56 58 60 62 64

raw norm. integr. gold

23 / 31

slide-24
SLIDE 24

Integrate Normalization

But what about in-domain performance?

24 / 31

slide-25
SLIDE 25

Integrate Normalization

80 82 84 86 88 90

base +norm

25 / 31

slide-26
SLIDE 26

Integrate Normalization

0.00 0.02 0.04 0.06 0.08 0.10 +8.42e1

base +norm

26 / 31

slide-27
SLIDE 27

Integrate Normalization

Test data: Model UAS LAS raw 70.47 60.16 normalization- direct 71.03* 61.83* integrated 71.15 62.30* gold 71.45 63.16*

Table: *indicates statistical significance compared to previous entry.

27 / 31

slide-28
SLIDE 28

Integrate Normalization

Conclusions: Normalization is still helpful on top of character and external embeddings Integrating normalization leads to a small but consistent/significant improvement Performance +-60% from using gold normalization New dataset will be made available, provides a nice benchmark for domain adaptation

28 / 31

slide-29
SLIDE 29

Next CLIN

Effect of different categories of normalization replacements Get closer to gold normalization

29 / 31

slide-30
SLIDE 30

Bibliography

Miryam de Lhoneux, Yan Shao, Ali Basirat, Eliyahu Kiperwasser, Sara Stymne, Yoav Goldberg, and Joakim Nivre. From raw text to universal dependencies - look, no tags! In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 207–217, Vancouver, Canada, August 2017. Association for Computational Linguistics. Eliyahu Kiperwasser and Yoav Goldberg. Simple and accurate dependency parsing using bidirectional LSTM feature

  • representations. TACL, 4:313–327, 2016.

Chen Li and Yang Liu. Joint POS tagging and text normalization for informal text. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, pages 1263–1269, 2015.

30 / 31

slide-31
SLIDE 31

Integrate Normalization

Foster: not noisy, constituency Denoised Web Treebank: no train Tweebank: no train Foreebank: not noisy

31 / 31