Dependency Parsing as Head Selection Xingxing Zhang , Jianpeng Cheng, - - PowerPoint PPT Presentation

dependency parsing as head selection
SMART_READER_LITE
LIVE PREVIEW

Dependency Parsing as Head Selection Xingxing Zhang , Jianpeng Cheng, - - PowerPoint PPT Presentation

Dependency Parsing as Head Selection Xingxing Zhang , Jianpeng Cheng, Mirella Lapata Institute for Language, Cognition and Computation University of Edinburgh x.zhang@ed.ac.uk April 6, 2017 Zhang et al. (Univ. of Edinburgh) DeNSe : Dependency


slide-1
SLIDE 1

Dependency Parsing as Head Selection

Xingxing Zhang, Jianpeng Cheng, Mirella Lapata

Institute for Language, Cognition and Computation University of Edinburgh x.zhang@ed.ac.uk

April 6, 2017

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 1 / 18

slide-2
SLIDE 2

Dependency Parsing

Dependency Parsing is the task of transforming a sentence S = (root, w1, w2, . . . , wN) into a directed tree originating out of root. Parsing Algorithms

Transition-based Parsing Graph-based Parsing

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 2 / 18

slide-3
SLIDE 3

Dependency Parsing

Dependency Parsing is the task of transforming a sentence S = (root, w1, w2, . . . , wN) into a directed tree originating out of root. Parsing Algorithms

Transition-based Parsing Graph-based Parsing

Our parser is neither Transition-based nor Graph-based (during training)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 2 / 18

slide-4
SLIDE 4

Transition-based Parsing

Data Structure

Buffer, Stack, Arc Set

Parsing:

Choose an action from SHIFT REDUCE-Left REDUCE-Right

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 3 / 18

slide-5
SLIDE 5

Graph-based Parsing

A Sentence → A Directed Complete Graph

(Graphs from Kubler et al., 2009)

Parsing: Finding Maximum Spanning Tree

Chu-Liu-Edmond algorithm (Chu and Liu, 1965) Eisner algorithm (Eisner 1996)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 4 / 18

slide-6
SLIDE 6

Recent Advances

Mostly replacing discrete features with Neural Network features. Transition-based Parsers

Feed-Forward NN features (Chen and Manning, 2014) Bi-LSTM features (Kiperwasser and Goldberg, 2016) Stack LSTM: Buffer, Stack and Action Sequences modeled by Stack-LSTMs (Dyer et al., 2015)

Graph-based Parsers

Tensor Decomposition features (Lei et al., 2014) Feed-Forward NN features (Pei et al., 2015) Bi-LSTM features (Kiperwasser and Goldberg, 2016)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 5 / 18

slide-7
SLIDE 7

Do we need a transition system or graph algorithm?

root kids love candy

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 6 / 18

slide-8
SLIDE 8

Do we need a transition system or graph algorithm?

root kids love candy An important fact: Every word has only one head!

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 6 / 18

slide-9
SLIDE 9

Do we need a transition system or graph algorithm?

root kids love candy An important fact: Every word has only one head! Why not just learn to select the head?

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 6 / 18

slide-10
SLIDE 10

Dependency Parsing as Head Selection

DeNSe: Dependency Neural Selection

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 7 / 18

slide-11
SLIDE 11

Dependency Parsing as Head Selection

DeNSe: Dependency Neural Selection

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 7 / 18

slide-12
SLIDE 12

Dependency Parsing as Head Selection

DeNSe: Dependency Neural Selection

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 7 / 18

slide-13
SLIDE 13

Dependency Parsing as Head Selection

DeNSe: Dependency Neural Selection Phead(root|love, S) = exp(MLP(aroot, alove)) 3

k=0 exp(MLP(ak, alove))

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 7 / 18

slide-14
SLIDE 14

Dependency Parsing as Head Selection

DeNSe: Dependency Neural Selection Phead(root|love, S) = exp(MLP(aroot, alove)) 3

k=0 exp(MLP(ak, alove))

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 7 / 18

slide-15
SLIDE 15

Decoding

Greedy Decoding: The output may not be a (projective) tree!

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 8 / 18

slide-16
SLIDE 16

Decoding

Greedy Decoding: The output may not be a (projective) tree!

Greedy Decoding Dataset #Sent (Dev) Tree Proj PTB (English) 1,700 95.1 86.6 CTB (Chinese) 803 87.0 73.1 Czech 374 87.7 65.5 German 367 96.7 67.3

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 8 / 18

slide-17
SLIDE 17

Decoding

Greedy Decoding: The output may not be a (projective) tree!

Greedy Decoding Dataset #Sent (Dev) Tree Proj PTB (English) 1,700 95.1 86.6 CTB (Chinese) 803 87.0 73.1 Czech 374 87.7 65.5 German 367 96.7 67.3

Decoding with a Maximum Spanning Tree Algorithm (relatively rare)

Projective Parsing: Eisner Algorithm Non-projective Parsing: Chu-Liu-Edmond Algorithm

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 8 / 18

slide-18
SLIDE 18

Labelled Parser

A two-layer Rectifier Network (Glorot et al., 2011) Dependent Word:

Bi-LSTM Feature Word Embedding PoS Embedding

Head Word:

Bi-LSTM Feature Word Embedding PoS Embedding

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 9 / 18

slide-19
SLIDE 19

Experiments

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 10 / 18

slide-20
SLIDE 20

Projective Parsing Results (PTB; English)

NN (Chen & Manning, 2014); S-LSTM (Dyer et al., 2015); Bi-LSTM (Kiperwasser & Goldberg, 2016); SynNet (Andor et al. 2016)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 11 / 18

slide-21
SLIDE 21

Projective Parsing Results (PTB; English)

NN (Chen & Manning, 2014); S-LSTM (Dyer et al., 2015); Bi-LSTM (Kiperwasser & Goldberg, 2016); SynNet (Andor et al. 2016)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 11 / 18

slide-22
SLIDE 22

Projective Parsing Results (PTB; English)

NN (Chen & Manning, 2014); S-LSTM (Dyer et al., 2015); Bi-LSTM (Kiperwasser & Goldberg, 2016); SynNet (Andor et al. 2016)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 11 / 18

slide-23
SLIDE 23

Projective Parsing Results (PTB; Chinese)

NN (Chen & Manning, 2014); S-LSTM (Dyer et al., 2015); Bi-LSTM (Kiperwasser & Goldberg, 2016); 3rd-cubic (Zhang & McDonald 2014)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 12 / 18

slide-24
SLIDE 24

Non-projective Parsing Results (German)

MST-1st, MST-2nd (McDonald et al., 2005) Turbo-1st, Turbo-3rd (Martins et al., 2013) RBG-1st RBG-3rd (Martins et al. 2013)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 13 / 18

slide-25
SLIDE 25

Non-projective Parsing Results (German)

MST-1st, MST-2nd (McDonald et al., 2005) Turbo-1st, Turbo-3rd (Martins et al., 2013) RBG-1st RBG-3rd (Martins et al. 2013)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 13 / 18

slide-26
SLIDE 26

Non-projective Parsing Results (Czech)

MST-1st, MST-2nd ((McDonald et al., 2005) Turbo-1st, Turbo-3rd (Martins et al., 2013) RBG-1st RBG-3rd (Martins et al. 2013)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 14 / 18

slide-27
SLIDE 27

Non-projective Parsing Results (Czech)

MST-1st, MST-2nd ((McDonald et al., 2005) Turbo-1st, Turbo-3rd (Martins et al., 2013) RBG-1st RBG-3rd (Martins et al. 2013)

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 14 / 18

slide-28
SLIDE 28

Unlabeled Exact Match

PTB CTB Parser Dev Test Dev Test C&M14 43.35 40.93 32.75 32.20 Dyer15 51.94 50.70 39.72 37.23 DeNSe 51.24 49.34 34.74 33.66 DeNSe+E 52.47 50.79 36.49 35.13

Table: UEM results on PTB and CTB.

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 15 / 18

slide-29
SLIDE 29

UAS v.s. Length

11 14 17 20 23 26 28 32 38 118 PTB sentence length 89 90 91 92 93 94 95 96 UAS (%) C&M14 DeNSe+E Dyer15

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 16 / 18

slide-30
SLIDE 30

UAS v.s. Length

5 9 14 18 22 26 30 37 49 116 PTB sentence length 80 81 82 83 84 85 86 87 88 89 90 91 92 93 UAS (%) C&M14 DeNSe+E Dyer15

CTB

CTB

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 16 / 18

slide-31
SLIDE 31

Conclusions

We propose a dependency parser as greedily selecting the head of each word in sentence. Combine the greedy model with a MST algorithm can further increase the performance Code available: https://github.com/XingxingZhang/dense parser

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 17 / 18

slide-32
SLIDE 32

Thanks Q & A

Zhang et al. (Univ. of Edinburgh) DeNSe: Dependency Neural Selection April 6, 2017 18 / 18