Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, - - PowerPoint PPT Presentation

parsing universal dependencies
SMART_READER_LITE
LIVE PREVIEW

Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, - - PowerPoint PPT Presentation

Combining Global Models for Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng Cornell University Overview Scope of Our System What we did What we didnt do Word Segmentation Sentence


slide-1
SLIDE 1

Combining Global Models for Parsing Universal Dependencies

Team C2L2 —

Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng

Cornell University

slide-2
SLIDE 2

Overview — Scope of Our System

What we did What we didn’t do

  • Word Segmentation
  • Sentence Boundary Detection
  • POS Tagging
  • Morphology Analysis
  • Non-projective Parsing
  • Unlabeled data
  • Projective Parsing
  • Dependency Arc Labeling
  • Delexicalized Parsing
slide-3
SLIDE 3

Overview — Highlights

argmax

𝑧∈𝒵

  • Global transition-

based models

  • Bi-LSTM-powered

compact features

  • High efficiency, low

resource demand

  • Delexicalized

syntactic transfer

fi sme

  • Overall
  • Small Treebanks
  • Surprise Languages

1st 2nd

slide-4
SLIDE 4

Overview — System Pipeline

II. Feature Extraction I. UDPipe Pre-process III. Unlabeled Parsing IV. Arc Labeling

slide-5
SLIDE 5

II. Feature Extraction I. UDPipe Pre-process III. Unlabeled Parsing IV. Arc Labeling

Raw Text

UDPipe

Sentence delimited & tokenized

slide-6
SLIDE 6

II. Feature Extraction I. UDPipe Pre-process III. Unlabeled Parsing IV. Arc Labeling

Languages OOV rates ↓ (word) ko – Korean 43.68% la – Latin 41.22% sk – Slovak 36.51% … … Average 14.4%

* Measured on development set

slide-7
SLIDE 7

II. Feature Extraction I. UDPipe Pre-process III. Unlabeled Parsing IV. Arc Labeling

p a r s i n g

parsing

Bi-directional LSTM

slide-8
SLIDE 8

II. Feature Extraction I. UDPipe Pre-process III. Unlabeled Parsing IV. Arc Labeling

dependency Universal parsing dependency Universal parsing

Bi-directional LSTM

slide-9
SLIDE 9

II. Feature Extraction I. UDPipe Pre-process III. Unlabeled Parsing IV. Arc Labeling

Bi-LSTM features

Eisner’s Arc-eager Global Arc-hybrid Global Reparsing by Eisner’s (Sagae and Lavie, 2006)

slide-10
SLIDE 10

II. Feature Extraction I. UDPipe Pre-process III. Unlabeled Parsing IV. Arc Labeling

* Shi, Huang and Lee (2017, EMNLP)

Global Transition-based Parsing

  • 𝑃(𝑜3) Exact decoders
  • Arc-eager and Arc-hybrid systems
  • Large-margin global training
  • Dynamic programming (Huang and Sagae, 2010;

Kuhlmann, Gómez-Rodríguez and Satta, 2011)

slide-11
SLIDE 11

II. Feature Extraction I. UDPipe Pre-process III. Unlabeled Parsing IV. Arc Labeling

Eisner’s Arc-eager Arc-hybrid

Compact (2) Feature Set Scoring function: deep bi-affine

(Dozat and Manning, 2017)

head modifier stack top buffer top stack top buffer top

slide-12
SLIDE 12

II. Feature Extraction I. UDPipe Pre-process III. Unlabeled Parsing IV. Arc Labeling

Ensembling

75.00 74.32 74.00 73.75

73 73.5 74 74.5 75

LAS

Full Single Arc-eager Single Arc-hybrid Single Eisner’s

slide-13
SLIDE 13

II. Feature Extraction I. UDPipe Pre-process III. Unlabeled Parsing IV. Arc Labeling

head modifier concat( ) Multi-layer perceptron

nsubj

  • bj

…….

slide-14
SLIDE 14

II. Feature Extraction I. UDPipe Pre-process III. Unlabeled Parsing IV. Arc Labeling

Effect of Ensemble

75.00 74.69

73 73.5 74 74.5 75

LAS

Full Single Labeler

slide-15
SLIDE 15

Results — Official Ranking

Big Treebanks 2 Small Treebanks 1 PUD Treebanks 2 Surprise Languages 1 Overall 2

slide-16
SLIDE 16

Strategies — Small Treebanks

Train on: {fr, fr_partut, fr_sequoia} All tasks Combined model

Finetune on fr_sequoia All tasks Finetune on fr_partut All tasks Finetune on fr All tasks

fr model fr_partut model fr_sequoia model

Task finetune Task finetune Task finetune

slide-17
SLIDE 17

Results — Small Treebanks

fr fr_partut fr_sequoia fr 84.09 fr_partut 79.53 fr_sequoia 84.65 Combined 87.57 85.57 82.80 +Finetune 87.87 86.65 86.37

* UAS results on dev set, using gold segmentation

Test Treebank

Train Treebank

slide-18
SLIDE 18

Strategies — Surprise Languages

p a r s i n g

parsing

Bi-directional LSTM

  • Train on a source language (selected via WALS)
  • Delexicalized parser

parsing

concat( )

UPOS tag Bag of Morphology

Morphology tags

Max pooling

slide-19
SLIDE 19

Results — Surprise Languages

Target Source* Ranking Buryat Hindi 2 Upper Sorbian Czech 1 Kurmanji Persian 1 North Sámi Finnish 1 Average 1

*selected via WALS

slide-20
SLIDE 20

Implementation

  • Neural networks
  • Parsing algorithms
  • Hardware X 2
  • Training time
  • Approx. 1 week
slide-21
SLIDE 21

Efficiency

* Not Benchmark Results 16.27 4.64 26.17 8.88 5.96 5 10 15 20 25 30

Stanford (Stanford) C2L2 (Ithaca) IMS (Stuttgart) HIT-SCIR (Harbin) LATTICE (Paris) Runtime (Hours) *

LAS CPUs RAM 76.30 4 16 75.00 2 8 74.42 12 64 72.11 1 8 70.93 8 32

slide-22
SLIDE 22

Combining Global Models for Parsing Universal Dependencies

Team C2L2 — Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng

argmax

𝑧∈𝒵

  • Global transition-

based models

  • Ensemble
  • Two-stage

fine-tuning

https://github.com/CoNLL-UD-2017/C2L2