Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, - PowerPoint PPT Presentation

Combining Global Models for Parsing Universal Dependencies Team C2L2 — Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng Cornell University

Overview — Scope of Our System What we did What we didn’t do • Word Segmentation • Sentence Boundary Detection • Projective Parsing • POS Tagging • Dependency Arc Labeling • Morphology Analysis • Delexicalized Parsing • Non-projective Parsing • Unlabeled data

Overview — Highlights 2 nd argmax 𝑧∈𝒵 • Global transition- • Bi-LSTM-powered • Overall based models compact features 1 st fi sme • Delexicalized • High efficiency, low • Small Treebanks syntactic transfer resource demand • Surprise Languages

Overview — System Pipeline I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Sentence Raw UDPipe delimited Text & tokenized

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Languages OOV rates ↓ (word) ko – Korean 43.68% la – Latin 41.22% sk – Slovak 36.51% … … Average 14.4% * Measured on development set

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing parsing Bi-directional LSTM p a r s i n g

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Universal dependency parsing Bi-directional LSTM Universal dependency parsing

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Reparsing by Eisner’s (Sagae and Lavie, 2006) Arc-eager Arc-hybrid Eisner’s Global Global Bi-LSTM features

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Global Transition-based Parsing • 𝑃(𝑜 3 ) Exact decoders • Arc-eager and Arc-hybrid systems • Large-margin global training • Dynamic programming (Huang and Sagae, 2010; Kuhlmann, Gómez-Rodríguez and Satta, 2011) * Shi, Huang and Lee (2017, EMNLP)

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Compact (2) Feature Set Eisner’s head modifier Arc-eager stack top buffer top Arc-hybrid stack top buffer top Scoring function: deep bi-affine (Dozat and Manning, 2017)

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Ensembling 75.00 75 74.5 74.32 LAS 74.00 74 73.75 73.5 73 Single Single Single Full Arc-eager Arc-hybrid Eisner’s

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing nsubj obj ……. Multi-layer perceptron concat( ) head modifier

I. II. IV. III. UDPipe Feature Arc Unlabeled Pre-process Extraction Labeling Parsing Effect of Ensemble 75.00 75 74.69 74.5 LAS 74 73.5 73 Single Full Labeler

Results — Official Ranking Big Treebanks 2 Small Treebanks 1 PUD Treebanks 2 Surprise Languages 1 Overall 2

Strategies — Small Treebanks Task finetune Task finetune Task finetune fr_partut model fr_sequoia model fr model Finetune on fr Finetune on fr_partut Finetune on fr_sequoia All tasks All tasks All tasks Combined model Train on: {fr, fr_partut, fr_sequoia} All tasks

Results — Small Treebanks Test Treebank fr fr_partut fr_sequoia Train Treebank fr 84.09 fr_partut 79.53 fr_sequoia 84.65 Combined 87.57 85.57 82.80 +Finetune 87.87 86.65 86.37 * UAS results on dev set, using gold segmentation

Strategies — Surprise Languages Train on a source language (selected via WALS) • Delexicalized parser • parsing parsing UPOS Bag of Bi-directional LSTM concat( ) tag Morphology Max pooling p a r s i n g Morphology tags

Results — Surprise Languages Target Source* Ranking Buryat Hindi 2 Upper Sorbian Czech 1 Kurmanji Persian 1 North Sámi Finnish 1 Average 1 *selected via WALS

Implementation • Neural networks • Parsing algorithms • Hardware X 2 • Training time Approx. 1 week

Efficiency Runtime (Hours) * 30 26.17 25 20 16.27 15 8.88 10 5.96 4.64 5 0 Stanford C2L2 IMS HIT-SCIR LATTICE (Stanford) (Ithaca) (Stuttgart) (Harbin) (Paris) LAS 76.30 75.00 74.42 72.11 70.93 CPUs 4 2 12 1 8 RAM 16 8 64 8 32 * Not Benchmark Results

Combining Global Models for Parsing Universal Dependencies argmax 𝑧∈𝒵 • Global transition- • Ensemble • Two-stage based models fine-tuning https://github.com/CoNLL-UD-2017/C2L2 Team C2L2 — Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng

Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, - PowerPoint PPT Presentation

Combining Global Models for Parsing Universal Dependencies Team C2L2 Tianze Shi, Felix G. Wu, Xilun Chen, Yao Cheng Cornell University Overview Scope of Our System What we did What we didnt do Word Segmentation Sentence

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Building stuff with monadic dependencies + unchanging dependencies + polymorphic dependencies +

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS

Task Dependencies: ant Steven J Zeil February 25, 2013 Task Dependencies: ant Outline

Towards an adequate account of parataxis in Universal Dependencies Lars Ahrenberg Department of

ConlluEditor: a fully graphical editor for Universal dependencies treebank files Johannes

GF2UD and UD2GF UD: Universal Dependencies Prasanth Kolachina GF Summer school, 2017 the black

Parsing to Stanford Dependencies: Trade-offs between speed and accuracy Daniel Cer,

Models of Human Parsing Experimental Data 2 Informatics 2A: Lecture 22 Eye-tracking Reading

Outline LR Parsing Review of bottom-up parsing LALR Parser Generators Computing the

Graph-Based Parsing Joakim Nivre Uppsala University Department of Linguistics and Philology

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Generalised Parsing and Combinator Parsing A Happy Marriage? L. Thomas van Binsbergen

Parsing as Deduction Joseph K uhner March 24, 2007 Joseph K uhner Parsing as Deduction

V erification de protocoles cryptographiques en pr esence de th eories equationnelles

Accessing Data while Preserving Privacy Kobbi Nissim Georgetown University and CRCS@Harvard

5 February 2020 Webinar Slides with speaker notes Kia ora, Koutou, Ko Kahungunu toku iwi No

Gaussian Processes for Sample Efficient Reinforcement Learning with RMAX-like Exploration Tobias

A Fast Incremental Cycle Ratio Algorithm Gang Wu and Chris Chu Iowa State University Outline

NetFPGA Summer Course Presented by: Andrew W Moore, Noa Zilberman, Gianni Antichi Stephen

Towards an understanding of ramified extensions of structured ring spectra Birgit Richter Joint

Overview of Innovations Topics Introduction Most of the lecture materials on the Kalman filter