Soft Cross lingual Syntax Projection for Dependency Parsing - - PowerPoint PPT Presentation

soft cross lingual syntax projection for dependency
SMART_READER_LITE
LIVE PREVIEW

Soft Cross lingual Syntax Projection for Dependency Parsing - - PowerPoint PPT Presentation

Soft Cross lingual Syntax Projection for Dependency Parsing Zhenghua Li, Min Zhang, Wenliang Chen {zhli13, minzhang, wlchen}@suda.edu.cn Soochow University, China Dependency parsing A bilingual example pmod root obj obj det subj


slide-1
SLIDE 1

Zhenghua Li, Min Zhang, Wenliang Chen {zhli13, minzhang, wlchen}@suda.edu.cn Soochow University, China

Soft Cross‐lingual Syntax Projection for Dependency Parsing

slide-2
SLIDE 2

 A bilingual example

Dependency parsing

I1 eat2 the3 fish4 with5 a6 folk7

root

$0

subj pmod

  • bj

det

  • bj

det

我1 用2 叉子3 吃4 鱼5

subj root

  • bj
  • bj

vv

$0 fish eat

slide-3
SLIDE 3

Big picture (semi-supervised)

Larger training data English Treebank Chinese Treebank Bitext

I love this game 我 爱 这 运动

Chinese labeled data with partial tree Project English parse trees into Chinese English Parser

slide-4
SLIDE 4

Syntax projection

fish I1 eat2 the3 fish4 with5 a6 folk7 $0 我1 用2 叉子3 吃4 鱼5 $0 eat

slide-5
SLIDE 5

Challenges

 Syntactic non-isomorphism across languages  Different annotation choices (guideline)  Partial (incomplete) parse trees resulted from

projection

 Parsing errors on the source side  Word alignment errors

slide-6
SLIDE 6

Cross-language non-isomorphism

I1 eat2 the3 fish4 with5 a6 folk7 $0 我1 用2 叉子3 吃4 鱼5 use (verb) $0 eat

slide-7
SLIDE 7

 Coordination structure as an example

Different annotation choices

fish and bird fish and bird fish and bird fish and bird fish and bird

slide-8
SLIDE 8

Challenges

 Syntactic non-isomorphism across languages  Different annotation choices (guideline)  Partial (incomplete) parse trees resulted from

projection

 Parsing errors on the source side  Word alignment errors

All these factors can lead to bad projections!

slide-9
SLIDE 9

Why called soft projection

Project less but reliable dependencies, put quality before quantity

Careful/gentle/conservative projection Wrong projection -> training noise

slide-10
SLIDE 10

Big picture (semi-supervised)

Larger training data Chinese Parser English Treebank Chinese Treebank Bitext

I love this game 我 爱 这 运动

Chinese labeled data with partial trees Project English parse trees into Chinese English Parser filtering

slide-11
SLIDE 11

Step 1: word alignment and English parsing on bitext

English Treebank Bitext

I love this game 我 爱 这 运动

English Parser

I1 eat2 the3 fish4 with5 a6 folk7 $0 我1 用2 叉子3 吃4 鱼5 $0

slide-12
SLIDE 12

Step 2: project English tree into Chinese (direct correspondence assumption)

English Treebank Bitext

I love this game 我 爱 这 运动

Chinese labeled data with partial tree Project English parse trees into Chinese English Parser

slide-13
SLIDE 13

I1 eat2 the3 fish4 with5 a6 folk7 $0 我1 用2 叉子3 吃4 鱼5 $0

Step 2: project English tree into Chinese (direct correspondence assumption)

slide-14
SLIDE 14

Step 3: filter projected structures with baseline Chinese Parser

Chinese Parser English Treebank Chinese Treebank Bitext

I love this game 我 爱 这 运动

Chinese labeled data with partial tree Project English parse trees into Chinese English Parser filtering

slide-15
SLIDE 15

Relationship between prob and accuracy

slide-16
SLIDE 16

Step 3: filter projected structures with baseline Chinese Parser

use I1 eat2 the3 fish4 with5 a6 folk7 $0 我1 用2 叉子3 吃4 鱼5 $0 eat

Chinese Parser

slide-17
SLIDE 17

Step 3: filter projected structures with baseline Chinese Parser

I1 eat2 the3 fish4 with5 a6 folk7 $0 我1 用2 叉子3 吃4 鱼5 $0 use eat

slide-18
SLIDE 18

Step 3: filter projected structures with baseline Chinese Parser

I1 eat2 the3 fish4 with5 a6 folk7 $0 我1 用2 叉子3 吃4 鱼5 $0 use eat

slide-19
SLIDE 19

Step 4: combine the data to train a new Chinese Parser

Larger training data Chinese Parser English Treebank Chinese Treebank Bitext

I love this game 我 爱 这 运动

Chinese labeled data with partial tree Project English parse trees into Chinese English Parser filtering

slide-20
SLIDE 20

How to handle data with partial tree annotation

 Convert partial tree annotation into forest

annotation (ambiguous labelings)

 For an unattached word, add links from all other words to

it.

$0 我1 用2 叉子3 吃4 鱼5

`

use eat

slide-21
SLIDE 21

How to handle data with partial tree annotation

 Maximize the mixed likelihood of manually

labeled data with tree annotation and auto- collected data with forest annotation

 Tree annotation can be understood as a special case of

forest annotation

How to train a parser using data with forest annotation?

slide-22
SLIDE 22

Train with ambiguous labelings

 Refer to Tackstrom+ 13 and several earlier papers

Maximize the likelihood of the data Maximize the probability of a forest Maximize the sum probability of all the trees in the forest

The training problem can be solved with the inside-outside algorithm

slide-23
SLIDE 23

Experiments

 Data statistics  Parser

 Second-order dependency parser (McDonald & Pereira

06) (CRF-based, probabilistic)

 SGD training (20K + 1M training data)

slide-24
SLIDE 24

Relationship between prob and accuracy

slide-25
SLIDE 25

Effect of filtering threshold

Proj ratio: 44% 31% 26%

slide-26
SLIDE 26

Supplement the projected structures with baseline Chinese parser

 Even after filtering, the projected structures may still

contain wrong dependencies

 Use the baseline Chinese Parser to add more high-

prob dependencies (multiple heads for a single word, decrease potential noise)

slide-27
SLIDE 27

Supplement the projected structures with baseline Chinese parser

I1 eat2 the3 fish4 with5 a6 folk7 $0 我1 用2 叉子3 吃4 鱼5 $0 use eat

slide-28
SLIDE 28

Supplement the projected structures with baseline Chinese parser

I1 eat2 the3 fish4 with5 a6 folk7 $0 我1 用2 叉子3 吃4 鱼5 $0 use eat

slide-29
SLIDE 29

Effect of supplement threshold

slide-30
SLIDE 30

Effect of supplement threshold

slide-31
SLIDE 31

Effect of supplement threshold

slide-32
SLIDE 32

Final results on CTB5 test

slide-33
SLIDE 33

Comparison with (Jiang+ 10) on CTB5X test

slide-34
SLIDE 34

Recent works on multilingual dependency parsing

 Semi-supervised

 Bilingual word reordering info (Huang & Sagae 09)  Project to build a local classifier (Jiang & Liu 10)

 Unsupervised

 Projection (Ganchev+ 09)  Delexicalized (McDonald+ 11; Tackstrom+ 12, 13)  Hybrid (McDonald+ 11; Ma & Xia 14)

slide-35
SLIDE 35

Conclusions

 We propose a simple semi-supervised

framework to derive high-quality labeled training data from bitext

 Use target-language marginal probabilities to

control the quality of the projected structures (quite simple and effective)

 Use forest based training method to make use of

partial annotations (a very general framework)

slide-36
SLIDE 36

Future directions

 Project more dependencies from source-

language parse trees?

 When two target-langauge words align to the

same source-langauge word?

 More complex correspondences between source-

target trees?

slide-37
SLIDE 37

Future directions

 More elegant ways to handle

 word alignment errors (word alignment prob?)  source-language parsing errors (parsing prob?)  cross-lingual non-isomorphism (very difficult!)  annotation guideline differences

 Universal dependency parsing? (earlier invited

talk by Prof. Nivre)

 Joint word alignment and bilingual dependency

parsing?

 handle all of the above issues in a unified framework

slide-38
SLIDE 38

Thanks for your time! Questions?

slide-39
SLIDE 39

Build local classifiers via projection (Jiang & Liu 10)

 Semi-supervised; project edges

 Step 1: projection to obtain dependency/non-dependency

classification instances

 Step 2: build a target-language local dependency/non-

dependency classifier

 Step 3: feed the outputs of the classifier into a supervised

parser as extra weights during test phase.

slide-40
SLIDE 40

Supplement the projected structures with baseline Chinese parser

If: a word obtain a head from projection (also survives from filtering) and the baseline Chinese parser suggests another high-prob candidate head Then: insert the head candidate into the projected structure.

slide-41
SLIDE 41

Multilingual dependency parsing becomes a hot topic

 Pioneered by Hwa+ 05  Motivations

 A more accurate parser on one language may help a less

accurate one on another language (this paper)

 A difficult syntactic ambiguity in one language may be

easy to resolve in another language

 Rich labeled resources in one language can be transferred

to build parsers of another language (unsupervised)