Projection of Trees across Parallel Texts Daniel Zeman, Rudolf Rosa - - PowerPoint PPT Presentation

projection of trees across parallel texts
SMART_READER_LITE
LIVE PREVIEW

Projection of Trees across Parallel Texts Daniel Zeman, Rudolf Rosa - - PowerPoint PPT Presentation

Projection of Trees across Parallel Texts Daniel Zeman, Rudolf Rosa April 17, 2020 NPFL120 Multilingual Natural Language Processing Charles University Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics unless


slide-1
SLIDE 1

Projection of Trees across Parallel Texts

Daniel Zeman, Rudolf Rosa

April 17, 2020

NPFL120 Multilingual Natural Language Processing

Charles University Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics unless otherwise stated

slide-2
SLIDE 2

Projection of Trees across Parallel Texts

  • Rebecca Hwa, Philip Resnik, Amy Weinberg, Clara Cabezas, Okan Kolak (2004).

Bootstrapping Parsers via Syntactic Projection across Parallel Texts

  • In Natural Language Engineering 1 (1): 1–15. Cambridge University Press
  • Source: English
  • Target: Spanish, Chinese
  • Dependency trees (not phrase structure)

Projection of Trees across Parallel Texts

1/14

slide-3
SLIDE 3

Projection System Architecture

Projection of Trees across Parallel Texts

2/14

slide-4
SLIDE 4

Direct Projection

Given sentence pair (E, F) and a set of syntactic relations for E, where E = e1, ..., en is an English sentence and F = f1, ..., fm is its non-English parallel, syntactic relations R(x, y) are projected from English as follows:

  • one-to-one – ei aligned with a unique fx and ej aligned with a unique fy – then

R(ei, ej) ⇒ R(fx, fy)

  • unaligned English –

not aligned with any word in – create new empty word so that for any aligned with a unique , and

  • one-to-many –

aligned with – then create new empty , parent of , and set to align to instead

  • many-to-one –

uniquely aligned to – then keep the head of aligned to , and delete other alignments

  • many-to-many – decompose: fjrst one-to-many, then many-to-one
  • unaligned foreign – leave them out of the projected tree

Projection of Trees across Parallel Texts

3/14

slide-5
SLIDE 5

Direct Projection

Given sentence pair (E, F) and a set of syntactic relations for E, where E = e1, ..., en is an English sentence and F = f1, ..., fm is its non-English parallel, syntactic relations R(x, y) are projected from English as follows:

  • one-to-one – ei aligned with a unique fx and ej aligned with a unique fy – then

R(ei, ej) ⇒ R(fx, fy)

  • unaligned English – ej not aligned with any word in F – create new empty word fy so

that for any ei aligned with a unique fx, R(ei, ej) ⇒ R(fx, fy) and R(ej, ei) ⇒ R(fy, fx)

  • one-to-many –

aligned with – then create new empty , parent of , and set to align to instead

  • many-to-one –

uniquely aligned to – then keep the head of aligned to , and delete other alignments

  • many-to-many – decompose: fjrst one-to-many, then many-to-one
  • unaligned foreign – leave them out of the projected tree

Projection of Trees across Parallel Texts

3/14

slide-6
SLIDE 6

Direct Projection

Given sentence pair (E, F) and a set of syntactic relations for E, where E = e1, ..., en is an English sentence and F = f1, ..., fm is its non-English parallel, syntactic relations R(x, y) are projected from English as follows:

  • one-to-one – ei aligned with a unique fx and ej aligned with a unique fy – then

R(ei, ej) ⇒ R(fx, fy)

  • unaligned English – ej not aligned with any word in F – create new empty word fy so

that for any ei aligned with a unique fx, R(ei, ej) ⇒ R(fx, fy) and R(ej, ei) ⇒ R(fy, fx)

  • one-to-many – ei aligned with fx, ..., fy – then create new empty fz, parent of

fx, ..., fy, and set ei to align to fz instead

  • many-to-one –

uniquely aligned to – then keep the head of aligned to , and delete other alignments

  • many-to-many – decompose: fjrst one-to-many, then many-to-one
  • unaligned foreign – leave them out of the projected tree

Projection of Trees across Parallel Texts

3/14

slide-7
SLIDE 7

Direct Projection

Given sentence pair (E, F) and a set of syntactic relations for E, where E = e1, ..., en is an English sentence and F = f1, ..., fm is its non-English parallel, syntactic relations R(x, y) are projected from English as follows:

  • one-to-one – ei aligned with a unique fx and ej aligned with a unique fy – then

R(ei, ej) ⇒ R(fx, fy)

  • unaligned English – ej not aligned with any word in F – create new empty word fy so

that for any ei aligned with a unique fx, R(ei, ej) ⇒ R(fx, fy) and R(ej, ei) ⇒ R(fy, fx)

  • one-to-many – ei aligned with fx, ..., fy – then create new empty fz, parent of

fx, ..., fy, and set ei to align to fz instead

  • many-to-one – ei, ..., ej uniquely aligned to fx – then keep the head of ei, ..., ej

aligned to fx, and delete other alignments

  • many-to-many – decompose: fjrst one-to-many, then many-to-one
  • unaligned foreign – leave them out of the projected tree

Projection of Trees across Parallel Texts

3/14

slide-8
SLIDE 8

Direct Projection

Given sentence pair (E, F) and a set of syntactic relations for E, where E = e1, ..., en is an English sentence and F = f1, ..., fm is its non-English parallel, syntactic relations R(x, y) are projected from English as follows:

  • one-to-one – ei aligned with a unique fx and ej aligned with a unique fy – then

R(ei, ej) ⇒ R(fx, fy)

  • unaligned English – ej not aligned with any word in F – create new empty word fy so

that for any ei aligned with a unique fx, R(ei, ej) ⇒ R(fx, fy) and R(ej, ei) ⇒ R(fy, fx)

  • one-to-many – ei aligned with fx, ..., fy – then create new empty fz, parent of

fx, ..., fy, and set ei to align to fz instead

  • many-to-one – ei, ..., ej uniquely aligned to fx – then keep the head of ei, ..., ej

aligned to fx, and delete other alignments

  • many-to-many – decompose: fjrst one-to-many, then many-to-one
  • unaligned foreign – leave them out of the projected tree

Projection of Trees across Parallel Texts

3/14

slide-9
SLIDE 9

Direct Projection

Given sentence pair (E, F) and a set of syntactic relations for E, where E = e1, ..., en is an English sentence and F = f1, ..., fm is its non-English parallel, syntactic relations R(x, y) are projected from English as follows:

  • one-to-one – ei aligned with a unique fx and ej aligned with a unique fy – then

R(ei, ej) ⇒ R(fx, fy)

  • unaligned English – ej not aligned with any word in F – create new empty word fy so

that for any ei aligned with a unique fx, R(ei, ej) ⇒ R(fx, fy) and R(ej, ei) ⇒ R(fy, fx)

  • one-to-many – ei aligned with fx, ..., fy – then create new empty fz, parent of

fx, ..., fy, and set ei to align to fz instead

  • many-to-one – ei, ..., ej uniquely aligned to fx – then keep the head of ei, ..., ej

aligned to fx, and delete other alignments

  • many-to-many – decompose: fjrst one-to-many, then many-to-one
  • unaligned foreign – leave them out of the projected tree

Projection of Trees across Parallel Texts

3/14

slide-10
SLIDE 10

Direct Projection Example

He took a picture

  • f my daughter

Vyfotil si moji dceru

nsubj

  • bj

det nmod case det nmod det nsubj

  • bj

det case

Projection of Trees across Parallel Texts

4/14

slide-11
SLIDE 11

Direct Projection Example

He took a picture

  • f my daughter

Vyfotil si moji dceru

nsubj

  • bj

det nmod case det nmod det nsubj

  • bj

det case

Projection of Trees across Parallel Texts

4/14

slide-12
SLIDE 12

Direct Projection Example

He took a picture

  • f my daughter

f1 f2 f3 Vyfotil si f6 moji dceru

nsubj

  • bj

det nmod case det nmod det nsubj

  • bj

det case

Projection of Trees across Parallel Texts

4/14

slide-13
SLIDE 13

Direct Projection Example 2

He took a picture

  • f my daughter

Vyfotil si moji dceru

nsubj

  • bj

det nmod case det det case nsubj

Projection of Trees across Parallel Texts

5/14

slide-14
SLIDE 14

Direct Projection Example 2

He took a picture

  • f my daughter

Vyfotil si moji dceru

nsubj

  • bj

det nmod case det det case nsubj

Projection of Trees across Parallel Texts

5/14

slide-15
SLIDE 15

Direct Projection Example 2

He took a picture

  • f my daughter

f1 Vyfotil si f6 moji dceru

nsubj

  • bj

det nmod case det det case nsubj

Projection of Trees across Parallel Texts

5/14

slide-16
SLIDE 16

Direct Projection Example 2

He took a picture

  • f my daughter

f1 Vyfotil si f6 moji dceru

nsubj

  • bj

det nmod case det det case nsubj

Projection of Trees across Parallel Texts

5/14

slide-17
SLIDE 17

Direct Projection Example 3

He took a picture

  • f my daughter

Vyfotil si moji dceru

nsubj

  • bj

det nmod case det det nsubj case

  • bj

x x

Projection of Trees across Parallel Texts

6/14

slide-18
SLIDE 18

Direct Projection Example 3

He took a picture

  • f my daughter

Vyfotil si moji dceru

nsubj

  • bj

det nmod case det det nsubj case

  • bj

x x

Projection of Trees across Parallel Texts

6/14

slide-19
SLIDE 19

Direct Projection Example 3

He took a picture

  • f my daughter

f1 f2 Vyfotil si f6 moji dceru

nsubj

  • bj

det nmod case det det nsubj case

  • bj

x x

Projection of Trees across Parallel Texts

6/14

slide-20
SLIDE 20

Direct Projection Example 3

He took a picture

  • f my daughter

f1 f2 Vyfotil f4 si f6 moji dceru

nsubj

  • bj

det nmod case det det nsubj case

  • bj

x x

Projection of Trees across Parallel Texts

6/14

slide-21
SLIDE 21

Many-to-One Assumption: ei, ..., ej Is a Phrase with One Head

He took a picture

  • f my daughter

Vyfotil si moji dceru

nsubj

  • bj

det nmod case det det

Projection of Trees across Parallel Texts

7/14

slide-22
SLIDE 22

Many-to-One Assumption: ei, ..., ej Is a Phrase with One Head. What if Not?

He took a picture

  • f my daughter

Vyfotil si moji dceru

nsubj

  • bj

det nmod case det det

Projection of Trees across Parallel Texts

7/14

slide-23
SLIDE 23

Experiments with Direct Projection

  • 100 gold trees projected from English to Spanish
  • 88 gold trees projected from English to Chinese
  • Word alignments are gold-standard too!
  • The goal is just to check the direct correspondence assumption.
  • Compared with target gold-standard trees
  • Spanish unlabeled F-score = 37%
  • Chinese unlabeled F-score = 38%

Projection of Trees across Parallel Texts

8/14

slide-24
SLIDE 24

Experiments with Direct Projection

  • 100 gold trees projected from English to Spanish
  • 88 gold trees projected from English to Chinese
  • Word alignments are gold-standard too!
  • The goal is just to check the direct correspondence assumption.
  • Compared with target gold-standard trees
  • Spanish unlabeled F-score = 37%
  • Chinese unlabeled F-score = 38%

Projection of Trees across Parallel Texts

8/14

slide-25
SLIDE 25

Problems

  • Many-to-one deletes alignments ⇒ tree is not connected
  • Possible solution: transitive closure?
  • Unaligned foreign words remain unattached
  • Possible solution: postprocessing with target language knowledge

He took a picture

  • f my daughter

f1 Vyfotil si f6 moji dceru

nsubj

  • bj

det nmod case det det case nsubj

  • bj:nmod

Projection of Trees across Parallel Texts

9/14

slide-26
SLIDE 26

Problems

  • Many-to-one deletes alignments ⇒ tree is not connected
  • Possible solution: transitive closure?
  • Unaligned foreign words remain unattached
  • Possible solution: postprocessing with target language knowledge

He took a picture

  • f my daughter

f1 Vyfotil si f6 moji dceru

nsubj

  • bj

det nmod case det det case nsubj

  • bj:nmod

Projection of Trees across Parallel Texts

9/14

slide-27
SLIDE 27

Problems

  • Many-to-one deletes alignments ⇒ tree is not connected
  • Possible solution: transitive closure?
  • Unaligned foreign words remain unattached
  • Possible solution: postprocessing with target language knowledge

He took a picture

  • f my daughter

f1 Vyfotil si f6 moji dceru

nsubj

  • bj

det nmod case det det case nsubj

  • bj:nmod

Projection of Trees across Parallel Texts

9/14

slide-28
SLIDE 28

Postprocessing Rules

  • A few dozen rules, less than a month work
  • Spanish example
  • A refmexive clitic should modify the verb to its left.
  • Chinese example
  • An aspectual marker should modify the verb to its left.

Projection of Trees across Parallel Texts

10/14

slide-29
SLIDE 29

Experiments with Postprocessing on Gold Data

  • 100 gold trees projected from English to Spanish
  • 88 gold trees projected from English to Chinese
  • Word alignments are gold-standard too!
  • Compared with target gold-standard trees
  • Spanish unlabeled F-score = 70%
  • Chinese unlabeled F-score = 67%

Projection of Trees across Parallel Texts

11/14

slide-30
SLIDE 30

Real-World Setting

  • Collins Model2 (1997) English parser trained on Penn Treebank / WSJ
  • Converted to dependencies (Magerman 1994, Xia and Palmer 2001)
  • Word alignments computed with GIZA++ (Och and Ney 2003)
  • 100K en-es sentence pairs (Bible, Federal Broadcasting Information Service, United

Nations Parallel Corpus)

  • 240K en-zh sentence pairs (Federal Broadcasting Information Service)
  • Project trees using direct correspondence + postprocessing
  • Agressive fjltering: discard projected trees of poor quality
  • Train Collins dependency parser (1999) on remaining trees
  • Apply the parser to unseen target-language sentences

Projection of Trees across Parallel Texts

12/14

slide-31
SLIDE 31

Real-World Setting

  • Collins Model2 (1997) English parser trained on Penn Treebank / WSJ
  • Converted to dependencies (Magerman 1994, Xia and Palmer 2001)
  • Word alignments computed with GIZA++ (Och and Ney 2003)
  • 100K en-es sentence pairs (Bible, Federal Broadcasting Information Service, United

Nations Parallel Corpus)

  • 240K en-zh sentence pairs (Federal Broadcasting Information Service)
  • Project trees using direct correspondence + postprocessing
  • Agressive fjltering: discard projected trees of poor quality
  • Train Collins dependency parser (1999) on remaining trees
  • Apply the parser to unseen target-language sentences

Projection of Trees across Parallel Texts

12/14

slide-32
SLIDE 32

Real-World Setting

  • Collins Model2 (1997) English parser trained on Penn Treebank / WSJ
  • Converted to dependencies (Magerman 1994, Xia and Palmer 2001)
  • Word alignments computed with GIZA++ (Och and Ney 2003)
  • 100K en-es sentence pairs (Bible, Federal Broadcasting Information Service, United

Nations Parallel Corpus)

  • 240K en-zh sentence pairs (Federal Broadcasting Information Service)
  • Project trees using direct correspondence + postprocessing
  • Agressive fjltering: discard projected trees of poor quality
  • Train Collins dependency parser (1999) on remaining trees
  • Apply the parser to unseen target-language sentences

Projection of Trees across Parallel Texts

12/14

slide-33
SLIDE 33

Pruning Criteria

  • Based on tuning on development set, discard if…
  • > 20% of the English words have no Spanish counterpart
  • > 30% of the Spanish words have no English counterpart
  • > 4 Spanish words were aligned to the same English word
  • Additional criteria for English-Chinese:
  • Crossing dependencies
  • Number of unattached nodes after postprocessing
  • Number of words with unknown POS category
  • 20K projected Spanish trees after fjltering
  • 50K projected Chinese trees after fjltering

Projection of Trees across Parallel Texts

13/14

slide-34
SLIDE 34

Pruning Criteria

  • Based on tuning on development set, discard if…
  • > 20% of the English words have no Spanish counterpart
  • > 30% of the Spanish words have no English counterpart
  • > 4 Spanish words were aligned to the same English word
  • Additional criteria for English-Chinese:
  • Crossing dependencies
  • Number of unattached nodes after postprocessing
  • Number of words with unknown POS category
  • 20K projected Spanish trees after fjltering
  • 50K projected Chinese trees after fjltering

Projection of Trees across Parallel Texts

13/14

slide-35
SLIDE 35

Pruning Criteria

  • Based on tuning on development set, discard if…
  • > 20% of the English words have no Spanish counterpart
  • > 30% of the Spanish words have no English counterpart
  • > 4 Spanish words were aligned to the same English word
  • Additional criteria for English-Chinese:
  • Crossing dependencies
  • Number of unattached nodes after postprocessing
  • Number of words with unknown POS category
  • 20K projected Spanish trees after fjltering
  • 50K projected Chinese trees after fjltering

Projection of Trees across Parallel Texts

13/14

slide-36
SLIDE 36

Pruning Criteria

  • Based on tuning on development set, discard if…
  • > 20% of the English words have no Spanish counterpart
  • > 30% of the Spanish words have no English counterpart
  • > 4 Spanish words were aligned to the same English word
  • Additional criteria for English-Chinese:
  • Crossing dependencies
  • Number of unattached nodes after postprocessing
  • Number of words with unknown POS category
  • 20K projected Spanish trees after fjltering
  • 50K projected Chinese trees after fjltering

Projection of Trees across Parallel Texts

13/14

slide-37
SLIDE 37

Pruning Criteria

  • Based on tuning on development set, discard if…
  • > 20% of the English words have no Spanish counterpart
  • > 30% of the Spanish words have no English counterpart
  • > 4 Spanish words were aligned to the same English word
  • Additional criteria for English-Chinese:
  • Crossing dependencies
  • Number of unattached nodes after postprocessing
  • Number of words with unknown POS category
  • 20K projected Spanish trees after fjltering
  • 50K projected Chinese trees after fjltering

Projection of Trees across Parallel Texts

13/14

slide-38
SLIDE 38

Experiments

  • Spanish
  • Baseline (left-to-right) unl F-score = 33.8%
  • Parser on unfjltered data (98K) F = 67.3%
  • Parser on fjltered data (20K) F = 72.1%
  • Commercial parser F = 69.2%
  • Chinese
  • Baseline (left-to-right) F = 35.1%
  • Baseline + postprocessing F = 44.3%
  • Parser on fjltered data (50K) F = 53.9%
  • Parser on PennChineseTB (10K) F = 64.3%
  • Learning curve: projected parser = about 2K manual sentences

Projection of Trees across Parallel Texts

14/14

slide-39
SLIDE 39

Experiments

  • Spanish
  • Baseline (left-to-right) unl F-score = 33.8%
  • Parser on unfjltered data (98K) F = 67.3%
  • Parser on fjltered data (20K) F = 72.1%
  • Commercial parser F = 69.2%
  • Chinese
  • Baseline (left-to-right) F = 35.1%
  • Baseline + postprocessing F = 44.3%
  • Parser on fjltered data (50K) F = 53.9%
  • Parser on PennChineseTB (10K) F = 64.3%
  • Learning curve: projected parser = about 2K manual sentences

Projection of Trees across Parallel Texts

14/14

slide-40
SLIDE 40

Experiments

  • Spanish
  • Baseline (left-to-right) unl F-score = 33.8%
  • Parser on unfjltered data (98K) F = 67.3%
  • Parser on fjltered data (20K) F = 72.1%
  • Commercial parser F = 69.2%
  • Chinese
  • Baseline (left-to-right) F = 35.1%
  • Baseline + postprocessing F = 44.3%
  • Parser on fjltered data (50K) F = 53.9%
  • Parser on PennChineseTB (10K) F = 64.3%
  • Learning curve: projected parser = about 2K manual sentences

Projection of Trees across Parallel Texts

14/14