Parser Self-Training for Syntax-Based Machine Translation Nara - - PowerPoint PPT Presentation

parser self training for syntax based machine translation
SMART_READER_LITE
LIVE PREVIEW

Parser Self-Training for Syntax-Based Machine Translation Nara - - PowerPoint PPT Presentation

Parser Self-Training for Syntax-Based Machine Translation Nara Institute of Science and Technology Augmented Human Communication Laboratory Makoto Morishita, Koichi Akabe, Yuto Hatakoshi, Graham Neubig, Koichiro Yoshino, Satoshi


slide-1
SLIDE 1

Parser Self-Training
 for Syntax-Based Machine Translation

Nara Institute of Science and Technology Augmented Human Communication Laboratory Makoto Morishita, Koichi Akabe, Yuto Hatakoshi,
 Graham Neubig, Koichiro Yoshino, Satoshi Nakamura

2015/12/03 IWSLT 2015

slide-2
SLIDE 2

Background

slide-3
SLIDE 3

Makoto Morishita, AHC Lab, NAIST

Phrase-Based Machine Translation

3

๏ Translate and reorder by phrases.


  • Easy to learn translation model.

  • Low translation accuracy 

  • n language pairs with different word order.

John hit a ball ジョンは 打った ボールを ジョンは ボールを 打った Translation Model Reordering Model

[Koehn et al., 2003]

slide-4
SLIDE 4

Makoto Morishita, AHC Lab, NAIST

Tree-to-String Machine Translation

4

๏ Use the source language parse tree in translation


  • High translation accuracy on language pairs with 


different word order.


  • Translation accuracy is affected greatly by the parser accuracy.

[Liu et al., 2006]

S NP0 NN VP VBD NP1 DT NN John hit a ball

x0:NP0 は x1:NP1 を 打った

slide-5
SLIDE 5

Makoto Morishita, AHC Lab, NAIST

Forest-to-String Machine Translation

5

๏ Use the source language parse forest in translation


  • Decoder can choose the parse tree that has


high translation probability from the parse tree candidates

Forest-to-String
 decoder Source language
 parse forest Target language
 sentence [Mi et al., 2008]

[Zhang et al., 2012]

slide-6
SLIDE 6

Makoto Morishita, AHC Lab, NAIST

Parser Self-Training

6

๏ Use the parser output as training data.

Parser

Input
 sentence Parse tree Use as training data [McClosky et al., 2006] ๏ Improve the parser accuracy.


  • Parser is adapted to the target domain.
slide-7
SLIDE 7

Makoto Morishita, AHC Lab, NAIST

Self-Training for Preordering

7

๏ By selecting the parse trees, 


more effective self-training (Targeted Self-Training).


  • Use only high scored parse trees.

  • However, in this method, we need hand-aligned data.

  • It is costly to make hand-aligned data.

Parser Input
 sentence Candidate 
 preordering parse trees Use as training data Evaluation High scored
 parse tree Correct preordering data [Katz-Brown et al., 2011]

slide-8
SLIDE 8

Proposed Method

slide-9
SLIDE 9

Makoto Morishita, AHC Lab, NAIST

Proposed Method

9

๏ Targeted Self-Training using MT automatic

evaluation metrics


  • low cost and accurate evaluation

Parser Input
 sentence Parse forest Use as training data Evaluation using MT automatic evaluation metrics High scored
 parse tree Forest-to-String
 Decoder Translated sentence
 and parse tree used in translation

slide-10
SLIDE 10

Makoto Morishita, AHC Lab, NAIST

Selection Methods

10

๏ Sentence selection


  • Select the sentences to use from the entire corpus

๏ Parse tree selection


  • Select a parse tree to use from a single sentence

One
 sentence Several parse tree
 candidates Parse tree selection High scored
 parse tree Several
 sentences Sentence selection Sentences to be used

slide-11
SLIDE 11

Makoto Morishita, AHC Lab, NAIST

Parse Tree Selection

11

๏ Parser 1-best


  • Use the parser 1-best tree.

  • Traditional self-training [McClosky et al. 2006].

๏ Decoder 1-best


  • Use the parse tree used in translation.

๏ Evaluation 1-best


  • Among the translation candidates, use the

parse tree used in highest scored translation.

slide-12
SLIDE 12

Makoto Morishita, AHC Lab, NAIST

Decoder 1-best

12

Parser Input
 sentence Parse forest Forest-to-String
 Decoder ๏ Decoder 1-best


  • Use the parse tree used in translation.

Translated sentence
 and parse tree used in translation

slide-13
SLIDE 13

Makoto Morishita, AHC Lab, NAIST

Evaluation 1-best

13

Parser Input
 sentence Parse forest Forest-to-String
 Decoder

๏ Evaluation 1-best


  • Among the translation candidate, use the parse tree used

in highest scored translation.


  • This highest scored translation is called Oracle translation.

Translation and
 parse tree candidates Automatic
 Evaluation High scored translation
 and parse tree
 (Oracle translation)

slide-14
SLIDE 14

Makoto Morishita, AHC Lab, NAIST

Sentence Selection

14

๏ Random


  • Select sentences randomly from the corpus.

  • Traditional self-training.

๏ Threshold of the evaluation score


  • Use sentences that score over the threshold.

๏ Gain of the evaluation score


  • Use sentences that have a large gain in


score between decoder 1-best and oracle translation.

slide-15
SLIDE 15

Makoto Morishita, AHC Lab, NAIST

Threshold of the Evaluation Score

15

Selection 
 based on the score

๏ Threshold of the evaluation score


  • Use sentences that score over threshold.

Use Score≧Threshold Score<Threshold Do not use High scored translation
 and parse tree
 (Oracle translation)

slide-16
SLIDE 16

Makoto Morishita, AHC Lab, NAIST

Gain of the Evaluation Score

16

Oracle translation,
 parse tree

Selection based on
 the gain of the score

๏ Gain of evaluation score


  • Use sentences that have a large gain in score

between decoder 1-best and oracle translation.

Decoder 1-best translation,
 parse tree Use Large gain Small gain Do not use

slide-17
SLIDE 17

Experiments

slide-18
SLIDE 18

Makoto Morishita, AHC Lab, NAIST

Experimental Setup
 (for Self-Training)

18

Parser
 (Egret) Evaluation (BLEU+1) Forest-to-String
 Decoder (Travatar) Japanese Dependency Corpus (7k) Parallel corpus
 (ASPEC 2.0M) Source language Target
 language Existing model Parser self-training

slide-19
SLIDE 19

Makoto Morishita, AHC Lab, NAIST

19

Existing Parser
 (Egret) Forest-to-String
 Decoder (Travatar) Translated
 sentence Decoder training Decoder dev, test Parallel corpus
 (ASPEC dev:2k, test 2k) In these experiments, we focused on Japanese-English,
 Japanese-Chinese translation Self-Trained Parser
 (Egret)

Experimental Setup
 (for Evaluation)

Parallel corpus
 (ASPEC 2.0M) Source
 language Target
 language Source
 language Target
 language

slide-20
SLIDE 20

Makoto Morishita, AHC Lab, NAIST

Experiment Results
 (Japanese-English Translation)

20

Tree Selection Sentence Selection Sentences (k) BLEU RIBES

Baseline

  • 23.83

72.27

Parser 1-best Random

96 23.66 71.77

Decoder 1-best Random

97 23.81 72.04

Oracle Random

97 23.93 72.09

slide-21
SLIDE 21

Makoto Morishita, AHC Lab, NAIST

Oracle Translation Score Distribution

  • It contains a lot of noisy sentences.

21

5k 10k 15k 20k 25k 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Sentences BLEU+1 Score

slide-22
SLIDE 22

Makoto Morishita, AHC Lab, NAIST

Experiment Results
 (Japanese-English Translation)

22

๏ By self-training, the accuracy significantly improved

Tree Selection Sentence Selection Sentences (k) BLEU RIBES

Baseline

  • 23.83

72.27

Parser 1-best Random

96 23.66 71.77

Decoder 1-best Random

97 23.81 72.04

Oracle Random

97 23.93 72.09

Oracle BLEU+1 Threshold

120 24.26 72.38

Oracle BLEU+1 Gain

100 24.22 72.32 ** * ** : p < 0.01 * : p < 0.05

slide-23
SLIDE 23

Manual Evaluation

23

Tree
 selection Sentence
 selection Score

Significance between Baseline Significance between
 Parser -best

Baseline 2.38 ̶ ̶ Parser 1-best Random 2.42 No ̶ Oracle BLEU+1 Threshold 2.50 Yes
 (99% level) Yes
 (90% level)

๏ We could verify that our method is effective.

Score range is 1 to 5

slide-24
SLIDE 24

Makoto Morishita, AHC Lab, NAIST

Example of an improvement

24

Source

C投与群ではRの活動を240分にわたって明らかに増強した

Reference

in the C - administered group, thermal reaction clearly increased the activity of R for 240 minutes.

Baseline

for 240 minutes clearly enhanced the activity of C administration group R.

Self-Trained

for 240 minutes clearly enhanced the activity of R in the C - administration group.

slide-25
SLIDE 25

Makoto Morishita, AHC Lab, NAIST

Before Self-Training

25

PP NP P NP N SYMP P AUX_SYMP SYM AUX_SYM SYMP SYM AUX_VP N PP P P C 投与 群 で は R の 活動 を administered group in TOP

  • f

activity OBJ

slide-26
SLIDE 26

Makoto Morishita, AHC Lab, NAIST

After Self-Training

26

VP PP VP NP PP NP N SYM N P P PP VP NP P SYM NP ADV N C 投与 群 で は R の 活動 を administered group in TOP

  • f

activity OBJ

slide-27
SLIDE 27

Makoto Morishita, AHC Lab, NAIST

Tree Selection Sentence Selection Sentences (k) BLEU RIBES

Baseline

  • 29.60

81.32

Parser 1-best Random

129 29.75 81.55

Decoder 1-best Random

130 29.76 81.53

Oracle Random

130 29.89 81.66

Oracle BLEU+1 Threshold

82 29.86 81.60

Oracle BLEU+1 Gain

100 29.85 81.59

Oracle (ja-en) BLEU+1 Threshold

120 29.87 81.58

27

** : p < 0.01 * : p < 0.05 ** * ** ** ** * * * ** * ๏ By self-training, the accuracy significantly improved ๏ By using ja-en self-trained model, 


it also improved the accuracy.

Experiment Results
 (Japanese-Chinese Translation)

slide-28
SLIDE 28

Makoto Morishita, AHC Lab, NAIST

Tree Selection Sentence Selection Sentences (k) BLEU RIBES

Baseline

  • 29.60

81.32

Parser 1-best Random

129 29.75 81.55

Decoder 1-best Random

130 29.76 81.53

Oracle Random

130 29.89 81.66

Oracle BLEU+1 Threshold

82 29.86 81.60

Oracle BLEU+1 Gain

100 29.85 81.59

Oracle (ja-en) BLEU+1 Threshold

120 29.87 81.58

28

** : p < 0.01 * : p < 0.05 ** * ** ** ** * * * ** * ๏ By self-training, the accuracy significantly improved ๏ By using ja-en self-trained model, 


it also improved the accuracy.

Experiment Results
 (Japanese-Chinese Translation)

slide-29
SLIDE 29

Parser Accuracy

slide-30
SLIDE 30

Experimental Setup

30

100 manually annotated trees

๏ Evalb: tool of scoring parsing accuracy based on Collins, 1997. ๏ We test Ja-En parsers.

Parser Evalb Test sentence

slide-31
SLIDE 31

Experiment Results

31

Tree
 selection Sentence
 selection Recall Precision F-Measure Baseline 84.88 84.77 84.83 Parser 1-best Random 86.52 86.41 86.46 Oracle BLEU+1 Threshold 88.13 88.01 88.07 ** ** : p < 0.01 * : p < 0.05 *

๏ Our method improves not only MT results,


but also parser accuracy itself.

slide-32
SLIDE 32

Conclusion

slide-33
SLIDE 33

Makoto Morishita, AHC Lab, NAIST

Conclusion

33

๏ By our proposed self-training method, 


translation and parser accuracy improved.

๏ Self-Training does not rely on target language


  • By using Ja-En self-trained model, Ja-Zh translation

accuracy improved.

๏ Future work


  • Verify this method is applicable in other languages.

  • Self-training using several target languages data.

  • Test the effect when performing the parser self-

training repeatedly.

slide-34
SLIDE 34

END

slide-35
SLIDE 35

Makoto Morishita, AHC Lab, NAIST

35

Tree Selection Sentence Selection Sentences (k) BLEU RIBES

Baseline

  • 23.83

72.27

Parser 1-best Random

96 23.66 71.77

Decoder 1-best Random

97 23.81 72.04

Oracle Random

97 23.93 72.09

Oracle BLEU+1 ≧ 0.7

206 24.27 72.38

Oracle BLEU+1 ≧ 0.8

120 24.26 72.38

Oracle BLEU+1 ≧ 0.9

58 24.26 72.49

Oracle BLEU+1 Gain

100 24.22 72.32 ** : p < 0.01 * : p < 0.05 ** ** ** *

Experiment Results
 (Japanese-English Translation)

slide-36
SLIDE 36

Makoto Morishita, AHC Lab, NAIST

Tree Selection Sentence Selection Sentences (k) BLEU RIBES

Baseline

  • 29.60

81.32

Parser 1-best Random

129 29.75 81.55

Decoder 1-best Random

130 29.76 81.53

Oracle Random

130 29.89 81.66

Oracle BLEU+1 ≧ 0.7

240 29.86 81.60

Oracle BLEU+1 ≧ 0.8

150 29.91 81.47

Oracle BLEU+1 ≧ 0.9

82 29.86 81.60

Oracle BLEU+1 Gain

100 29.85 81.59

Oracle (ja-en) BLEU+1 ≧ 0.8

120 29.87 81.58

36

** : p < 0.01 * : p < 0.05 ** ** * ** ** ** * * * ** *

Experiment Results
 (Japanese-Chinese Translation)

** **

slide-37
SLIDE 37

Makoto Morishita, AHC Lab, NAIST

Why decoder 1-best parse tree is better than parser 1-bset?

37

๏ Probability considered in Forest-to-String translation


  • Parse tree probability

  • Translation model

  • Language model

๏ The rule that use correct tree have 


high probability on translation model.


  • The rule that use incorrect tree have low probability.

๏ By using language model,


the correct parse tree tends to be chosen.


  • The correct tree have high probability on language model.