Semi-supervised Learning for Neural Machine Translation Yong Cheng - - PowerPoint PPT Presentation

semi supervised learning for neural machine translation
SMART_READER_LITE
LIVE PREVIEW

Semi-supervised Learning for Neural Machine Translation Yong Cheng - - PowerPoint PPT Presentation

Semi-supervised Learning for Neural Machine Translation Yong Cheng joint work with Wei Xu, Zhongjun He, Wei He, Hua Wu, Maosong Sun, Yang Liu 1 Machine Translation Automated translation using computer software 2 Machine Translation Rule-based


slide-1
SLIDE 1

Semi-supervised Learning for Neural Machine Translation

Yong Cheng

joint work with Wei Xu, Zhongjun He, Wei He, Hua Wu, Maosong Sun, Yang Liu

1

slide-2
SLIDE 2

Machine Translation

2

Automated translation using computer software

slide-3
SLIDE 3

Machine Translation

3

Rule-based Machine Translation 1970s Example-based Machine Translation 1984 Statical Machine Translation (SMT) 1993 Neural Machine Translation NMT 2014 Trends: learning to translate from DATA

slide-4
SLIDE 4

Machine Translation

4

Parallel Corpora Monolingual Corpora Parallel corpora are usually limited in

quantity quality coverage

& &

slide-5
SLIDE 5

Monolingual Corpora Used in SMT and NMT

N-gram language model in SMT Koehn et al., [2007] Monolingual corpora as decipherment Ravi and Knight [2011] Integrate a neural language model into NMT. Gulccehre et al. [2015] Additional pseudo parallel corpus. Sennrich et al. [2016]

5

slide-6
SLIDE 6

Supervised Training

Parallel Corpus Objective

6

slide-7
SLIDE 7

Unsupervised Training

Monolingual Corpus

7

slide-8
SLIDE 8

cc

Our Approach — Autoencoders

8

bushi yu shalong juxing le huitan

x

slide-9
SLIDE 9

cc

Our Approach — Autoencoders

9

bushi yu shalong juxing le huitan

x

P(y | x; ! θ )

slide-10
SLIDE 10

cc

Our Approach — Autoencoders

10

bushi yu shalong juxing le huitan

x

Bush held a talk with sharon

y

P(y | x; ! θ )

latent

slide-11
SLIDE 11

cc

Our Approach — Autoencoders

11

bushi yu shalong juxing le huitan

x

Bush held a talk with sharon

y

P(y | x; ! θ ) P(x | y; ! θ )

latent

slide-12
SLIDE 12

cc

Our Approach — Autoencoders

12

bushi yu shalong juxing le huitan

x

Bush held a talk with sharon

′ x

bushi yu shalong juxing le huitan

y

P(y | x; ! θ ) P(x | y; ! θ )

latent

slide-13
SLIDE 13

cc

Our Approach — Autoencoders

13

source autoencoder target autoencoder

slide-14
SLIDE 14

Unsupervised Training (Autoencoders)

Monolingual Corpus

14

target autoencoder

slide-15
SLIDE 15

Semi-supervised Training

15

Training Objective

slide-16
SLIDE 16

Translation Results

Compared with Moses (SMT) and RNNSearch (NMT)

16

slide-17
SLIDE 17

Translation Results

Compared with Moses (SMT) and RNNSearch (NMT)

17

slide-18
SLIDE 18

Translation Results

Compared with Moses (SMT) and RNNSearch (NMT)

18

slide-19
SLIDE 19

Translation Results

Compared with Moses (SMT) and RNNSearch (NMT)

19

slide-20
SLIDE 20

Translation Results

Compared with Moses (SMT) and RNNSearch (NMT)

20

slide-21
SLIDE 21

Translation Results

Compared with Sennrich et al. [2015a]

21

slide-22
SLIDE 22

Example Translation of Monolingual Corpus

22

slide-23
SLIDE 23

Conclusion

Monolingual corpora is an important resource for neural machine translation. We have proposed a semi-supervised approach to training bidirectional neural machine translation models for exploiting monolingual corpora. As our method is sensitive to the OOVs present in monolingual corpora, we plan to integrate Jean et

  • al. (2015)’s technique on using very large

vocabulary into our approach.

23

slide-24
SLIDE 24

Thank You !

24

slide-25
SLIDE 25

Effect of Sample Size

ZH-EN EN-ZH

25

slide-26
SLIDE 26

Effect of OOV ratio

ZH-EN EN-ZH

26