Translator Research Production Shared Research task Dataset - - PowerPoint PPT Presentation

▶

Jun 03, 2023 1.4k likes •2.14k views

Translator Research Production Shared Research task Dataset newstest2016 newstest2017 WMT18 33.9 29.0 Random 16.2 14.1 LangID+Random 26.6 23.3 LangID+Adeq 35.1 30.2 Ablation: no LangID 15.4 12.7 Ablation: no AbsDiff 33.8

SLIDE 1

Translator

SLIDE 2

SLIDE 3

SLIDE 4

Production Research Shared task Research

SLIDE 5

SLIDE 6

SLIDE 7

SLIDE 8

SLIDE 9

SLIDE 10

SLIDE 11

SLIDE 12

Dataset newstest2016 newstest2017 WMT18 33.9 29.0 Random 16.2 14.1 LangID+Random 26.6 23.3 LangID+Adeq 35.1 30.2 Ablation: no LangID 15.4 12.7 Ablation: no AbsDiff 33.8 29.3 Ablation: no CE-Weight 31.7 27.4 LangID+Adeq+Dom 36.0 31.0

SLIDE 13

SLIDE 14

SLIDE 15

System 2016 2017 2018 WMT18-Microsoft 38.6 31.3 46.5 WMT18-FAIR

32.7

44.9 WMT19-baseline 37.7 30.3 46.5 + data-filtering 38.3 31.1 46.6 + noisy back-translation 38.9 32.8 46.3 + fine-tuning 40.6 33.6 48.9

SLIDE 16

System en de both WMT18-Microsoft 41.1 35.5 39.1 WMT18-FAIR

WMT19-baseline

41.8 32.5 38.2 + data-filtering 41.7 34.0 39.0 + noisy back-translation 38.9 40.4 39.7 + fine-tuning 42.2 39.2 41.2

SLIDE 17

SLIDE 18

SLIDE 19

SLIDE 20

SLIDE 21

SLIDE 22

SLIDE 23

SLIDE 24

SLIDE 25

SLIDE 26

SLIDE 27

SLIDE 28

SLIDE 29

SLIDE 30

SLIDE 31

SLIDE 32

SLIDE 33

SLIDE 34

SLIDE 35

SLIDE 36

Microsoft Translator at WMT 2019: T
wards Large-Scale

Document-Level Neural Machine Translation https://arxiv.org/abs/1907.06170

Improving Deep Transformer with Depth-Scaled Initialization and

Merged Attention https://arxiv.org/abs/1908.11365

SLIDE 37

SLIDE 38

SLIDE 39

SLIDE 40

SLIDE 41

SLIDE 42

SLIDE 43

On The Evaluation of Machine Translation Systems Trained With Back-Translation https://arxiv.org/abs/1908.05204

SLIDE 44

SLIDE 45

SLIDE 46

SLIDE 47

SLIDE 48

SLIDE 49

SLIDE 50

SLIDE 51

SLIDE 52

SLIDE 53

SLIDE 54

SLIDE 55

SLIDE 56

Model Parameters Layers Dim BERT/GPT-2 117M 12 768/4096 BERT/GPT-2 345M 24 1024/4096 GPT-2 762M 36 1280/4096 GPT-2 1542M 48 1600/4096 Model Parameters Layers Dim Nematus RNN 25M (95MB) 1/1 (2/2) 512/1024 Transformer (Base) 30M (117MB) 6/6 512/2048 Transformer (Big) 209M (790MB) 6/6 1024/4096 Transformer (Bigger) 386M (1,471MB) 12/12 1024/4096 Transformer (Even Bigger) 570M 18/18 1024/4096 Transformer (Biggest) 750M 24/24 1024/4096

SLIDE 57

SLIDE 58

SLIDE 59

SLIDE 60

SLIDE 61

SLIDE 62

SLIDE 63

SLIDE 64

SLIDE 65

SLIDE 66

blogs.msdn.com/translator twitter.com/MSTranslator facebook.com/BingTranslator Microsoft.com/Translator