Statistical Machine Translation
Josef van Genabith DFKI GmbH Josef.van_Genabith@dfki.de
Language Technology II SS 2014
May 13th, 2014
Statistical Machine Translation May 13th, 2014 Josef van Genabith - - PowerPoint PPT Presentation
Statistical Machine Translation May 13th, 2014 Josef van Genabith DFKI GmbH Josef.van_Genabith@dfki.de Language Technology II SS 2014 With some additional slides from Chris Dyer MT Marathon 2011 and Sabine Hunsiker LT SS 2012 Overview
Josef van Genabith DFKI GmbH Josef.van_Genabith@dfki.de
May 13th, 2014
Language Technology II (SS 2014): Statistical Machine Translation 2 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statistical Machine Translation 3 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statistical Machine Translation 4 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statistical Machine Translation 5 Josef.van_Genabith@dfki.de
𝑄(𝑓,𝑔) 𝑄(𝑔)
Language Technology II (SS 2014): Statistical Machine Translation 6 Josef.van_Genabith@dfki.de
𝑓
𝑄 𝑔 𝑓 ×𝑄(𝑓) 𝑄(𝑔)
𝑓 𝑄 𝑓 𝑔 = arg max 𝑓 𝑄 𝑔 𝑓 ×𝑄(𝑓) 𝑞(𝑔)
𝑓
Language Technology II (SS 2014): Statistical Machine Translation 7 Josef.van_Genabith@dfki.de
𝑓
Language Technology II (SS 2014): Statistical Machine Translation 8 Josef.van_Genabith@dfki.de
𝑓
Language Technology II (SS 2014): Statistical Machine Translation 9 Josef.van_Genabith@dfki.de
Source e 𝑄(𝑓) Channel 𝑄(𝑔|𝑓) Observed f What is most likely e ? 𝑓 e f
Language Technology II (SS 2014): Statistical Machine Translation 10 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statsitical Machine Translation 11 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 12 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 13 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 14 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 15 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 16 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 17 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 18 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 19 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 20 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 21 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 22 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 23 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 24 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 25 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 26 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 27 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statsitical Machine Translation 28 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statistical Machine Translation 29 Josef.van_Genabith@dfki.de
Slide: Chris Dyer, MT Marathon 2011
Language Technology II (SS 2014): Statistical Machine Translation 30 Josef.van_Genabith@dfki.de
𝑓
Language Technology II (SS 2014): Statistical Machine Translation 31 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statistical Machine Translation 32 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statistical Machine Translation 33 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statistical Machine Translation 34 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statistical Machine Translation 35 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statistical Machine Translation 36 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statistical Machine Translation 37 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statistical Machine Translation 38 Josef.van_Genabith@dfki.de
Language Technology II (SS 2014): Statistical Machine Translation 39 Josef.van_Genabith@dfki.de
Language Technology II (SS 2012): Machine Translation 40 sabine.hunsicker@dfki.de
Language Technology II (SS 2012): Machine Translation 41 sabine.hunsicker@dfki.de
MT L2 L3 … Ln L1 MT L2 L3 … Ln L1 MT
L1 L2
Publishable quality can only be authored by humans; Translation Memories & CAT- Tools mandatory for professional translators Daily throughput of
> 500 M Words Topic of many running and completed research projects (VerbMobil, TC Star, TransTac, …) US-Military uses systems for spoken MT
Language Technology II (SS 2012): Machine Translation 42 sabine.hunsicker@dfki.de
Language Technology II (SS 2012): Machine Translation 43 sabine.hunsicker@dfki.de
…for understanding the source text …for generating well-formed target text
Language Technology II (SS 2012): Machine Translation 44 sabine.hunsicker@dfki.de
Language Technology II (SS 2012): Machine Translation 45 sabine.hunsicker@dfki.de
syntactic disambiguation (POS, attachments) semantic disambiguation (collocations, scope, word sense) reference resolution lexical choice in target language application-specific terminology, register, connotations, good style …
Language Technology II (SS 2012): Machine Translation 46 sabine.hunsicker@dfki.de
“statistical semantic studies”
Language Technology II (SS 2012): Machine Translation 47 sabine.hunsicker@dfki.de
language weaver, aixplain GmbH, Linear B Ltd., esteam, Google Labs