Large-scale deployment of statistical machine translation Example - - PDF document

large scale deployment of statistical machine translation
SMART_READER_LITE
LIVE PREVIEW

Large-scale deployment of statistical machine translation Example - - PDF document

10/26/2008 Large-scale deployment of statistical machine translation Example Microsoft Chris.Wendt@microsoft.com Microsoft Research Machine Translation Agenda Microsoft MT engine basics Architecture and design for scale


slide-1
SLIDE 1

10/26/2008 1

Large-scale deployment of statistical machine translation

Example Microsoft

Chris.Wendt@microsoft.com Microsoft Research – Machine Translation

Agenda

  • Microsoft MT engine basics
  • Architecture and design for scale
  • Translator in Practice
  • Microsoft internal use: Human Translation and

Raw Publishing

slide-2
SLIDE 2

10/26/2008 2

Time Line

1991

  • Microsoft Research is founded, with NLP as one of its first research areas
  • NLP team is active in rule-based parsing and grammar checking

1996

  • Grammar Checker in Word ‘97

1999

  • Work on Machine Translation begins

2003

  • V1: First public visibility with the Microsoft Knowledge Base
  • Example-based system: V1 of Microsoft Translator

2005

  • V2: Switch to Treelet systems for all from English language pairs
  • Treelet system consitutes V2 of Microsoft Translator

2007

  • First consumer availability at http://translator.live.com in 2007
  • Mixed Systran and Microsoft Translator V2 deployment

2008

  • Adding a phrasal systems for all to English language pairs
  • http://translator.live.com powered exclusively by Microsoft‘s own systems

Microsoft’s Statistical MT Engine

HTML handling Sentence breaking Source language parser Syntactic tree based decoder Source language word breaker Surface string based decoder Rule-based post processing Case restoration Syntactic reordering model Contextual translation model Syntactic word insertion and deletion model Target language model Distance and word-based reordering Languages with source parser Other source languages

Models

slide-3
SLIDE 3

10/26/2008 3

Training Architecture

Parallel Data Source/Target word breaking Source language parsing Syntactic reordering model Contextual translation models Syntactic word insertion and deletion model Target language model Distance and word-based reordering Target language monolingual data Word alignment Treelet + Syntactic structure extraction Language model training Phrase table extraction Surface reordering training Syntactic models training Case restoration model

  • Discrim. Train

model weights Model weights Treelet table extraction

Runtime Architecture

Watchdog #1 Monitor, reset, restart ….. ….. Model Server #1 Model Server #n ….. Watchdog #1 Monitor, reset, restart Internet Translator #1 Translator #2 Translator #3 Translator #n-1 Translator #n Front Door Machine #1 User Interface Sentence Breaking Front Door Machine #n User Interface Sentence Breaking Traffic Distribution

slide-4
SLIDE 4

10/26/2008 4

Front Door

  • Microsoft Internet Information Server
  • Landing Page

– HTTP interface for Bilingual Viewer

  • Fetches web page, sentence & html breaking, creates

marked up version

  • Sends page to client, asynchronously fills translation

requests

  • Distributor

– SOAP API – Distributes sentences to multiple leaves – In memory cache of sentence translations

Automatic evaluation: BLEU

  • A fully automated MT evaluation metric

– Modified N-gram precision, comparing a test sentence to reference sentences

  • Automatic and cheap: runs daily and for every check-

in

  • Standard in the MT community

– Immediate, simple to administer – Shown to correlate with human judgments

  • Warning: Does not compare between engines or

between languages.

slide-5
SLIDE 5

10/26/2008 5

  • 3 to 5 independent human evaluators are asked to rank translation quality for 250

sentences on a scale of 1 to 4

– Comparing to human translated sentence – No source language knowledge required 4 = Ideal Grammatically correct, all information included 3 = Acceptable Not perfect, but definitely comprehensible, and with accurate transfer of all important information 2 = Possibly Acceptable May be interpretable given context/time, some information transferred accurately 1 = Unacceptable Absolutely not comprehensible and/or little or not information transferred accurately

  • Each sentence is evaluated by all raters, and scores are averaged
  • Relative evaluations

– Track progress against ourselves and a competitor

Human evaluations

9

Language pairs on translator.live.com

en_es 18% en_de 15% en_pt 13% en_zh-chs 7% en_fr 6% es_en 6% pt_en 4% en_it 4% en_ar 3% en_ja 3% en_zh-cht 3% en_ko 3% de_en 2% en_es en_de en_pt en_zh-chs en_fr es_en pt_en en_it en_ar en_ja en_zh-cht en_ko de_en

  • ther

The fact to note in this distribution is the relative popularity of the English>German language pair among consumers, in contrast to the lack of popularity for this language pair among the technical audience.

slide-6
SLIDE 6

10/26/2008 6

Products

  • Bilingual Viewer

– Used by Live Search results page

  • Translator landing page
  • Toolbar Translator Button
  • Translator Add-in for 3rd party pages
  • Internet Explorer 8 accelerator
  • Community built Firefox version
  • Translator Bot (mtbot@hotmail.com)
  • Office Research Pane
  • SOAP API for product team use
  • Microsoft Localization

– CSS KB, MSDN Technet, Products

Two ways to apply MT in a product

  • Post-Editing

– Increase human translator’s productivity – In practice: 0% to 25% productivity increase

  • Varies by content, style

and language

  • Raw publishing

– Publish the output of the MT system directly to end user – Best with bilingual UI – Good results with IT Pro and Developer audience

 Increasing the extent of localization

slide-7
SLIDE 7

10/26/2008 7

MT with post-editing

Source Target Apply TM on >85% match Translation memory (TM) Apply MT on the rest Human editing

Product 1: Post-editing Results Without specific post-editor training

8%

  • 3.80%
  • 12%
  • 9.20%

11% 1.80%

  • 23%
  • 30%
  • 20%
  • 10%

0% 10% 20% French Italian German Spanish Chinese S. Chinese T. Japanese

Productivity gain/loss

Productivity gain/loss

slide-8
SLIDE 8

10/26/2008 8

Product 2: Post-Editing Results A couple of weeks later: with training

14.50% 20.00% 8% 28.60% 6.10% 14.70% 0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% French Brazilian Swedish Danish Czech Dutch

Productivity gain

Productivity gain

Product 3: Post-editing Results With training

11.89% 33.35% 29.12% 10.98% 22.01% 6.56%

  • 4.05%
  • 1.90%
  • 10.00%

0.00% 10.00% 20.00% 30.00% 40.00%

French Italian German Spanish Chinese S. Chinese T. Japanese Brazilian

Productivity gain/loss

Productivity gain/loss

slide-9
SLIDE 9

10/26/2008 9

Post-Editing: Lessons Learned

  • Training of the translator is required

– Understand the peculiarities of the engine used – Always read the source sentence first – Understand when to discard the MT

  • “Two seconds is Too much”
  • Acknowledge different suitability for different

style and terminology

  • Customize terminology per individual project –

use of dictionary

  • Productivity gains of 5% to 25% are achievable,

but investment is required

Raw MT Publishing

Source Target Apply TM on 100% match Translation memory (TM) Apply MT on the rest

slide-10
SLIDE 10

10/26/2008 10

slide-11
SLIDE 11

10/26/2008 11

History of MT in Customer Support

  • Since 2003 CSS has been actively using Machine Translation for

Knowledge Base articles – Spanish was the first language deployed – Japanese went live one year later

  • Current Languages

– 10 Languages deployed: Spanish, German, French, Italian, Japanese, Portuguese, Brazilian Portuguese, Chinese Simplified, Chinese Traditional, Arabic – 3 Languages in Testing: Korean, Turkish and Russian

Microsoft Knowledge Base

Language Human translated, or

  • riginally authored in

language % English 235,425 100% Japanese 70,684 27% French 35,310 14% German 30,459 12% Spanish 16,980 7% Italian 14,401 6% Chinese (Simplified) 12,873 5% Chinese (Traditional) 10,372 4% Portuguese (Brazil) 10,205 4% Portuguese (Iberian) 7,129 3% Arabic 2,152 1%

MT & HT distribution across languages

Traffic to the knowledge base is fairly unevenly distributed. By targeting human translation to the high page view articles, 80% of the Japanese total page views are for human translated articles. Even in Arabic 54% of the page views end up on human quality articles.

slide-12
SLIDE 12

10/26/2008 12

Customer Feedback: KB Inline Survey

Knowledge Base – average resolve rate of human translated vs. machine translated articles

31.8% 35.3% 35.4% 22.5% 25.0% 33.3% 27.8% 27.6% 28.7% 29.2% 25.4% 32.6% 29.0% 20.9% 18.7% 26.5% 17.8% 23.3% 28.7% 24.1% Arabic Chinese (Simplified) Chinese (Traditional) French German Italian Japanese Portuguese Portuguese (Brazil) Spanish Machine Translation Human Translation English 25.5%

slide-13
SLIDE 13

10/26/2008 13

Global English

  • Support started rewriting source language to account

for MT in 2003 (6 months after Spanish MT was deployed)

  • Retrained the writers to write with global audience and

MT in mind.

  • Top five rules to make source language content suitable

for MT:

1. Use Standard English writing style 2. Use correct punctuation – especially the following:

– Missing punctuation causing incorrect sentence break – Hyphens – Commas

3. Eliminate long sentences 4. Use capitalization correctly 5. Use correct spelling

Impact of Global English

Resolve rate of articles authored to standard guidelines

19% 23% 20% 24% 18% 25% 22% 26% 27% 22% 19% 22% 24% 34% 20% 27% 36% 31% 40% 25% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% DE - German ES - Spanish FR-French IT - Italian JA - Japanese PT - Portuguese PT-BR - Portuguese Brazil ZH-CN - Chinese Simplified ZH-TW - Chinese Traditional MT Languages Combined % Yes - Global English % Yes - Old Article

slide-14
SLIDE 14

10/26/2008 14

Translation Wiki Benefits of using MT

  • Localize into more languages without

increasing budget

Larger language set

  • Increase the extent of localization

without proportional budget increase

Localize More

  • 5% to 25% productivity increase
  • >25% in software localization

Higher Productivity

  • Remove delay in translation
  • Especially desired by technical audience

Faster Availability

28

slide-15
SLIDE 15

10/26/2008 15

Conclusions

  • Automation enriches the customer’s experience
  • Long term investments in tools and processes

are required, and patience in seeing results is needed

  • Reaching customers through multiple forums

and media is important

  • Metrics are more useful than opinions
  • Using customer feedback and community

provides better solutions

Thank you

slide-16
SLIDE 16

10/26/2008 16

References

  • Menezes, Arul, Kristina Toutanova and Chris Quirk. Microsoft Research Treelet translation

system: NAACL 2006 Europarl evaluation. Workshop on Machine Translation, NAACL 2006

  • Chris Quirk, Arul Menezes. Dependency Treelet Translation: The convergence of statistical

and example-based machine translation? March 2006 Machine Translation 43--65