Machine Translation at Booking.com Journey and Lessons Learned May - PowerPoint PPT Presentation

User Track Machine Translation at Booking.com Journey and Lessons Learned May 30, 2017, Prague Pavel Levin Nishikant Dhanuka Maxim Khalilov

Who am I? About me • Master in Computer Science (NLP) from IIT Mumbai • 8 years of work experience in analytics and consulting • Data Scientist at Booking.com since last 2 years ฀ Partner Services Department (Scaled Content) • linkedin.com/in/nishikantdhanuka/ About Booking.com W orld’s #1 website for booking hotels and other accommodations • Founded in 1996 in Amsterdam; part of The Priceline Group since 2005 • 1,200,000+ properties in more than 220+ countries; 25 Million rooms • Over 1,400,000 room nights are reserved on Booking.com every 24 hours • Employing more than 15,000 people in 199 offices worldwide • Website available in 43 languages

Agenda. • Motivation ฀ MT critical for Booking.com’s localization process • MT Journey and Lessons Learned ฀ MT Model & Experiments ฀ Evaluation Results ◦ Automatic & Human ◦ Sentence Length Analysis ◦ A/B Tests ฀ Interesting Examples • Conclusion and Future Work • Q & A

Motivation

Mission: Empower people to book any hotel in the world, while browsing high quality content in their own language. of daily bookings on Booking.com is made in a language other than English … thus it is important to have locally relevant content at scale How Locally Relevant? Why At Scale? • Present Hotel descriptions in the • One Million+ properties and language of the user growing very fast • Allow partners and guests to • Frequent change requests to consume and produce content in update the content their own language • 43 languages and more ฀ Customer Reviews • New customer reviews / tickets ฀ Customer Service Support every second

Currently Hotel descriptions translated by human in 43 languages based on visitor demand. *50% Translation Coverage *90% Demand Coverage * Approximate numbers based on average of some languages

Example of Lost Business Opportunity because of highly manual and slow process. Put in human translation pipeline if this happens often Lost Business New Hotel in Profile visited Sees the Drops Off China on B.com by a description in How do we balance quality, German English Initial content only in speed and cost effectiveness? customer English & Chinese Still makes the booking Chicken-Egg Machine Success problem Translation

MT Journey and Lessons Learned

Our Journey to discover the awesomeness of NMT Phase 1 Phase 2 Phase 3 General Purpose SMT NMT Trained on NMT general purpose data Booking.com SMT NMT SMT Trained on in-domain data In-domain In-domain General Purpose NMT SMT NMT > > > General Purpose General Purpose In-domain SMT NMT SMT

Lots of in-domain data to train the MT system Langu- Parallel # of Vocab Avg. age Sente- Words Size Len nces English -> German German 171 M 845 K 16.3 10.5 M English 174 M 583 K 16.5 English -> French French 193 M 588 K 17.7 11.3 M English 188 M 581 K 16.7

Our NMT Model Configuration Details Data Preparation Model Training Translate Split Data Train, Val, Model Type seq2seq Optimization Stochastic Gradient Beam 30 Test Method Descent Size Input Text Word Input 1,000 Initial 1 Unknown Source with Unit Level Embedding Learning Words Highest Dimension Rate Handling Attention Tokenization Aggressive RNN Type LSTM Decay Rate 0.5 Evaluate Max Sentence 50 # of hidden 4 Decay Decrease in Validation Auto BLEU Length layers Strategy Perplexity <=0 WER Vocabulary 50,000 Hidden Layer 1,000 Number of 5 - 13 Human A/F Size Dimension Epochs Attention Global Stopping BLEU + sensitive Other Length Mechanism Attention Criteria sentences +constraints A/B Test ** Approx. 220 Million Dropout 0.3 ** MT pipeline based on Parameters Harvard implementation of OpenNMT Batch Size 250 ** 1 Epoch takes approx. 2 days on a single NVIDIA Tesla K80 GPU

1. Data Preparation: Tokenization and Vocabulary EN: The rooms at the Prague Mandarin Oriental feature EN DE Data Preparation underfloor heating, and guests can choose from various <blank> 1 <blank> 1 bed linen and pillows. <unk> 2 <unk> 2 Split Data Train, Val, Test <s> 3 <s> 3 DE: Die Zimmer im Prague Mandarin Oriental bieten </s> 4 </s> 4 eine Fußbodenheizung und eine Auswahl an Input Text Word a 5 und 5 Unit Level Bettwäsche und Kissen. and 6 sie 6 Tokenization Aggressive the 7 mit 7 is 8 einen 8 Max Sentence 50 with 9 der 9 EN: The rooms at the Prague Mandarin Oriental Length in 10 ein 10 feature underfloor heating , and guests can choose Vocabulary 50,000 from various bed linen and pillows . Size Aggressive only keeps Tokenized text sequences of letters / DE: Die Zimmer im Prague Mandarin Oriental bieten represented as bag of numbers i.e. doesn’t eine Fußbodenheizung und eine Auswahl an words vector based on allow mix of alphanumeric Bettwäsche und Kissen . vocabulary ids as in: "E65", "soft-landing"

2. Model Architecture: Approx. 220 Million Parameters . Umfasst wifi Model Model Type seq2seq Input 1,000 Embedding Dimension RNN Type LSTM # of hidden 4 layers Hidden Layer 1,000 Dimension Attention Global Mechanism Attention Includes Wifi . Umfasst wifi

3. Training: 1 Epoch takes approx. 2 days on a single NVIDIA Tesla K80 GPU Training Perplexity Development BLEU Score Development Optimization Stochastic Gradient Method Descent 54 2.2 Model Perplexity Initial 1 52 2.1 BLEU Score Learning 50 2 Rate 48 1.9 Decay Rate 0.5 46 1.8 44 Decay Decrease in Validation 1.7 42 Strategy Perplexity <=0 1.6 40 Number of 5 - 13 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 Epochs Epoch # Epoch # Stopping BLEU + sensitive Criteria sentences +constraints Stopping Criteria: Sensitive Sentence Example Dropout 0.3 The neighborhood is very nice and safe There is a safe installed in this very nice neighborhood Batch Size 250

4. Translate: Unknown Word Handling Good Example Bad Example Translate Source Offering a restaurant, Hodor Eco- Free access to The Game Beam 30 Size lodge is located in Winterfell. entertainment Centre Unknown Source with Human Das Hodor Eco-Lodge begrüßt Sie Kostenfreier Zugang zum Words Highest Translation in Winterfell mit einem Restaurant. Unterhaltungszentrum The Handling Attention Game Raw Output Das <unk><unk> in <unk> bietet Kostenfreier Zugang zum ein Restaurant. <unk> Output with Das Hodor Eco-lodge in Winterfell Kostenfreier Zugang zum <unk> replaced bietet ein Restaurant. Centre

5. Evaluate: Auto, Human, Length Analysis & A/B Tests Evaluate BLEU A/F Framework Auto BLEU ➢ 3 evaluators per language # of words shared WER ➢ Provided with original text and MT hypotheses, between MT output Human A/F and human reference including human reference ➢ Benefits sequential ➢ Not aware which system produced which hypothesis ➢ Asked to assess the quality of 150 random sentences words Other Length A/B Test ➢ Penalizes short from test corpus ➢ 4 level scale to both Adequacy & Fluency translations Example: WER Variation of the word- Minor Mistake: - EN : “there is a parking area available” level Levenshtein - DE : “ es ist eine Garage verfügbar ” distance ➢ Measures the distance by counting Major Mistake: - EN: “there is a parking area available” insertions, deletions - DE : “ es ist eine Aufbewahrungsstelle verfügbar ” & substitutions

Our In-domain NMT system Evaluation Results 1/5: significantly outperforms all BLEU Score for German & French other MT engines 55 50 46 Both Neural systems 45 consistently outperform their 40 35 35 31 Statistical counter-parts 28 30 25 SMT NMT GP-SMT GP-NMT In-domain SMT beats General 53 55 Purpose NMT 50 45 40 36 32 35 30 Compared to German, French 30 25 improved much more from SMT NMT GP-SMT GP-NMT SMT to NMT

Our In-domain NMT system Evaluation Results 2/5: still outperforms all other MT Adequacy/Fluency Scores for German engines Human 4 3.9 3.96 3.8 Both Neural systems still 3.65 3.62 3.57 3.6 consistently outperform their 3.4 statistical counter-parts 3.2 3 However General Purpose SMT NMT GP-SMT GP-NMT 4 Human NMT now beats In-domain 3.78 3.82 3.8 SMT 3.57 3.6 3.37 3.4 3.15 Particularly fluency score of 3.2 our NMT engine is close to 3 SMT NMT GP-SMT GP-NMT human level

General Purpose NMT system Evaluation Results 3/5: outperforms others; conflicts Adequacy/Fluency Scores for French with BLEU 4 3.78 3.8 Human 3.67 Apparently General Purpose 3.70 3.6 NMT even outperforms 3.4 3.32 3.4 human level 3.2 3 Adequacy of both neural SMT NMT GP-SMT GP-NMT 4 engines is almost at human Human 3.8 3.75 level; fluency very far though 3.6 3.41 3.4 3.4 3.31 3.28 3.2 Compared to German, A/F scores are relatively less for 3 SMT NMT GP-SMT GP-NMT French; conflicts with BLEU

Machine Translation at Booking.com Journey and Lessons Learned May - PowerPoint PPT Presentation

User Track Machine Translation at Booking.com Journey and Lessons Learned May 30, 2017, Prague Pavel Levin Nishikant Dhanuka Maxim Khalilov Who am I? About me Master in Computer Science (NLP) from IIT Mumbai 8 years of work

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation:

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence:

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment

Global Translation Services Website translation using post-edited machine translation and

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

Planning & Development Committee Item Item 4. Old Spa Booking Office, West Cliff, Whitby

Lunar Resources for Solar Conversion B.Eng. M.Sc. Juergen Schleppi js79@hw.ac.uk OEMF -

OCRA: The One Centimetre Receiver Array Richard Davis, Mike Peel OCRA Collaboration: University

Propagating Functionality with Inheritance Object-Oriented Programming in R: S3 & R6

What This Course Is About Design-by-Contract (DbC) Focus is design Readings: OOSC2 Chapter 11

AIRS PROJECT OVERVIEW AND LAUNCH READINESS STATUS 13 February 2002 Tom Pagano AIRS Deputy

The 25th Princeton Conference Navigating Uncertainty in the U.S. Health Care System Where

Crystallography revisited 1 Point coordinates z 111 c Point coordinates for unit cell center

Towards Scalable Real-Time Entity Resolution using a Similarity-Aware Inverted Index Approach

Machine Translation at Booking.com Journey and Lessons Learned May - PowerPoint PPT Presentation

User Track Machine Translation at Booking.com Journey and Lessons Learned May 30, 2017, Prague Pavel Levin Nishikant Dhanuka Maxim Khalilov Who am I? About me Master in Computer Science (NLP) from IIT Mumbai 8 years of work

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation:

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence:

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment

Global Translation Services Website translation using post-edited machine translation and

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

Planning &amp; Development Committee Item Item 4. Old Spa Booking Office, West Cliff, Whitby

Lunar Resources for Solar Conversion B.Eng. M.Sc. Juergen Schleppi js79@hw.ac.uk OEMF -

OCRA: The One Centimetre Receiver Array Richard Davis, Mike Peel OCRA Collaboration: University

Propagating Functionality with Inheritance Object-Oriented Programming in R: S3 &amp; R6

What This Course Is About Design-by-Contract (DbC) Focus is design Readings: OOSC2 Chapter 11

AIRS PROJECT OVERVIEW AND LAUNCH READINESS STATUS 13 February 2002 Tom Pagano AIRS Deputy

The 25th Princeton Conference Navigating Uncertainty in the U.S. Health Care System Where

Crystallography revisited 1 Point coordinates z 111 c Point coordinates for unit cell center

Towards Scalable Real-Time Entity Resolution using a Similarity-Aware Inverted Index Approach

Planning & Development Committee Item Item 4. Old Spa Booking Office, West Cliff, Whitby

Propagating Functionality with Inheritance Object-Oriented Programming in R: S3 & R6