machine translation examples cs 188 artificial
play

Machine Translation: Examples CS 188: Artificial Intelligence - PDF document

Machine Translation: Examples CS 188: Artificial Intelligence Spring 2006 Lecture 28: Machine Translation 5/2/2006 Dan Klein UC Berkeley Levels of Transfer General Approaches Rule - b ased approaches (Vauquois Interlingua


  1. Machine Translation: Examples CS 188: Artificial Intelligence Spring 2006 Lecture 28: Machine Translation 5/2/2006 Dan Klein – UC Berkeley Levels of Transfer General Approaches � Rule - b ased approaches (Vauquois Interlingua � Expert system style rewrite systems triangle) Semantic Semantic � Interlingua methods (analyze and generate) Composition Decomposition � Lexicons come from humans or dictionaries Semantic Semantic � Can be very fast, and can accumulate a lot of Structure Structure Semantic Semantic Semantic knowledge over time (e.g. Systran) Analysis Generation Transfer Syntactic Syntactic Structure Structure � Statistical approaches Syntactic Syntactic Syntactic Transfer Analysis Generation � Noisy channel systems Word Word � Lower-level transfer Structure Structure Direct � Lexicons discovered using parallel corpora Morphological Morphological � Require little human declaration of knowledge Analysis Generation Target Text Source Text The Coding View MT System Components � “One naturally wonders if the problem of Language Model Translation Model translation could conceivably be treated as a channel source problem in cryptography. When I look at an article e f P(f|e) P(e) in Russian, I say: ‘This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode.’ ” observed best decoder e f � Warren Weaver (1955:18, quoting a letter he wrote in 1947) argmax P(e|f) = argmax P(f|e)P(e) e e Finds an English translation which is both fluent and semantically faithful to the French source 1

  2. Language Models Parallel Corpora � Language Models � Parallel corpora (or bitexts) � Any probabilistic model capable of assigning probabilities to sentences � Collection of source- � Usually n-gram models, but also PCFGs target translation pairs � Exact same technology (and software) as in ASR � Main resource for � Train on a huge collection of monolingual corpora (documents in learning a translation the target language) model � Either naturally occurring (e.g. parliamentary w 1 w 2 w n -1 STOP START proceedings, news translation services) or commissioned Building a Translation Model 1-to-Many Alignments � Steps in building a En vertu simple statistical de les translation model What nouvelles is propositions � Match up words in the , anticipated training sentence quel cost est pairs (word of le collecting alignment) coût fees prévu � Learn a lexicon from under de the perception these alignments new de � Learn larger phrases proposal les ? droits ? Many-to-Many Alignments The HMM Alignment Model � The HMM model (Vogel 96) -2 -1 0 1 2 3 � Re-estimate using the forward-backward algorithm � Handling nulls requires some care � Note: alignments are not provided, but induced 2

  3. Phrases vs Word Models Examples: Translation and Fertility he is nodding il hoche la tête Extracting Phrases Basic Phrase-Based Model [Koehn et al, 2003] Segmentation Translation Distortion Decoding The Pharaoh Decoder � Now we have a phrase table: � A huge list of translation phrases (e.g. 1M phrases) � Each phrase has a probability P(f|e) � When we see a new input sentence: � Grow a translation left to right � Extend translation using known phrases � Also multiply by language model score � Probabilities at each step include LM and TM 3

  4. Some Output Translations Madame la présidente, votre présidence de cette institution a été � Even human translators aren’t perfect: marquante. � In an Austrian ski hotel: Mrs Fontaine, your presidency of this institution has been outstanding. Madam President, president of this house has been discoveries. Not to perambulate the corridors in the hours of Madam President, your presidency of this institution has been repose in the boots of ascension. impressive. � In a Copenhagen airline ticket office: Je vais maintenant m'exprimer brièvement en irlandais. We take your bags and send them in all directions. I shall now speak briefly in Irish . � From a brochure of a car rental firm in Tokyo: When I will now speak briefly in Ireland . passenger of foot heave in sight, tootle the horn. I will now speak briefly in Irish . Trumpet him melodiously at first, but if he still Nous trouvons en vous un président tel que nous le souhaitions. obstacles your passage then tootle him with vigor. We think that you are the type of president that we want. We are in you a president as the wanted. We are in you a president as we the wanted. http://www.englishfirst.org/13166/funnytranslations.htm 4

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend