language and computers
play

Language and Computers Direct transfer systems Interlingua-based - PowerPoint PPT Presentation

Language and Computers Machine Translation Introduction Examples for Translations Background: Dictionaries Linguistic knowledge based systems Language and Computers Direct transfer systems Interlingua-based systems Machine Translation


  1. Language and Computers Machine Translation Introduction Examples for Translations Background: Dictionaries Linguistic knowledge based systems Language and Computers Direct transfer systems Interlingua-based systems Machine Translation Machine learning based systems Alignment Statistical Modeling Phrase-Based Translation Based on Dickinson, Brew, & Meurers (2013) What makes MT hard? Evaluating MT systems References 1 / 49

  2. Language and What is Machine Translation? Computers Machine Translation Introduction Examples for Translations Background: Dictionaries Translation is the process of: Linguistic knowledge based systems ◮ moving texts from one (human) language ( source Direct transfer systems Interlingua-based systems language ) to another ( target language ), Machine learning based systems ◮ in a way that preserves meaning. Alignment Statistical Modeling Phrase-Based Translation Machine translation (MT) automates (part of) the process: What makes MT hard? ◮ Fully automatic translation Evaluating MT ◮ Computer-aided (human) translation systems References 2 / 49

  3. Language and What is MT good for? Computers Machine Translation Introduction Examples for Translations ◮ When you need the gist of something and there are no Background: human translators around: Dictionaries Linguistic knowledge ◮ translating e-mails & webpages based systems ◮ obtaining information from sources in multiple Direct transfer systems Interlingua-based systems languages (e.g., search engines) Machine learning based systems ◮ If you have a limited vocabulary and a small range of Alignment Statistical Modeling sentence types: Phrase-Based Translation ◮ translating weather reports What makes MT hard? ◮ translating technical manuals Evaluating MT ◮ translating terms in scientific meetings systems References ◮ If you want your human translators to focus on interesting/difficult sentences while avoiding lookup of unknown words and translation of mundane sentences. 3 / 49

  4. Language and Is MT needed? Computers Machine Translation Introduction Examples for Translations Background: Dictionaries ◮ Translation is of immediate importance for multilingual Linguistic knowledge based systems countries (Canada, India, Switzerland, . . . ), Direct transfer systems Interlingua-based systems international institutions (United Nations, International Machine learning based systems Monetary Fund, World Trade Organization, . . . ), Alignment multinational or exporting companies. Statistical Modeling Phrase-Based Translation ◮ The European Union has 23 official languages. All What makes MT hard? federal laws and other documents have to be translated Evaluating MT into all languages. systems References 4 / 49

  5. Language and Example translations Computers Machine Translation The simple case Introduction Examples for Translations Background: ◮ It will help to look at a few examples of real translation Dictionaries Linguistic knowledge before talking about how a machine does it. based systems Direct transfer systems ◮ Take the simple Spanish sentence and its English Interlingua-based systems translation below: Machine learning based systems Alignment (1) (Yo) hablo espa˜ nol. Statistical Modeling Phrase-Based Translation I speak 1 st , sg Spanish What makes MT hard? ‘I speak Spanish.’ Evaluating MT systems ◮ Words in this example pretty much translate one-for-one References ◮ But we have to make sure hablo matches with Yo , i.e., that the subject agrees with the form of the verb. 5 / 49

  6. Language and Example translations Computers Machine Translation A slightly more complex case Introduction Examples for Translations Background: Dictionaries The order and number of words can differ: Linguistic knowledge based systems Direct transfer systems (2) a. Tu hablas espa˜ nol? Interlingua-based systems Machine learning You speak 2 nd , sg Spanish based systems Alignment ‘Do you speak Spanish?’ Statistical Modeling Phrase-Based Translation What makes MT b. Hablas espa˜ nol? hard? Speak 2 nd , sg Spanish Evaluating MT systems ‘Do you speak Spanish?’ References 6 / 49

  7. Language and What goes into a translation Computers Machine Translation Introduction Examples for Translations Background: Dictionaries Some things to note about these examples and thus what Linguistic knowledge based systems we might need to know to translate: Direct transfer systems Interlingua-based systems ◮ Words have to be translated → dictionaries Machine learning based systems ◮ Words are grouped into meaningful units → syntax Alignment Statistical Modeling ◮ Word order can differ from language to languge Phrase-Based Translation What makes MT ◮ The forms of words within a sentence are systematic, hard? Evaluating MT e.g., verbs have to be conjugated, etc. systems References 7 / 49

  8. Language and Different approaches to MT Computers Machine Translation Introduction Examples for Translations Background: Dictionaries Linguistic knowledge We’ll look at some basic approaches to MT: based systems Direct transfer systems ◮ Systems based on linguistic knowledge (Rule-Based Interlingua-based systems MT (RBMT)) Machine learning based systems ◮ Direct transfer systems Alignment Statistical Modeling ◮ Machine learning approaches, i.e., statistical machine Phrase-Based Translation translation (SMT) What makes MT hard? ◮ SMT is the most popular form of MT right now Evaluating MT systems References 8 / 49

  9. Language and Dictionaries Computers Machine Translation Introduction Examples for Translations An MT dictionary differs from a “paper” dictionary: Background: Dictionaries ◮ must be computer-usable (electronic form, indexed) Linguistic knowledge based systems ◮ needs to be able to handle various word inflections Direct transfer systems Interlingua-based systems ◮ can contain (syntactic and semantic) restrictions that a Machine learning based systems word places on other words Alignment Statistical Modeling ◮ e.g., subcategorization information: give needs a giver, Phrase-Based Translation a person given to, and an object that is given What makes MT hard? ◮ e.g., selectional restrictions: if X eats , X must be Evaluating MT animate systems ◮ contains frequency information References ◮ for SMT, may be the only piece of additional information 9 / 49

  10. Language and Direct transfer systems Computers Machine Translation A direct transfer systems consists of: Introduction Examples for Translations ◮ A source language grammar Background: Dictionaries ◮ A target language grammar Linguistic knowledge based systems ◮ Rules relating source language underlying Direct transfer systems Interlingua-based systems representation (UR) to target language UR Machine learning based systems ◮ A direct transfer system has a transfer component Alignment which relates a source language representation with a Statistical Modeling Phrase-Based Translation target language representation. What makes MT ◮ This can also be called a comparative grammar . hard? Evaluating MT We’ll walk through the following French to English example: systems References (3) Londres plaˆ ıt a ` Sam. London is pleasing to Sam ‘Sam likes London.’ 10 / 49

  11. Language and Steps in a transfer system Computers Machine Translation Introduction 1. source language grammar analyzes the input and puts Examples for Translations Background: it into an underlying representation (UR). Dictionaries Londres plaˆ ıt ` a Sam → Londres plaire Sam (source UR) Linguistic knowledge based systems 2. The transfer component relates this source language Direct transfer systems Interlingua-based systems UR (French UR) to a target language UR (English UR). Machine learning based systems French UR English UR Alignment Statistical Modeling X plaire Y ↔ Eng(Y) like Eng(X) Phrase-Based Translation (where Eng(X) means the English translation of X) What makes MT hard? Londres plaire Sam (source UR) → Sam like London Evaluating MT systems (target UR) References 3. target language grammar translates the target language UR into an actual target language sentence. Sam like London → Sam likes London 11 / 49

  12. Language and Notes on transfer systems Computers Machine Translation Introduction Examples for Translations Background: Dictionaries ◮ The transfer mechanism is in theory reversible; e.g., the Linguistic knowledge plaire rule works in both directions based systems ◮ Not clear if this is desirable: e.g., Dutch aanvangen Direct transfer systems Interlingua-based systems should be translated into English as begin , but begin Machine learning should be translated as beginnen . based systems Alignment ◮ Because we have a separate target language grammar, Statistical Modeling Phrase-Based Translation we are able to ensure that the rules of English apply; What makes MT hard? like → likes . Evaluating MT ◮ RBMT systems are still in use today, especially for more systems References exotic language pairs 12 / 49

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend