machine translation
play

Machine Translation e b y Special cases a m y e h t h g - PowerPoint PPT Presentation

Abstraction Elimination of Machine Translation e b y Special cases a m y e h t h g u Exceptions o h t , e t a l s n a Spelling rules Symbolic Methods r t s t e o o n d o t d a Punctuation s h t g


  1. Abstraction Elimination of Machine Translation e b y — Special cases a m y e h t h g u — Exceptions o h t , e t a l s n a Spelling rules Symbolic Methods r t s t e o o n d o t d a Punctuation s h t g n g i n h t i h t e e s m Declensions e h o T s n i Martin Kay d e v Conjugations l o v n i Cases Prepositions Stanford University and Moods The University of the Saarland … Martin Kay Translation—Symbolic Methods 1 Martin Kay Translation—Symbolic Methods 2 Morphographemic Abstraction Morphographemics walking Kind Kinder Kindern walk +ing love loves loving rubbing run runs running rub +ing manger mange mangeons walks try trying tries walk +s tie tying ties Diacritics medico medici tries try +s arco arche Spelling idiosyncracies no longer matter no longer get in the way Martin Kay Translation—Symbolic Methods 3 Martin Kay Translation—Symbolic Methods 4

  2. Morphological Abstraction Morphological Abstraction Masc dem der Sing Dat dogs dog Plural Neut Nom schemata Männer schema Plural Mann Plur Acc Masc Gen children child Plural Acc Sing Gen Singular Dat sheep sheep Jungen Junge Masc Nom Plural Acc Plur Paradigms and exceptions no longer matter Gen Martin Kay Translation—Symbolic Methods 5 Martin Kay Translation—Symbolic Methods 6 Dat Word-level Processes Morphemes vs. Structure Umlauting (Io) sono arrivato I arrived Vowel harmony (Loro) sono arrivati They (You) arrived Shortening Il faut qu'il le fasse He must do it Suffixing Lengthening Qu'il le fasse I hope he does it Prefixing Hans schwimmt gern Hans likes swimming Circumfixing Sie können gern eins nehmen Infixing Reduplication Inflexional morphology Feel free to take one Derivational morphology Word Formation Martin Kay Translation—Symbolic Methods 7 Martin Kay Translation—Symbolic Methods 8

  3. I have to have this injection every week. It is quite painful, so I like to have it done on the weekend. What do you do for exercise? I like swimming I have to have this injection every week. I like to swim It is quite painful, so I like having it done on the weekend. Martin Kay Translation—Symbolic Methods 9 Martin Kay Translation—Symbolic Methods 10 Syntactic Abstraction Syntactic Abstraction They sent the final report to the minister How much abstraction is enough/too much? They sent the minister the final report Information structure The final report, they sent to the minister John gave this perfect stranger a lot of money To the minister they sent the final report John gave a lot of money to this perfect stranger The final report was sent to the minister (by them) Broccoli, I cannot stand! send One thing I cannot stand is broccoli. (past) The more broccoli there is, the less I like it. Agent Patient Ricipient It is Ivan that caused all the trouble in the first place. pro report minister (human) (def) (def) (plur) final Martin Kay Translation—Symbolic Methods 11 Martin Kay Translation—Symbolic Methods 12

  4. Topicalization Other Levels His clever brother always stood in his light Er stand immer im schatten seines klugen What does it mean in English/German? Bruders He will not be here until Monday Er wird erst Montag da sein Cela vous plait? Do you like that? Hans schwimmt gern Hans likes swimming/to swim Martin Kay Translation—Symbolic Methods 13 Martin Kay Translation—Symbolic Methods 14 O I n n o i n n t t h h M M o e e i n n a a c c r r h t t h y y b h h a a ' ' e e s y i s i r r b b b b b n n u u u u u e e s s s s s x x t t t t o o How did you get here? ? ✔ ✔ ✔ ✔ m m e e ? Where did you leave your wallet? ✔ ✔ ✔ ✘ Where shall we put aunt Agatha? ✘ ✔ Where is the fire extinguisher? ? ✔ ✔ ✔ ✘ Where shall I put this cushion? ✔ ✔ Martin Kay Translation—Symbolic Methods 15 Martin Kay Translation—Symbolic Methods 16

  5. Syntax? — Adjective order The Vauquois Triangle Opinion Size Age Shape Color Origin Material Purpose Semantics Fine big old wooden storage boxes Abstraction little blue Mexican model Syntax Funny round meeting room farm vegetable product Morphology How to classify organic Phonology recursive soft running Source Target … ? Martin Kay Translation—Symbolic Methods 17 Martin Kay Translation—Symbolic Methods 18 The Transfer Approach The Vauquois Triangle Semantics Semantics Analyze to some level of abstraction L Syntax Syntax Transfer Transfer Generate Morphology Morphology Synthesis s i s y l a Phonology Phonology n A Source Target Martin Kay Translation—Symbolic Methods 19 Martin Kay Translation—Symbolic Methods 20

  6. Commercial Systems The Vauquois Triangle Do not follow the model closely: Semantics — Levels of abstraction are Abstraction • Not strongly separated Syntax • Are weakly formalized at best — Generation Levels are largely eliminated Transfer Morphology s i s y Commercial systems are almost l a n A entirely deterministic Phonology Aim for speed Source Target Martin Kay Translation—Symbolic Methods 21 Martin Kay Translation—Symbolic Methods 22 The Standard Approach Commercial Systems Rely on — Tuning the lexicon to the domain Shallow, ad hoc — Huge inventories of set phrases Transformer Target Source parse — Selectional restrictions Martin Kay Translation—Symbolic Methods 23 Martin Kay Translation—Symbolic Methods 24

  7. Assessment of the Standard Academic Approaches Approach Semantics Transfer • Robust • Can produce word salad Syntax Synthesis • Ad hoc and hard to maintain s i s Morphology y l a • Bilingual and unidirectional n A Phonology Source Target Martin Kay Translation—Symbolic Methods 25 Martin Kay Translation—Symbolic Methods 26 die dies dying died dye dyes dyeing dyed coax coaxes coaxing coaxed singe singes singeing singed watch watches watching watched develop develops developing developed wash washes washing washed stoop stoops stooping stooped veto vetoes vetoing vetoed enter enters entering entered tie ties tying tied bare bares baring bared ski skis skiing skied hop hops hopping hopped Orthography play plays playing played travel travels traveling traveling travel travels travelling travelled Easy technology ~ finite-state humbug humbugs humbugging humbugged panic panics panicking panicked bus buses bussing bussed English Morphographemics bus buses busing bused hoe hoes hoeing hoed pass passes passing passed buzz buzzes buzzing buzzed Martin Kay Translation—Symbolic Methods 27 Martin Kay Translation—Symbolic Methods 28

  8. define sib [j | s | x | z | s h | c h] ; define consonant [ b | c | d | f | g | h | j | k | l | m | n | p | q | r | s | t | v | w | x | y | z ] ; define Word [[ preamble .o. define vowel [a | e | i | o | u] ; optional .o. define boundary [.#. | % +]; YtoIE .o. define optional [ %? (->) 0] ; IEtoY .o. define YtoIE [ y -> i e || consonant _ EM alpha] ; Einsertion .o. define IEtoY [ i e -> y || _ EM i ] ; gemination .o. define Edeletion1 [ e -> 0 || vowel consonant _ EM vowel ] ; DiacriticDeletion .o. define Edeletion2 [ e EM e -> EM e ] ; Edeletion1 .o. define Einsertion [ [..] -> e || [sib | o] (diacritic) EM _ s EM ] ; Edeletion2 .o. define gemination [ b -> b b, c -> c k, d -> d d, f -> f f, g -> g g, BoundaryDeletion] | 0 ]; l -> l l, m -> m m, n -> n n, p -> p p, r -> r r, s -> s s, t -> t t || vowel _ EM vowel ] ; define DiacriticDeletion [ diacritic -> 0 ] ; define BoundaryDeletion [ [BM | EM] -> 0] ; Martin Kay Translation—Symbolic Methods 29 Martin Kay Translation—Symbolic Methods 30 Morphology Morphology Generally finite-state English Inflexion ~ easy, robust Can be ambiguous, but not all that often Prefix, suffix, infix, circumfix Irregular and supletive forms Ablaut, umlaut, intercalation English Derivation ~ complex, fairly robust agglutinating, polysynthetic languages Most people pretend it is not there Compounding Occasional "syntactic" ambiguities: untiable, undoable. Segmentation ambiguities: unionize Overgeneration: redecomposablizationally Others can be hard Bantu, Finish, Sanskrit ... Martin Kay Translation—Symbolic Methods 31 Martin Kay Translation—Symbolic Methods 32

  9. What to do with Morphology? Deep(?) Syntax • Type/token ratio • Probabilistic Phrase structure/dependency grammar • POS Tag • Dependency parsing • Shallow Syntax • LFG/HPSG/CCG ... — NP Chunking • Deep Syntax Martin Kay Translation—Symbolic Methods 33 Martin Kay Translation—Symbolic Methods 34 Deep Syntax Shallow Parsing • Hugely ambiguous — Gepard: average ambiguity over a corpus of newspaper • Captures local phenomena at best. text (avg. 11.43 words): 78 readings • Fast — essentially finite-state • Not robust • Result may not be grammatical — Language boundary is not well defined — Subcategorization — "Constructions" Martin Kay Translation—Symbolic Methods 35 Martin Kay Translation—Symbolic Methods 36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend