machine translation
play

Machine Translation Steps: Analysis, Transfer, Generation Classical - PowerPoint PPT Presentation

Session 2: Syntactic Transfer Syntactic Transfer Machine Translation Steps: Analysis, Transfer, Generation Classical and Statistical Approaches How are the various types of divergence dealt with? Session 2: Syntactic Transfer


  1. Session 2: Syntactic Transfer � Syntactic Transfer Machine Translation � Steps: Analysis, Transfer, Generation – Classical and Statistical Approaches � How are the various types of divergence dealt with? Session 2: Syntactic Transfer � For lab exercise: Quick Prolog Intro/Recap � Basic Prolog Terminology and Syntax Jonas Kuhn Universität des Saarlandes, Saarbrücken � Lists and Definite Clause Grammars (DCGs) The University of Texas at Austin jonask@coli.uni-sb.de DGfS/CL Fall School 2005, Ruhr-Universität Bochum, September 19-30, 2005 Jonas Kuhn: MT 2 Syntactic Transfer Syntactic Transfer: Resources � Translation process is governed by three sets of rules Syntactic Syntactic Structure Structure � Standard grammar specification for source language analysis (e.g., context-free grammars) String String � Transfer “grammar”: Transformation rules � Source-language syntactic analysis: � Include translation variables (e.g., tv(X) in Trujillo’s construct SL analysis tree tree notation) � Transfer: Tree-to-tree transformations applied � Set of transformation rules will be applied recursively recursively to SL tree: construct TL tree to each occurrence of translation variables � recursive, non-deterministic top-down process � Standard grammar specification for target language � (No syntactic generation required in TL) generation � Morphological generation � Consolidation: Applying TL grammar constraints to the TL structure to enforce grammaticality (and fill in underspecified values) Jonas Kuhn: MT 3 Jonas Kuhn: MT 4

  2. Transfer grammar Example Tree-to-tree transformation NP NP �� tv(X) tv(Y) tv(X) tv(Y) N1 N1 �� English grammar Adj N N Adj NP � Det N1 Spanish N1 � Adj N tv(A) tv(B) tv(B) tv(A) grammar Det � a NP � Det N1 Adj � delicious Det Det delicious �� deliciosa N1 � N Adj �� N � soup a una soup �� sopa Det � una Adj � deliciosa N � sopa NP NP N1 N1 Det Adj N Det Adj N a delicious soup una sopa deliciosa Jonas Kuhn: MT 5 Jonas Kuhn: MT 6 Transformations: Prolog notation Divergences in syntactic transfer � Actual Prolog code by Trujillo (slightly different structural � Thematic divergence analysis than in text book) � En: You like her � We will come back to the details of this notation… � Sp: Ella te gusta [np|_]/_ dtrs [ DetE, N1E ] <==> [np|_]/_ dtrs [ DetS, N1S ] :- DetE <==> DetS, N1E <==> N1S. [n1|_]/_ dtrs [ [ap|_]/_ dtrs [ AdjvE ], [n1|_]/_ dtrs [ NE ]] <==> [n1|_]/_ dtrs [ [n1|_]/_ dtrs [ NS ], [ap|_]/_ dtrs [ AdjvS ]] :- AdjvE <==> AdjvS, NE <==> NS. [n|_]/soup <==> [n|_]/sopa. [adjv|_]/delicious <==> [adjv|_]/deliciosa. [det|_]/a <==> [det|_]/una. Jonas Kuhn: MT 7 Jonas Kuhn: MT 8

  3. Divergences in syntactic transfer Divergences in syntactic transfer � Head switching � Structural � En: The baby just ate � En: Luisa entered the house � Sp: El bebé acaba de comer � Sp: Luisa entró a la casa Jonas Kuhn: MT 9 Jonas Kuhn: MT 10 Divergences in syntactic transfer Divergences in syntactic transfer � Categorial � Lexical gaps (conflational divergence) � En: a little bread � En: Camillo got up early � Sp: un poco de pan � Sp: Camillo madrugó Jonas Kuhn: MT 11 Jonas Kuhn: MT 12

  4. Divergences in syntactic transfer Divergences in syntactic transfer � Lexicalization (lexical divergence) � Collocational � En: Susan swam across the channel � En: Jan made a decision � Sp: Susan cruzó el canal nadando � Sp: Jan tomó/*hizó una decisión Jonas Kuhn: MT 13 Jonas Kuhn: MT 14 Divergences in syntactic transfer Quick Prolog Intro/Recap � Compare: Blackburn, Bos & Striegnitz: Learn Prolog � Idiomatic Now! � En: Socrates kicked the bucket [www.coli.uni-sb.de/~kris/learn-prolog-now/] � Sp: Socrates estiró la pata � Public domain compiler SWI Prolog � Developed since 1987 at the University of Amsterdam, The Netherlands � http://www.swi-prolog.org/ � Available for MS-Windows, Mac, and Linux � Logic programming, i.e., a Prolog program is (mostly) not a sequence of commands, but a set of facts and rules used to prove or refute new facts Jonas Kuhn: MT 15 Jonas Kuhn: MT 16

  5. Interpreter and knowledge base Terminology � How we communicate with the system � Knowledge base: � Knowledge base (file we can edit) � Facts woman(mia). � Rules playsAirGuitar(jody). Inference rules to derive new facts from given facts listensToMusic(yolanda) :- happy(yolanda). Read: “If Yolanda is happy, then she listens to music.” � Interpreter (shell in which we can type queries) � ?- � Facts and rules define predicates ?- woman(mia). Examples: happy, listensToMusic � Yes � Interpreter: � In order to use a knowledge base, we have to load or consult one � Query ?- [’my-knowledge-base-file.pl’]. Clause for which we ask: is there a proof from the � With SWI running under MS Windows, the File menu can be used � knowledge base? � To quit the interpreter at the end of your session type ?- halt. Jonas Kuhn: MT 17 Jonas Kuhn: MT 18 Prolog rules Variables � A predicate definition may consist of several clauses � Capitalized identifiers are interpreted as variables (undergoing unification) � Disjunctive interpretation playsAirGuitar(butch):- woman(mia). happy(butch). woman(jody). playsAirGuitar(butch):- woman(yolanda). listensToMusic(butch). � Each clause ends in a period loves(vincent,mia). � The condition part (right-hand side) of a rule may loves(marcellus,mia). loves(pumpkin,honey_bunny). contain several term loves(honey_bunny,pumpkin). ?- woman(X). � Conjunctive interpretation playsAirGuitar(vincent):- X=mia jealous(X,Y) :- listensToMusic(vincent), ; happy(vincent). loves(X,Z), loves(Y,Z). X=jody � The consequence part (left-hand side) may only Hitting semicolon tells Prolog ?- jealous(marcellus,W). to find alternative solutions contain one term W=vincent � Backtracking Jonas Kuhn: MT 19 Jonas Kuhn: MT 20

  6. Variables Variables � The match predicate “ = ” can be used to state that two � Special variable: _ (the “anonymous” things are the same variable) jealous(X,Y) :- � Can match any arbitrary value – even if used loves(X,U), loves(Y,V), U=V. � Normally, variables are simply re-used in predicate several times in the same clause! definitions in order to express that two argument positions have to be the same in_love(X) :- � When a variable is used just once, this is often due to a typo loves(X,_). � Prolog will issue a warning for variables used only once in a clause � To suppress the warning, a leading underscore can be used in_love(X) :- loves(X,_Someone). Jonas Kuhn: MT 21 Jonas Kuhn: MT 22 Prolog survival guide Prolog lists � Clauses (facts/rules/queries) end in a period � Important data structure for linguistic tasks � Uppercase identifiers are variables, functors/atoms � List elements can be enumerated within brackets have to start with a lowercase letter! [fred, ann, pete] � Prolog variables are logical variables tied to a � Special case: the empty list: [] particular value within the scope of a clause (unlike variables in other programming languages where values of variables can be changed) � For flexible access to list elements, Prolog has a built-in operator for decomposing lists into head and tail: the “ |” operator � Don’t forget consulting your knowledge base (and re- consulting after making changes) � To exit the Prolog interpreter type ?- [ X | Y ] = [fred, ann, pete] ?- halt. X = fred (and don’t forget the period!) Y = [ann, pete] Jonas Kuhn: MT 23 Jonas Kuhn: MT 24

  7. Prolog lists Built-in list predicates � Lists are typically manipulated in recursive predicates � Some important, generic list predicates are predefined in most Prolog versions (“built-in”) � member/2 trans(eins,one). � member(X,L) is true if and only if X is an element of the list L trans(zwei,two). � Examples: member(b,[a,b,c]), trans(drei,three). member([2,3],[1,[2,3]]) � append/3 trans_list([],[]). � append(L1,L2,L3) is true if and only if L3 is the concatenation of lists L1 and L2 trans_list([H|T],[H1|T1]) :- � Examples: append([a],[b,c],[a,b,c]), trans(H,H1), append([],[1,2],[1,2]) trans_list(T,T1). � reverse/2 � reverse(L1,L2) is true if and only if L1 is the reversed version of list L2 � Example application: � Examples: reverse([a,b,c],[c,b,a]) ?- trans_list([zwei,eins,drei],X). � length/2 X = [two,one,three] length(L,N) is true if and only if the integer N is the length (number of � elements) of list L Examples: length([a,b,c],3), length([],0) � Jonas Kuhn: MT 25 Jonas Kuhn: MT 26 Definite Clause Grammars (DCGs) Definite Clause Grammars (DCGs) � Simple built-in grammar formalism � Internally, the rewrite rule notation is compiled out as follows (using a “difference � Rewrite rules for (augmented) context-free list notation” for phrase coverage): grammars s(X,Z) :- np(X,Y), vp(Y,Z). s --> np, vp. np(X,Z) :- det(X,Y), n(Y,Z). np --> det, n. vp(X,Z) :- v(X,Y), np(Y,Z). vp --> v, np. vp(X,Z) :- v(X,Z). vp --> v. det([the|T],T). det --> [the]. n([dog|T],T). n --> [dog]. v([barks|T],T). v --> [barks]. Jonas Kuhn: MT 27 Jonas Kuhn: MT 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend