Tecto to AMR and translation Ondˇ rej Bojar, Silvie Cinkov´ a, Ondˇ rej Duˇ sek, Tim O’Gorman, Martin Popel, Roman Sudarikov, Zdeˇ nka Ureˇ sov´ a August 1, 2014 1 / 24
Introduction 2 / 24
Motivation ◮ We are investigating the value of parallel Abstract Meaning Representations (AMRs) ◮ Question 1: How similar are AMRs made in different languages? How do you compare them? ◮ Question 2: How could we get a large corpus of parallel AMRs? 3 / 24
AMRICA ◮ (AMR Inspector with Cross-language Alignment) ◮ Usual evaluation and alignment methods break across languages. ◮ Extension to Smatch (Cai & Knight 2012). 4 / 24
Smatch Classic 5 / 24
Smatch Classic 6 / 24
Smatch Classic 7 / 24
AMRICA 8 / 24
AMRICA 9 / 24
T-layer to AMR conversion ◮ PCEDT: Large parallel corpus (WSJ) annotated with t-layer for English and Czech ◮ T-layer to AMR conversion would provide a large static parallel AMR corpus. ◮ Could be used dynamically to turn a ”t-layer” parser into an AMR parser. 10 / 24
Why this might work ◮ AMR and t-layer are very similar: ◮ Both abstract away from syntax. ◮ Both make all semantic links in a sentence in a graph format. ◮ Both do coreference ◮ Various minor structural differences. ◮ AMR is more abstract, makes more inference. 11 / 24
“Peter is eager to please” be.ENUNC eager-41 PAT arg1 eager arg0 please-01 PAT arg0 ACT please person PAT ACT name #Gen #Cor name coreference op1 Peter Peter 12 / 24
Merging of Coreferent Nodes eager-41 be.ENUNC arg1 PAT arg0 please-01 eager arg0 person ACT PAT name please name ACT PAT op1 Peter #Gen Peter 13 / 24
Elimination of semantically light words eager-41 eager arg1 arg0 please-01 PAT arg0 ACT please person name ACT name op1 Peter Peter 14 / 24
Semantic Roles and Senses eager-41 eager-41 arg1 arg0 please-01 arg1 arg0 arg0 please-01 person name arg0 name op1 Peter Peter 15 / 24
Add Named Entities eager-41 eager-41 arg1 arg1 arg0 please-01 arg0 please-01 arg0 arg0 person person name name name name op1 op1 Peter Peter 16 / 24
Conversion Procedures ◮ Converted t-trees to AMR format ◮ Added named entities using NER systems (Stanford and NameTag) ◮ Tried two strategies for doing more complex changes to the graphs: ◮ PML-TQ ◮ Tsurgeon ◮ List-based verbalization and semantic role mapping 17 / 24
PML-TQ rules ◮ Based on AMR guidelines (generalized) ◮ For copula, attributes, non-core roles . . . A PML-TQ rule LHS (PML-TQ Query) conditions on a t-subtree t-node Ib_DEL RHS (AMR Subtree) t_lemma{in{{}be},{}become},{}remain}} r conditions on surface t-node Ib2 t-node Ir ARG0 ARG1 functor{={}ACT} functor{={}PAT} formeme{~{}n:.*} formeme{={}adj:compl} a-node tag{={}IN} b2 w t-node Iw functor{={}PAT} Guidelines example: The boy is responsible for the work. 18 / 24
PML-TQ rules Rule application t-tree zone=en n/nervous Conversion result ARG0 ARG1 be.enunc p2/person p/presentation PREDv:fin was name Ondrej nervous n2/name ACTn:subj PATadj:compl Ondrej nervous op1 Matching t-tree presentation "Ondrej" PATn:about+X about the presentation Matching sentence: Ondrej was nervous about the presentation. 19 / 24
Tsurgeon tree transformation rules ◮ We converted to constituency trees so as to use a tree tranformation tool, Tsurgeon (Levy and Andrews 2006) to quickly implement hand-written rules. 20 / 24
Tsurgeon tree transformation rules ◮ Many of the structural differences are just notational differences: eat eat CONJ arg1 and PAT PAT and op1 op2 banana apple banana apple 21 / 24
List-based Methods ◮ Verbalizations are based on dictionary look-ups: ◮ beekeeper → person :ARG0-of keep-01 :ARG1 bee ◮ As are complex predications: give ACT CPHR bless John blessing APP ACT PAT Mary John Mary 22 / 24
Using Existing Resources Other WSJ annotation Propbank Lexicon Lexical Lists Vallex Map t-layer roles to AMR roles X X X Verbalize nouns/adjectives X X Introduce inferrable predicates X Named Entities X X 23 / 24
Results of EN t-to-AMR Conv Semantic Role Mapping Smatch w/o senses Verbalization Lists Named Entities Smatch Baseline (direct conversion) 20 28 Baseline (direct conversion) X 33 41 Baseline (direct conversion) X X 37 45 Baseline (direct conversion) X X X 40 48 PML-TQ (guidelines-based) X X 35 43 PML-TQ (guidelines-based) X X X 38 47 Tsurgeon (rule-based) X X X 44 52 JAMR 44 45 24 / 24
Recommend
More recommend