desiging and improving frmg a wide coverage french meta
play

Desiging and improving FRMG, a wide coverage French meta-grammar - PowerPoint PPT Presentation

Desiging and improving FRMG, a wide coverage French meta-grammar http://alpage.inria.fr ric de la Clergerie < Eric.De_La_Clergerie@inria.fr > INRIA Paris-Rocquencourt / Univ. Paris Diderot A Corua 9 de octubre 2012 INRIA INRIA


  1. Desiging and improving FRMG, a wide coverage French meta-grammar http://alpage.inria.fr Éric de la Clergerie < Eric.De_La_Clergerie@inria.fr > INRIA Paris-Rocquencourt / Univ. Paris Diderot A Coruña – 9 de octubre 2012 INRIA INRIA É. de la Clergerie FRMG 2012/10/09 1 / 49

  2. Introduction FRMG is a wide coverage French meta-grammar fast initial development in 2005 (EASy campaign) continuous improvement since then now usable at large scale with good coverage and accuracy online demo at http://alpage.inria.fr/parserdemo note: existence of SPMG for Spanish (Victoria project) Objectives of this talk: provide some background on TAGs, tree factoring and meta-grammars 1 present FRMG 2 illustrate the descriptive power of meta-grammars on a class of 3 complements present some results on the improvement of FRMG 4 INRIA INRIA É. de la Clergerie FRMG 2012/10/09 2 / 49

  3. Outline Designing 1 Using 2 Improving 3 INRIA INRIA É. de la Clergerie FRMG 2012/10/09 3 / 49

  4. Pro and cons of TAGs Tree Adjoining Grammars provide the advantage of extended domain of locality: subcategorization frames long distance dependencies (extractions/movements) but the drawback is an explosion of the number of trees: several (tens of) thousands ⇒ efficiency problems during parsing = many common sub-trees ⇒ development and maintenance problems = S S’ NP t.wh=+ ↓ NP0 ↓ NP1 VP ⋆ NP S’ S que ↓ NP1 V S ↓ NP0 VP ↓ NP0 ✸ v VP V V ✸ v mange INRIA ✸ v INRIA É. de la Clergerie FRMG 2012/10/09 4 / 49

  5. Solutions For FRMG , the choice is to use factoring operators within trees but difficult to directly write complex factorized trees to use a Metagrammar modular and factorized description of syntactic phenomena ❀ generation of factorized trees from the descriptions INRIA INRIA É. de la Clergerie FRMG 2012/10/09 5 / 49

  6. Tree factoring Principe : combining several trees into a single one, sharing common subparts several traversal paths in a tree (Harbush) or using regular operators within trees (D Y AL OG ): disjunctions T [ t 1 ; t 2 ] ≡ T [ t 1 ] ∪ T [ t 2 ] repetitions (Kleene Stars) T [ t @ ∗ ] ≡ { T [ ǫ ] , T [ t ] , T [( t , t )] , . . . } interleaving (free ordering within sequences) ( t 1 , t 2 )## t 3 ≡ ( t 1 , t 2 , t 3 ; t 1 , t 3 , t 2 ; t 3 , t 1 , t 2 ) optionality t ? ≡ ( t ; ǫ ) guards (node with guards) T [ G + , t ; G − ] ≡ T [ t ] .σ + ∪ T [ ǫ ] .σ − guards: Boolean formulae on equations on feature path and values Tree factoring does not change the expressive power or the complexity of TAGs but unfactoring = ⇒ exponential increase of the grammar used directly by D Y AL OG INRIA INRIA É. de la Clergerie FRMG 2012/10/09 6 / 49

  7. Disjunction Several possible realizations for the subject (NP , cln, S, . . . ) S S S ↓ NP0 ↓ S VP ⋄ cln VP VP t.mode=inf ↓ NP1 ↓ NP1 ↓ NP1 V V V ✸ v ✸ v ✸ v ⇓ S alternative | VP ↓ NP0 ↓ S ↓ NP1 ⋄ cln V t.mode=inf ✸ v INRIA INRIA É. de la Clergerie FRMG 2012/10/09 7 / 49

  8. Guards In French, the subject may be missing, under conditions V.top.mode = ¬ inf|imp|part S V.top.mode = inf|imp|part | VP ↓ NP0 ↓ NP1 ⋄ cln ↓ S V ✸ v INRIA INRIA É. de la Clergerie FRMG 2012/10/09 8 / 49

  9. Free ordering The verbal arguments are not ordered (rough approximation) S | VP free order ↓ NP0 ↓ S ⋄ cln V ## ↓ NP1 ↓ PP2 ✸ v Unfactoring: ( 1 no subj + 3 subj ) ∗ ( 1 no arg + 2 1 arg + 2 2 args ) = 20 trees INRIA INRIA É. de la Clergerie FRMG 2012/10/09 9 / 49

  10. Repetition (Kleene star) Natural description of coordination by repetition: NP ↓ NP3 ⋆ NP1 @* ✸ coo , ↓ NP2 INRIA INRIA É. de la Clergerie FRMG 2012/10/09 10 / 49

  11. Meta-Grammars Definition (Meta-grammar) Modular Description with classes grouping constraints, and with inheritance INRIA INRIA É. de la Clergerie FRMG 2012/10/09 11 / 49

  12. Meta-Grammars Definition (Meta-grammar) Modular Description with classes grouping constraints, and with inheritance Inheritance ( <: ) class collect_real_subject_canonical { Constraints < : c o l l e c t _ r e a l _ s u b j e c t ; ◮ dominance (>> et >>+) $arg . extracted = value (~ c l e f t ) ; ◮ precedence (<) S >> VSubj ; V >> psubj ; ◮ equality (=) VSubj < V; VMod < psubj ; ◮ Decorations (FS) node psubj : [ cat :N2, id : subject , ⋆ nodes top : [ wh: − , sat : + ] ] ; ⋆ class − psubj : : agreement ; psubj = psubj : : N; ◮ Eq. between pathes ( . ) psubj => ⋆ node ( node psubj ) node ( I n f l ) . bot . inv = value (+) , ⋆ class ( desc ) $arg . extracted = value ( − ) , ⋆ var ( $arg ) $arg . real = value (N2) , Resources + / Needs − desc . e x t r a c t i o n = value (~ − ) , ◮ Namespace ( :: ) node (V) . top .mode= value (~ i n f | imp | . . . ) ; ~psubj=> node ( I n f l ) . bot . inv = value (~+) ; Guards ( => ) } INRIA INRIA É. de la Clergerie FRMG 2012/10/09 11 / 49

  13. Compiling Meta-Grammars MGCOMP , developed with D Y AL OG Step 1: Terminal classes Constraint inheritance by the terminal classes (+ constraint checking) Step 2: Neutral classes Crossing of terminal classes to neutralize resources & needs ◮ C 1 [ − R ∪ K 1 ] xC 2 [+ R ∪ K 2 ] = ( C 1 xC 2 )[= R ∪ K 1 ∪ K 2 ] ◮ (Namespace) = ⇒ import producing classe with renaming C 1 [ − N :: R ∪ K 1 ] xC 2 [+ R ∪ K 2 ] = ( C 1 xN :: C 2 )[= N :: R ∪ K 1 ∪ N :: K 2 ] Guard reduction (whenever possible) Constraint checking Step 3: TAG Trees Use of constraints of neutral classes to build trees underspecified precedence between sibling nodes = ⇒ interleaving INRIA INRIA É. de la Clergerie FRMG 2012/10/09 12 / 49

  14. FRMG: A French meta-grammar Verbe subcategorization : subject subj , attribute acomp , object , vcomp , scomp , wh-comp , prep-vcomp , prep-scomp prep-object , prep-acomp at most 3 arguments (subject incl.) Aux. verbs, control verbs Various realizations (NP , clitics, infinitive, completive, . . . ) and subject position (pre, post, post-clitics) extraction of arguments and adjuncts (wh, relatives, clefted, topicalisation) active and passive voices coordination (without ellipses), comparatives, superlatives verb/sentence modifiers (with incises) at various positions (participle sentences, PP , adv, . . . ), «support» verbs (prendre conscience de) punctuation INRIA INRIA É. de la Clergerie FRMG 2012/10/09 13 / 49

  15. An use case: handling clause complements A large and heterogeneous class of complements, difficult to characterize: they are modifiers bringing information on many aspects (tense, duration, space, manner, cause, intensity, quantity, . . . ) they have many realizations: adverbs, prep. phrases, conjonctive subordinates, participials, adjectival phrases, (idiomatic) clauses, concessives, . . . they are mobiles in a clause, with several possible anchoring points (clauses, coordination, prepositions, event nouns, . . . ) they are parenthesables (parenthesis, coma, dash, . . . ), in a more or less mandatory way depending of the complement and position they include idiomatic constructions, and are often semantically restricted (tense, space, bodypart, . . . ) ⇒ importance of semantic features provided by the lexicon (L EFFF ) = INRIA INRIA É. de la Clergerie FRMG 2012/10/09 14 / 49

  16. A few illustrative sentences covered by FRMG Il a parfois envie de partir. Désormais , il veut partir. Avec son ami , il a décidé de partir sans tarder . Il est arrivé pendant que tu parlais . Sa société n’allant pas bien , il doit la vendre. Il prend le train, soucieux des deniers publics . Il a, le premier , fini l’exercice. Mains sur la tête , il recule contre le mur. Couloir de droite , vous avez la classe de Mr Louis. Vous trouverez, chapitre 22 , les explications nécessaires à ce devoir. Il attend, rue des Bourdonnais , que ses amis arrivent. Il a, lui aussi , décidé de partir. Il est parti, il n’y a pas deux jours , avec des amis. Il est parti, voici deux semaines , avec des amis. Service oblige , je dois vous quitter. Il a, quoi que tu en penses , toutes ses chances. Paul a, plus que son frère , le sens de la famille. Il mange une pomme et, parfois , une poire. Il part, avec, toutefois , une pointe de regret. L ’annonce, ce matin , d’un remaniement a surpris tous les commentateurs. INRIA INRIA É. de la Clergerie FRMG 2012/10/09 15 / 49

  17. TAG auxiliary trees Clause complements are modifiers, hence represented by TAG auxiliary trees, with a foot node sharing a common category with the root node. class a u x i l i a r y { % Class f o r TAG a u x i l i a r y trees Root >>+ Foot ; % root dominates foot node ( Root ) . cat = node ( Foot ) . cat ; % same cat . on foot and root node Root : [ id : Root ] ; node Foot : [ id : Foot ] ; node ( Foot ) . type = value ( foot ) ; node ( Root ) . type = value ( std ) ; node ( Foot ) . top = node ( Foot ) . bot ; } $ x id=Root ⋆ $ x id=Foot,top=_1,bot=_1 INRIA INRIA É. de la Clergerie FRMG 2012/10/09 16 / 49

  18. shallow auxiliary Actually, shallow aux. trees are sufficient: the root node is a parent of the foot node class shallow_auxiliary { < : a u x i l i a r y ; + shallow_auxiliary ; % % provide f u n c t i o n a l i t y Root >> Foot ; % % root i s parent of foot } $ x ⋆ $ x INRIA INRIA É. de la Clergerie FRMG 2012/10/09 17 / 49

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend