Desiging and improving FRMG, a wide coverage French meta-grammar - PowerPoint PPT Presentation

Desiging and improving FRMG, a wide coverage French meta-grammar http://alpage.inria.fr Éric de la Clergerie < Eric.De_La_Clergerie@inria.fr > INRIA Paris-Rocquencourt / Univ. Paris Diderot A Coruña – 9 de octubre 2012 INRIA INRIA É. de la Clergerie FRMG 2012/10/09 1 / 49

Introduction FRMG is a wide coverage French meta-grammar fast initial development in 2005 (EASy campaign) continuous improvement since then now usable at large scale with good coverage and accuracy online demo at http://alpage.inria.fr/parserdemo note: existence of SPMG for Spanish (Victoria project) Objectives of this talk: provide some background on TAGs, tree factoring and meta-grammars 1 present FRMG 2 illustrate the descriptive power of meta-grammars on a class of 3 complements present some results on the improvement of FRMG 4 INRIA INRIA É. de la Clergerie FRMG 2012/10/09 2 / 49

Outline Designing 1 Using 2 Improving 3 INRIA INRIA É. de la Clergerie FRMG 2012/10/09 3 / 49

Pro and cons of TAGs Tree Adjoining Grammars provide the advantage of extended domain of locality: subcategorization frames long distance dependencies (extractions/movements) but the drawback is an explosion of the number of trees: several (tens of) thousands ⇒ efficiency problems during parsing = many common sub-trees ⇒ development and maintenance problems = S S’ NP t.wh=+ ↓ NP0 ↓ NP1 VP ⋆ NP S’ S que ↓ NP1 V S ↓ NP0 VP ↓ NP0 ✸ v VP V V ✸ v mange INRIA ✸ v INRIA É. de la Clergerie FRMG 2012/10/09 4 / 49

Solutions For FRMG , the choice is to use factoring operators within trees but difficult to directly write complex factorized trees to use a Metagrammar modular and factorized description of syntactic phenomena ❀ generation of factorized trees from the descriptions INRIA INRIA É. de la Clergerie FRMG 2012/10/09 5 / 49

Tree factoring Principe : combining several trees into a single one, sharing common subparts several traversal paths in a tree (Harbush) or using regular operators within trees (D Y AL OG ): disjunctions T [ t 1 ; t 2 ] ≡ T [ t 1 ] ∪ T [ t 2 ] repetitions (Kleene Stars) T [ t @ ∗ ] ≡ { T [ ǫ ] , T [ t ] , T [( t , t )] , . . . } interleaving (free ordering within sequences) ( t 1 , t 2 )## t 3 ≡ ( t 1 , t 2 , t 3 ; t 1 , t 3 , t 2 ; t 3 , t 1 , t 2 ) optionality t ? ≡ ( t ; ǫ ) guards (node with guards) T [ G + , t ; G − ] ≡ T [ t ] .σ + ∪ T [ ǫ ] .σ − guards: Boolean formulae on equations on feature path and values Tree factoring does not change the expressive power or the complexity of TAGs but unfactoring = ⇒ exponential increase of the grammar used directly by D Y AL OG INRIA INRIA É. de la Clergerie FRMG 2012/10/09 6 / 49

Disjunction Several possible realizations for the subject (NP , cln, S, . . . ) S S S ↓ NP0 ↓ S VP ⋄ cln VP VP t.mode=inf ↓ NP1 ↓ NP1 ↓ NP1 V V V ✸ v ✸ v ✸ v ⇓ S alternative | VP ↓ NP0 ↓ S ↓ NP1 ⋄ cln V t.mode=inf ✸ v INRIA INRIA É. de la Clergerie FRMG 2012/10/09 7 / 49

Free ordering The verbal arguments are not ordered (rough approximation) S | VP free order ↓ NP0 ↓ S ⋄ cln V ## ↓ NP1 ↓ PP2 ✸ v Unfactoring: ( 1 no subj + 3 subj ) ∗ ( 1 no arg + 2 1 arg + 2 2 args ) = 20 trees INRIA INRIA É. de la Clergerie FRMG 2012/10/09 9 / 49

Repetition (Kleene star) Natural description of coordination by repetition: NP ↓ NP3 ⋆ NP1 @* ✸ coo , ↓ NP2 INRIA INRIA É. de la Clergerie FRMG 2012/10/09 10 / 49

Meta-Grammars Definition (Meta-grammar) Modular Description with classes grouping constraints, and with inheritance INRIA INRIA É. de la Clergerie FRMG 2012/10/09 11 / 49

Meta-Grammars Definition (Meta-grammar) Modular Description with classes grouping constraints, and with inheritance Inheritance ( <: ) class collect_real_subject_canonical { Constraints < : c o l l e c t _ r e a l _ s u b j e c t ; ◮ dominance (>> et >>+) $arg . extracted = value (~ c l e f t ) ; ◮ precedence (<) S >> VSubj ; V >> psubj ; ◮ equality (=) VSubj < V; VMod < psubj ; ◮ Decorations (FS) node psubj : [ cat :N2, id : subject , ⋆ nodes top : [ wh: − , sat : + ] ] ; ⋆ class − psubj : : agreement ; psubj = psubj : : N; ◮ Eq. between pathes ( . ) psubj => ⋆ node ( node psubj ) node ( I n f l ) . bot . inv = value (+) , ⋆ class ( desc ) $arg . extracted = value ( − ) , ⋆ var ( $arg ) $arg . real = value (N2) , Resources + / Needs − desc . e x t r a c t i o n = value (~ − ) , ◮ Namespace ( :: ) node (V) . top .mode= value (~ i n f | imp | . . . ) ; ~psubj=> node ( I n f l ) . bot . inv = value (~+) ; Guards ( => ) } INRIA INRIA É. de la Clergerie FRMG 2012/10/09 11 / 49

Compiling Meta-Grammars MGCOMP , developed with D Y AL OG Step 1: Terminal classes Constraint inheritance by the terminal classes (+ constraint checking) Step 2: Neutral classes Crossing of terminal classes to neutralize resources & needs ◮ C 1 [ − R ∪ K 1 ] xC 2 [+ R ∪ K 2 ] = ( C 1 xC 2 )[= R ∪ K 1 ∪ K 2 ] ◮ (Namespace) = ⇒ import producing classe with renaming C 1 [ − N :: R ∪ K 1 ] xC 2 [+ R ∪ K 2 ] = ( C 1 xN :: C 2 )[= N :: R ∪ K 1 ∪ N :: K 2 ] Guard reduction (whenever possible) Constraint checking Step 3: TAG Trees Use of constraints of neutral classes to build trees underspecified precedence between sibling nodes = ⇒ interleaving INRIA INRIA É. de la Clergerie FRMG 2012/10/09 12 / 49

FRMG: A French meta-grammar Verbe subcategorization : subject subj , attribute acomp , object , vcomp , scomp , wh-comp , prep-vcomp , prep-scomp prep-object , prep-acomp at most 3 arguments (subject incl.) Aux. verbs, control verbs Various realizations (NP , clitics, infinitive, completive, . . . ) and subject position (pre, post, post-clitics) extraction of arguments and adjuncts (wh, relatives, clefted, topicalisation) active and passive voices coordination (without ellipses), comparatives, superlatives verb/sentence modifiers (with incises) at various positions (participle sentences, PP , adv, . . . ), «support» verbs (prendre conscience de) punctuation INRIA INRIA É. de la Clergerie FRMG 2012/10/09 13 / 49

An use case: handling clause complements A large and heterogeneous class of complements, difficult to characterize: they are modifiers bringing information on many aspects (tense, duration, space, manner, cause, intensity, quantity, . . . ) they have many realizations: adverbs, prep. phrases, conjonctive subordinates, participials, adjectival phrases, (idiomatic) clauses, concessives, . . . they are mobiles in a clause, with several possible anchoring points (clauses, coordination, prepositions, event nouns, . . . ) they are parenthesables (parenthesis, coma, dash, . . . ), in a more or less mandatory way depending of the complement and position they include idiomatic constructions, and are often semantically restricted (tense, space, bodypart, . . . ) ⇒ importance of semantic features provided by the lexicon (L EFFF ) = INRIA INRIA É. de la Clergerie FRMG 2012/10/09 14 / 49

A few illustrative sentences covered by FRMG Il a parfois envie de partir. Désormais , il veut partir. Avec son ami , il a décidé de partir sans tarder . Il est arrivé pendant que tu parlais . Sa société n’allant pas bien , il doit la vendre. Il prend le train, soucieux des deniers publics . Il a, le premier , fini l’exercice. Mains sur la tête , il recule contre le mur. Couloir de droite , vous avez la classe de Mr Louis. Vous trouverez, chapitre 22 , les explications nécessaires à ce devoir. Il attend, rue des Bourdonnais , que ses amis arrivent. Il a, lui aussi , décidé de partir. Il est parti, il n’y a pas deux jours , avec des amis. Il est parti, voici deux semaines , avec des amis. Service oblige , je dois vous quitter. Il a, quoi que tu en penses , toutes ses chances. Paul a, plus que son frère , le sens de la famille. Il mange une pomme et, parfois , une poire. Il part, avec, toutefois , une pointe de regret. L ’annonce, ce matin , d’un remaniement a surpris tous les commentateurs. INRIA INRIA É. de la Clergerie FRMG 2012/10/09 15 / 49

TAG auxiliary trees Clause complements are modifiers, hence represented by TAG auxiliary trees, with a foot node sharing a common category with the root node. class a u x i l i a r y { % Class f o r TAG a u x i l i a r y trees Root >>+ Foot ; % root dominates foot node ( Root ) . cat = node ( Foot ) . cat ; % same cat . on foot and root node Root : [ id : Root ] ; node Foot : [ id : Foot ] ; node ( Foot ) . type = value ( foot ) ; node ( Root ) . type = value ( std ) ; node ( Foot ) . top = node ( Foot ) . bot ; } $ x id=Root ⋆ $ x id=Foot,top=_1,bot=_1 INRIA INRIA É. de la Clergerie FRMG 2012/10/09 16 / 49

shallow auxiliary Actually, shallow aux. trees are sufficient: the root node is a parent of the foot node class shallow_auxiliary { < : a u x i l i a r y ; + shallow_auxiliary ; % % provide f u n c t i o n a l i t y Root >> Foot ; % % root i s parent of foot } $ x ⋆ $ x INRIA INRIA É. de la Clergerie FRMG 2012/10/09 17 / 49

Desiging and improving FRMG, a wide coverage French meta-grammar - PowerPoint PPT Presentation

Desiging and improving FRMG, a wide coverage French meta-grammar http://alpage.inria.fr ric de la Clergerie < Eric.De_La_Clergerie@inria.fr > INRIA Paris-Rocquencourt / Univ. Paris Diderot A Corua 9 de octubre 2012 INRIA INRIA

Gary Laird Committee member IOSH Fire Risk Management Group (FRMG) frmg@ioshnetworks.co.uk A

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

Neatening sketched strokes using piecewise French Curves James McCrae, Karan Singh French Curves

ADE Formats Primer Katie Pierce Meyer Tim Walsh Aliza Leventhal Desiging the Future Landscape:

Statistical analysis of meta-omics data Sandra Plancade INRA (French Institute of Research in

META Seal of Recognition and META Prize Award Ceremony Georg Rehm (DFKI) on behalf of the

French People - who we are French People - Studio of design, situated in Shanghai former French

Introduction to French Business Culture 1 IHRM French Business Culture Agenda The

The French baccalaureate until 2020 1. What is the French Baccalaureate or the Bac? The French

Data Flow Coverage 1 Stuart Anderson Stuart Anderson Data Flow Coverage 1 2011 c 1 Why

Logic-based test coverage Basic approach Clauses and predicates Basic coverage criteria: CC, PC,

Coverage-Oriented Verification Coverage-Oriented Verification of Banias of Banias Alon Gluska

WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for

Small UAV A French MoD perspective and planning French ISTAR segmentation French ISTAR

French or Spanish? World Languages French Ms. Kostolecki Spanish Mr. Draper Mr. Morreale

French/UK Grid Grid Workshop Workshop French/UK Grid Workshop French/UK London, November

Mandarin Physical Contact Verbs: a Frame-based Constructional Approach Meichun Liu, Tianqi He,

Information Extraction and Question-Answering Systems Foundations and methods Dr. Gnter

Semantic Roles & Labeling LING 571 Deep Processing in NLP November 18, 2019 Shane

The Metagrammar Goes Multilingual A Crosslinguistic Look at the V2 Phenomenon Owen Rambow

User Needs and Requirements Analysis for Big Data Healthcare Applications Sonja Zillner, Siemens

Cost Friendly savings notification More timely (paper, despite delivery of geographical

Introduction & Language Guessing Data Structures and Algorithms for CL III, WS 2019-2020

Authentication Authentication September 11, 2020 Administrative new VM new VM

Sambuz

Useful Links

Newsletter

Mail Us

Desiging and improving FRMG, a wide coverage French meta-grammar - PowerPoint PPT Presentation

Desiging and improving FRMG, a wide coverage French meta-grammar http://alpage.inria.fr ric de la Clergerie < Eric.De_La_Clergerie@inria.fr > INRIA Paris-Rocquencourt / Univ. Paris Diderot A Corua 9 de octubre 2012 INRIA INRIA

Gary Laird Committee member IOSH Fire Risk Management Group (FRMG) frmg@ioshnetworks.co.uk A

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

Neatening sketched strokes using piecewise French Curves James McCrae, Karan Singh French Curves

ADE Formats Primer Katie Pierce Meyer Tim Walsh Aliza Leventhal Desiging the Future Landscape:

Statistical analysis of meta-omics data Sandra Plancade INRA (French Institute of Research in

META Seal of Recognition and META Prize Award Ceremony Georg Rehm (DFKI) on behalf of the

French People - who we are French People - Studio of design, situated in Shanghai former French

Introduction to French Business Culture 1 IHRM French Business Culture Agenda The

The French baccalaureate until 2020 1. What is the French Baccalaureate or the Bac? The French

Data Flow Coverage 1 Stuart Anderson Stuart Anderson Data Flow Coverage 1 2011 c 1 Why

Logic-based test coverage Basic approach Clauses and predicates Basic coverage criteria: CC, PC,

Coverage-Oriented Verification Coverage-Oriented Verification of Banias of Banias Alon Gluska

WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for WORLD WIDE WORKSHOP for

Small UAV A French MoD perspective and planning French ISTAR segmentation French ISTAR

French or Spanish? World Languages French Ms. Kostolecki Spanish Mr. Draper Mr. Morreale

French/UK Grid Grid Workshop Workshop French/UK Grid Workshop French/UK London, November

Mandarin Physical Contact Verbs: a Frame-based Constructional Approach Meichun Liu, Tianqi He,

Information Extraction and Question-Answering Systems Foundations and methods Dr. Gnter

Semantic Roles &amp; Labeling LING 571 Deep Processing in NLP November 18, 2019 Shane

The Metagrammar Goes Multilingual A Crosslinguistic Look at the V2 Phenomenon Owen Rambow

User Needs and Requirements Analysis for Big Data Healthcare Applications Sonja Zillner, Siemens

Cost Friendly savings notification More timely (paper, despite delivery of geographical

Introduction &amp; Language Guessing Data Structures and Algorithms for CL III, WS 2019-2020

Authentication Authentication September 11, 2020 Administrative new VM new VM

Sambuz

Useful Links

Newsletter

Mail Us

Semantic Roles & Labeling LING 571 Deep Processing in NLP November 18, 2019 Shane

Introduction & Language Guessing Data Structures and Algorithms for CL III, WS 2019-2020