parsing beyond context free grammar
play

Parsing beyond context-free grammar: necessarily adjacent. Range - PowerPoint PPT Presentation

Kallmeyer/Maier ESSLLI 2008 Kallmeyer/Maier ESSLLI 2008 Range Concatenation Grammar The idea behind range concatenation grammar (RCG) is comparable to the idea behind MCFG. Predicate-rewriting clauses describe ranges which are not Parsing


  1. Kallmeyer/Maier ESSLLI 2008 Kallmeyer/Maier ESSLLI 2008 Range Concatenation Grammar The idea behind range concatenation grammar (RCG) is comparable to the idea behind MCFG. • Predicate-rewriting clauses describe ranges which are not Parsing beyond context-free grammar: necessarily adjacent. Range Concatenation Grammar Parsing • One predicate can be true or false for a certain string. • Some string w is in the language of an RCG if the start Laura Kallmeyer, Wolfgang Maier predicate is true for w . University of T¨ ubingen • While in MCFG, a string is generated, in RCG, a string is ESSLLI Course 2008 reduced to ǫ . Parsing beyond CFG 1 RCG Parsing Parsing beyond CFG 3 RCG Parsing Kallmeyer/Maier ESSLLI 2008 Kallmeyer/Maier ESSLLI 2008 Expressivity of RCG • RCG exactly covers the class of PTIME recognizable languages (Bertsch&Nederhof, 2001). • Simple RCG (basically non-deleting non-copying RCG) is Overview equivalent to MCFG 1. Range Concatenation Grammars (RCG) • RCG can represent languages beyond mild context-sensitivity 2. Parsing RCG (a) Directional top-down parsing (b) Earley-style parsing 3. Uses of RCG Parsing beyond CFG 2 RCG Parsing Parsing beyond CFG 4 RCG Parsing

  2. Kallmeyer/Maier ESSLLI 2008 Kallmeyer/Maier ESSLLI 2008 Definition of RCGs: Derivation Relation, Language Definition of RCGs: Grammar Definition • The derivation relation is defined as follows: A RCG is a tuple G = � N, T, V, P, S � such that For a predicate A of arity k , a clause A ( . . . ) → . . . , and ranges • N is a finite set of predicates, each with a fixed arity, � i 1 , j 1 � , . . ., � i k , j k � with respect to a given w : if there is an • T and V are disjoint finite sets of terminals and variables, instantiation of this clause with LHS A ( � i 1 , j 1 � , . . ., � i k , j k � ), then A ( � i 1 , j 1 � , . . ., � i i , j k � ) can be replaced with the RHS of • S ∈ N is the start predicate of arity 1, and this instantiation. • P is a finite set of clauses of the form • The language of an RCG G is the set of strings that can be A 0 ( x 01 , . . ., x 0 a 0 ) → ǫ reduced to the empty word: ∗ L ( G ) = { w | S ( � 0 , | w |� ) ⇒ ǫ with respect to w } . or A 0 ( x 01 , . . ., x 0 a 0 ) → A 1 ( x 11 , . . ., x 1 a 1 ) . . .A n ( x n 1 , . . ., x na n ) with n ≥ 1 and A i ∈ N, x ij ∈ ( T ∪ V ) ∗ and a i being the arity of A i . A predicate A n ( x n 1 , . . ., x na n ) can be written as A n ( � x n ) Parsing beyond CFG 5 RCG Parsing Parsing beyond CFG 7 RCG Parsing Kallmeyer/Maier ESSLLI 2008 Kallmeyer/Maier ESSLLI 2008 Definition of RCGs: Instantiation A sample RCG (1) Sample RCG G for the string language { a n b k a n | k, n ∈ IN } : An A given clause C is instantiated with respect to a string w if variables and arguments are consistently replaced by ranges of w . RCG with N = { S, A, B } , T = { a, b } , V = { X, Y, Z } , start predicate S and clauses Example: • S ( X Y Z ) → A ( X, Z ) B ( Y ), • A ( � i . . .j � ) → B ( � i + 1 . . .j � ) • A ( a X, a Y ) → A ( X, Y ), is an instantiation of the clause • B ( b X ) → B ( X ), • A ( aX 1 ) → B ( X 1 ) • A ( ǫ, ǫ ) → ǫ, if w i +1 = a . • B ( ǫ ) → ǫ Parsing beyond CFG 6 RCG Parsing Parsing beyond CFG 8 RCG Parsing

  3. Kallmeyer/Maier ESSLLI 2008 Kallmeyer/Maier ESSLLI 2008 A sample RCG (2) A sample RCG (4) As an example consider the reduction of w = aabaa : A ( a X , a Y ) → A ( X , Y ) S ( X Z ) → A ( X , Z ) B ( Y ) Y w 0 , 1 w 1 , 2 w 3 , 4 w 4 , 5 w 1 , 2 w 4 , 5 w 0 , 2 w 2 , 3 w 3 , 5 w 0 , 2 w 3 , 5 w 2 , 3 a a a a a a aa b aa aa aa b leads to A ( w 0 , 2 , w 3 , 5 ) ⇒ A ( w 1 , 2 , w 4 , 5 ). Then With this instantiation, S ( w 0 , 5 ) ⇒ A ( w 0 , 2 , w 3 , 5 ) B ( w 2 , 3 ). Then Parsing beyond CFG 9 RCG Parsing Parsing beyond CFG 11 RCG Parsing Kallmeyer/Maier ESSLLI 2008 Kallmeyer/Maier ESSLLI 2008 A sample RCG (3) A sample RCG (5) B ( b X ) → B ( X ) A ( a X , a Y ) → A ( X , Y ) w 2 , 3 w 3 , 3 w 3 , 3 w 1 , 2 w 2 , 2 w 4 , 5 w 5 , 5 w 2 , 2 w 5 , 5 b ǫ ǫ a ǫ a ǫ ǫ ǫ and B ( ǫ ) → ǫ and A ( ǫ, ǫ ) → ǫ lead to lead to A ( w 1 , 2 , w 4 , 5 ) ⇒ A ( w 2 , 2 , w 5 , 5 ) ⇒ ǫ A ( w 0 , 2 , w 3 , 5 ) B ( w 2 , 3 ) ⇒ A ( w 0 , 2 , w 3 , 5 ) B ( w 3 , 3 ) ⇒ A ( w 0 , 2 , w 3 , 5 ). Parsing beyond CFG 10 RCG Parsing Parsing beyond CFG 12 RCG Parsing

  4. Kallmeyer/Maier ESSLLI 2008 Kallmeyer/Maier ESSLLI 2008 RCG parsing: Treatment of terminals Definition of RCGs: Other properties (1) Without loss of generality, we presuppose that all non- ǫ clauses • An RCG with maximal predicate arity k is called an RCG of contain no terminals in their arguments. arity k (also called a k -RCG). For each t ∈ T , we introduce a new clause T t ( t ) → ǫ and for each • An RCG is called non-combinatorial if each of the arguments clause C ∈ P , in the right-hand sides of the productions are single variables. • we replace each occurrence t ′ of t in all arguments of all • An RCG is called linear if no variable appears more than once predicates with a variable V t ′ , in the left-hand sides of the productions and no variable appears more than once in the right-hand side of the • for each V t ′ , we add the predicate T t ( V t ′ ) to the RHS of C . productions. Furthermore, for all clauses we assume that its variables are continuously numbered from 1 to some j . Parsing beyond CFG 13 RCG Parsing Parsing beyond CFG 15 RCG Parsing Kallmeyer/Maier ESSLLI 2008 Kallmeyer/Maier ESSLLI 2008 Definition of RCGs: Other properties (2) RCG parsing: Range vectors We will use range vectors similar to those used for MCFG parsing. • An RCG is called non-erasing if for each production, each Range vectors are used to describe variable bindings. variable occurring in the left-hand side occurs also in the right-hand side and vice versa. • φ = ( � x 1 , y 1 � , . . ., � x k , y k � ) is a range vector in w if all � x i , y i � are ranges in w for 1 ≤ i ≤ k . • An RCG is called simple if it is non-combinatorial, linear and non-erasing. • φ = ( � x 1 , y 1 � , . . ., � x k , y k � ) is a range constraint vector if it contains pairs � x, y � where x, y ∈ Pos ( w ) ∪ V r ( V r is a set • A simple RCG is called ordered simple if the range variables { r 1 , r 2 , . . . } of range boundary variables) such that if are ordered the same way in the RHS and the LHS predicates. � x, y � ∈ Pos ( w ) 2 then it is a range. Ordered simple RCG is equivalent to simple RCG. • k is called the dimension of φ • φ ( i ) .l denotes then the first component and φ ( i ) .r the second component of the i th element of φ . Parsing beyond CFG 14 RCG Parsing Parsing beyond CFG 16 RCG Parsing

  5. Kallmeyer/Maier ESSLLI 2008 Kallmeyer/Maier ESSLLI 2008 RCG parsing: Variable constraint vectors Directional top-down parsing The variable constraint vector φ of a non- ǫ clause A ( � x ) → Φ is a Corresponds to the algorithm presented in Boullier (2000). range constraint vector of dimension j , j being the highest variable Item form: index in the clause. It contains only x ∈ V r × V r and must be • Active items: [ A ( � X ) → Φ • Ψ , φ ] consistent with variable adjacencies in the clause. • Passive items: [ A, ψ, flag ] Formally, the elements of φ are pairs from V r × V r such that φ ( h ) .r = φ ( i ) .l iff X h X i occurs as a substring in one of the where arguments of the clause. • φ is a range vector of dimension j , j being the highest variable index in the clause, • ψ is a range vector of dimension k , k being the arity of A , • flag = { p, c } indicates if a passive item is predicted or completed. Parsing beyond CFG 17 RCG Parsing Parsing beyond CFG 19 RCG Parsing Kallmeyer/Maier ESSLLI 2008 Kallmeyer/Maier ESSLLI 2008 Update of range vectors Directional top-down parsing (axiom and goal) We define an update φ ′ of a range constraint vector φ with respect • Axiom: to an identity x = y , x, y ∈ Pos ( w ) ∪ V r as follows: [ S, ( � 0 , n � ) , p ] • if x = y , then φ ′ = φ ; • The goal item is [ S, ( � 0 , n � ) , c ]. • else if x ∈ V r and the result ψ of replacing all occurrences of x in φ with y is a range constraint vector, then φ ′ = ψ ; • else if y ∈ V r and the result ψ of replacing all occurrences of y in φ with x is a range constraint vector, then φ ′ = ψ ; • otherwise, φ ′ is undefined. Parsing beyond CFG 18 RCG Parsing Parsing beyond CFG 20 RCG Parsing

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend