pumping and ogden properties of multiple context free
play

Pumping and Ogden Properties of Multiple Context-Free Grammars - PDF document

Pumping and Ogden Properties of Multiple Context-Free Grammars Makoto Kanazawa National Institute of Informatics and SOKENDAI Japan 1994: Ph.D. in Linguistics, Stanford University 1994:


  1. 略歴 Pumping and Ogden Properties of Multiple Context-Free Grammars Makoto Kanazawa National Institute of Informatics and SOKENDAI Japan • 1994: Ph.D. in Linguistics, Stanford University • 1994: 千葉大学文学部行動科学科 • 2000: 東京大学情報学環 • 2004: 国立情報学研究所 • 2018: 法政大学理工学部創生科学科 2 Arising from concerns in Multiple Context-Free Grammars computational linguistics. CFGs are almost good enough for NL grammars, but not quite; a mild • Introduced by Seki, Matsumura, Fujii, and Kasami (1987–1991) extension of CFGs is needed. • Independently by Vijay-Shanker,Weir, and Joshi Several criteria were put forward as to (1987) what constitutes a “mild” extension. • Many equivalent models • Often thought to be an adequate formalization of mildly context-sensitive grammars (Joshi 1985) 3

  2. Context-Free Grammars production A → w 0 B 1 w 1 … B n w n B i ∈ N, w j ∈ Σ * S ⇒ G * β A γ A → α ∈ P S ⇒ G * S S ⇒ G * β α γ top-down derivation L(G) = { w ∈ Σ * | S ⇒ G * w } 4 Bottom-Up Interpretation B i ⇒ G * v i (i = 1,…,n) A → w 0 B 1 w 1 … B n w n ∈ P A ⇒ G * w 0 v 1 w 1 … v n w n L(G) = { w ∈ Σ * | S ⇒ G * w } 5 CFGs as Logic Programs on Strings A → w 0 B 1 w 1 … B n w n A(w 0 x 1 w 1 … x n w n ) ← B 1 ( x 1 ),…,B n ( x n ) Horn clause L(G) = { w ∈ Σ * | G ⊢ S(w) } 6

  3. It’s best to think of an MCFG as a Multiple Context-Free Grammars kind of logic program. Each rule is a definite clause. Nonterminals are predicates on A( α 1 ,…, α q ) ← B 1 ( x 1,1 ,…, x 1,q 1 ),…,B n ( x n,1 ,…, x n,q n ) strings. n ≥ 0, q, q i ≥ 1, α k ∈ ( Σ ∪ { x i,j | i ∈ [1,n], j ∈ [1,q i ] })* each x i,j occurs exactly once in ( α 1 ,…, α q ) • q = dim(A) ( dimension of A) • dim(S) = 1 • L(G) = { w ∈ Σ * | G ⊢ S(w) } 7 m-MCFG = MCFG with nonterminal S( x 1 # x 2 ) ← D( x 1 , x 2 ) dimension not exceeding m D( ε , ε ) ← 1-MCFG = CFG { w#w R | w ∈ D 1 * } D( x 1 y 1 , y 2 x 2 ) ← E( x 1 , x 2 ), D( y 1 , y 2 ) Derivation tree for w = proof of S(w) E(a x 1 ā , ā x 2 a) ← D( x 1 , x 2 ) S(aa āā a ā # ā a āā aa) 2-MCFG 2-ary branching D(aa āā a ā , ā a āā aa) E(aa āā , āā aa) D(a ā , ā a) D(a ā , ā a) E(a ā , ā a) D( ε , ε ) E(a ā , ā a) D( ε , ε ) D( ε , ε ) D( ε , ε ) derivation tree 8 The languages of MCFGs form an S( x 1 … x m ) ← A( x 1 ,…, x m ) infinite hierarchy. A( ε ,…, ε ) ← A(a 1 x 1 a 2 ,…,a 2m − 1 x m a 2m ) ← A( x 1 ,…, x m ) non-branching m-MCFG { a 1n a 2n … a 2m − 1n a 2mn | n ≥ 0 } m-MCFL (m − 1)-MCFL Seki et al. 1991 2-MCFL 1-MCFL = CFL 9

  4. Chomsky Hierarchy Rewriting Machines Logic Programs on Strings Languages Grammars Elementary Formal Systems Unrestricted Turing r.e. (Smullyan 1961) Context- Length-Bounded EFS (Arikawa CSL = LBA Sensitive et al. 1989) NSPACE(n) Simple LMG (Groenink 1997) / Poly-time Hereditary EFS (Ikeda and P Turing Arimura 1997) MCFG MCFL Context-Free PDA Simple EFS (Arikawa 1970) CFL Right-Linear FA Reg 10 Which Properties of CFGs Are Shared by/ Generalize to MCFGs? • Membership in LOGCFL • Semilinearity • … 11 Derivation trees of MCFGs are similar Pumping to those of CFGs. When the same nonterminal occurs S( x 1 # x 2 ) ← D( x 1 , x 2 ) twice on the same path of a derivation D( ε , ε ) ← D( x 1 y 1 , y 2 x 2 ) ← E( x 1 , x 2 ), D( y 1 , y 2 ) tree,… S(aa āā a ā # ā a āā aa) E(a x 1 ā , ā x 2 a) ← D( x 1 , x 2 ) D(aa āā a ā , ā a āā aa) E(aa āā , āā aa) D(a ā , ā a) D(a ā , ā a) E(a ā , ā a) D( ε , ε ) E(a ā , ā a) D( ε , ε ) D( ε , ε ) D( ε , ε ) 12

  5. You can decompose the derivation tree into three parts, and the middle S( y 1 # y 2 ) part can be iterated any number of D( y 1 , y 2 ) times, including zero times. D(a x 1 ā a ā , ā a ā x 2 a) In the overall derivation tree, the E(a x 1 ā , ā x 2 a) D(a ā , ā a) variables x 1 , x 2 , y 1 , y 2 are instantiated by … D( x 1 , x 2 ) E(a ā , ā a) D( ε , ε ) The number of iterated substrings D( ε , ε ) D(a ā , ā a) (factors) larger than two. E(a ā , ā a) D( ε , ε ) a n a ā ( ā a ā ) n #( ā a ā ) n ā aa n ∈ L(G) D( ε , ε ) 13 For MCFGs, need to consider a Iterative Properties generalized form of the condition of the puming lemma. Not straightforward; open question for ∃ p ∀ z ∈ L(|z| ≥ p ⇒ L is k-iterative iff ∃ u 1 …u k+1 v 1 …v k ( a long time. z = u 1 v 1 …u k v k u k+1 ∧ v 1 …v k ≠ ε ∧ ∀ n ≥ 0(u 1 v 1n …u k v kn u k+1 ∈ L)) L ∈ CFL ⇒ L is 2-iterative L ∈ m-MCFL ⇒ L is 2m-iterative ? wrong claim in 1991 14 The middle part of the derivation tree Difficulty with Pumping may look like this. A(v 12 x 1 v 22 , v 32 x 2 v 42 ) A(v 1 x 1 v 2 , v 3 x 2 v 4 ) A(v 1 x 1 v 2 , v 3 x 2 v 4 ) A( x 1 , x 2 ) A( x 1 , x 2 ) 15

  6. Or like this. Difficulty with Pumping A(v 12 x 1 v 2 x 2 v 3 v 2 v 4 v 3 , v 4 ) A(v 1 x 1 v 2 x 2 v 3 , v 4 ) A(v 1 x 1 v 2 x 2 v 3 , v 4 ) A( x 1 , x 2 ) A( x 1 , x 2 ) “uneven pump” 16 The pumping lemma fails for 3- MCFGs. S( x 1 # x 2 # x 3 ) ← A( x 1 , x 2 , x 3 ) A(a x 1 , y 1 c x 2 c ̄ d y 2 d ̄ x 3 , y 3 b) ← A( x 1 , x 2 , x 3 ), A( y 1 , y 2 , y 3 ) A(a, ε , b) ← not k-iterative for any k m-MCFL 3-MCFL Kanazawa et al. 2014 2-MCFL 1-MCFL = CFL 17 Pumping possible for special cases. Pumping Lemma for Subclasses Well-nested MCFGs. 2m-iterative m-MCFL m-MCFL wn 2-MCFL 2-MCFL wn 1-MCFL 1-MCFL wn = = CFL CFL 4-iterative well-nested MCFGs Kanazawa 2009 18

  7. Has a natural equivalent Well-Nestedness characterization: yCFT sp { w#w R | w ∈ D 1 * } { w#w | w ∈ D 1 * } S( x 1 # x 2 ) ← D( x 1 , x 2 ) S( x 1 # x 2 ) ← D( x 1 , x 2 ) D( ε , ε ) ← D( ε , ε ) ← D( x 1 y 1 , y 2 x 2 ) ← E( x 1 , x 2 ), D( y 1 , y 2 ) D( x 1 y 1 , x 2 y 2 ) ← E( x 1 , x 2 ), D( y 1 , y 2 ) E(a x 1 ā , ā x 2 a) ← D( x 1 , x 2 ) E(a x 1 ā , a x 2 ā ) ← D( x 1 , x 2 ) well-nested non-well-nested { w#w | w ∈ D 1 * } ∉ MCFL wn Kanazawa and Salvati 2010 19 Pumping not easy to prove even form Difficulty with Pumping well-nested MCFGs: this situation can still arise. A(v 12 x 1 v 2 x 2 v 3 v 2 v 4 v 3 , v 4 ) A(v 1 x 1 v 2 x 2 v 3 , v 4 ) A(v 1 x 1 v 2 x 2 v 3 , v 4 ) A( x 1 , x 2 ) A( x 1 , x 2 ) “uneven pump” 20 A very simple example. S( x 1 x 2 ) ← A( x 1 , x 2 ) The only choice you can make is the A(a x 1 b x 2 c, d) ← A( x 1 , x 2 ) non-branching ⊆ well-nested number of times you use the second A( ε , ε ) ← rule. S( ε ) S(abcd) S(aabcbdcd) S(aaabcbdcbdc,d) Actually 2-iterative, but no A( ε , ε ) A(abc, d) A(aabcbdc,d) A(aaabcbdcbdc,d) straightforward connection between A( ε , ε ) A(abc, d) A(aabcbdc,d) the iterated substrings and parts of derivation trees. A( ε , ε ) A(abc, d) A( ε , ε ) i=0 i=1 i=2 i=3 { ε } ∪ { a i − 1 abc(bdc) i − 1 d | i ≥ 1 } 21

  8. If the derivation tree contains an ( v 1 x 1 v 2 ,…, v 2 m − 1 x m v 2 m ) even m-pump, the string is 2m- • If G is a well-nested m -MCFG, pumpable. B Otherwise, the string is in the { T | T is a derivation tree of “even m- pump” language of some w.n. (m-1)- G without even m -pumps } B MCFG, and therefore is 2(m-1)- may not be finite. pumpable (disregarding finitely ( x 1 ,…, x m ) many exceptions). • But there is a well-nested ( m − 1)-MCFG Proof by induction on m. generating { yield( T ) | T is a derivation tree of G without even m -pumps }. 22 My proof of the pumping lemma for Pumping Lemma for Subclasses m-MCFL wn and 2-MCFL is not straightforward. 2m-iterative m-MCFL m-MCFL wn 2-MCFL 2-MCFL wn 1-MCFL 1-MCFL wn = = CFL CFL 4-iterative Kanazawa 2009, by grammar splitting and transformation What about Ogden’s Lemma ? 23 Can be used to show inherent Ogden’s Lemma for CFL ambiguity of some CFLs, e.g., { a m b n c p | m = n ∨ n = p }. L ∈ CFL ⇒ ∃ p ∀ z ∈ L(at least p positions of z are marked ⇒ ∃ u 1 u 2 u 3 v 1 v 2 ( z = u 1 v 1 u 2 v 2 u 3 ∧ (u 1 , v 1 , u 2 each contain a marked position ∨ u 2 , v 2 , u 3 each contain a marked position) ∧ v 1 u 2 v 2 contains no more than p marked positions ∧ ∀ n ≥ 0(u 1 v 1n u 2 v 2n u 3 ∈ L)) Ogden 1968 24

  9. There are various ways of generalizing Ogden’s lemma suitable for MCFGs. At least this much should be implied. L has the weak Ogden property iff ∃ p ∀ z ∈ L(at least p positions of z are marked ⇒ ∃ k ≥ 1 ∃ u 1 …u k+1 v 1 …v k ( z = u 1 v 1 …u k v k u k+1 ∧ ∃ i(v i contains a marked position) ∧ ∀ n ≥ 0(u 1 v 1n …u k v kn u k+1 ∈ L)) 25 This is the first new result in this talk. The Failure of Ogden’s Lemma 2m-iterative m-MCFL m-MCFL wn 3-MCFL wn 6-iterative 2-MCFL 2-MCFL wn 1-MCFL 1-MCFL wn = = CFL CFL 4-iterative The weak Ogden property fails for 3-MCFL wn and 2-MCFL. 26 A language for which the weak Ogden property fails. { a i 1 b i 0 $a i 2 b i 1 $a i 3 b i 2 $…$a i n b i n − 1 | n ≥ 3, i 0 ,…,i n ≥ 0 } 3-MCFL wn 2-MCFL 2-MCFL wn CFL 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend