SLIDE 1
Extracting semi-Dyck words from fsa using the CYK algorithm Thomas - - PowerPoint PPT Presentation
Extracting semi-Dyck words from fsa using the CYK algorithm Thomas - - PowerPoint PPT Presentation
Extracting semi-Dyck words from fsa using the CYK algorithm Thomas Ruprecht November 30, 2018 Outline Motivation Finding appropriate restrictions CYK algorithm for extraction of semi-Dyck words goal: extract semi-Dyck words from reg.
SLIDE 2
SLIDE 3
Motivation: Chomsky-Schützenberger parsing
▶ ChoSchü theorem [CS63]: decompose context-free language into
▶ reg. language 𝑆 ▶ alph. string homomorphism ℎ ▶ semi-Dyck language D
such that 𝑀 = ℎ(𝑆 ∩ 𝐸) ▶ ChoSchü parsing [Hul11]:
▶ def. of 𝑆 and 𝐸 using grammar imply
▶ bijection between 𝑆 ∩ 𝐸 and derivation trees ▶ bijection between 𝑆 ∩ 𝐸 ∩ ℎ−1(𝑥) and derivation trees for 𝑥
▶ goal: extract semi-Dyck words from reg. language 𝑆 ∩ ℎ−1(𝑥)
SLIDE 4
Motivation: Chomsky-Schützenberger parsing
▶ ChoSchü theorem [CS63]: decompose context-free language into
▶ reg. language 𝑆 ▶ alph. string homomorphism ℎ ▶ semi-Dyck language D
such that 𝑀 = ℎ(𝑆 ∩ 𝐸) ▶ ChoSchü parsing [Hul11]:
▶ def. of 𝑆 and 𝐸 using grammar imply
▶ bijection between 𝑆 ∩ 𝐸 and derivation trees ▶ bijection between 𝑆 ∩ 𝐸 ∩ ℎ−1(𝑥) and derivation trees for 𝑥
▶ goal: extract semi-Dyck words from reg. language 𝑆 ∩ ℎ−1(𝑥)
SLIDE 5
Motivation: Chomsky-Schützenberger parsing
▶ ChoSchü theorem [CS63]: decompose context-free language into
▶ reg. language 𝑆 ▶ alph. string homomorphism ℎ ▶ semi-Dyck language D
such that 𝑀 = ℎ(𝑆 ∩ 𝐸) ▶ ChoSchü parsing [Hul11]:
▶ def. of 𝑆 and 𝐸 using grammar imply
▶ bijection between 𝑆 ∩ 𝐸 and derivation trees ▶ bijection between 𝑆 ∩ 𝐸 ∩ ℎ−1(𝑥) and derivation trees for 𝑥
▶ goal: extract semi-Dyck words from reg. language 𝑆 ∩ ℎ−1(𝑥)
SLIDE 6
Motivation: existing algorithm to extract Dyck words [Hul11]
Require: fjnite state automaton = (𝑅, 𝛵 ∪ 𝛵, 𝑟init, 𝑟fjn, 𝑈) Ensure: enumerate words in L() ∩ D(𝛵)
1: procedure extractDyck() 2:
𝐵, 𝐷 ∶= {𝑤 ∣ (𝑞, 𝜏, 𝑟), (𝑟, 𝜏, 𝑠) ∈ 𝑈}, ∅
3:
for (𝑞, 𝑤, 𝑟) ∈ 𝐵 do
4:
𝐵 ∖= {(𝑞, 𝑤, 𝑟)}; 𝐷 ∪= {(𝑞, 𝑤, 𝑟)}
5:
if (𝑞, 𝑟) = (𝑟init, 𝑟fjn) then yield 𝑤
6:
𝐵 ∪= {(𝑞, 𝑤𝑥, 𝑠) ∣ (𝑟, 𝑥, 𝑠) ∈ 𝐷} ∖ 𝐷
7:
𝐵 ∪= {(𝑝, 𝑣𝑤, 𝑟) ∣ (𝑝, 𝑣, 𝑞) ∈ 𝐷} ∖ 𝐷
8:
𝐵 ∪= {(𝑝, 𝜏𝑤𝜏, 𝑠) ∣ (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈} ∖ 𝐷 ▶ relies on recursive structure of Dyck words: concatenation and bracketing ▶ dynamic programming: store intermediate results (backlinks) for state ▶ backlinks are equivalent to reduct grammar [BPS61]
SLIDE 7
Motivation: existing algorithm to extract Dyck words [Hul11]
Require: fjnite state automaton = (𝑅, 𝛵 ∪ 𝛵, 𝑟init, 𝑟fjn, 𝑈) Ensure: enumerate words in L() ∩ D(𝛵)
1: procedure extractDyck() 2:
𝐵, 𝐷 ∶= {𝑤 ∣ (𝑞, 𝜏, 𝑟), (𝑟, 𝜏, 𝑠) ∈ 𝑈}, ∅
3:
for (𝑞, 𝑤, 𝑟) ∈ 𝐵 do
4:
𝐵 ∖= {(𝑞, 𝑤, 𝑟)}; 𝐷 ∪= {(𝑞, 𝑤, 𝑟)}
5:
if (𝑞, 𝑟) = (𝑟init, 𝑟fjn) then yield 𝑤
6:
𝐵 ∪= {(𝑞, 𝑤𝑥, 𝑠) ∣ (𝑟, 𝑥, 𝑠) ∈ 𝐷} ∖ 𝐷
7:
𝐵 ∪= {(𝑝, 𝑣𝑤, 𝑟) ∣ (𝑝, 𝑣, 𝑞) ∈ 𝐷} ∖ 𝐷
8:
𝐵 ∪= {(𝑝, 𝜏𝑤𝜏, 𝑠) ∣ (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈} ∖ 𝐷 ▶ relies on recursive structure of Dyck words: concatenation and bracketing ▶ dynamic programming: store intermediate results (backlinks) for state ▶ backlinks are equivalent to reduct grammar [BPS61]
SLIDE 8
Motivation: existing algorithm to extract Dyck words [Hul11]
Require: fjnite state automaton = (𝑅, 𝛵 ∪ 𝛵, 𝑟init, 𝑟fjn, 𝑈) Ensure: enumerate words in L() ∩ D(𝛵)
1: procedure extractDyck() 2:
𝐵, 𝐷 ∶= {𝑤 ∣ (𝑞, 𝜏, 𝑟), (𝑟, 𝜏, 𝑠) ∈ 𝑈}, ∅
3:
for (𝑞, 𝑤, 𝑟) ∈ 𝐵 do
4:
𝐵 ∖= {(𝑞, 𝑤, 𝑟)}; 𝐷 ∪= {(𝑞, 𝑤, 𝑟)}
5:
if (𝑞, 𝑟) = (𝑟init, 𝑟fjn) then yield 𝑤
6:
𝐵 ∪= {(𝑞, 𝑤𝑥, 𝑠) ∣ (𝑟, 𝑥, 𝑠) ∈ 𝐷} ∖ 𝐷
7:
𝐵 ∪= {(𝑝, 𝑣𝑤, 𝑟) ∣ (𝑝, 𝑣, 𝑞) ∈ 𝐷} ∖ 𝐷
8:
𝐵 ∪= {(𝑝, 𝜏𝑤𝜏, 𝑠) ∣ (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈} ∖ 𝐷 ▶ relies on recursive structure of Dyck words: concatenation and bracketing ▶ dynamic programming: store intermediate results (backlinks) for state ▶ backlinks are equivalent to reduct grammar [BPS61]
SLIDE 9
Motivation: existing algorithm to extract Dyck words [Hul11]
Require: fjnite state automaton = (𝑅, 𝛵 ∪ 𝛵, 𝑟init, 𝑟fjn, 𝑈) Ensure: enumerate words in L() ∩ D(𝛵)
1: procedure extractDyck() 2:
𝐵, 𝐷 ∶= {𝑤 ∣ (𝑞, 𝜏, 𝑟), (𝑟, 𝜏, 𝑠) ∈ 𝑈}, ∅
3:
for (𝑞, 𝑤, 𝑟) ∈ 𝐵 do
4:
𝐵 ∖= {(𝑞, 𝑤, 𝑟)}; 𝐷 ∪= {(𝑞, 𝑤, 𝑟)}
5:
if (𝑞, 𝑟) = (𝑟init, 𝑟fjn) then yield 𝑤
6:
𝐵 ∪= {(𝑞, 𝑤𝑥, 𝑠) ∣ (𝑟, 𝑥, 𝑠) ∈ 𝐷} ∖ 𝐷
7:
𝐵 ∪= {(𝑝, 𝑣𝑤, 𝑟) ∣ (𝑝, 𝑣, 𝑞) ∈ 𝐷} ∖ 𝐷
8:
𝐵 ∪= {(𝑝, 𝜏𝑤𝜏, 𝑠) ∣ (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈} ∖ 𝐷 ▶ relies on recursive structure of Dyck words: concatenation and bracketing ▶ dynamic programming: store intermediate results (backlinks) for state ▶ backlinks are equivalent to reduct grammar [BPS61]
SLIDE 10
Outline
Motivation Finding appropriate restrictions CYK algorithm for extraction of semi-Dyck words
SLIDE 11
𝑜-centered semi-Dyck languages
▶ example [()]{([ ]⟦{}⟧)} is 3-centered ▶ 𝑜-centered semi-Dyck word o.t.f. 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where
▶ 𝑥𝑗 ∈ 𝛵
∗ ⋅ 𝛵∗
▶ 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 ∈ D(𝛵)
▶ C(𝛵, 𝑜) ⊆ D(𝛵) ▶ C(𝛵, ≤𝑜) = ⋃𝑜′≤𝑜 C(𝛵, 𝑜′) ▶ C(𝛵, ≤∞) = ⋃𝑜′∈ℕ C(𝛵, 𝑜′) = D(𝛵)
SLIDE 12
𝑜-centered semi-Dyck languages
▶ example [()]{([ ]⟦{}⟧)} is 3-centered ▶ 𝑜-centered semi-Dyck word o.t.f. 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where
▶ 𝑥𝑗 ∈ 𝛵
∗ ⋅ 𝛵∗
▶ 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 ∈ D(𝛵)
▶ C(𝛵, 𝑜) ⊆ D(𝛵) ▶ C(𝛵, ≤𝑜) = ⋃𝑜′≤𝑜 C(𝛵, 𝑜′) ▶ C(𝛵, ≤∞) = ⋃𝑜′∈ℕ C(𝛵, 𝑜′) = D(𝛵)
SLIDE 13
𝑜-centered semi-Dyck languages
▶ example [()]{([ ]⟦{}⟧)} is 3-centered ▶ 𝑜-centered semi-Dyck word o.t.f. 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where
▶ 𝑥𝑗 ∈ 𝛵
∗ ⋅ 𝛵∗
▶ 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 ∈ D(𝛵)
▶ C(𝛵, 𝑜) ⊆ D(𝛵) ▶ C(𝛵, ≤𝑜) = ⋃𝑜′≤𝑜 C(𝛵, 𝑜′) ▶ C(𝛵, ≤∞) = ⋃𝑜′∈ℕ C(𝛵, 𝑜′) = D(𝛵)
SLIDE 14
𝑜-centered semi-Dyck languages
▶ example [()]{([ ]⟦{}⟧)} is 3-centered ▶ 𝑜-centered semi-Dyck word o.t.f. 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where
▶ 𝑥𝑗 ∈ 𝛵
∗ ⋅ 𝛵∗
▶ 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 ∈ D(𝛵)
▶ C(𝛵, 𝑜) ⊆ D(𝛵) ▶ C(𝛵, ≤𝑜) = ⋃𝑜′≤𝑜 C(𝛵, 𝑜′) ▶ C(𝛵, ≤∞) = ⋃𝑜′∈ℕ C(𝛵, 𝑜′) = D(𝛵)
SLIDE 15
𝑜-centered semi-Dyck languages
▶ example [()]{([ ]⟦{}⟧)} is 3-centered ▶ 𝑜-centered semi-Dyck word o.t.f. 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where
▶ 𝑥𝑗 ∈ 𝛵
∗ ⋅ 𝛵∗
▶ 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 ∈ D(𝛵)
▶ C(𝛵, 𝑜) ⊆ D(𝛵) ▶ C(𝛵, ≤𝑜) = ⋃𝑜′≤𝑜 C(𝛵, 𝑜′) ▶ C(𝛵, ≤∞) = ⋃𝑜′∈ℕ C(𝛵, 𝑜′) = D(𝛵)
SLIDE 16
𝑜-centered semi-Dyck languages
▶ example [()]{([ ]⟦{}⟧)} is 3-centered ▶ 𝑜-centered semi-Dyck word o.t.f. 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where
▶ 𝑥𝑗 ∈ 𝛵
∗ ⋅ 𝛵∗
▶ 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 ∈ D(𝛵)
▶ C(𝛵, 𝑜) ⊆ D(𝛵) ▶ C(𝛵, ≤𝑜) = ⋃𝑜′≤𝑜 C(𝛵, 𝑜′) ▶ C(𝛵, ≤∞) = ⋃𝑜′∈ℕ C(𝛵, 𝑜′) = D(𝛵)
SLIDE 17
𝑜-centered semi-Dyck languages
▶ example [()]{([ ]⟦{}⟧)} is 3-centered ▶ 𝑜-centered semi-Dyck word o.t.f. 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where
▶ 𝑥𝑗 ∈ 𝛵
∗ ⋅ 𝛵∗
▶ 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 ∈ D(𝛵)
▶ C(𝛵, 𝑜) ⊆ D(𝛵) ▶ C(𝛵, ≤𝑜) = ⋃𝑜′≤𝑜 C(𝛵, 𝑜′) ▶ C(𝛵, ≤∞) = ⋃𝑜′∈ℕ C(𝛵, 𝑜′) = D(𝛵)
SLIDE 18
𝑜-centered semi-Dyck languages
▶ example [()]{([ ]⟦{}⟧)} is 3-centered ▶ 𝑜-centered semi-Dyck word o.t.f. 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where
▶ 𝑥𝑗 ∈ 𝛵
∗ ⋅ 𝛵∗
▶ 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 ∈ D(𝛵)
▶ C(𝛵, 𝑜) ⊆ D(𝛵) ▶ C(𝛵, ≤𝑜) = ⋃𝑜′≤𝑜 C(𝛵, 𝑜′) ▶ C(𝛵, ≤∞) = ⋃𝑜′∈ℕ C(𝛵, 𝑜′) = D(𝛵)
SLIDE 19
𝑜-centered semi-Dyck languages
▶ example [()]{([ ]⟦{}⟧)} is 3-centered ▶ 𝑜-centered semi-Dyck word o.t.f. 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where
▶ 𝑥𝑗 ∈ 𝛵
∗ ⋅ 𝛵∗
▶ 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 ∈ D(𝛵)
▶ C(𝛵, 𝑜) ⊆ D(𝛵) ▶ C(𝛵, ≤𝑜) = ⋃𝑜′≤𝑜 C(𝛵, 𝑜′) ▶ C(𝛵, ≤∞) = ⋃𝑜′∈ℕ C(𝛵, 𝑜′) = D(𝛵)
SLIDE 20
𝑜-centered semi-Dyck languages
▶ example [()]{([ ]⟦{}⟧)} is 3-centered ▶ 𝑜-centered semi-Dyck word o.t.f. 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where
▶ 𝑥𝑗 ∈ 𝛵
∗ ⋅ 𝛵∗
▶ 𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 ∈ D(𝛵)
▶ C(𝛵, 𝑜) ⊆ D(𝛵) ▶ C(𝛵, ≤𝑜) = ⋃𝑜′≤𝑜 C(𝛵, 𝑜′) ▶ C(𝛵, ≤∞) = ⋃𝑜′∈ℕ C(𝛵, 𝑜′) = D(𝛵)
SLIDE 21
(At most) 𝑜-centered regular languages
▶ (≤ 𝑜 )
- centered regular word o.t.f.
𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where 𝑛 ≤ 𝑜, 𝑥𝑗 does not contain subsequences in 𝛵 ⋅ 𝛵 ▶ = (𝑅, 𝛵 ∪ 𝛵, 𝑟init, 𝑟fjn, 𝑈) is (≤ 𝑜 )
- centered
▶ surjective function 𝑔∶ 𝑅 → {0, …, 𝑜}:
(𝑞, 𝜏, 𝑟) ∈ 𝑈 ⇒ {𝑔(𝑞) = 𝑔(𝑠) − 1 if (𝑟, 𝜏, 𝑠) ∈ 𝑈 𝑔(𝑞) = 𝑔(𝑟)
- therwise
vice versa for (𝑞, 𝜏, 𝑟)
▶ = state partition with ordered cells ▶ ̂ 𝑜 smallest number s.t. is (≤ ̂ 𝑜)-centered ⇒ 𝑔 is surjective
start 1 2 (, [ , ⟦ ] ) { ), }, ] ⟧
SLIDE 22
(At most) 𝑜-centered regular languages
▶ (≤ 𝑜 )
- centered regular word o.t.f.
𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where 𝑛 ≤ 𝑜, 𝑥𝑗 does not contain subsequences in 𝛵 ⋅ 𝛵 ▶ = (𝑅, 𝛵 ∪ 𝛵, 𝑟init, 𝑟fjn, 𝑈) is (≤ 𝑜 )
- centered
▶ surjective function 𝑔∶ 𝑅 → {0, …, 𝑜}:
(𝑞, 𝜏, 𝑟) ∈ 𝑈 ⇒ {𝑔(𝑞) = 𝑔(𝑠) − 1 if (𝑟, 𝜏, 𝑠) ∈ 𝑈 𝑔(𝑞) = 𝑔(𝑟)
- therwise
vice versa for (𝑞, 𝜏, 𝑟)
▶ = state partition with ordered cells ▶ ̂ 𝑜 smallest number s.t. is (≤ ̂ 𝑜)-centered ⇒ 𝑔 is surjective
start 1 2 (, [, ⟦ ] ) { ), }, ] ⟧
SLIDE 23
(At most) 𝑜-centered regular languages
▶ (≤ 𝑜 )
- centered regular word o.t.f.
𝑥0(1)1𝑥1…(𝑜)𝑜𝑥𝑜 where 𝑛 ≤ 𝑜, 𝑥𝑗 does not contain subsequences in 𝛵 ⋅ 𝛵 ▶ = (𝑅, 𝛵 ∪ 𝛵, 𝑟init, 𝑟fjn, 𝑈) is (≤ 𝑜 )
- centered
▶ surjective function 𝑔∶ 𝑅 → {0, …, 𝑜}:
(𝑞, 𝜏, 𝑟) ∈ 𝑈 ⇒ {𝑔(𝑞) = 𝑔(𝑠) − 1 if (𝑟, 𝜏, 𝑠) ∈ 𝑈 𝑔(𝑞) = 𝑔(𝑟)
- therwise
vice versa for (𝑞, 𝜏, 𝑟)
▶ = state partition with ordered cells ▶ ̂ 𝑜 smallest number s.t. is (≤ ̂ 𝑜)-centered ⇒ 𝑔 is surjective
start 1 2 (, [, ⟦ ] ) { ), }, ] ⟧
SLIDE 24
(At most) 𝑜-centered regular languages
▶ (≤𝑜)-centered regular word o.t.f. 𝑥0(1)1𝑥1…(𝑛)𝑛𝑥𝑛 where 𝑛 ≤ 𝑜, 𝑥𝑗 does not contain subsequences in 𝛵 ⋅ 𝛵 ▶ = (𝑅, 𝛵 ∪ 𝛵, 𝑟init, 𝑟fjn, 𝑈) is (≤𝑜)-centered
▶ surjective function 𝑔∶ 𝑅 → {0, …, 𝑜}:
(𝑞, 𝜏, 𝑟) ∈ 𝑈 ⇒ {𝑔(𝑞) < 𝑔(𝑠) − 1 if (𝑟, 𝜏, 𝑠) ∈ 𝑈 𝑔(𝑞) = 𝑔(𝑟)
- therwise
vice versa for (𝑞, 𝜏, 𝑟)
▶ ≈ state partition with ordered cells ▶ ̂ 𝑜 smallest number s.t. is (≤ ̂ 𝑜)-centered ⇒ 𝑔 is surjective
start 1 2 (, [, ⟦ ] ) { ), }, ] ⟧
SLIDE 25
(At most) 𝑜-centered regular languages
▶ (≤𝑜)-centered regular word o.t.f. 𝑥0(1)1𝑥1…(𝑛)𝑛𝑥𝑛 where 𝑛 ≤ 𝑜, 𝑥𝑗 does not contain subsequences in 𝛵 ⋅ 𝛵 ▶ = (𝑅, 𝛵 ∪ 𝛵, 𝑟init, 𝑟fjn, 𝑈) is (≤𝑜)-centered
▶ surjective function 𝑔∶ 𝑅 → {0, …, 𝑜}:
(𝑞, 𝜏, 𝑟) ∈ 𝑈 ⇒ {𝑔(𝑞) < 𝑔(𝑠) − 1 if (𝑟, 𝜏, 𝑠) ∈ 𝑈 𝑔(𝑞) = 𝑔(𝑟)
- therwise
vice versa for (𝑞, 𝜏, 𝑟)
▶ ≈ state partition with ordered cells ▶ ̂ 𝑜 smallest number s.t. is (≤ ̂ 𝑜)-centered ⇒ 𝑔 is surjective
start 1 2 (, [, ⟦ ] ) { ), }, ] ⟧
SLIDE 26
Closure properties
𝑀 is (≤ℓ)-centered, 𝑁 is (≤𝑛)-centered reg. language over 𝛵, for ℓ, 𝑛 ∈ ℕ ∪ {∞} ▶ 𝑀 ∩ 𝑁 is (≤min(ℓ, 𝑛))-centered ▶ 𝑀 ∪ 𝑁 is (≤max(ℓ, 𝑛))-centered ▶ 𝑀 is (≤∞)-centered ▶ 𝑀 ∖ 𝑁 is (≤ℓ)-centered ▶ 𝑀 ∩ D(𝛵) ⊆ C(𝛵, ≤ℓ)
SLIDE 27
Closure properties
𝑀 is (≤ℓ)-centered, 𝑁 is (≤𝑛)-centered reg. language over 𝛵, for ℓ, 𝑛 ∈ ℕ ∪ {∞} ▶ 𝑀 ∩ 𝑁 is (≤min(ℓ, 𝑛))-centered ▶ 𝑀 ∪ 𝑁 is (≤max(ℓ, 𝑛))-centered ▶ 𝑀 is (≤∞)-centered ▶ 𝑀 ∖ 𝑁 is (≤ℓ)-centered ▶ 𝑀 ∩ D(𝛵) ⊆ C(𝛵, ≤ℓ)
SLIDE 28
Closure properties
𝑀 is (≤ℓ)-centered, 𝑁 is (≤𝑛)-centered reg. language over 𝛵, for ℓ, 𝑛 ∈ ℕ ∪ {∞} ▶ 𝑀 ∩ 𝑁 is (≤min(ℓ, 𝑛))-centered ▶ 𝑀 ∪ 𝑁 is (≤max(ℓ, 𝑛))-centered ▶ 𝑀 is (≤∞)-centered ▶ 𝑀 ∖ 𝑁 is (≤ℓ)-centered ▶ 𝑀 ∩ D(𝛵) ⊆ C(𝛵, ≤ℓ)
SLIDE 29
Closure properties
𝑀 is (≤ℓ)-centered, 𝑁 is (≤𝑛)-centered reg. language over 𝛵, for ℓ, 𝑛 ∈ ℕ ∪ {∞} ▶ 𝑀 ∩ 𝑁 is (≤min(ℓ, 𝑛))-centered ▶ 𝑀 ∪ 𝑁 is (≤max(ℓ, 𝑛))-centered ▶ 𝑀 is (≤∞)-centered ▶ 𝑀 ∖ 𝑁 is (≤ℓ)-centered ▶ 𝑀 ∩ D(𝛵) ⊆ C(𝛵, ≤ℓ)
SLIDE 30
Closure properties
𝑀 is (≤ℓ)-centered, 𝑁 is (≤𝑛)-centered reg. language over 𝛵, for ℓ, 𝑛 ∈ ℕ ∪ {∞} ▶ 𝑀 ∩ 𝑁 is (≤min(ℓ, 𝑛))-centered ▶ 𝑀 ∪ 𝑁 is (≤max(ℓ, 𝑛))-centered ▶ 𝑀 is (≤∞)-centered ▶ 𝑀 ∖ 𝑁 is (≤ℓ)-centered ▶ 𝑀 ∩ D(𝛵) ⊆ C(𝛵, ≤ℓ)
SLIDE 31
Closure properties
𝑀 is (≤ℓ)-centered, 𝑁 is (≤𝑛)-centered reg. language over 𝛵, for ℓ, 𝑛 ∈ ℕ ∪ {∞} ▶ 𝑀 ∩ 𝑁 is (≤min(ℓ, 𝑛))-centered ▶ 𝑀 ∪ 𝑁 is (≤max(ℓ, 𝑛))-centered ▶ 𝑀 is (≤∞)-centered ▶ 𝑀 ∖ 𝑁 is (≤ℓ)-centered ▶ 𝑀 ∩ D(𝛵) ⊆ C(𝛵, ≤ℓ)
SLIDE 32
Outline
Motivation Finding appropriate restrictions CYK algorithm for extraction of semi-Dyck words
SLIDE 33
CYK algorithm for extraction of semi-Dyck words: example
▶ 𝑜-CYK algorithm applicable for (≤𝑜)-centered automata ▶ span 𝑔(𝑝), 𝑔(𝑠): fjll backlinks for sub-runs accepting semi-Dyck words
▶ initial: 𝑝, 𝑠 → 𝜏𝜏 for (𝑝, 𝜏, 𝑞), (𝑞, 𝜏, 𝑠) ∈ 𝑈 ▶ concatenation: 𝑝, 𝑠 → (𝑝, 𝑞)(𝑞, 𝑠) ▶ bracketing: 𝑝, 𝑠 → 𝜏(𝑞, 𝑟)𝜏 for (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈
𝑟0 start 𝑟1 𝑟2 (, [, ⟦ [ ] ) { } ), }, ] ⟦ ⟧
1 2 (𝑟0, 𝑟1): [ ], ( (𝑟0, 𝑟1) ) (𝑟1, 𝑟2): { } (𝑟0, 𝑟1): ⟦ ⟧, (𝑟0, 𝑟1)(𝑟1, 𝑟2), ( (𝑟0, 𝑟2) ), [ (𝑟0, 𝑟2) ]
SLIDE 34
CYK algorithm for extraction of semi-Dyck words: example
▶ 𝑜-CYK algorithm applicable for (≤𝑜)-centered automata ▶ span 𝑔(𝑝), 𝑔(𝑠): fjll backlinks for sub-runs accepting semi-Dyck words
▶ initial: 𝑝, 𝑠 → 𝜏𝜏 for (𝑝, 𝜏, 𝑞), (𝑞, 𝜏, 𝑠) ∈ 𝑈 ▶ concatenation: 𝑝, 𝑠 → (𝑝, 𝑞)(𝑞, 𝑠) ▶ bracketing: 𝑝, 𝑠 → 𝜏(𝑞, 𝑟)𝜏 for (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈
𝑟0 start 𝑟1 𝑟2 (, [, ⟦ [ ] ) { } ), }, ] ⟦ ⟧
1 2 (𝑟0, 𝑟1): [ ], ( (𝑟0, 𝑟1) ) (𝑟1, 𝑟2): { } (𝑟0, 𝑟1): ⟦ ⟧, (𝑟0, 𝑟1)(𝑟1, 𝑟2), ( (𝑟0, 𝑟2) ), [ (𝑟0, 𝑟2) ]
SLIDE 35
CYK algorithm for extraction of semi-Dyck words: example
▶ 𝑜-CYK algorithm applicable for (≤𝑜)-centered automata ▶ span 𝑔(𝑝), 𝑔(𝑠): fjll backlinks for sub-runs accepting semi-Dyck words
▶ initial: 𝑝, 𝑠 → 𝜏𝜏 for (𝑝, 𝜏, 𝑞), (𝑞, 𝜏, 𝑠) ∈ 𝑈 ▶ concatenation: 𝑝, 𝑠 → (𝑝, 𝑞)(𝑞, 𝑠) ▶ bracketing: 𝑝, 𝑠 → 𝜏(𝑞, 𝑟)𝜏 for (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈
𝑟0 start 𝑟1 𝑟2 (, [, ⟦ [ ] ) { } ), }, ] ⟦ ⟧
1 2 (𝑟0, 𝑟1): [ ], ( (𝑟0, 𝑟1) ) (𝑟1, 𝑟2): { } (𝑟0, 𝑟1): ⟦ ⟧, (𝑟0, 𝑟1)(𝑟1, 𝑟2), ( (𝑟0, 𝑟2) ), [ (𝑟0, 𝑟2) ]
SLIDE 36
CYK algorithm for extraction of semi-Dyck words: example
▶ 𝑜-CYK algorithm applicable for (≤𝑜)-centered automata ▶ span 𝑔(𝑝), 𝑔(𝑠): fjll backlinks for sub-runs accepting semi-Dyck words
▶ initial: 𝑝, 𝑠 → 𝜏𝜏 for (𝑝, 𝜏, 𝑞), (𝑞, 𝜏, 𝑠) ∈ 𝑈 ▶ concatenation: 𝑝, 𝑠 → (𝑝, 𝑞)(𝑞, 𝑠) ▶ bracketing: 𝑝, 𝑠 → 𝜏(𝑞, 𝑟)𝜏 for (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈
𝑟0 start 𝑟1 𝑟2 (, [, ⟦ [ ] ) { } ), }, ] ⟦ ⟧
1 2 (𝑟0, 𝑟1): [ ], ( (𝑟0, 𝑟1) ) (𝑟1, 𝑟2): { } (𝑟0, 𝑟1): ⟦ ⟧, (𝑟0, 𝑟1)(𝑟1, 𝑟2), ( (𝑟0, 𝑟2) ), [ (𝑟0, 𝑟2) ]
SLIDE 37
CYK algorithm for extraction of semi-Dyck words: example
▶ 𝑜-CYK algorithm applicable for (≤𝑜)-centered automata ▶ span 𝑔(𝑝), 𝑔(𝑠): fjll backlinks for sub-runs accepting semi-Dyck words
▶ initial: 𝑝, 𝑠 → 𝜏𝜏 for (𝑝, 𝜏, 𝑞), (𝑞, 𝜏, 𝑠) ∈ 𝑈 ▶ concatenation: 𝑝, 𝑠 → (𝑝, 𝑞)(𝑞, 𝑠) ▶ bracketing: 𝑝, 𝑠 → 𝜏(𝑞, 𝑟)𝜏 for (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈
𝑟0 start 𝑟1 𝑟2 (, [, ⟦ [ ] ) { } ), }, ] ⟦ ⟧
1 2 (𝑟0, 𝑟1): [ ], ( (𝑟0, 𝑟1) ) (𝑟1, 𝑟2): { } (𝑟0, 𝑟1): ⟦ ⟧, (𝑟0, 𝑟1)(𝑟1, 𝑟2), ( (𝑟0, 𝑟2) ), [ (𝑟0, 𝑟2) ]
SLIDE 38
CYK algorithm for extraction of semi-Dyck words: example
▶ 𝑜-CYK algorithm applicable for (≤𝑜)-centered automata ▶ span 𝑔(𝑝), 𝑔(𝑠): fjll backlinks for sub-runs accepting semi-Dyck words
▶ initial: 𝑝, 𝑠 → 𝜏𝜏 for (𝑝, 𝜏, 𝑞), (𝑞, 𝜏, 𝑠) ∈ 𝑈 ▶ concatenation: 𝑝, 𝑠 → (𝑝, 𝑞)(𝑞, 𝑠) ▶ bracketing: 𝑝, 𝑠 → 𝜏(𝑞, 𝑟)𝜏 for (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈
𝑟0 start 𝑟1 𝑟2 (, [, ⟦ [ ] ) { } ), }, ] ⟦ ⟧
1 2 (𝑟0, 𝑟1): [ ], ( (𝑟0, 𝑟1) ) (𝑟1, 𝑟2): { } (𝑟0, 𝑟1): ⟦ ⟧, (𝑟0, 𝑟1)(𝑟1, 𝑟2), ( (𝑟0, 𝑟2) ), [ (𝑟0, 𝑟2) ]
SLIDE 39
CYK algorithm for extraction of semi-Dyck words: example
▶ 𝑜-CYK algorithm applicable for (≤𝑜)-centered automata ▶ span 𝑔(𝑝), 𝑔(𝑠): fjll backlinks for sub-runs accepting semi-Dyck words
▶ initial: 𝑝, 𝑠 → 𝜏𝜏 for (𝑝, 𝜏, 𝑞), (𝑞, 𝜏, 𝑠) ∈ 𝑈 ▶ concatenation: 𝑝, 𝑠 → (𝑝, 𝑞)(𝑞, 𝑠) ▶ bracketing: 𝑝, 𝑠 → 𝜏(𝑞, 𝑟)𝜏 for (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈
𝑟0 start 𝑟1 𝑟2 (, [, ⟦ [ ] ) { } ), }, ] ⟦ ⟧
1 2 (𝑟0, 𝑟1): [ ], ( (𝑟0, 𝑟1) ) (𝑟1, 𝑟2): { } (𝑟0, 𝑟1): ⟦ ⟧, (𝑟0, 𝑟1)(𝑟1, 𝑟2), ( (𝑟0, 𝑟2) ), [ (𝑟0, 𝑟2) ]
SLIDE 40
CYK algorithm for extraction of semi-Dyck words: example
▶ 𝑜-CYK algorithm applicable for (≤𝑜)-centered automata ▶ span 𝑔(𝑝), 𝑔(𝑠): fjll backlinks for sub-runs accepting semi-Dyck words
▶ initial: 𝑝, 𝑠 → 𝜏𝜏 for (𝑝, 𝜏, 𝑞), (𝑞, 𝜏, 𝑠) ∈ 𝑈 ▶ concatenation: 𝑝, 𝑠 → (𝑝, 𝑞)(𝑞, 𝑠) ▶ bracketing: 𝑝, 𝑠 → 𝜏(𝑞, 𝑟)𝜏 for (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈
𝑟0 start 𝑟1 𝑟2 (, [, ⟦ [ ] ) { } ), }, ] ⟦ ⟧
1 2 (𝑟0, 𝑟1): [ ], ( (𝑟0, 𝑟1) ) (𝑟1, 𝑟2): { } (𝑟0, 𝑟1): ⟦ ⟧, (𝑟0, 𝑟1)(𝑟1, 𝑟2), ( (𝑟0, 𝑟2) ), [ (𝑟0, 𝑟2) ]
SLIDE 41
CYK algorithm for extraction of semi-Dyck words: example
▶ 𝑜-CYK algorithm applicable for (≤𝑜)-centered automata ▶ span 𝑔(𝑝), 𝑔(𝑠): fjll backlinks for sub-runs accepting semi-Dyck words
▶ initial: 𝑝, 𝑠 → 𝜏𝜏 for (𝑝, 𝜏, 𝑞), (𝑞, 𝜏, 𝑠) ∈ 𝑈 ▶ concatenation: 𝑝, 𝑠 → (𝑝, 𝑞)(𝑞, 𝑠) ▶ bracketing: 𝑝, 𝑠 → 𝜏(𝑞, 𝑟)𝜏 for (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈
𝑟0 start 𝑟1 𝑟2 (, [, ⟦ [ ] ) { } ), }, ] ⟦ ⟧
1 2 (𝑟0, 𝑟1): [ ], ( (𝑟0, 𝑟1) ) (𝑟1, 𝑟2): { } (𝑟0, 𝑟1): ⟦ ⟧, (𝑟0, 𝑟1)(𝑟1, 𝑟2), ( (𝑟0, 𝑟2) ), [ (𝑟0, 𝑟2) ]
SLIDE 42
CYK algorithm for extraction of semi-Dyck words
Require: 𝑜≥-centered automaton = (𝑅, 𝛵, 𝑟init, 𝑟fjn, 𝑈) Ensure: enumerates elements of ℒ() ∩ 𝐸(𝛵) 1: procedure extractDyck() 2: ′ ∶= normalForm() ▷ combine transitions (𝑝, 𝜏, 𝑞), (𝑞, 𝜏, 𝑟) to (𝑝, 𝜏𝜏, 𝑟) 3:
𝐷 ∶= cyk(′)
4:
enumerate(𝐷, 𝑟init, 𝑟fjn) ▷ c.f. Huang and Chiang [HC05]
5: function cyk() 6:
for 𝑠 ∈ {1, …, 𝑜} do
7:
for 𝑚 ∈ {0, …, 𝑜 − 1} do
8:
𝑇𝑚,𝑚+𝑠 ∶= {(𝑞, 𝑟) ∣ (𝑞, 𝜏𝜏, 𝑟) ∈ 𝑈, 𝑔(𝑞) = 𝑚, 𝑔(𝑟) = 𝑚 + 𝑠}
9:
for 𝑛 ∈ {1, …, 𝑠 − 1} do
10:
𝑇𝑚,𝑚+𝑠 ∪= {(𝑝, 𝑟) ∣ (𝑝, 𝑞) ∈ 𝑇𝑚,𝑛, (𝑞, 𝑟) ∈ 𝑇𝑛,𝑚+𝑠}
11:
𝑇𝑚,𝑚+𝑠 ∪= ⋃(𝑞,𝑟)∈𝑇𝑚,𝑚+𝑠 R(𝑞, 𝑟) ▷ transitively reachable (𝑝, 𝑠) via (𝑝, 𝜏, 𝑞), (𝑟, 𝜏, 𝑠) ∈ 𝑈
12:
return (𝑇𝑗,𝑘 ∣ 𝑗 ∈ {0, …, 𝑜 − 1}, 𝑘 ∈ {𝑗 + 1, …, 𝑜})
SLIDE 43
Conclusion
▶ application for Chomsky-Schützenberger parsing [Hul11; Den17]:
▶ 𝑆, ℎ−1(𝑥) are (≤∞)-centered ▶ 𝑆 ∩ ℎ−1(𝑥) is (≤|𝑥|)-centered for 𝜁-free grammars ▶ size of closure R(𝑞, 𝑟) depends on chain rules
▶ CYK parsing of cfg without binarization ▶ closure properties:
▶ parse multiple words at same time ▶ even using difgerent grammars
SLIDE 44
Conclusion
▶ application for Chomsky-Schützenberger parsing [Hul11; Den17]:
▶ 𝑆, ℎ−1(𝑥) are (≤∞)-centered ▶ 𝑆 ∩ ℎ−1(𝑥) is (≤|𝑥|)-centered for 𝜁-free grammars ▶ size of closure R(𝑞, 𝑟) depends on chain rules
▶ CYK parsing of cfg without binarization ▶ closure properties:
▶ parse multiple words at same time ▶ even using difgerent grammars
SLIDE 45
Conclusion
▶ application for Chomsky-Schützenberger parsing [Hul11; Den17]:
▶ 𝑆, ℎ−1(𝑥) are (≤∞)-centered ▶ 𝑆 ∩ ℎ−1(𝑥) is (≤|𝑥|)-centered for 𝜁-free grammars ▶ size of closure R(𝑞, 𝑟) depends on chain rules
▶ CYK parsing of cfg without binarization ▶ closure properties:
▶ parse multiple words at same time ▶ even using difgerent grammars
SLIDE 46
Conclusion
▶ application for Chomsky-Schützenberger parsing [Hul11; Den17]:
▶ 𝑆, ℎ−1(𝑥) are (≤∞)-centered ▶ 𝑆 ∩ ℎ−1(𝑥) is (≤|𝑥|)-centered for 𝜁-free grammars ▶ size of closure R(𝑞, 𝑟) depends on chain rules
▶ CYK parsing of cfg without binarization ▶ closure properties:
▶ parse multiple words at same time ▶ even using difgerent grammars
SLIDE 47
Conclusion
▶ application for Chomsky-Schützenberger parsing [Hul11; Den17]:
▶ 𝑆, ℎ−1(𝑥) are (≤∞)-centered ▶ 𝑆 ∩ ℎ−1(𝑥) is (≤|𝑥|)-centered for 𝜁-free grammars ▶ size of closure R(𝑞, 𝑟) depends on chain rules
▶ CYK parsing of cfg without binarization ▶ closure properties:
▶ parse multiple words at same time ▶ even using difgerent grammars
SLIDE 48