Coarse-to-fine recognition for weighted tree-stack automata
Max Korn
- 27. Oktober 2017
1 / 2
Coarse-to-fine recognition for weighted tree-stack automata Max - - PowerPoint PPT Presentation
Coarse-to-fine recognition for weighted tree-stack automata Max Korn 27. Oktober 2017 1 / 2 Motivation Problem: Parsing with complicated grammars and recognition with complicated automata are time intensive 2 / 2 Motivation Problem:
1 / 2
◮ Problem: Parsing with complicated grammars and recognition
2 / 2
◮ Problem: Parsing with complicated grammars and recognition
◮ Example: multiple-context-free grammars and tree-stack
2 / 2
◮ Problem: Parsing with complicated grammars and recognition
◮ Example: multiple-context-free grammars and tree-stack
◮ Solution: use less complex grammar/ automaton
2 / 2
3 / 2
3 / 2
4 / 2
4 / 2
4 / 2
4 / 2
4 / 2
5 / 2
◮ set C (of configurations) ◮ set P (of predicates) with P ⊆ P(C) ◮ set R (of instructions) with R ⊆ P(C × C) ◮ initial configuration ci ∈ C
5 / 2
6 / 2
6 / 2
7 / 2
7 / 2
7 / 2
7 / 2
7 / 2
8 / 2
8 / 2
9 / 2
9 / 2
9 / 2
q0 q1 (a, true, push(A), 1) (a, true, push(B), 1) q2 (ǫ, true, id, 0) (b, top(B), pop, 2) q3 (b, top(A), pop, 2)
10 / 2
q0 q1 (a, true, push(A), 1) (a, true, push(B), 1) q2 (ǫ, true, id, 0) (b, top(B), pop, 2) q3 (b, top(A), pop, 2) ≈A q0 q1 (a, true, push(C), 1) (a, true, push(C), 1) q2 (ǫ, true, id, 0) (b, top(C), pop, 2) q3 (b, top(C), pop, 2)
10 / 2
q0 q1 (a, true, push(A), 1) (a, true, push(B), 1) q2 (ǫ, true, id, 0) (b, top(B), pop, 2) q3 (b, top(A), pop, 2) ≈A q0 q1 (a, true, push(C), 1) (a, true, push(C), 1) q2 (ǫ, true, id, 0) (b, top(C), pop, 2) q3 (b, top(C), pop, 2)
10 / 2
10 / 2
11 / 2
◮ Ignoring tree-structures inspired by Burden and Ljungl¨
11 / 2
◮ Ignoring tree-structures inspired by Burden and Ljungl¨
◮ Relabelling to equivalence classes of stack symbols by
11 / 2
◮ Ignoring tree-structures inspired by Burden and Ljungl¨
◮ Relabelling to equivalence classes of stack symbols by
◮ Reducing the amount of push-down configurations to a finite
11 / 2
12 / 2
12 / 2
13 / 2
13 / 2
13 / 2
14 / 2
14 / 2
15 / 2
15 / 2
15 / 2
16 / 2
16 / 2
17 / 2
17 / 2
17 / 2
18 / 2
18 / 2
18 / 2
19 / 2
19 / 2
19 / 2
19 / 2
19 / 2
20 / 2
20 / 2
20 / 2
21 / 2
1: M′ ←≈A (M) 2: Pf ← ∅ 3: Pc ← RM′(w) 9: return Pf
21 / 2
1: M′ ←≈A (M) 2: Pf ← ∅ 3: Pc ← RM′(w) 4: while
5:
6:
9: return Pf
21 / 2
1: M′ ←≈A (M) 2: Pf ← ∅ 3: Pc ← RM′(w) 4: while
5:
6:
7:
A (θ) do
8:
9: return Pf
21 / 2
1: M′ ←≈A (M) 2: Pf ← ∅ 3: Pc ← RM′(w) 4: while |Pf | < n or maxθ∈Pf wt(θ) > minθ′∈Pcwt(≈−1
A (θ′)) do
5:
6:
7:
A (θ) do
8:
9: return Pf
21 / 2
≈A1
≈A2
≈A3
≈Am
22 / 2
≈A1
≈A2
≈A3
≈Am
22 / 2
1: M1 ←≈A1 (M) 2: M2 ←≈A2 (M1)
3: Mm ←≈Am (Mm−1) 4: Pf ← ∅ 5: Pm ← RMm(w) 15: return Pf
23 / 2
1: M1 ←≈A1 (M) 2: M2 ←≈A2 (M1)
3: Mm ←≈Am (Mm−1) 4: Pf ← ∅ 5: Pm ← RMm(w) 6: while |Pf | < n or maxθ∈Pf wt(θ) > minθ′∈Pm wt(≈−1
A (θ′)) do
7:
8:
15: return Pf
23 / 2
1: M1 ←≈A1 (M) 2: M2 ←≈A2 (M1)
3: Mm ←≈Am (Mm−1) 4: Pf ← ∅ 5: Pm ← RMm(w) 6: while |Pf | < n or maxθ∈Pf wt(θ) > minθ′∈Pm wt(≈−1
A (θ′)) do
7:
8:
9:
Am (θm) do
10:
15: return Pf
23 / 2
1: M1 ←≈A1 (M) 2: M2 ←≈A2 (M1)
3: Mm ←≈Am (Mm−1) 4: Pf ← ∅ 5: Pm ← RMm(w) 6: while |Pf | < n or maxθ∈Pf wt(θ) > minθ′∈Pm wt(≈−1
A (θ′)) do
7:
8:
9:
Am (θm) do
10:
11:
Am−1 (θm−1) do
12:
15: return Pf
23 / 2
1: M1 ←≈A1 (M) 2: M2 ←≈A2 (M1)
3: Mm ←≈Am (Mm−1) 4: Pf ← ∅ 5: Pm ← RMm(w) 6: while |Pf | < n or maxθ∈Pf wt(θ) > minθ′∈Pm wt(≈−1
A (θ′)) do
7:
8:
9:
Am (θm) do
10:
11:
Am−1 (θm−1) do
12:
13:
A1 (θ1) do
14:
15: return Pf
23 / 2
24 / 2
PTK, filter via MRel
24 / 2
PTK, filter via MRel
RLB, filter via MPD
24 / 2
PTK, filter via MRel
RLB, filter via MPD
TTS, filter via M
24 / 2
PTK, filter via MRel
RLB, filter via MPD
TTS, filter via M
24 / 2
24 / 2
1using rustomata https://github.com/tud-fop/rustomata 25 / 2
1using rustomata https://github.com/tud-fop/rustomata 25 / 2
for run1 in app3.recognise(word ). take(n) { let trans_runs1 = ctf_level(run1 , &ptk , &app2 ); for run2 in trans_runs1 { let trans_runs2 = ctf_level(run2 , &rlb , &app1 ); for run3 in trans_runs2 { let trans_runs3 = ctf_level(run3 , &tts , &app0 ); for run4 in trans_runs3 { println !("{:?}", run4 ); } } } }
1using rustomata https://github.com/tud-fop/rustomata 25 / 2
25 / 2
◮ Grammars created by using the first 5, 10, 15 and 20
2http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/
negra-corpus.html
3http:
//www.coli.uni-saarland.de/projects/sfb378/negra-corpus/stts.asc
26 / 2
◮ Grammars created by using the first 5, 10, 15 and 20
◮ Grammars converted into Automata by rustomata
2http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/
negra-corpus.html
3http:
//www.coli.uni-saarland.de/projects/sfb378/negra-corpus/stts.asc
26 / 2
◮ Grammars created by using the first 5, 10, 15 and 20
◮ Grammars converted into Automata by rustomata ◮ 2 equivalence classes, one using fanout and the other using
2http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/
negra-corpus.html
3http:
//www.coli.uni-saarland.de/projects/sfb378/negra-corpus/stts.asc
26 / 2
◮ Grammars created by using the first 5, 10, 15 and 20
◮ Grammars converted into Automata by rustomata ◮ 2 equivalence classes, one using fanout and the other using
◮ PTK heights of 5, 10, 15 and 20
2http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/
negra-corpus.html
3http:
//www.coli.uni-saarland.de/projects/sfb378/negra-corpus/stts.asc
26 / 2
◮ Grammars created by using the first 5, 10, 15 and 20
◮ Grammars converted into Automata by rustomata ◮ 2 equivalence classes, one using fanout and the other using
◮ PTK heights of 5, 10, 15 and 20 ◮ recognising three sentences contained in the corresponding
2http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/
negra-corpus.html
3http:
//www.coli.uni-saarland.de/projects/sfb378/negra-corpus/stts.asc
26 / 2
◮ Grammars created by using the first 5, 10, 15 and 20
◮ Grammars converted into Automata by rustomata ◮ 2 equivalence classes, one using fanout and the other using
◮ PTK heights of 5, 10, 15 and 20 ◮ recognising three sentences contained in the corresponding
2http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/
negra-corpus.html
3http:
//www.coli.uni-saarland.de/projects/sfb378/negra-corpus/stts.asc
26 / 2
◮ Grammars created by using the first 5, 10, 15 and 20
◮ Grammars converted into Automata by rustomata ◮ 2 equivalence classes, one using fanout and the other using
◮ PTK heights of 5, 10, 15 and 20 ◮ recognising three sentences contained in the corresponding
2http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/
negra-corpus.html
3http:
//www.coli.uni-saarland.de/projects/sfb378/negra-corpus/stts.asc
26 / 2
◮ Grammars created by using the first 5, 10, 15 and 20
◮ Grammars converted into Automata by rustomata ◮ 2 equivalence classes, one using fanout and the other using
◮ PTK heights of 5, 10, 15 and 20 ◮ recognising three sentences contained in the corresponding
2http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/
negra-corpus.html
3http:
//www.coli.uni-saarland.de/projects/sfb378/negra-corpus/stts.asc
26 / 2
◮ Grammars created by using the first 5, 10, 15 and 20
◮ Grammars converted into Automata by rustomata ◮ 2 equivalence classes, one using fanout and the other using
◮ PTK heights of 5, 10, 15 and 20 ◮ recognising three sentences contained in the corresponding
2http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/
negra-corpus.html
3http:
//www.coli.uni-saarland.de/projects/sfb378/negra-corpus/stts.asc
26 / 2
27 / 2
28 / 2
0.2 0.4 0.6 0.8 1 ·106 20 40 60 80 number of configurations time (s) no Appr. 1 layer 2 layer 3 layer
29 / 2
0.5 1 1.5 2 2.5 3 ·105 5 10 15 20 number of configurations time (s) no Appr. 1 layer 2 layer 3 layer
30 / 2
0.5 1 1.5 2 2.5 3 ·105 5 10 15 20 number of configurations time (s) no Appr. 1 layer 2 layer 3 layer
31 / 2
0.5 1 1.5 2 2.5 3 ·105 5 10 15 20 number of configurations time (s) no Appr. 1 layer 2 layer 3 layer
32 / 2
[1] H˚ akan Burden and Peter Ljungl¨
Rewriting Systems”. In: Proceedings of the Ninth IWPT (2005),
[2] Eugene Charniak et al. “Multilevel coarse-to-fine PCFG parsing”. In: Proceedings of the HLT-NACL. 2006. [3] Andreas van Cranenburgh. “Efficient Parsing with Linear Context-Free Rewriting Systems”. In: Proceedings of the 13th Conference of the
[4] Tobias Denkinger. “Approximation of Weighted Automata with Storage”. In: Proceedings Eighth International Symposium on GandALF. 2017,
[5] Mark-Jan Nederhof. “Regular approximations of CFLs: A grammatical view”. In: Proceedings of the IWPT (1997), pp. 159–170.
33 / 2
33 / 2
33 / 2
33 / 2
◮ push(γ) = replace(ε, γ) for all γ ∈ Γ
33 / 2
◮ push(γ) = replace(ε, γ) for all γ ∈ Γ ◮ pop(γ) = replace(γ, ε) for all γ ∈ Γ
33 / 2
◮ push(γ) = replace(ε, γ) for all γ ∈ Γ ◮ pop(γ) = replace(γ, ε) for all γ ∈ Γ ◮ id(γ) = replace(γ, γ) for all γ ∈ Γ
33 / 2
◮ push(γ) = replace(ε, γ) for all γ ∈ Γ ◮ pop(γ) = replace(γ, ε) for all γ ∈ Γ ◮ id(γ) = replace(γ, γ) for all γ ∈ Γ ◮ id = replace(ε, ε)
33 / 2
33 / 2
33 / 2