Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion
Approximating Context-Free Grammars for Parsing and Verification
Sylvain Schmitz
LORIA, INRIA Nancy - Grand Est
October 18, 2007
Approximating Context-Free Grammars for Parsing and Verification - - PowerPoint PPT Presentation
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Approximating Context-Free Grammars for Parsing and Verification Sylvain Schmitz LORIA, INRIA Nancy - Grand Est October 18, 2007 Motivation Approximations
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion
Sylvain Schmitz
LORIA, INRIA Nancy - Grand Est
October 18, 2007
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue
Milner et al. [1997]
datatype ’a option = NONE | SOME of ’a fun filter pred l = let fun filterP (x::r, l) = case (pred x)Executable code Program text Compiler Language Specification
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue
Milner et al. [1997]
datatype ’a option = NONE | SOME of ’a fun filter pred l = let fun filterP (x::r, l) = case (pred x)datatype ’a option = NONE | SOME of ’a fun filter pred l = let fun filterP (x::r, l) = case (pred x)
| NONE => filterP(r, l) | filterP ([], l) = rev l in filterP (l, []) end
Executable code Program text
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue
Milner et al. [1997]
datatype ’a option = NONE | SOME of ’a fun filter pred l = let fun filterP (x::r, l) = case (pred x)datatype ’a option = NONE | SOME of ’a fun filter pred l = let fun filterP (x::r, l) = case (pred x)
| NONE => filterP(r, l) | filterP ([], l) = rev l in filterP (l, []) end
Executable code Program text ◮ MLton ◮ Moscow ML ◮ Poly/ML ◮ SML/NJ
Error: match.sml 9.25. Syntax error: replacing EQUALOP with DARROW
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue
Milner et al. [1997]
datatype ’a option = NONE | SOME of ’a fun filter pred l = let fun filterP (x::r, l) = case (pred x)datatype ’a option = NONE | SOME of ’a fun filter pred l = let fun filterP (x::r, l) = case (pred x)
| NONE => filterP(r, l) | filterP ([], l) = rev l in filterP (l, []) end
Executable code Program text ◮ MLton ◮ Moscow ML ◮ Poly/ML ◮ SML/NJ
! Toplevel input: ! | filterP ([], l) = rev l ! ˆ ! Syntax error.
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue
Milner et al. [1997]
datatype ’a option = NONE | SOME of ’a fun filter pred l = let fun filterP (x::r, l) = case (pred x)datatype ’a option = NONE | SOME of ’a fun filter pred l = let fun filterP (x::r, l) = case (pred x)
| NONE => filterP(r, l) | filterP ([], l) = rev l in filterP (l, []) end
Executable code Program text ◮ MLton ◮ Moscow ML ◮ Poly/ML ◮ SML/NJ
Error: => expected but = was found
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue
Milner et al. [1997]
datatype ’a option = NONE | SOME of ’a fun filter pred l = let fun filterP (x::r, l) = case (pred x)datatype ’a option = NONE | SOME of ’a fun filter pred l = let fun filterP (x::r, l) = case (pred x)
| NONE => filterP(r, l) | filterP ([], l) = rev l in filterP (l, []) end
Executable code Program text ◮ MLton ◮ Moscow ML ◮ Poly/ML ◮ SML/NJ
stdIn:7.24-7.29 Error: syntax error: deleting EQUALOP ID
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue
Compiler Language Specification Executable code Program text
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue
Program text Executable code Language Specification Compiler Parser
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue
Parser
Context-free grammar
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue
fvalbind sfvalbind atpats | NONE => filterP(r, l) | filterP ([], l) = rev l exp . . . mrule pat match exp sfvalbind fvalbind exp | NONE => filterP(r, l) | filterP ([], l) = rev l . . .
Context-free grammar Parser generator Parse tree Input tokens Parser
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue
fvalbind sfvalbind atpats | NONE => filterP(r, l) | filterP ([], l) = rev l exp . . . mrule pat match exp sfvalbind fvalbind exp | NONE => filterP(r, l) | filterP ([], l) = rev l . . .
Parser Parse tree Input tokens Parser generator Context-free grammar
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts
◮ GNU Bison state 20 6 exp: "case" exp "of" match . 8 match: match . ’|’ mrule ’|’ shift, and go to state 24 ’|’ [reduce using rule 6 (exp)] ◮ Restricted grammar class
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts
◮ GNU Bison state 20 6 exp: "case" exp "of" match . 8 match: match . ’|’ mrule ’|’ shift, and go to state 24 ’|’ [reduce using rule 6 (exp)] ◮ Restricted grammar class
CFG LALR(1)
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts
An Objective Measure [Malloy et al., 2002] on a C# Grammar
100 200 300 400 500 600 700 2 4 6 8 10 12 14 16 18 20 LALR(1) conflicts Parser versions ’2002_malloy.data’ using 1:($2+$3)
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts
A Subjective Measure
Courtesy of http://www.phdcomics.com.
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts
A Subjective Measure
Courtesy of http://www.phdcomics.com.
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts
A Subjective Measure
Courtesy of http://www.phdcomics.com.
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ LR(k) [Knuth, 1965] ◮ LR-Regular [ ˇ
Culik and Cohen, 1973]
◮ Generalized LR [Tomita,
1986]
◮ Unambiguous CFGs
[Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963]
◮ Horizontal and vertical
unambiguity test [Brabrand et al., 2007]
LALR(1) LR(k) CFG
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ LR(k) [Knuth, 1965] ◮ LR-Regular [ ˇ
Culik and Cohen, 1973]
◮ Generalized LR [Tomita,
1986]
◮ Unambiguous CFGs
[Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963]
◮ Horizontal and vertical
unambiguity test [Brabrand et al., 2007]
CFG LR-Regular LR(k) LALR(1)
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ LR(k) [Knuth, 1965] ◮ LR-Regular [ ˇ
Culik and Cohen, 1973]
◮ Generalized LR [Tomita,
1986]
◮ Unambiguous CFGs
[Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963]
◮ Horizontal and vertical
unambiguity test [Brabrand et al., 2007]
CFG
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
exp exp match mrule mrule match exp mrule exp match pat exp pat exp mrule match mrule match exp match mrule exp case a of b => case b of c => c | d=> d case a of b => case b of c => c | d=> d case a of b => case b of c => c | d=> d
Context-free grammar Parser generator Input tokens Parse forest
Parser
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
exp exp match mrule mrule match exp mrule exp match pat exp pat exp mrule match mrule match exp match mrule exp case a of b => case b of c => c | d=> d case a of b => case b of c => c | d=> d case a of b => case b of c => c | d=> d
Context-free grammar Parser generator Parse forest Input tokens
Parser
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ LR(k) [Knuth, 1965] ◮ LR-Regular [ ˇ
Culik and Cohen, 1973]
◮ Generalized LR [Tomita,
1986]
◮ Unambiguous CFGs
[Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963]
◮ Horizontal and vertical
unambiguity test [Brabrand et al., 2007]
CFG UCFG
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ LR(k) [Knuth, 1965] ◮ LR-Regular [ ˇ
Culik and Cohen, 1973]
◮ Generalized LR [Tomita,
1986]
◮ Unambiguous CFGs
[Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963]
◮ Horizontal and vertical
unambiguity test [Brabrand et al., 2007]
CFG UCFG LR-Regular LR(k) LALR(1)
S a f e U n s a f e
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ LR(k) [Knuth, 1965] ◮ LR-Regular [ ˇ
Culik and Cohen, 1973]
◮ Generalized LR [Tomita,
1986]
◮ Unambiguous CFGs
[Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963]
◮ Horizontal and vertical
unambiguity test [Brabrand et al., 2007]
CFG
U n s a f e
HVRU UCFG
S a f e
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ LR(k) [Knuth, 1965] ◮ LR-Regular [ ˇ
Culik and Cohen, 1973]
◮ Generalized LR [Tomita,
1986]
◮ Unambiguous CFGs
[Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963]
◮ Horizontal and vertical
unambiguity test [Brabrand et al., 2007]
LALR(1) LR-Regular CFG
U n s a f e
HVRU
S a f e
LR(k) UCFG
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ Noncanonical parsing
methods [Szymanski and
Williams, 1976, Tai, 1979]
◮ Noncanonical LALR(1) ◮ Shift-Resolve
◮ Noncanonical
unambiguity test
◮ Framework for grammar
approximations
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ Noncanonical parsing
methods [Szymanski and
Williams, 1976, Tai, 1979]
◮ Noncanonical LALR(1) ◮ Shift-Resolve
◮ Noncanonical
unambiguity test
◮ Framework for grammar
approximations
CFG UCFG LALR(1) NLALR(1) LR(k) LR-Regular
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ Noncanonical parsing
methods [Szymanski and
Williams, 1976, Tai, 1979]
◮ Noncanonical LALR(1) ◮ Shift-Resolve
◮ Noncanonical
unambiguity test
◮ Framework for grammar
approximations
LALR(1) ShRe LR(k) LR-Regular UCFG CFG
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ Noncanonical parsing
methods [Szymanski and
Williams, 1976, Tai, 1979]
◮ Noncanonical LALR(1) ◮ Shift-Resolve
◮ Noncanonical
unambiguity test
◮ Framework for grammar
approximations
LALR(1) LR(k) LR-Regular NU HVRU UCFG CFG
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions
◮ Noncanonical parsing
methods [Szymanski and
Williams, 1976, Tai, 1979]
◮ Noncanonical LALR(1) ◮ Shift-Resolve
◮ Noncanonical
unambiguity test
◮ Framework for grammar
approximations
LALR(1) LR(k) LR-Regular NU HVRU UCFG CFG
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
G = N, T, P, S, V = N ∪ T
dec
1
− → fun fvalbind fvalbind
2
− → sfvalbind fvalbind
3
− → fvalbind ′|′ sfvalbind sfvalbind
4
− → vid atpats = exp exp
5
− → case exp of match match
6
− → mrule match
7
− → match ′|′ mrule mrule
8
− → pat => exp atpats
9
− → atpat atpats
10
− → atpats atpat pat
11
− → vid atpat atpat
12
− → vid
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
G
b = N, Tb, P b, S, Vb = N ∪ Tb
dec
1
− → d1 fun fvalbind r1 fvalbind
2
− → d2 sfvalbind r2 fvalbind
3
− → d3 fvalbind ′|′ sfvalbind r3 sfvalbind
4
− → d4 vid atpats = exp r4 exp
5
− → d5 case exp of match r5 match
6
− → d6 mrule r6 match
7
− → d7 match ′|′ mrule r7 mrule
8
− → d8 pat => exp r8 atpats
9
− → d9 atpat r9 atpats
10
− → d10 atpats atpat r10 pat
11
− → d11 vid atpat r11 atpat
12
− → d12 vid r12
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
fvalbind sfvalbind fvalbind sfvalbind ’|’ exp = vid atpats
d3 d2 sfvalbind r2
′|′ ·d4 vid atpats = exp r4 r3
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
Left-to-right Walks in Trees
fvalbind sfvalbind fvalbind sfvalbind ’|’ exp = vid atpats d4
d3 d2 sfvalbind r2
′|′ d4· vid atpats = exp r4 r3
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
Left-to-right Walks in Trees
fvalbind sfvalbind fvalbind sfvalbind ’|’ exp = vid atpats sfvalbind
d3 d2 sfvalbind r2
′|′ d4 vid atpats = exp r4· r3
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
Left-to-right Walks in Trees
fvalbind sfvalbind fvalbind sfvalbind ’|’ exp = vid atpats r3
d3 d2 sfvalbind r2
′|′ d4 vid atpats = exp r4 r3·
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
Left-to-right Walks in Trees fvalbind sfvalbind fvalbind sfvalbind ’|’ exp = vid atpats r3 d3 r4 d4 r2 d2 . . . . . . . . . . . .
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
Definition
Γ/≡ is the quotient of Γ by an equivalence relation ≡ between positions.
Theorem (Language over-approximation)
L(Gb) ⊆ L(Γ/≡) ∩ T ∗
b
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
fvalbind sfvalbind fvalbind sfvalbind ’|’ r3 d3 r2 d2 d4 r4 = exp d4 = exp r4 vid atpats vid atpats ◮ equivalence class
[sfvalbind
4
− →vid atpats· = exp]
◮ LR(0) items ◮ Γ/item0: nondeterministic LR(0) automaton
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
[sfvalbind
4
− →·vid atpats = exp] d4 [sfvalbind
4
− →vid·atpats = exp] vid atpats [sfvalbind
4
− →vid atpats· = exp] = [sfvalbind
4
− →vid atpats =·exp] [sfvalbind
4
− →vid atpats = exp·] exp d4 [fvalbind
2
− →·sfvalbind] [fvalbind
3
− →fvalbind ′|′ sfvalbind·] [fvalbind
3
− →fvalbind ′|′·sfvalbind] [fvalbind
2
− →sfvalbind·] r4 r4
equivalence class
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
◮ general framework for approximations ◮ applications:
◮ parser construction ◮ ambiguity detection ◮ XML validation [Segoufin and Vianu, 2002]? ◮ symbolic supertagging [Boullier, 2003]?
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations
◮ general framework for approximations ◮ applications:
◮ parser construction ◮ ambiguity detection ◮ XML validation [Segoufin and Vianu, 2002]? ◮ symbolic supertagging [Boullier, 2003]?
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Principles
◮ noncanonical ◮ k = 1 reduced lookahead symbol ◮ resolve = reduce + pushback: emulates a
bounded reduced lookahead without any preset bound
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Principles
◮ noncanonical ◮ k = 1 reduced lookahead symbol ◮ resolve = reduce + pushback: emulates a
bounded reduced lookahead without any preset bound
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example
| NONE => filterP(r, l) | filterP ([], l) = rev l . . . fvalbind sfvalbind exp atpats exp sfvalbind fvalbind
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example
| NONE => filterP(r, l) | filterP ([], l) = rev l . . . fvalbind sfvalbind exp atpats exp sfvalbind fvalbind match mrule pat exp
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example
| NONE => filterP(r, l) | filterP ([], l) = rev l . . . fvalbind sfvalbind exp atpats exp sfvalbind fvalbind match mrule pat exp
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example
| NONE => filterP(r, l) | filterP ([], l) = rev l . . . fvalbind exp sfvalbind fvalbind match mrule pat exp exp atpats sfvalbind
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example
| NONE => filterP(r, l) | filterP ([], l) = rev l . . . fvalbind sfvalbind fvalbind mrule pat exp atpats exp sfvalbind match exp
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example
| NONE => filterP(r, l) | filterP ([], l) = rev l . . . mrule pat exp atpats exp match exp sfvalbind sfvalbind fvalbind fvalbind
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
Principle
◮ di transitions denote traditional item closures ◮ ri transitions denote a phrase that should be
reduced
◮ other transitions denote shifts ◮ items in the construction hold
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
Principle
◮ di transitions denote traditional item closures ◮ ri transitions denote a phrase that should be
reduced
◮ other transitions denote shifts ◮ items in the construction hold
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
Example exp− →case exp of match·, 0, 0 match− →match· ’|’ mrule, 0, 0
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
Example exp− →case exp of match·, 0, 0 match− →match· ’|’ mrule, 0, 0 sfvalbind− →vid atpats = exp·, 5, 0 r5
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
Example exp− →case exp of match·, 0, 0 match− →match· ’|’ mrule, 0, 0 sfvalbind− →vid atpats = exp·, 5, 0 fvalbind− →fvalbind ’|’ sfvalbind·, 5, 0 fvalbind− →sfvalbind·, 5, 0 r4
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
Example exp− →case exp of match·, 0, 0 match− →match· ’|’ mrule, 0, 0 sfvalbind− →vid atpats = exp·, 5, 0 fvalbind− →fvalbind ’|’ sfvalbind·, 5, 0 fvalbind− →sfvalbind·, 5, 0 fvalbind− →fvalbind· ’|’ sfvalbind, 5, 0 dec− →fun fvalbind·, 5, 0 S′− →dec·$, 5, 0
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
Example match− →match· ’|’ mrule, 0, 0 sfvalbind− →vid atpats = exp·, 5, 0 fvalbind− →fvalbind ’|’ sfvalbind·, 5, 0 fvalbind− →sfvalbind·, 5, 0 fvalbind− →fvalbind· ’|’ sfvalbind, 5, 0 dec− →fun fvalbind·, 5, 0 S′− →dec·$, 5, 0 exp− →case exp of match·, 0, 0
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
Example exp− →case exp of match·, 0, 0 match− →match· ’|’ mrule, 0, 0 sfvalbind− →vid atpats = exp·, 5, 0 fvalbind− →fvalbind ’|’ sfvalbind·, 5, 0 fvalbind− →sfvalbind·, 5, 0 fvalbind− →fvalbind· ’|’ sfvalbind, 5, 0 dec− →fun fvalbind·, 5, 0 S′− →dec·$, 5, 0 fvalbind− →fvalbind ’|’ ·sfvalbind, 5, 1 match− →match ’|’ ·mrule, 0, 0 ’|’
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
Example exp− →case exp of match·, 0, 0 match− →match· ’|’ mrule, 0, 0 sfvalbind− →vid atpats = exp·, 5, 0 fvalbind− →fvalbind ’|’ sfvalbind·, 5, 0 fvalbind− →sfvalbind·, 5, 0 fvalbind− →fvalbind· ’|’ sfvalbind, 5, 0 dec− →fun fvalbind·, 5, 0 S′− →dec·$, 5, 0 fvalbind− →fvalbind ’|’ ·sfvalbind, 5, 1 match− →match ’|’ ·mrule, 0, 0 ’|’ mrule− →·pat => exp, 0, 0 d8
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
Example exp− →case exp of match·, 0, 0 match− →match· ’|’ mrule, 0, 0 sfvalbind− →vid atpats = exp·, 5, 0 fvalbind− →fvalbind ’|’ sfvalbind·, 5, 0 fvalbind− →sfvalbind·, 5, 0 fvalbind− →fvalbind· ’|’ sfvalbind, 5, 0 dec− →fun fvalbind·, 5, 0 S′− →dec·$, 5, 0 fvalbind− →fvalbind ’|’ ·sfvalbind, 5, 1 match− →match ’|’ ·mrule, 0, 0 ’|’ mrule− →·pat => exp, 0, 0 pat− →·vid atpat, 0, 0 sfvalbind− →·vid atpats = exp, 0, 0
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
exp− →case exp of match·, 0, 0 match− →match· ’|’ mrule, 0, 0 sfvalbind− →vid atpats = exp·, 5, 0 fvalbind− →fvalbind ’|’ sfvalbind·, 5, 0 fvalbind− →sfvalbind·, 5, 0 fvalbind− →fvalbind· ’|’ sfvalbind, 5, 0 dec− →fun fvalbind·, 5, 0 S′− →dec·$, 5, 0
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
exp− →case exp of match·, 0, 0 match− →match· ’|’ mrule, 0, 0 sfvalbind− →vid atpats = exp·, 5, 0 fvalbind− →fvalbind ’|’ sfvalbind·, 5, 0 fvalbind− →sfvalbind·, 5, 0 fvalbind− →fvalbind· ’|’ sfvalbind, 5, 0 dec− →fun fvalbind·, 5, 0 S′− →dec·$, 5, 0 mrule− →pat ’|’ exp·, 5, 0 r5
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
exp− →case exp of match·, 0, 0 sfvalbind− →vid atpats = exp·, 5, 0 fvalbind− →fvalbind ’|’ sfvalbind·, 5, 0 fvalbind− →sfvalbind·, 5, 0 fvalbind− →fvalbind· ’|’ sfvalbind, 5, 0 dec− →fun fvalbind·, 5, 0 S′− →dec·$, 5, 0 mrule− →pat ’|’ exp·, 5, 0 match− →mrule·, 5, 0 match− →match· ’|’ mrule, 5, 0 match− →match· ’|’ mrule, 0, 0
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
◮ |Γ/≡|: size of the position automaton ◮ |A|: size of the parser: O(2|Γ/≡| |P|) ◮ parsing time complexity for input w: O(|w|)
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
◮ |Γ/≡|: size of the position automaton
|Γ/item0| = O(|G|)
◮ |A|: size of the parser: O(2|Γ/≡| |P|) ◮ parsing time complexity for input w: O(|w|)
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
− incomparable with classical parsing techniques + subset construction mendable
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
− incomparable with classical parsing techniques + subset construction mendable
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction
◮ Shift Resolve parsers
◮ 2-steps construction
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion
◮ a bracketed sentence = a derivation tree ◮ ambiguity = more than one tree with the same
yield
d6d8d13 vid r13 => d5 case d14 vid r14 of d7d6d8d13 vid r13 => d14 vid r14r8r6 ′|′ d8d13 vid r13 => d14 vid r14r8r7r5r8r6 d7d6d8d13 vid r13 => d5 case d14 vid r14 of d7d8d13 vid r13 => d14 vid r14r8r7r5r8r6 ′|′ d8d13 vid r13 => d14 vid r14r8r7
◮ construct a FSA A such that L(Gb) ⊆ L(A), and
look for bracketed sentences with the same yield
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion
◮ a bracketed sentence = a derivation tree ◮ ambiguity = more than one tree with the same
yield
d6d8d13 vid r13 => d5 case d14 vid r14 of d7d6d8d13 vid r13 => d14 vid r14r8r6 ′|′ d8d13 vid r13 => d14 vid r14r8r7r5r8r6 d7d6d8d13 vid r13 => d5 case d14 vid r14 of d7d8d13 vid r13 => d14 vid r14r8r7r5r8r6 ′|′ d8d13 vid r13 => d14 vid r14r8r7
◮ construct a FSA A such that L(Gb) ⊆ L(A), and
look for bracketed sentences with the same yield
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion
◮ a bracketed sentence = a derivation tree ◮ ambiguity = more than one tree with the same
yield
d6d8d13 vid r13 => d5 case d14 vid r14 of d7d6d8d13 vid r13 => d14 vid r14r8r6 ′|′ d8d13 vid r13 => d14 vid r14r8r7r5r8r6 d7d6d8d13 vid r13 => d5 case d14 vid r14 of d7d8d13 vid r13 => d14 vid r14r8r7r5r8r6 ′|′ d8d13 vid r13 => d14 vid r14r8r7
◮ construct a FSA A such that L(Gb) ⊆ L(A), and
look for bracketed sentences with the same yield
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Regular Unambiguity
◮ G is regular unambiguous for ≡ of finite index, if
there does not exist wb w′
b in L(Γ/≡) ∩ T ∗ b with
h(wb) = h(w′
b)
◮ LR(0) RU(item0) ◮ regular approximations are too weak
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Regular Unambiguity
◮ G is regular unambiguous for ≡ of finite index, if
there does not exist wb w′
b in L(Γ/≡) ∩ T ∗ b with
h(wb) = h(w′
b)
◮ LR(0) RU(item0) ◮ regular approximations are too weak
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ SF(Gb) ⊆ L(Γ/≡) ◮ look for two different bracketed sentential forms
in L(Γ/≡)
d6d8 pat => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 pat => d5 case exp of matchr5r8r6
′|′ mrules r7
◮ a nonterminal transition represents exactly its
derived context-free language
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ SF(Gb) ⊆ L(Γ/≡) ◮ look for two different bracketed sentential forms
in L(Γ/≡)
d6d8 pat => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 pat => d5 case exp of matchr5r8r6
′|′ mrules r7
◮ a nonterminal transition represents exactly its
derived context-free language
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ SF(Gb) ⊆ L(Γ/≡) ◮ look for two different bracketed sentential forms
in L(Γ/≡)
d6d8 pat => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 pat => d5 case exp of matchr5r8r6
′|′ mrules r7
◮ a nonterminal transition represents exactly its
derived context-free language
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 d14 vid r14 => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 d14 vid r14 => d5 case exp of matchr5r8r6
′|′ mrules r7
epsilon: mae
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 d14 vid r14 => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 d14 vid r14 => d5 case exp of matchr5r8r6
′|′ mrules r7
epsilon: mae
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 d14 vid r14 => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 d14 vid r14 => d5 case exp of matchr5r8r6
′|′ mrules r7
epsilon: mae
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 d14 vid r14 => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 d14 vid r14 => d5 case exp of matchr5r8r6
′|′ mrules r7
shift: mas
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 d14 vid r14 => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 d14 vid r14 => d5 case exp of matchr5r8r6
′|′ mrules r7
nothing!
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 pat => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 pat => d5 case exp of matchr5r8r6
′|′ mrules r7
shift: mas
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 pat => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 pat => d5 case exp of matchr5r8r6
′|′ mrules r7
conflict: mac
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 pat => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 pat => d5 case exp of matchr5r8r6
′|′ mrules r7
conflict: mac
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 pat => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 pat => d5 case exp of matchr5r8r6
′|′ mrules r7
conflict: mac
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 pat => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 pat => d5 case exp of matchr5r8r6
′|′ mrules r7
shift: mas
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 pat => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 pat => d5 case exp of matchr5r8r6
′|′ mrules r7
reduce: mar
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ between pairs of states of Γ/≡, (q1, q2) ◮ synchronized left-to-right walks from an initial
pair (qs, qs)
d6d8 pat => d5 case exp of d7 match ′|′ mrulesr7r5r8r6 d7d6d8 pat => d5 case exp of matchr5r8r6
′|′ mrules r7
conflict: mac
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ ma=mas ∪ mae ∪ mac ∪ mar ◮ G is noncanonically unambiguous if there does
not exist a relation (qs, qs) ma∗ (qf, qf) that uses mac at some step
◮ Computation in O(|Γ/≡|2) in space
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
◮ Regular Unambiguity RU(≡) ◮ Bounded-length detection schemes ◮ LR(k) and LR-Regular (LR(Π)) ◮ Horizontal and vertical ambiguity (HVRU(≡))
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
[Gorn, 1963, Cheung and Uzgalis, 1995, Schr¨
◮ generate sentences ◮ not conservative ◮ prefixm prevents from false positives in
sentences of length < m
◮ need to generate a2n+1 to find Gn
4 ambiguous,
but Gn
4 NU(item0)
S− →A|Bna, A− →Aaa|a, B1− →aa, B2− →B1B1, . . . , Bn− →Bn−1Bn−1 (Gn
4 )
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity
[Knuth, 1965, Hunt III et al., 1975, ˇ Culik and Cohen, 1973, Heilbrunner, 1983]
◮ conservative tests ◮ define itemΠ s.t. LR(Π) ⊂ NU(itemΠ) ◮ need a LR(2n) test to prove Gn
3 unambiguous,
but Gn
3 ∈ NU(item0)
S− →A|Bn, A− →Aaa|a, B1− →aa, B2− →B1B1, . . . , Bn− →Bn−1Bn−1 (Gn
3 )
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Experimental Results
◮ For the whole SML grammar:
◮ conflicts in the LALR(1) parser
sml.y: conflicts: 223 shift/reduce, 35 reduce/reduce
◮ Our tool:
89 potential ambiguities with LR(1) precision detected
◮ For the SML grammar fragment:
2 potential ambiguities with LR(0) precision detected: (match -> mrule . , match -> match . ’|’ mrule ) (match -> match . ’|’ mrule , match -> match ’|’ mrule . )
◮ NU(item1) correctly identifies 87% of our
unambiguous grammars—73% of the non-LALR(1) ones
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Experimental Results
◮ conservative ambiguity detection ◮ provably better than several other techniques ◮ also experimentally better
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Closing Comments
◮ Main issues in parser development:
◮ nondeterminism ◮ ambiguity in particular
◮ Deterministic parsers for larger classes of
grammars
◮ Ambiguity detection algorithm
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Future Work
◮ Linear time parsing for NU(≡) grammars? ◮ Improved implementation ◮ Noncanonical languages ◮ Regular approximations
Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Future Work
References
Shift/Reduce Conflict
GNU Bison state 20 6 exp: "case" exp "of" match . 8 match: match . ’|’ mrule ’|’ shift, and go to state 24 ’|’ [reduce using rule 6 (exp)]
References
Shift/Reduce Conflict
Which action to choose?
pat mrule exp match . . . match pat mrule exp fvalbind
References
Shift/Reduce Conflict
Which action to choose? Reduce?
pat mrule exp match . . . exp vid sfvalbind error! sfvalbind fvalbind
References
Shift/Reduce Conflict
Which action to choose? Shift?
pat mrule exp match match . . . exp pat mrule fvalbind
References
Shift/Reduce Conflict
Which action to choose?
| NONE => filterP(r, l) | filterP ([], l) = rev l exp pat mrule match . . . fvalbind sfvalbind exp atpats exp sfvalbind fvalbind
References
Shift/Reduce Conflict
Which action to choose? Reduce?
| NONE => filterP(r, l) | filterP ([], l) = rev l exp pat mrule match . . . fvalbind sfvalbind exp atpats exp sfvalbind fvalbind
References
Shift/Reduce Conflict
Which action to choose? Shift?
| NONE => filterP(r, l) | filterP ([], l) = rev l exp pat mrule match . . . pat exp sfvalbind fvalbind fvalbind exp match error! fvalbind
References
sfvb mrule atpats atpat atpat pat vid vid => = | | ... ... ... ...
References
Ambiguity Report
◮ grambiguity [Brabrand et al., 2007]
*** horizontal ambiguity at E[plus]: Exp <--> ’+’ Exp ambiguous string: "x+x+x"
◮ ANTLRWorks [Parr, 2007]
References
◮ memory requirements: a solution could be a
NLALR test
◮ dynamic disambiguation: inverse problem,
some means to deciding equivalence needed
References
context-free grammars. Master’s thesis, Centrum voor Wiskunde en Informatica, Universiteit van Amsterdam, Aug. 2007.
parsing-based approach. In IWPT’03, pages 55–65,
ftp://ftp.inria.fr/INRIA/Projects/Atoll/ Pierre.Boullier/supertaggeur final.pdf.
ambiguity of context-free grammars. In J. Holub and J. ˇ Zd’´ arek, editors, CIAA’07, 2007. URL http://www.brics.dk/∼brabrand/grambiguity/. To appear in Lecture Notes in Computer Science.