approximating context free grammars for parsing and
play

Approximating Context-Free Grammars for Parsing and Verification - PowerPoint PPT Presentation

Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Approximating Context-Free Grammars for Parsing and Verification Sylvain Schmitz LORIA, INRIA Nancy - Grand Est October 18, 2007 Motivation Approximations


  1. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue Parsers /** * smlfvalbind.y * * Standard ML function declarations. * See _The_definition_of_standard_ML_, Milner et al., 1997, * ISBN 0−262−63181−4. Context-free */ %token CASE "case" %token FUN "fun" %token MATCH "=>" %token OF "of" grammar %token VID %start dec %% dec: "fun" fvalbind ; fvalbind: sfvalbind | fvalbind ’|’ sfvalbind ; sfvalbind: VID atpats ’=’ exp ; exp: VID | "case" exp "of" match ; match: mrule | match ’|’ mrule ; mrule: pat "=>" exp ; atpats: atpat | atpats atpat ; . . . atpat: VID ; pat: VID atpat ; %% | NONE = > filterP(r, l) | filterP Parser generator � fvalbind � � fvalbind � � sfvalbind � ([], l) = rev l Input Parse � exp � tokens tree � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � Parser . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  2. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts LALR(1) Parser Generator ◮ GNU Bison state 20 6 exp: "case" exp "of" match . 8 match: match . ’|’ mrule ’|’ shift, and go to state 24 ’|’ [reduce using rule 6 (exp)] ◮ Restricted grammar class

  3. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts LALR(1) Parser Generator CFG ◮ GNU Bison state 20 6 exp: "case" exp "of" match . 8 match: match . ’|’ mrule ’|’ shift, and go to state 24 ’|’ [reduce using rule 6 (exp)] LALR(1) ◮ Restricted grammar class

  4. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts Dealing with Conflicts An Objective Measure [Malloy et al., 2002] on a C# Grammar 700 ’2002_malloy.data’ using 1:($2+$3) 600 500 LALR(1) conflicts 400 300 200 100 0 2 4 6 8 10 12 14 16 18 20 Parser versions

  5. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts Dealing with Conflicts A Subjective Measure Courtesy of http://www.phdcomics.com .

  6. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts Dealing with Conflicts A Subjective Measure Courtesy of http://www.phdcomics.com .

  7. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts Dealing with Conflicts A Subjective Measure Courtesy of http://www.phdcomics.com .

  8. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ Culik and Cohen, 1973] ◮ Generalized LR [Tomita, LR( k ) 1986] ◮ Unambiguous CFGs LALR(1) [Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963] ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  9. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ Culik and Cohen, 1973] LR-Regular ◮ Generalized LR [Tomita, LR( k ) 1986] ◮ Unambiguous CFGs LALR(1) [Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963] ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  10. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ Culik and Cohen, 1973] ◮ Generalized LR [Tomita, 1986] ◮ Unambiguous CFGs [Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963] ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  11. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Ambiguity /** * smlfvalbind.y * * Standard ML function declarations. * See _The_definition_of_standard_ML_, Milner et al., 1997, Context-free * ISBN 0−262−63181−4. */ %token CASE "case" %token FUN "fun" %token MATCH "=>" %token OF "of" grammar %token VID %start dec %% dec: "fun" fvalbind ; fvalbind: sfvalbind | fvalbind ’|’ sfvalbind ; sfvalbind: � exp � VID atpats ’=’ exp ; exp: VID | "case" exp "of" match ; match: mrule � match � | match ’|’ mrule ; mrule: pat "=>" exp ; atpats: atpat | atpats atpat � mrule � ; atpat: VID ; pat: VID atpat � exp � ; %% case a of b = > case b of c = > c | d = > d � match � Parser � match � generator � exp � � pat � � exp � � mrule � � mrule � case a of b = > case b of c = > c | d = > d � exp � � match � � match � Input Parse � mrule � tokens forest � exp � � match � � exp � � pat � � exp � � mrule � � mrule � Parser case a of b = > case b of c = > c | d = > d

  12. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Ambiguity /** * smlfvalbind.y * * Standard ML function declarations. * See _The_definition_of_standard_ML_, Milner et al., 1997, * ISBN 0−262−63181−4. Context-free */ %token CASE "case" %token FUN "fun" %token MATCH "=>" %token OF "of" grammar %token VID %start dec %% dec: "fun" fvalbind ; fvalbind: sfvalbind | fvalbind ’|’ sfvalbind ; sfvalbind: VID atpats ’=’ exp � exp � ; exp: VID | "case" exp "of" match ; match: mrule � match � | match ’|’ mrule ; mrule: pat "=>" exp ; atpats: atpat � mrule � | atpats atpat ; atpat: VID ; pat: VID atpat ; � exp � %% case a of b = > case b of c = > c | d = > d � match � Parser � match � generator � exp � � pat � � exp � � mrule � � mrule � case a of b = > case b of c = > c | d = > d � exp � � match � � match � Input Parse � mrule � tokens forest � exp � � match � � exp � � pat � � exp � � mrule � � mrule � Parser case a of b = > case b of c = > c | d = > d

  13. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ Culik and Cohen, 1973] UCFG ◮ Generalized LR [Tomita, 1986] ◮ Unambiguous CFGs [Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963] ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  14. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ U Culik and n s a f e Cohen, 1973] UCFG LR-Regular ◮ Generalized LR [Tomita, LR( k ) 1986] ◮ Unambiguous CFGs LALR(1) [Cantor, 1962, Chomsky S and Sch¨ utzenberger, 1963] a f e ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  15. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ U Culik and n s a f e Cohen, 1973] HVRU UCFG ◮ Generalized LR [Tomita, 1986] ◮ Unambiguous CFGs [Cantor, 1962, Chomsky S and Sch¨ utzenberger, 1963] a f e ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  16. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ U Culik and n s a f e Cohen, 1973] HVRU UCFG LR-Regular ◮ Generalized LR [Tomita, LR( k ) 1986] ◮ Unambiguous CFGs LALR(1) [Cantor, 1962, Chomsky S and Sch¨ utzenberger, 1963] a f e ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  17. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Contributions ◮ Noncanonical parsing methods [Szymanski and Williams, 1976, Tai, 1979] ◮ Noncanonical LALR(1) ◮ Shift-Resolve ◮ Noncanonical unambiguity test ◮ Framework for grammar approximations

  18. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Contributions CFG ◮ Noncanonical parsing methods [Szymanski and Williams, 1976, Tai, 1979] ◮ Noncanonical LALR(1) UCFG LR-Regular ◮ Shift-Resolve LR( k ) NLALR(1) ◮ Noncanonical LALR(1) unambiguity test ◮ Framework for grammar approximations

  19. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Contributions CFG ◮ Noncanonical parsing methods [Szymanski and Williams, 1976, Tai, 1979] ◮ Noncanonical LALR(1) UCFG LR-Regular ◮ Shift-Resolve LR( k ) ShRe ◮ Noncanonical LALR(1) unambiguity test ◮ Framework for grammar approximations

  20. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Contributions CFG ◮ Noncanonical parsing methods [Szymanski and Williams, 1976, Tai, 1979] ◮ Noncanonical LALR(1) HVRU UCFG NU LR-Regular ◮ Shift-Resolve LR( k ) ◮ Noncanonical LALR(1) unambiguity test ◮ Framework for grammar approximations

  21. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Contributions CFG ◮ Noncanonical parsing methods [Szymanski and Williams, 1976, Tai, 1979] ◮ Noncanonical LALR(1) HVRU UCFG NU LR-Regular ◮ Shift-Resolve LR( k ) ◮ Noncanonical LALR(1) unambiguity test ◮ Framework for grammar approximations

  22. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Bracketed Grammars G = � N , T , P , S � , V = N ∪ T 1 � dec � − → fun � fvalbind � 2 � fvalbind � → − � sfvalbind � 3 � fvalbind � ′ | ′ � sfvalbind � � fvalbind � − → 4 � sfvalbind � − → vid � atpats � = � exp � 5 � exp � − → case � exp � of � match � 6 � match � − → � mrule � 7 � match � ′ | ′ � mrule � � match � − → 8 � mrule � − → � pat � = > � exp � 9 � atpats � → − � atpat � 10 � atpats � − → � atpats � � atpat � 11 � pat � − → vid � atpat � 12 � atpat � − → vid

  23. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Bracketed Grammars G b = � N , T b , P b , S � , V b = N ∪ T b 1 � dec � − → d 1 fun � fvalbind � r 1 2 � fvalbind � − → d 2 � sfvalbind � r 2 3 d 3 � fvalbind � ′ | ′ � sfvalbind � r 3 � fvalbind � − → 4 � sfvalbind � − → d 4 vid � atpats � = � exp � r 4 5 � exp � − → d 5 case � exp � of � match � r 5 6 � match � − → d 6 � mrule � r 6 7 d 7 � match � ′ | ′ � mrule � r 7 � match � − → 8 � mrule � − → d 8 � pat � = > � exp � r 8 9 � atpats � − → d 9 � atpat � r 9 10 � atpats � − → d 10 � atpats � � atpat � r 10 11 � pat � − → d 11 vid � atpat � r 11 12 � atpat � − → d 12 vid r 12

  24. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Positions � fvalbind � � fvalbind � � sfvalbind � ’ | ’ � sfvalbind � vid � atpats � � exp � = ′ | ′ · d 4 vid � atpats � = � exp � r 4 r 3 d 3 d 2 � sfvalbind � r 2

  25. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Position Graph Γ Left-to-right Walks in Trees � fvalbind � � fvalbind � ’ | ’ � sfvalbind � d 4 � sfvalbind � vid � atpats � � exp � = ′ | ′ d 4 · vid � atpats � = � exp � r 4 r 3 d 3 d 2 � sfvalbind � r 2

  26. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Position Graph Γ Left-to-right Walks in Trees � fvalbind � � sfvalbind � � fvalbind � ’ | ’ � sfvalbind � � sfvalbind � vid � atpats � � exp � = ′ | ′ d 4 vid � atpats � = � exp � r 4 · r 3 d 3 d 2 � sfvalbind � r 2

  27. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Position Graph Γ Left-to-right Walks in Trees � fvalbind � r 3 � fvalbind � ’ | ’ � sfvalbind � � sfvalbind � vid � atpats � � exp � = ′ | ′ d 4 vid � atpats � = � exp � r 4 r 3 · d 3 d 2 � sfvalbind � r 2

  28. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Position Graph Γ Left-to-right Walks in Trees . . . � fvalbind � d 3 r 3 � fvalbind � � sfvalbind � ’ | ’ r 4 d 4 � sfvalbind � � atpats � � exp � d 2 vid = r 2 . . . . . . . . .

  29. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Position Automaton Γ/ ≡ Definition Γ/ ≡ is the quotient of Γ by an equivalence relation ≡ between positions. Theorem (Language over-approximation) L ( G b ) ⊆ L ( Γ/ ≡ ) ∩ T ∗ b

  30. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Example: item 0 Equivalence � fvalbind � d 3 r 3 � fvalbind � ’ | ’ � sfvalbind � r 4 d 4 � sfvalbind � � atpats � � exp � d 2 vid = r 2 r 4 d 4 vid � atpats � � exp � = ◮ equivalence class → vid � atpats � · = � exp � ] 4 [ � sfvalbind � − ◮ LR(0) items ◮ Γ/ item 0 : nondeterministic LR(0) automaton

  31. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Example: item 0 Equivalence → · � sfvalbind � ] →� fvalbind � ′ | ′ · � sfvalbind � ] 2 3 [ � fvalbind � − [ � fvalbind � − d 4 d 4 → · vid � atpats � = � exp � ] 4 [ � sfvalbind � − vid → vid · � atpats � = � exp � ] 4 [ � sfvalbind � − � atpats � → vid � atpats � · = � exp � ] 4 [ � sfvalbind � − = → vid � atpats � = · � exp � ] 4 [ � sfvalbind � − � exp � → vid � atpats � = � exp � · ] 4 [ � sfvalbind � − r 4 r 4 →� sfvalbind � · ] →� fvalbind � ′ | ′ � sfvalbind � · ] 2 3 [ � fvalbind � − [ � fvalbind � − equivalence class

  32. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Summary ◮ general framework for approximations ◮ applications: ◮ parser construction ◮ ambiguity detection ◮ XML validation [Segoufin and Vianu, 2002]? ◮ symbolic supertagging [Boullier, 2003]?

  33. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Summary ◮ general framework for approximations ◮ applications: ◮ parser construction ◮ ambiguity detection ◮ XML validation [Segoufin and Vianu, 2002]? ◮ symbolic supertagging [Boullier, 2003]?

  34. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Principles Shift-Resolve Parsing ◮ noncanonical ◮ k = 1 reduced lookahead symbol ◮ resolve = reduce + pushback: emulates a bounded reduced lookahead without any preset bound

  35. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Principles Shift-Resolve Parsing ◮ noncanonical ◮ k = 1 reduced lookahead symbol ◮ resolve = reduce + pushback: emulates a bounded reduced lookahead without any preset bound

  36. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � sfvalbind � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  37. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  38. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  39. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  40. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  41. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  42. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Generating the Parser 1. position automaton 2. determinization by subset construction

  43. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Principle ◮ d i transitions denote traditional item closures ◮ r i transitions denote a phrase that should be reduced ◮ other transitions denote shifts ◮ items in the construction hold 1. a state of the position automaton 2. a parsing action 3. a pushback length

  44. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Principle ◮ d i transitions denote traditional item closures ◮ r i transitions denote a phrase that should be reduced ◮ other transitions denote shifts ◮ items in the construction hold 1. a state of the position automaton 2. a parsing action 3. a pushback length

  45. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �−

  46. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− r 5 → vid � atpats � = � exp � · , 5, 0 � sfvalbind �−

  47. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 r 4 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �−

  48. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ −

  49. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ −

  50. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ − ’ | ’ →� fvalbind � ’ | ’ · � sfvalbind � , 5, 1 � fvalbind �− →� match � ’ | ’ · � mrule � , 0, 0 � match �−

  51. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ − ’ | ’ →� fvalbind � ’ | ’ · � sfvalbind � , 5, 1 � fvalbind �− →� match � ’ | ’ · � mrule � , 0, 0 � match �− d 8 → · � pat � = > � exp � , 0, 0 � mrule �−

  52. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ − ’ | ’ →� fvalbind � ’ | ’ · � sfvalbind � , 5, 1 � fvalbind �− →� match � ’ | ’ · � mrule � , 0, 0 � match �− → · � pat � = > � exp � , 0, 0 � mrule �− → · vid � atpat � , 0, 0 � pat �− → · vid � atpats � = � exp � , 0, 0 � sfvalbind �−

  53. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Construction Failure → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ −

  54. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Construction Failure → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 r 5 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ − →� pat � ’ | ’ � exp � · , 5, 0 � mrule �−

  55. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Construction Failure → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ − →� pat � ’ | ’ � exp � · , 5, 0 � mrule �− →� mrule � · , 5, 0 � match �− →� match � · ’ | ’ � mrule � , 5, 0 � match �−

  56. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Complexity ◮ | Γ/ ≡ | : size of the position automaton ◮ |A| : size of the parser: O ( 2 | Γ/ ≡ | | P | ) ◮ parsing time complexity for input w : O ( | w | )

  57. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Complexity ◮ | Γ/ ≡ | : size of the position automaton | Γ/ item 0 | = O ( |G| ) ◮ |A| : size of the parser: O ( 2 | Γ/ ≡ | | P | ) ◮ parsing time complexity for input w : O ( | w | )

  58. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Limitations − incomparable with classical parsing techniques + subset construction mendable

  59. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Limitations − incomparable with classical parsing techniques + subset construction mendable

  60. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Summary ◮ Shift Resolve parsers 1. Large class of grammars accepted 2. Unambiguity 3. Linear time parsing ◮ 2-steps construction 1. Simple 2. Flexible

  61. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Principles ◮ a bracketed sentence = a derivation tree ◮ ambiguity = more than one tree with the same yield d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 6 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 d 7 d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 ◮ construct a FSA A such that L ( G b ) ⊆ L ( A ) , and look for bracketed sentences with the same yield

  62. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Principles ◮ a bracketed sentence = a derivation tree ◮ ambiguity = more than one tree with the same yield d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 6 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 d 7 d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 ◮ construct a FSA A such that L ( G b ) ⊆ L ( A ) , and look for bracketed sentences with the same yield

  63. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Principles ◮ a bracketed sentence = a derivation tree ◮ ambiguity = more than one tree with the same yield d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 6 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 d 7 d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 ◮ construct a FSA A such that L ( G b ) ⊆ L ( A ) , and look for bracketed sentences with the same yield

  64. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Regular Unambiguity RU( ≡ ) ◮ G is regular unambiguous for ≡ of finite index, if there does not exist w b � w ′ b in L ( Γ/ ≡ ) ∩ T ∗ b with h ( w b ) = h ( w ′ b ) ◮ LR(0) � RU( item 0 ) ◮ regular approximations are too weak

  65. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Regular Unambiguity RU( ≡ ) ◮ G is regular unambiguous for ≡ of finite index, if there does not exist w b � w ′ b in L ( Γ/ ≡ ) ∩ T ∗ b with h ( w b ) = h ( w ′ b ) ◮ LR(0) � RU( item 0 ) ◮ regular approximations are too weak

  66. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Nonterminal Transitions ◮ SF ( G b ) ⊆ L ( Γ/ ≡ ) ◮ look for two di ff erent bracketed sentential forms in L ( Γ/ ≡ ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 ◮ a nonterminal transition represents exactly its derived context-free language

  67. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Nonterminal Transitions ◮ SF ( G b ) ⊆ L ( Γ/ ≡ ) ◮ look for two di ff erent bracketed sentential forms in L ( Γ/ ≡ ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 ◮ a nonterminal transition represents exactly its derived context-free language

  68. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Nonterminal Transitions ◮ SF ( G b ) ⊆ L ( Γ/ ≡ ) ◮ look for two di ff erent bracketed sentential forms in L ( Γ/ ≡ ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 ◮ a nonterminal transition represents exactly its derived context-free language

  69. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of � match � r 5 r 8 r 6 epsilon: mae

  70. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of � match � r 5 r 8 r 6 epsilon: mae

  71. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of � match � r 5 r 8 r 6 epsilon: mae

  72. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of � match � r 5 r 8 r 6 shift: mas

  73. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of � match � r 5 r 8 r 6 nothing!

  74. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 shift: mas

  75. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 conflict: mac

  76. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 conflict: mac

  77. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 conflict: mac

  78. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 shift: mas

  79. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 reduce: mar

  80. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 conflict: mac

  81. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity NU( ≡ ) ◮ ma = mas ∪ mae ∪ mac ∪ mar ◮ G is noncanonically unambiguous if there does not exist a relation ( q s , q s ) ma ∗ ( q f , q f ) that uses mac at some step ◮ Computation in O ( | Γ/ ≡ | 2 ) in space

  82. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Comparisons ◮ Regular Unambiguity RU( ≡ ) ◮ Bounded-length detection schemes ◮ LR( k ) and LR-Regular (LR( Π )) ◮ Horizontal and vertical ambiguity (HVRU( ≡ ))

  83. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Bounded-length detection [Gorn, 1963, Cheung and Uzgalis, 1995, Schr¨ oer, 2001, Jampana, 2005] ◮ generate sentences ◮ not conservative ◮ prefix m prevents from false positives in sentences of length < m ◮ need to generate a 2 n + 1 to find G n 4 ambiguous, but G n 4 � NU ( item 0 ) S − → A | B n a , A − → Aaa | a , B 1 − → aa , B 2 − → B 1 B 1 , . . . , B n − → B n − 1 B n − 1 ( G n 4 )

  84. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity LR( k ) and LR-Regular [Knuth, 1965, Hunt III et al., 1975, ˇ Culik and Cohen, 1973, Heilbrunner, 1983] ◮ conservative tests ◮ define item Π s.t. LR ( Π ) ⊂ NU ( item Π ) ◮ need a LR(2 n ) test to prove G n 3 unambiguous, but G n 3 ∈ NU ( item 0 ) S − → A | B n , A − → Aaa | a , B 1 − → aa , B 2 − → B 1 B 1 , . . . , B n − → B n − 1 B n − 1 ( G n 3 )

  85. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Experimental Results Implementation ◮ For the whole SML grammar: ◮ conflicts in the LALR(1) parser sml.y: conflicts: 223 shift/reduce, 35 reduce/reduce ◮ Our tool: 89 potential ambiguities with LR(1) precision detected ◮ For the SML grammar fragment: 2 potential ambiguities with LR(0) precision detected: (match -> mrule . , match -> match . ’|’ mrule ) (match -> match . ’|’ mrule , match -> match ’|’ mrule . ) ◮ NU( item 1 ) correctly identifies 87% of our unambiguous grammars—73% of the non-LALR(1) ones

  86. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Experimental Results Summary ◮ conservative ambiguity detection ◮ provably better than several other techniques ◮ also experimentally better

  87. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Closing Comments Conclusion ◮ Main issues in parser development: ◮ nondeterminism ◮ ambiguity in particular ◮ Deterministic parsers for larger classes of grammars ◮ Ambiguity detection algorithm

  88. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Future Work Directions for Future Work ◮ Linear time parsing for NU( ≡ ) grammars? ◮ Improved implementation ◮ Noncanonical languages ◮ Regular approximations

  89. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Future Work Thanks!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend