Dependently Typed Grammars MPC 2010 Kasper Brink, Stefan - - PowerPoint PPT Presentation

dependently typed grammars
SMART_READER_LITE
LIVE PREVIEW

Dependently Typed Grammars MPC 2010 Kasper Brink, Stefan - - PowerPoint PPT Presentation

Dependently Typed Grammars MPC 2010 Kasper Brink, Stefan Holdermans, Andres L oh June 22, 2010 Parser Combinators Expression Grammar E E B N | N B + | N 0 | 1 pExpr , pNum :: Parser Int pBin :: Parser (Int Int


slide-1
SLIDE 1

Dependently Typed Grammars

MPC 2010 Kasper Brink, Stefan Holdermans, Andres L¨

  • h

June 22, 2010

slide-2
SLIDE 2

Parser Combinators

Expression Grammar

E → E B N | N B → + | − N → 0 | 1 pExpr, pNum :: Parser Int pBin :: Parser (Int → Int → Int) pExpr = (λ e b n → b e n) <$> pExpr < ∗ > pBin < ∗ > pNum <|> pNum pBin = (+) <$ pSymbol ’+’ <|> (−) <$ pSymbol ’-’ pNum = 0 <$ pSymbol ’0’ <|> 1 <$ pSymbol ’1’

slide-3
SLIDE 3

Parser Combinators

Expression Grammar

E → E B N | N B → + | − N → 0 | 1 pExpr, pNum :: Parser Int pBin :: Parser (Int → Int → Int) pExpr = (λ e b n → b e n) <$> pExpr < ∗ > pBin < ∗ > pNum <|> pNum pBin = (+) <$ pSymbol ’+’ <|> (−) <$ pSymbol ’-’ pNum = 0 <$ pSymbol ’0’ <|> 1 <$ pSymbol ’1’ Left Recursion − → Non-termination!

slide-4
SLIDE 4

Representing grammars instead of parsers

◮ Represent a grammar as a data value ◮ Analyze and transform ◮ Generate a parser

slide-5
SLIDE 5

Representing grammars instead of parsers

◮ Represent a grammar as a data value ◮ Analyze and transform ◮ Generate a parser

This talk

◮ Representation in Agda ◮ Transform grammar to remove left recursion

slide-6
SLIDE 6

Outline

◮ Grammar Representation ◮ Left-Corner Transform ◮ (Part of) Correctness Proof ◮ Conclusion

slide-7
SLIDE 7

Grammar Representation

slide-8
SLIDE 8

Symbols

Terminal : Set Terminal = Char data Nonterminal : Set where E : Nonterminal B : Nonterminal N : Nonterminal data Symbol : Set where st : Terminal → Symbol sn : Nonterminal → Symbol

slide-9
SLIDE 9

Semantic Types

◮ Parsers: every parser has a result type ◮ Grammars: every nonterminal has a semantic type

❏ ❑ : Nonterminal → Set ❏ E ❑ = N ❏ B ❑ = N → N → N ❏ N ❑ = N

slide-10
SLIDE 10

Semantic Functions

◮ Type of semantic functions determined by ❏ ❑

E → E B N λ e b n → b e n : ❏ E ❑ → ❏ B ❑ → ❏ N ❑ → ❏ E ❑ E → N id : ❏ N ❑ → ❏ E ❑ N → 1 1 : ❏ N ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑

slide-11
SLIDE 11

Semantic Functions

◮ Type of semantic functions determined by ❏ ❑

E → E B N λ e b n → b e n : ❏ E ❑ → ❏ B ❑ → ❏ N ❑ → ❏ E ❑ E → N id : ❏ N ❑ → ❏ E ❑ N → 1 1 : ❏ N ❑

◮ Compute type of semantic function: ❏ |

| ❑

◮ Production A → β has semantic function of type ❏β|

|A❑ ❏ | | ❑ : Symbols → Nonterminal → Set ❏ [ ] | | A ❑ = ❏ A ❑ ❏ st :: β | | A ❑ = ❏ β | | A ❑ ❏ sn B :: β | | A ❑ = ❏ B ❑ → ❏ β | | A ❑

slide-12
SLIDE 12

Productions

data Production : Set where prod : (A : Nonterminal) → (β : Symbols) → ❏ β | | A ❑ → Production Example: p1 = prod E (sn E :: sn B :: sn N :: [ ]) (λ e b n → b e n) p2 = prod E (sn N :: [ ]) id p3 = prod N (st ’1’ :: [ ]) 1 Of course it is desirable to devise a more convenient input syntax for grammars.

slide-13
SLIDE 13

Generating a Parser

generateParser : Productions → (S : Nonterminal) → Parser ❏ S ❑ generateParser prods = gen where mutual gen : (A : Nonterminal) → Parser ❏ A ❑ gen A = (foldr <|> pFail ◦ map genAlt ◦ filterLHS A) prods genAlt : ∀ {A} → ProductionLHS A → Parser ❏ A ❑ genAlt (prodlhs (prod A β sem)) = buildParser β (pSucceed sem) buildParser : ∀ {A} β → Parser ❏ β | | A ❑ → Parser ❏ A ❑ buildParser [ ] p = p buildParser (st b :: β) p = buildParser β (p < ∗ pTerminal b) buildParser (sn B :: β) p = buildParser β (p < ∗ > gen B)

slide-14
SLIDE 14

Generating a Parser

generateParser : Productions → (S : Nonterminal) → Parser ❏ S ❑ generateParser prods = gen where mutual gen : (A : Nonterminal) → Parser ❏ A ❑ gen A = (foldr <|> pFail ◦ map genAlt ◦ filterLHS A) prods genAlt : ∀ {A} → ProductionLHS A → Parser ❏ A ❑ genAlt (prodlhs (prod A β sem)) = buildParser β (pSucceed sem) buildParser : ∀ {A} β → Parser ❏ β | | A ❑ → Parser ❏ A ❑ buildParser [ ] p = p buildParser (st b :: β) p = buildParser β (p < ∗ pTerminal b) buildParser (sn B :: β) p = buildParser β (p < ∗ > gen B)

slide-15
SLIDE 15

Generating a Parser

generateParser : Productions → (S : Nonterminal) → Parser ❏ S ❑ generateParser prods = gen where mutual gen : (A : Nonterminal) → Parser ❏ A ❑ gen A = (foldr <|> pFail ◦ map genAlt ◦ filterLHS A) prods genAlt : ∀ {A} → ProductionLHS A → Parser ❏ A ❑ genAlt (prodlhs (prod A β sem)) = buildParser β (pSucceed sem) buildParser : ∀ {A} β → Parser ❏ β | | A ❑ → Parser ❏ A ❑ buildParser [ ] p = p buildParser (st b :: β) p = buildParser β (p < ∗ pTerminal b) buildParser (sn B :: β) p = buildParser β (p < ∗ > gen B)

slide-16
SLIDE 16

Left-Corner Transform

slide-17
SLIDE 17

Left Corners

◮ Left corner:

A ∗ = ⇒ Xβ

slide-18
SLIDE 18

Left Corners

◮ Left corner:

A ∗ = ⇒ Xβ

◮ Left-corner transform introduces new nonterminals“A−X” ◮ A−X represents the part of an A that follows an X. ◮ Example:

A

= ⇒ Bβ

= ⇒ a b c β

= ⇒ a b c d e f g

A

a b c

B

d e f g

A

B

slide-19
SLIDE 19

Left-corner Transform

Transformation Rules (Johnson, 1998)

(1) ∀A, b : A → b A−b (2) ∀C, A → X β : C−X → β C−A (3) ∀A : A−A → ǫ

slide-20
SLIDE 20

Example Transformation

Original: E → E B N E → N B → + B → − N → 0 N → 1 Transformed: E → + E−+ E → − E−− E → 0 E−0 E → 1 E−1 B → + B−+ B → − B−− B → 0 B−0 B → 1 B−1 N → + N−+ N → − N−− N → 0 N−0 N → 1 N−1 E−E → B N E−E E−N → E−E E−+ → E−B E−− → E−B E−0 → E−N E−1 → E−N B−E → B N B−E B−N → B−E B−+ → B−B B−− → B−B B−0 → B−N B−1 → B−N N−E → B N N−E N−N → N−E N−+ → N−B N−− → N−B N−0 → N−N N−1 → N−N E−E → ǫ B−B → ǫ N−N → ǫ

slide-21
SLIDE 21

New nonterminals

(notation: Original“O. . . ” , Transformed“T. . . ” ) data TNonterminal : Set where n : ONonterminal → TNonterminal n − : ONonterminal → OSymbol → TNonterminal T❏ ❑ : TNonterminal → Set T❏ n A ❑ = O❏ A ❑ T❏ n A − st b ❑ = O❏ A ❑ T❏ n A − sn B ❑ = O❏ B ❑ → O❏ A ❑

❏A❑

a b c

❏B❑

d e f g h

❏B❑→❏A❑

slide-22
SLIDE 22

Transforming Grammars

Transformation Rules

(1) ∀A, b : A → b A−b (2) ∀C, A → X β : C−X → β C−A (3) ∀A : A−A → ǫ lct : OProductions → TProductions lct ps = concatMap (λ A → map (rule1 A) (terms ps)) (nonterms ps) + + concatMap (λ C → map (rule2 C) ps) (nonterms ps) + + map rule3 (nonterms ps)

slide-23
SLIDE 23

Transforming Productions

Rule (2): A → X β − → C−X → β C−A rule2 : ONonterminal → OProduction → TProduction rule2 C (O.prod A (X :: β) sem) = T.prod (n C − X) (lift β + + [T.sn (n C − O.sn A)]) (semtrans C A X β sem)

slide-24
SLIDE 24

Transforming Semantics

Use semantic types as specification of semantic transformation

Semantic transformation

production: A → B β − → C−B → β C−A semantics: ❏B β | |A ❑ − → ❏ β C−A| |C−B ❑ ❏ ❑ ❏ ❑

slide-25
SLIDE 25

Transforming Semantics

Use semantic types as specification of semantic transformation

Semantic transformation

production: A → B β − → C−B → β C−A semantics: ❏B β | |A ❑ − → ❏ β C−A| |C−B ❑ semtrans : ∀ C A B β → O❏ O.sn B :: β | | A ❑ → T❏ lift β + + [T.sn (n C − O.sn A)] | | n C − O.sn B ❑

slide-26
SLIDE 26

Transforming Semantics

Use semantic types as specification of semantic transformation

Semantic transformation

production: A → B β − → C−B → β C−A semantics: ❏B β | |A ❑ − → ❏ β C−A| |C−B ❑ semtrans : ∀ C A B β → O❏ O.sn B :: β | | A ❑ → T❏ lift β + + [T.sn (n C − O.sn A)] | | n C − O.sn B ❑ semtrans C A B β = O.foldSymbols (λ f → f) (λ f → λ g → f ◦ flip g) (λ f g → g ◦ f) β

slide-27
SLIDE 27

Correctness

slide-28
SLIDE 28

Correctness Criteria

◮ Correctness of the left-corner transform:

◮ Transformed grammar recognizes the same language ◮ No addition or removal of ambiguity

(number of parse trees for each sentence is preserved)

◮ Left recursion is removed

◮ What we proved (weaker):

◮ Transformed grammar recognizes at least the original language:

L(G) ⊆ L(G ′)

slide-29
SLIDE 29

Concepts Involved in Proof

G G ′

S w S w ′

LC Transform

(rule1a–1c)

= ⇒G

= ⇒G ′ ∼ = (preserving w)

slide-30
SLIDE 30

Concepts Involved in Proof

G G ′

S w S w ′

LC Transform

(rule1a–1c)

= ⇒G

= ⇒G ′ ∼ = (preserving w) Original productions in LC-order LC Traversal Top-Down Traversal

slide-31
SLIDE 31

Concepts Involved in Proof

G G ′

S w S w ′

LC Transform

(rule1a–1c)

= ⇒G

= ⇒G ′ ∼ = (preserving w) language inclusion Original productions in LC-order LC Traversal Top-Down Traversal

slide-32
SLIDE 32

Parse Tree Traversals

Top-down traversal: parent recognized before children Bottom-up traversal: parent recognized after children Left-corner traversal: parent recognized after left corner, and before other children 2 1 3 4 . . .

slide-33
SLIDE 33

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-34
SLIDE 34

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-35
SLIDE 35

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-36
SLIDE 36

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-37
SLIDE 37

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-38
SLIDE 38

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-39
SLIDE 39

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-40
SLIDE 40

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-41
SLIDE 41

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-42
SLIDE 42

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-43
SLIDE 43

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-44
SLIDE 44

Left-Corner Traversal

E → E B N E → N N → 1 B → + N → 1 1 + 1 E → 1 E−1 E−1 → E−N E−N → E−E E−E → B N E−E B → + B−+ B−+ → B−B B−B → ǫ N → 1 N−1 N−1 → N−N N−N → ǫ E−E → ǫ

slide-45
SLIDE 45

Correctness Proof

Function

S w

S w ′ is a proof that L(G) ⊆ L(G ′)

Proof Outline

◮ traverse S w

in LC-order

◮ transform productions ◮ add productions to S w ′ in top-down order ◮ show that sentence w is preserved

slide-46
SLIDE 46

Conclusion

slide-47
SLIDE 47

Summary

Contributions

◮ Library for representing grammars and semantic functions,

and generating parsers

◮ Implementation of the Left-Corner Transform ◮ Proof of a correctness property of our LCT implementation:

L(G) ⊆ L(G ′)

Conclusions

◮ Dependent types are a natural fit for representing grammars. ◮ Proofs are possible, but a lot of work. ◮ This is just a start . . .

slide-48
SLIDE 48

Future Work

◮ Other grammar transformations

(left factoring, . . . )

◮ Grammar combinators ◮ Proof of non-left-recursion

(total parser combinators, Danielsson and Norell)