Rewinding the stack for parsing and pretty printing Mathieu - - PowerPoint PPT Presentation

rewinding the stack for parsing and pretty printing
SMART_READER_LITE
LIVE PREVIEW

Rewinding the stack for parsing and pretty printing Mathieu - - PowerPoint PPT Presentation

Rewinding the stack for parsing and pretty printing Mathieu Boespflug McGill University 26 July 2011 1 / 31 A little primer on H ASKELL The polymorphic type of lists is written [ a ] . head of type [ a ] a is written head :: [ a ]


slide-1
SLIDE 1

Rewinding the stack for parsing and pretty printing

Mathieu Boespflug

McGill University

26 July 2011

1 / 31

slide-2
SLIDE 2

A little primer on HASKELL

◮ The polymorphic type of lists is written [a]. ◮ head of type [a] → a is written head :: [a] → a. ◮ () :: () ◮ new datatype:

data Maybe a = Nothing | Just a

◮ type synonym:

type Env a = [(Int,a)]

2 / 31

slide-3
SLIDE 3

The Problem

3 / 31

slide-4
SLIDE 4

What is a parser?

type P a = String → Either Error (a,String)

4 / 31

slide-5
SLIDE 5

What is a parser?

type P a = String → Either Error (a,String) fail :: Error → Either Error (a,String) success :: (a,String) → Either Error (a,String) lit :: P () lit x = P (λs → case stripPrefix x s of Nothing → fail "Parse error." Just s′ → success ((),s′)) true :: P Bool true = P (λs → case stripPrefix "true" s of Nothing → fail "Parse error." Just s′ → success (True,s′)) false :: P Bool false = P (λs → case stripPrefix "false" s of Nothing → fail "Parse error." Just s′ → success (False,s′))

4 / 31

slide-6
SLIDE 6

Combining parsers

◮ Parsers are monadic actions. ◮ Can be built compositionally from existing parser combinators,

which are also monadic actions. pure f = P (λs → (f,s)) P m ⊗ P k = P (λs → case m s of (f,s′) → case k s′ of (x,s′′) → (f x,s′′))

5 / 31

slide-7
SLIDE 7

Example parser

pure :: a → P a (⊗) :: P (a → b) → P a → P b (⊕) :: P a → P a → P a E ::= true | false | if E then E else E data Tm = Boolean Bool | If Tm Tm Tm tm :: P Tm tm = pure Boolean ⊗ true ⊕ pure Boolean ⊗ false ⊕ pure (λ x y z → If x y z) ⊗ lit "if" ⊗ tm ⊗ "then" ⊗ tm ⊗ lit "else" ⊗ tm

6 / 31

slide-8
SLIDE 8

Example Pretty Printer

E ::= true | false | if E then E else E data Tm = Boolean Bool | If Tm Tm Tm tm t = case t of Boolean True → "true" Boolean False → "false" If x y z → "if " + + x + + " then " + + y + + " else " + + z

7 / 31

slide-9
SLIDE 9

Objective:

◮ Write the parser once, get the pretty printer for free. ◮ Write the pretty printer once, get the parser for free.

8 / 31

slide-10
SLIDE 10

Objective:

◮ Write the parser once, get the pretty printer for free. ◮ Write the pretty printer once, get the parser for free.

Why?

◮ Synchrony! ◮ Synchrony means easier to maintain. ◮ Synchrony means less code. ◮ Less code means fewer bugs. ◮ Pollack consistency.

8 / 31

slide-11
SLIDE 11

Objective:

◮ Write the parser once, get the pretty printer for free. ◮ Write the pretty printer once, get the parser for free.

Why?

◮ Synchrony! ◮ Synchrony means easier to maintain. ◮ Synchrony means less code. ◮ Less code means fewer bugs. ◮ Pollack consistency.

How?

◮ Write both at the same time.

8 / 31

slide-12
SLIDE 12

The Solution

9 / 31

slide-13
SLIDE 13

A Cassette

10 / 31

slide-14
SLIDE 14

A Kassette in HASKELL

data K7 a b = K7 {sideA :: a,sideB :: b}

11 / 31

slide-15
SLIDE 15

A Kassette in HASKELL

data K7 a b = K7 {sideA :: a,sideB :: b} (♦) :: K7 (b → c) (a → b) → K7 (a → b) (b → c) → K7 (a → c) (c → a) ∼(K7 f f ′) ♦ ∼(K7 g g′) = K7 (f ◦ g) (g′ ◦ f ′)

11 / 31

slide-16
SLIDE 16

The category of cassettes

Can overload (◦) with (♦): class Category κ where id :: κ a a (◦) :: κ b c → κ a b → κ a c instance Category K7 where id = K7 id id (◦) = (♦)

12 / 31

slide-17
SLIDE 17

Sequencing

13 / 31

slide-18
SLIDE 18

A tentative parsing and pretty printing cassette

type PP a = K7 (String → Either Error (a,String)) (??)

14 / 31

slide-19
SLIDE 19

A tentative parsing and pretty printing cassette

type PP a = K7 (String → Either Error (a,String)) ((a,String) → String)

14 / 31

slide-20
SLIDE 20

A tentative parsing and pretty printing cassette

type PP a = K7 (String → Either Error (a,String)) (Either Error (a,String) → String)

14 / 31

slide-21
SLIDE 21

A tentative parsing and pretty printing cassette

type PP a = K7 (String → Either Error (a,String)) ((a,String) → String)

14 / 31

slide-22
SLIDE 22

A tentative parsing and pretty printing cassette

type PP a = K7 (String → Either Error (a,String)) (a → String → String)

14 / 31

slide-23
SLIDE 23

A tentative parsing and pretty printing cassette

type PP a = K7 (String → Either Error (a,String)) (a → String → String) pure (λx y z → If x y z) :: P (Tm → Tm → Tm → Tm) pure (λx y z → If x y z) ⊗ tm :: P ( Tm → Tm → Tm)

14 / 31

slide-24
SLIDE 24

A tentative parsing and pretty printing cassette

type PP a = K7 (String → Either Error (a,String)) (a → String → String) pure (λx y z → If x y z) :: P (Tm → Tm → Tm → Tm) pure (λx y z → If x y z) ⊗ tm :: P ( Tm → Tm → Tm) K7 (pure (λx y z → If x y z)) (??) :: K7 (String → (Tm → Tm → Tm → Tm,String)) ((Tm → Tm → Tm → Tm) → String → String)

14 / 31

slide-25
SLIDE 25

A tentative parsing and pretty printing cassette

type PP a = K7 (String → Either Error (a,String)) (a → String → String) pure (λx y z → If x y z) :: P (Tm → Tm → Tm → Tm) pure (λx y z → If x y z) ⊗ tm :: P ( Tm → Tm → Tm) K7 (pure (λ(x,y,z) → If x y z)) (??) :: K7 (String → ((Tm,Tm,Tm) → Tm,String)) ((Tm → (Tm,Tm,Tm)) → String → String)

14 / 31

slide-26
SLIDE 26

The problem

To summarize:

◮ Need uncurried functions so that type to parse and type to

pretty print match.

◮ Can inductively construct curried function type

a1 → (a2 → (... → an)).

◮ Uncurried function type (a1,a2,...,an−1) → an cannot be

inductively constructed.

◮ Cannot feed arguments to an uncurried function

incrementally.

◮ Tuples as arguments and returning tuples breaks

composability.

15 / 31

slide-27
SLIDE 27

Recovering symmetry with continuation passing style

Type of consumer in CPS: (a1 → ... → an → r) → r Type of producer in CPS: r → a1 → ... → an → r

16 / 31

slide-28
SLIDE 28

Recovering symmetry with continuation passing style

Type of parser in CPS: (String → a1 → ... → an → r) → String → r Type of pretty printer CPS: (String → r) → String → a1 → ... → an → r

16 / 31

slide-29
SLIDE 29

Recovering symmetry with continuation passing style

Type of parser in CPS: (String → a1 → ... → an → r) → String → r Type of pretty printer CPS: (String → r) → String → a1 → ... → an → r

◮ Both producer and consumer can be curried! ◮ Complete symmetry.

16 / 31

slide-30
SLIDE 30

Recovering symmetry with continuation passing style

Type of parser in CPS: (String → a1 → ... → an → r) → String → r Type of pretty printer CPS: (String → r) → String → a1 → ... → an → r Type of parser in CPS: (String → a1 → ... → an → r) → (String → r) Type of pretty printer in CPS: (String → r) → (String → a1 → ... → an → r)

◮ Both producer and consumer can be curried! ◮ Complete symmetry.

16 / 31

slide-31
SLIDE 31

Composing parsers in CPS

f :: (String → b → r1) → (String → r1) g :: (String → a → r2) → (String → r2) f ◦ g :: (String → a → b → r1) → (String → r1) Unification constraints: r2 = b → r1.

17 / 31

slide-32
SLIDE 32

Composing pretty printers in CPS (Danvy, 1998)

f ′ :: (String → r1) → (String → b → r1) g′ :: (String → r2) → (String → a → r2) g′ ◦ f ′ :: (String → r1) → (String → a → b → r1) Unification constraints: r2 = b → r1.

18 / 31

slide-33
SLIDE 33

Putting it all together

K7 f f ′ ◦ K7 g g′ :: K7 ((String → a → b → r) → (String → r)) ((String → r) → (String → a → b → r))

19 / 31

slide-34
SLIDE 34

{0,1}-parsers and {0,1}-printers

Existentially pack answer type: type PPP a = ∀r. K7 ((String → a → r) → (String → r)) ((String → r) → (String → a → r)) type PPP0 = ∀r. K7 ((String → r) → (String → r)) ((String → r) → (String → r))

◮ Not closed under composition! ◮ Compose n-parser with (pure) n-consumer to get 1-parser. ◮ Compose n-printer with (pure) n-producer to get 1-printer. ◮ Parser-consumer and printer-producer composition written

using (−→) (alias for (♦), but with lower precedence).

20 / 31

slide-35
SLIDE 35

Example: parsing and printing pairs

lit :: String → PPP0 lit x = K7 (λk s → case stripPrefix x s of Just s′ → k s′) (λk s → k (x + + s)) anyChar :: PPP Char anyChar = K7 (λk s → k (tail s) (head s)) (λk s x → k ([x] + + s))

21 / 31

slide-36
SLIDE 36

Example: parsing and printing pairs

lit :: String → PPP0 lit x = K7 (λk s → case stripPrefix x s of Just s′ → k s′) (λk s → k (x + + s)) anyChar :: PPP Char anyChar = K7 (λk s → k (tail s) (head s)) (λk s x → k ([x] + + s)) kpair :: K7 ((String → (a,b) → r) → (String → b → a → r)) ((String → b → a → r) → (String → (a,b) → r)) kpair = K7 (λk s y x → k s (x,y)) (λk s (x,y) → k s y x)

21 / 31

slide-37
SLIDE 37

Example: parsing and printing pairs

lit :: String → PPP0 lit x = K7 (λk s → case stripPrefix x s of Just s′ → k s′) (λk s → k (x + + s)) anyChar :: PPP Char anyChar = K7 (λk s → k (tail s) (head s)) (λk s x → k ([x] + + s)) kpair :: K7 ((String → (a,b) → r) → (String → b → a → r)) ((String → b → a → r) → (String → (a,b) → r)) kpair = K7 (λk s y x → k s (x,y)) (λk s (x,y) → k s y x) pair :: PPP (Char,Char) pair = lit "(" ◦ anyChar ◦ lit "," ◦ anyChar ◦ lit ")" −→ kpair

21 / 31

slide-38
SLIDE 38

Choice

22 / 31

slide-39
SLIDE 39

Choice for parsing/printing algebraic datatypes

◮ Need to add throwing and catching exceptions side effect:

  • 1. abort on malformed input.
  • 2. backtrack to last choice point if parsing/printing failure.

◮ Can model exceptions through the exception monad. ◮ Parsing is a monad.

−→ can compose monads to compose effects.

◮ Printing is not a monad.

−→ cannot compose monads to compose effects.

23 / 31

slide-40
SLIDE 40

Choice for parsing/printing algebraic datatypes

◮ Need to add throwing and catching exceptions side effect:

  • 1. abort on malformed input.
  • 2. backtrack to last choice point if parsing/printing failure.

◮ Can model exceptions through the exception monad. ◮ Parsing is a monad.

−→ can compose monads to compose effects.

◮ Printing is not a monad.

−→ cannot compose monads to compose effects.

◮ Answer type must be polymorphic — cannot lift to monadic

type: f :: (String → b → m r1) → (String → m r1) g :: (String → a → m r2) → (String → m r2) f ◦ g :: ?? Unsatisfiable unification constraint: m r2 = b → m r1.

23 / 31

slide-41
SLIDE 41

Solution: CPS transform a second time!

◮ Obtain 2-CPS 1-parser and 1-printer. Types:

K7 ((String → a → (r → t) → t) → String → (r → t) → t) ((String → (r → t) → t) → String → a → (r → t) → t)

◮ Now have a continuation and a meta-continuation. ◮ Pass continuation, meta-continuation first and make

meta-continuation constant: K7 ((t → String → a → t) → t → String → t) ((t → String → t) → t → String → a → t)

◮ Cannot be composed! Infinite type during unification:

t = a → t.

24 / 31

slide-42
SLIDE 42

Solution: CPS transform a second time!

◮ Must weaken meta-continuation argument of continuation of

parser.

◮ Conversely, must strengthen meta-continuation argument of

continuation of printer.

◮ Obtained type:

K7 (((a → t) → String → a → t) → (t → String → t)) ((t → String → t) → ((a → t) → String → a → t))

◮ Composition of cassettes is still pairwise functional

composition of components, as before.

25 / 31

slide-43
SLIDE 43

The choice combinator

type PPP a = ∀r. K7 (((a → t) → String → a → t) → (t → String → t)) ((t → String → t) → ((a → t) → String → a → t)) (⊕) :: PPP a → PPP a → PPP a K7 f f ′ ⊕ K7 g g′ = K7 (λk k′ s → f k (g k k′ s) s) (λk k′ s x → f ′ k (g′ k k′ s) s x)

◮ Reset meta-continuation (aka failure continuation) of f, f ′.

26 / 31

slide-44
SLIDE 44

Example: repeating cassettes

kcons = K7 (λk k′ s xs x → k (const (k′ xs x)) s (x : xs)) (λk k′ s xs → case xs of x : xs → k (λ → k′ xs) s xs x → k′ xs) knil = K7 (λk k′ s → k (const k′) s []) (λk k′ s xs → case xs of [] → k (k′ xs) s → k′ xs)

27 / 31

slide-45
SLIDE 45

Example: repeating cassettes

kcons = K7 (λk k′ s xs x → k (const (k′ xs x)) s (x : xs)) (λk k′ s xs → case xs of x : xs → k (λ → k′ xs) s xs x → k′ xs) knil = K7 (λk k′ s → k (const k′) s []) (λk k′ s xs → case xs of [] → k (k′ xs) s → k′ xs) many :: PPP a → PPP [a] many ppp = (ppp ◦ many ppp −→ kcons) ⊕ knil

◮ many is a derived combinator. ◮ Need lazy semantics to avoid non-termination. ◮ Essential use of answer type polymorphism.

27 / 31

slide-46
SLIDE 46

Playing cassettes

play :: (K7 a b → c) → K7 a b → c play f csst = f csst parse :: PPP a → String → Maybe a parse csst = play sideA csst (λ x → Just x) Nothing pretty :: PPP a → a → Maybe String pretty csst = play sideB csst (λ s → Just s) (const Nothing) ""

28 / 31

slide-47
SLIDE 47

Conclusion

29 / 31

slide-48
SLIDE 48

Literature

◮ “Functional unparsing” (Danvy, 1998)

−→ CPS, only printf, no ADTs.

◮ “There and back again” (Alimarine et al., 2005)

−→ arrows, needs binary encoding of alternatives, arrows must respect isomorphism laws.

◮ “Invertible Syntax Descriptions: Unifying Parsing and Pretty

Printing” (Rendel and Ostermann, 2010) −→ applicative functor but not quite, packs all arguments in nested tuples.

30 / 31

slide-49
SLIDE 49

Future work

◮ Fix order of arguments. ◮ Implementation in direct style. ◮ Port all Parsec combinators to cassette framework. ◮ Study initial vs final.

31 / 31