Run your Research On the Effectiveness of Lightweight Mechanization - - PowerPoint PPT Presentation

run your research
SMART_READER_LITE
LIVE PREVIEW

Run your Research On the Effectiveness of Lightweight Mechanization - - PowerPoint PPT Presentation

Run your Research On the Effectiveness of Lightweight Mechanization C Klein, J Clements, C Dimoulas, C Eastlund, M Felleisen, M Flatt, J A McCarthy, J Rafkind, S Tobin-Hochstadt, R B Findler 1 The Koala, the Orangutan, and the Walrus ftp>


slide-1
SLIDE 1

Run your Research

On the Effectiveness of Lightweight Mechanization

C Klein, J Clements, C Dimoulas, C Eastlund, M Felleisen, M Flatt, J A McCarthy, J Rafkind, S Tobin-Hochstadt, R B Findler

1
slide-2
SLIDE 2

and the Walrus the Orangutan, The Koala,

ftp> user anonymous 331 Guest login ok Password: 230-Welcome to λ.com int main () {

One day, Koala decided to build an ftp server 2
slide-3
SLIDE 3

and the Walrus the Orangutan, The Koala,

230-Welcome to λ.com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\

and made the unfortunate choice to use the programming language C. 3
slide-4
SLIDE 4

and the Walrus the Orangutan, The Koala,

230-Welcome to λ.com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\

We must not be surprised by this choice, however, as C is well-known to be a programming language that is effective for building systems software. 4
slide-5
SLIDE 5

and the Walrus the Orangutan, The Koala,

230-Welcome to λ.com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\

After a few months of effort, Koala produced a functioning server that was rapidly adopted across the internet and widely used. 5
slide-6
SLIDE 6

and the Walrus the Orangutan, The Koala,

230-Welcome to λ.com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\

One day, Orangutan decided to apply a new, automated testing technique to Koala’s ftp server and, sure enough, found multiple bugs — 6
slide-7
SLIDE 7

and the Walrus the Orangutan, The Koala,

230-Welcome to λ.com int main () { if (!(q = 0)) *((int*)p)=12; }p == 0 ∨ *p == *q \[\Gamma\ \vdash\

unsurprising for software of that complexity implemented in a programming language like C. After all, C is designed for performance and provides no help to maintain invariants of data structures or to detect errors early, when they are easy to fix. 7
slide-8
SLIDE 8

and the Walrus the Orangutan, The Koala,

}p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \]

So, Orangutan decided to write a paper that explained the mathematical techniques it used to uncover the bugs and made the unfortunate choice to use the programming language LaTeX. 8
slide-9
SLIDE 9

and the Walrus the Orangutan, The Koala,

}p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \]

We must not be surprised by this choice, however, as LaTeX is well-known to be a programming language that is effective for typesetting mathematical formulas. 9
slide-10
SLIDE 10

and the Walrus the Orangutan, The Koala,

}p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \]

After a few months of effort, Orangutan produced a paper extolling the virtues of its new techniques, and the ideas were adopted across the software engineering community and the paper was widely cited. 10
slide-11
SLIDE 11

and the Walrus the Orangutan, The Koala,

}p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \]

One day, Walrus decided to apply a new, lightweight mechanized metatheory technique to Orangutan’s paper and, sure enough, found multiple bugs — 11
slide-12
SLIDE 12

and the Walrus the Orangutan, The Koala,

}p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \]

unsurprising for a piece of mathematics of that complexity implemented in a programming language like LaTeX. After all, LaTeX is designed for beautiful output and provides no help to check invariants of mathematical formulas or to run examples to ensure they illustrate the intended points. 12
slide-13
SLIDE 13

Moral: bugs are

everywhere

13
slide-14
SLIDE 14

A niche for mechanized metatheory:

  • lightweight: high level of expressiveness (think scripting

language)

  • supports the entire semantics lifecycle:
Write-up Robust model Prototype model 14
slide-15
SLIDE 15

The Semantics Lifecycle

Write-up Robust model Prototype model

15
slide-16
SLIDE 16

misrenamed non-terminal

Write-up Robust model Prototype model

The Semantics Lifecycle

16
slide-17
SLIDE 17

misrenamed non-terminal forgot typing rule

Write-up Robust model Prototype model

The Semantics Lifecycle

17
slide-18
SLIDE 18

misrenamed non-terminal forgot typing rule lost a case in a helper function

Write-up Robust model Prototype model

The Semantics Lifecycle

18
slide-19
SLIDE 19

misrenamed non-terminal forgot typing rule lost a case in a helper function added a case to wrong fn

Write-up Robust model Prototype model

The Semantics Lifecycle

19
slide-20
SLIDE 20

misrenamed non-terminal forgot typing rule lost a case in a helper function added a case to wrong fn

Write-up Robust model Prototype model

swappped args

The Semantics Lifecycle

20
slide-21
SLIDE 21

misrenamed non-terminal forgot typing rule lost a case in a helper function added a case to wrong fn

Write-up Robust model Prototype model

swappped args misused the inductive hyp.

The Semantics Lifecycle

21
slide-22
SLIDE 22

misrenamed non-terminal forgot typing rule lost a case in a helper function added a case to wrong fn

Write-up Robust model Prototype model

swappped args misused the inductive hyp. didn’t recheck a lemma

The Semantics Lifecycle

22
slide-23
SLIDE 23

misrenamed non-terminal forgot typing rule lost a case in a helper function added a case to wrong fn

Write-up Robust model Prototype model

swappped args misused the inductive hyp. didn’t recheck a lemma transcribed math wrong

The Semantics Lifecycle

23
slide-24
SLIDE 24

misrenamed non-terminal forgot typing rule lost a case in a helper function added a case to wrong fn

Write-up Robust model Prototype model

swappped args misused the inductive hyp. didn’t recheck a lemma transcribed math wrong forgot to recheck example

The Semantics Lifecycle

24
slide-25
SLIDE 25

Redex

  • ur tool designed to fill this niche
25
slide-26
SLIDE 26

Our study:

  • Can random testing find bugs in an existing,

well-tested Redex model?

  • Can Redex find bugs in published papers?
26
slide-27
SLIDE 27

Our study:

  • Can random testing find bugs in an existing,

well-tested Redex model? Yes

  • Can Redex find bugs in published papers?

Yes

27
slide-28
SLIDE 28

10

10 papers in Redex 9 ICFP ’09 papers 8 written by others 2 mechanically verified

28
slide-29
SLIDE 29

10

papers with errors

10

10 papers in Redex 9 ICFP ’09 papers 8 written by others 2 mechanically verified

29
slide-30
SLIDE 30

10

Your

papers have errors too

10

30
slide-31
SLIDE 31

Copy & Paste Typesetting Error:

31
slide-32
SLIDE 32

Copy & Paste Typesetting Error:

32
slide-33
SLIDE 33

Copy & Paste Typesetting Error: Typesetting should be automatic

33
slide-34
SLIDE 34

Erroneous Example:

34
slide-35
SLIDE 35

Erroneous Example:

35
slide-36
SLIDE 36

Erroneous Example:

36
slide-37
SLIDE 37

Erroneous Example: Examples can be tested

37
slide-38
SLIDE 38

Unexpected Behavior: select(c, c)

38
slide-39
SLIDE 39

Unexpected Behavior: compile select(c, c) ⊙c | ~ select(c, c)

39
slide-40
SLIDE 40

Unexpected Behavior: compile select(c, c) – stuck ⊙c | ~ select(c, c) – loops forever Deadlock in source but busy waiting in target

40
slide-41
SLIDE 41

Unexpected Behavior: compile select(c, c) – stuck ⊙c | ~ select(c, c) – loops forever Deadlock in source but busy waiting in target Found this by playing with examples

41
slide-42
SLIDE 42

False Theorem: If a term reduces with a memo store, then the program without the memo store reduces the same way

42
slide-43
SLIDE 43

False Theorem: If a term reduces with a memo store, then the program without the memo store reduces the same way Counterexample: If σ = {(δ,1) → 2} then (λδ x. x) 1, σ ⇒* 2, σ, but (λδ x. x) 1 ↦ 1 Not a fly-by-night proof; 12 typeset pages in a dissertation chapter

43
slide-44
SLIDE 44

False Theorem: If a term reduces with a memo store, then the program without the memo store reduces the same way Counterexample: If σ = {(δ,1) → 2} then (λδ x. x) 1, σ ⇒* 2, σ, but (λδ x. x) 1 ↦ 1 Not a fly-by-night proof; 12 typeset pages in a dissertation chapter Random testing easily finds this

44
slide-45
SLIDE 45

Recap:

  • Automatic typesetting
  • Unit Testing
  • Exploring Examples
  • Random testing
45
slide-46
SLIDE 46

p ::= (e ...) e ::= (e e ...) | (λ (x:t ...) e) | x | (+ e ...) | number | (amb e ...) t ::= (→ t ... t) | num P ::= (e ... E e ...) E ::= (v ... E e ...) | (+ v ... E e ...) | [] v ::= (λ (x:t ...) e) | number Γ ::= · | (x : t Γ) P[((λ (x:t ...1) e) v ...1)] [βv] P[e{x:=v ...}] P[(+ number1 ...)] [+] P[Σ[[number1, ... ] ] ] (e1 ... E[(amb e2 ...)] e3 ...) [amb] (e1 ... E[e2] ... e3 ...) Γ ⊢ e1 : (→ t2 ... t3) Γ ⊢ e2 : t2 ... Γ ⊢ (e1 e2 ...) : t3 (x1 : t1 Γ) ⊢ (λ (x2:t2 ...) e) : (→ t2 ... t) Γ ⊢ (λ (x1:t1 x2:t2 ...) e) : (→ t1 t2 ... t) Γ ⊢ e : t Γ ⊢ (λ () e) : (→ t) (x : t Γ) ⊢ x : t Γ ⊢ x1 : t1 x1 ≠ x2 (x2 : t2 Γ) ⊢ x1 : t1 Γ ⊢ e : num ... Γ ⊢ (+ e ...) : num Γ ⊢ number : num Γ ⊢ e : num ... Γ ⊢ (amb e ...) : num Γ ⊢ e : num ... Γ ⊢ (amb e ...) : num (e1 ... E[(amb e2 ...)] e3 ...) [amb] (e1 ... E[e2] ... e3 ...) | number | (amb e ...) t ::= (→ t ... t) | num p ::= (e ...) e ::= (e e ...)

46
slide-47
SLIDE 47

p ::= (e ...) e ::= (e e ...) | (λ (x:t ...) e) | x | (+ e ...) | number | (amb e ...) t ::= (→ t ... t) | num P ::= (e ... E e ...) E ::= (v ... E e ...) | (+ v ... E e ...) | [] v ::= (λ (x:t ...) e) | number Γ ::= · | (x : t Γ) P[((λ (x:t ...1) e) v ...1)] [βv] P[e{x:=v ...}] P[(+ number1 ...)] [+] P[Σ[[number1, ... ] ] ] (e1 ... E[(amb e2 ...)] e3 ...) [amb] (e1 ... E[e2] ... e3 ...) Γ ⊢ e1 : (→ t2 ... t3) Γ ⊢ e2 : t2 ... Γ ⊢ (e1 e2 ...) : t3 (x1 : t1 Γ) ⊢ (λ (x2:t2 ...) e) : (→ t2 ... t) Γ ⊢ (λ (x1:t1 x2:t2 ...) e) : (→ t1 t2 ... t) Γ ⊢ e : t Γ ⊢ (λ () e) : (→ t) (x : t Γ) ⊢ x : t Γ ⊢ x1 : t1 x1 ≠ x2 (x2 : t2 Γ) ⊢ x1 : t1 Γ ⊢ e : num ... Γ ⊢ (+ e ...) : num Γ ⊢ number : num Γ ⊢ e : num ... Γ ⊢ (amb e ...) : num

Γ ⊢ e : num ... Γ ⊢ (amb e ...) : num (e1 ... E[(amb e2 ...)] e3 ...) [amb] (e1 ... E[e2] ... e3 ...) | number | (amb e ...) t ::= (→ t ... t) | num p ::= (e ...) e ::= (e e ...)

47
slide-48
SLIDE 48

Recap: ✓ Automatic typesetting ✓ Unit Testing ✓ Exploring Examples ✓ Random testing

48
slide-49
SLIDE 49

Takeaways:

  • Nobody will produce error-free papers
  • Errors introduce friction into our communication
  • Redex can help reduce the errors — with about as

much effort as LaTeX requires

49
slide-50
SLIDE 50

Thank you.

50