Mechanising Hankin and Barendregt using the Gordon-Melham axioms - - PowerPoint PPT Presentation

mechanising hankin and barendregt using the gordon melham
SMART_READER_LITE
LIVE PREVIEW

Mechanising Hankin and Barendregt using the Gordon-Melham axioms - - PowerPoint PPT Presentation

Mechanising Hankin and Barendregt using the Gordon-Melham axioms Michael Norrish Michael.Norrish@nicta.com.au Merlin03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms p.1 Motivation & Outline To investigate the


slide-1
SLIDE 1

Mechanising Hankin and Barendregt using the Gordon-Melham axioms

Michael Norrish

Michael.Norrish@nicta.com.au

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.1

slide-2
SLIDE 2

Motivation & Outline

To investigate the utility of Gordon & Melham’s approach to handling terms identified up to

α-equivalence.

Strategy: mechanise a substantial piece of existing theory Hankin, Lambda calculi: a guide for computer

  • scientists. Chapter 2 (basic equational theory),

Chapter 3 (reduction). Barendregt, The lambda calculus: its syntax and

  • semantics. Chapter 11 (residuals, finite-ness of

developments, standardisation theorem), except for §11.3 (conservation theorem for λI).

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.2

slide-3
SLIDE 3

Why the λ-calculus?

Lots of existing theory (no need for me to be creative). Replaying the theory requires: development of three “languages”: basic untyped λ-calculus, Λ;

λ-calculus with labelled redexes (two sorts of

binder), Λ′;

Λ′ with weighted variables, Λ′∗

Definition of functions/relations over these languages Many proofs All this provides quite a work-out for any mechanised technique.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.3

slide-4
SLIDE 4

The Gordon-Melham approach

Provides a type of terms (term, also Λ) identified up to

α-conversion

constructors: VAR, CON, LAM, @@. constants: FV : term → (string)set [ _ / _ ] _ : term → string → term → term “axioms” about them . . . and it’s all done definitionally on top of core HOL.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.4

slide-5
SLIDE 5

GM Axioms 1–4

  • 1. specifies the behaviour FV constant over the

constructors of term.

  • 2. specifies substitution, in particular

[M /v](LAM v N ) = LAM v N u = v ∧ u ∈ FV(M ) ⇒ [M /v](LAM u N ) = LAM u ([M /v]N )

  • 3. α-conversion

u ∈ FV(LAM v M ) ⇒

LAM v M = LAM u ([VAR(u)/v]M )

  • 4. Unique iteration, allowing derivation of induction

principle, and the definition of new functions over term

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.5

slide-6
SLIDE 6

GM Axiom 5

Abstraction terms are in bijection with HOL functions of certain form: LAM v M = ABS(λy.[VAR(y)/v]M ) Those functions (of type string → term) that generate LAM terms could be the basis for a Higher Order Abstract Syntax (using ABS instead of LAM).

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.6

slide-7
SLIDE 7

The Induction Principle

A consequence of GM Axiom 4:

(∀k. P(CON(k))) ∧ (∀s. P(VAR(s))) ∧ (∀t, u. P(t) ∧ P(u) ⇒ P(t @@ u)) ∧ (∀x, t. (∀y. P([VAR(y)/x]t)) ⇒ P(LAM x t)) ⇒ ∀t. P(t)

It’s straightforward to define size : term → N (for Λ and later types) so I also induct on the size of the terms if this is easier.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.7

slide-8
SLIDE 8

Hankin’s Chapter 2: basics

First important result is the Substitution Lemma:

x = y ∧ x ∈ FV(L) ⇒ [L/y]([N /x]M ) = [[L/y]N /x]([L/y]M )

Easy induction. Later found that I needed this variant:

z = y ∧ z ∈ FV(M ) ∧ z ∈ FV(L) ⇒ [L/y]([N /x]M ) = [[L/y]N /z]([L/y]([z/x]M ))

More general, as can apply left-to-right anywhere; just pick a suitably fresh z. Slightly harder induction.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.8

slide-9
SLIDE 9

Hankin’s Chapter 2: equational theory

Hankin presents

λ ⊢ (λv. M)N = [N/v]M λ ⊢ M = M′ λ ⊢ M N = M′ N λ ⊢ M = M′ λ ⊢ (λv. M) = (λv. M′)

Mechanised in HOL, this is a simple inductive relation: ...

∧ (∀M M ′ v. M lameq M ′ ⇒

LAM v M lameq LAM v M ′)

∧ ...

Term incompatibility also easy to mechanise.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.9

slide-10
SLIDE 10

Hankin’s Chapter 3: reduction

General properties of reduction relations.

β-reduction is Church-Rosser, using notion of “grand

reduction” (։1) (gives soundness of equational theory) Newman’s Lemma (Weak Church-Rosser + Strong Normalisation ⇒ CR) Hindley-Rosen Lemma CR for η- and βη-reduction (sketched)

δ-rules and Mitschke’s theorem (sketched)

Residuals and standardisation (sketched)

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.10

slide-11
SLIDE 11

Chapter 3: 1st Encounter with the BVC

Proving substitutivity of reduction relations, e.g.:

M →β M ′ ⇒ [N /x]M →β [N /x]M ′

Proof by rule induction over →β In abstraction case:

  • Ind. hyp.: ∀N , x. [N /x]M →β [N /x]M ′

To show: [N /x](LAM v M ) →β [N /x](LAM v M ′) With BVC, assume x = v and v ∈ FV(N ); push substitution through LAM; apply inductive hypothesis; apply congruence rule; done.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.11

slide-12
SLIDE 12

Chapter 3: 1st Encounter with the BVC

Proving substitutivity of reduction relations, e.g.:

M →β M ′ ⇒ [N /x]M →β [N /x]M ′

Proof by rule induction over →β In abstraction case:

  • Ind. hyp.: ∀N , x. [N /x]M →β [N /x]M ′

To show: [N /x](LAM v M ) →β [N /x](LAM v M ′) Without BVC, must instead α-convert abstraction to LAM z ([VAR(z)/v]M ), with z fresh. Then result of substitution is LAM z ([N /x]([VAR(z)/v]M )) and inductive hypothesis does not apply.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.11

slide-13
SLIDE 13

Instead of BVC, use iterated substitution

Previous proof failed because α-conversion produced two substitutions over base term. Strengthen statement to encompass this using iterated substitution, ISUB : term → (term × string)list → term Theorem to be proved becomes

M →β M ′ ⇒ (M ISUB S) →β (M ′ ISUB S)

Inductive hypothesis for abstraction case is then

∀S.(M ISUB S) →β (M ′ ISUB S)

The universal quantification of S then copes with goal including term of form LAM z (([z/v]M ) ISUB S)

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.12

slide-14
SLIDE 14

Barendregt Chapter 11

Section 11.1: introduction of type Λ′ (λ-terms with

  • ptionally labelled redexes). β′ = β0 ∪ β1. β0 reduces

labelled redexes, β1 unlabelled redexes. Gives alternative proof of CR for β. Section 11.2: the finite-ness of developments. Introduction of residual theory. Proof of SN and WCR for β0. Hence all β0 reduction sequences can be extended to a fixed completion point. Needs definition

  • f new type Λ′∗.

Section 11.3: conservation theorem for λI. (Omitted.) Section 11.4: standardisation theorem.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.13

slide-15
SLIDE 15

Barendregt’s new types

Labelled terms (Λ′): where redexes within the term may be labelled by numbers.

Λ′ has an extra constructor: (λix. M )N , taking 4

arguments (i ∈ N is the label). Weighted terms (Λ′∗): labelled terms where all variables (free and bound) are given strictly positive weights. E.g., λx. x 2(y4x 3). Same variable can get different weights, so weights really attach to variable positions. Mechanisation must provide substitution, α-conversion, and induction principles for these new types.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.14

slide-16
SLIDE 16

Defining type Λ′

Find model in subset of Λ, using CON constructor to label certain applications.

Λ′

α modelled by ΛN+α (Λ polymorphic through CON

constructor) Representation of (λix. M )N is CON(left(i)) @@ (LAM x M ) @@ N Inductively characterise set of terms that qualify as labelled. Substitution over representation corresponds to substitution over new type. Many theorems about Λ transfer unscathed (including Substitution Lemma and others)

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.15

slide-17
SLIDE 17

Defining type Λ′∗

This type used in SN proof for β0: if weighted appropriately, the sum of a term’s weights decreases with β0 reduction. Following Barendregt’s example, terms are paired with a weighting map w. Values of Λ′∗ are pairs of type

Λ′ × (term posn → N).

Characterising change in weighting map after substitutions is painful. (Barendregt completely skims

  • ver this.)

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.16

slide-18
SLIDE 18

Labelling Reductions

Barendregt writes M

→ N , with ∆ the redex (sub-term)

  • f M that reduces.

If the reduction is

(λx. (λy. yx)z) → (λx. zx)

what is the right label?

(λy. yx)z ?

But, (λx. (λy. yx)z) ≡α (λw. (λy. yw)z), so (λy. yw)z must also be right. With the GM axioms, α-equivalent terms are really identical, and there’s no clean way of picking x over w,

  • r over any other fresh variable.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.17

slide-19
SLIDE 19

Term Positions as Reduction Labels

Label reduction arrows with “term positions”: term posn = {Lt, Rt, In}∗ Same label can be used for reductions in Λ and Λ′. 1-1 correspondence between labels and reductions. Detailed (fiddly) theory of position sets required. E.g., lam posns : term → (term posn)set those places in a term where there are abstractions. Lots of term position detail emerges in proofs that Barendregt claims are clear, obvious or trivial. Most significant divergence from Barendregt’s text.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.18

slide-20
SLIDE 20

Iterated α-conversion and the BVC

Proof of the standardisation theorem (M ։β N ⇒ M ։s N ):

  • 1. (Complete) induction on size of N
  • 2. ∃Z. M ։h Z ∧ Z ։i N
  • 3. N is of form (λx1 . . . xn. N0N1 . . . Nm)
  • 4. So, Z is of form (λx1 . . . xn. Z0Z1 . . . Zm), with Zi ։β Ni
  • 5. Each Ni is smaller than N , so by inductive hypothesis,

each Zi ։β Ni can be made standard

  • 6. Fit all these together with leading head reductions, and

demonstrate complete standard reduction

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.19

slide-21
SLIDE 21

Iterated α-conversion and the BVC

Proof of the standardisation theorem (M ։β N ⇒ M ։s N ):

  • 1. (Complete) induction on size of N
  • 2. ∃Z. M ։h Z ∧ Z ։i N
  • 3. N is of form (λx1 . . . xn. N0N1 . . . Nm)
  • 4. So, Z is of form (λx1 . . . xn. Z0Z1 . . . Zm), with Zi ։β Ni

Stop: what if xi ∈ FV(Zj ) ? The BVC has been used to pick

xi to be distinct from

free variables of

Zi

Need (i) an α-conversion principle to allow a vector of fresh binders to be chosen; (ii) a principle stating that internal reductions under binders preserve those binders

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.19

slide-22
SLIDE 22

The Good Bits

I proved all the theorems Most proofs are recognisably Barendregt’s (though if his proof is “Obvious”, mine is not usually so short) The GM axioms didn’t cause any significant problems. Pain mainly came from elsewhere: Never proved theorems for explicitly calculating residuals Definition of standard reduction path, with indexes into path positions, painful to work with Lots of tedious reasoning about term positions

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.20

slide-23
SLIDE 23

The Bad Bits

I made do with very impoverished technology: Established new types by hand Had very primitive support for defining new functions

  • ver Λ

Didn’t investigate more novel ideas (e.g., permuting atoms) for easing difficulties So this is all future work. Have no accurate figures on time spent, nor any clear basis for comparison with others’ mechanisations of similar material

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.21

slide-24
SLIDE 24

Conclusions

The Gordon-Melham axioms are a good vehicle for mechanising reasoning about languages with binders Even with little in the way of reasoning support, a substantial amount of λ-calculus theory was mechanised. Would like to test the approach on Böhm trees first order logic (aiming for cut elimination, say)

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.22

slide-25
SLIDE 25

Automatic Function Definition over Λ

For example: size (VAR s) = 1 ∧ size (CON k) = 1 ∧ size (t @@ u) = 1 + size t + size u ∧ size (LAM v t) = 1 + size t Hope to cope with all definitions where result is invariant under all possible substitutions of variables for variables.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.23

slide-26
SLIDE 26

Under the Hood with Function Definition

f (LAM v t) = g v t (f t) is turned into f (LAM v t) = let nv = NEW (FV (LAM v t)) in g nv ([VAR nv/v] t) (f ([VAR nv/v] t)) If there are no bare references to t or v, and f is variable name indifferent (which needs to be proved by induction), then you can get back your original, pretty equation.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.24

slide-27
SLIDE 27

The Hardest Bit from Chapter 2

Defining η-normal form. Desired equations: enf(VAR(s))

= ⊤

enf(CON(k))

= ⊤

enf(M @@ N )

=

enf(M ) ∧ enf(N ) enf(LAM v M )

=

enf(M )∧

¬

  • is comb(M ) ∧ rand(M ) = VAR(v)∧

v ∈ FV(rator(M ))

  • In contrast, defining bnf, size, is comb, rator, rand and
  • thers is easy.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.25

slide-28
SLIDE 28

Function Definition—why enf fails

In the case of enf, the original definition enf (LAM x u) = enf ( u) ∧ (is_comb ( u) ⇒ (rand ( u) = VAR x) ⇒ x IN FV (rator ( u))))

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.26

slide-29
SLIDE 29

Function Definition—why enf fails

In the case of enf, the original definition becomes enf (LAM x u) = let v = NEW (FV (LAM x u)) in enf ([VAR v/x] u) ∧ (is_comb ([VAR v/x] u) ⇒ (rand ([VAR v/x] u) = VAR v) ⇒ v IN FV (rator ([VAR v/x] u))))

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.27

slide-30
SLIDE 30

Function Definition—why enf fails

In the case of enf, the original definition becomes enf (LAM x u) = let v = NEW (FV (LAM x u)) in enf ([VAR v/x] u) ∧ (is_comb ([VAR v/x] u) ⇒ (rand ([VAR v/x] u) = VAR v) ⇒ v IN FV (rator ([VAR v/x] u)))) Already have that is_comb ([VAR v/x] t) = is_comb t . . . and that rator ([VAR v/x] t) = [VAR v/x] (rator t) (and likewise for rand)

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.28

slide-31
SLIDE 31

Function Definition—why enf fails

Also need that

v ∈ FV(λx. u) ⇒ v ∈ FV([v/x](rator(u))) ≡ x ∈ FV(rator(u))

and

v ∈ FV(λx. u) ⇒ (rand([v/x]u) = v) ≡ (rand(u) = x)

To define enf, I proved these properties by hand Any tool for making definitions would have to store an ever-increasing set of likely properties for all of the previously defined constants Functions that are not renaming-invariant worse still.

Merlin’03: Mechanising Hankin and Barendregt using the Gordon-Melham axioms – p.29