Abstract Machines What is an abstract machine? - a set of legal - - PowerPoint PPT Presentation
Abstract Machines What is an abstract machine? - a set of legal - - PowerPoint PPT Presentation
Concepts of Program Design Abstract Machines Gabriele Keller Ron Vanderfeesten Abstract Machines What is an abstract machine? - a set of legal states final and initial states as subset - A set of instructions altering the state of the
- What is an abstract machine?
- a set of legal states
- final and initial states as subset
- A set of instructions altering the state of the machine
- it should be possible to implement the operations on a real machine in a
finite (preferably constant) number of steps
- Why use abstract machines at all?
- specifies the semantics of a programming languages
- facilitates porting to other architectures
- mobile code (e.g., Java Virtual Machine)
- We have seen this before
- similar to SOS, but with Abstract Machines, we care about performance
Abstract Machines
- Base line: the M-machine
- small step semantics for MinHs
- transition system embodies essentially a very high-level (= concise, but
inefficient) abstract machine
- we call it the M-machine
- Characteristics of the M-machine
- substitution as “machine operation”
- why is that bad?
- can be avoided by using an environment
- control-flow is not explicit
- the search rules determine next subexpression to be evaluated
- why is that bad?
Control Flow
- Example:
Control Flow
Plus(Plus(Num 3)(Num 2))(Num 4) ↦ Plus(Num 5)(Num 4) Plus(Plus(Plus(Num 3)(Num 2))(Num 4))(Num 6) ↦ Plus(Plus(Num 5)(Num 4))(Num 6) (Plus (Num n) (Num m)) ↦ Num (n+m) (Plus e1 e2) ↦ (Plus e1’ e2) e1 ↦ e1’ (Plus(Num n) e2) ↦ (Plus (Num n) e2’) e2 ↦ e2’ Plus(Num 3)(Num 2) ↦ Num 5
depending on the size & nesting depth of the expression, searching for the next reducible subexpression can be very expensive!!!
- Single-step evaluation in Haskell:
Control Flow
- Properties:
- for each step, the expression has to be traversed to find the next
subexpression that has to be evaluated
- makes heavy use of the runtime stack
eval(Num n) = Num n eval e = eval (single e) single (Plus (Num n1) (Num n2)) = Num (n1 + n2) single (Plus (Num n1) e) = Plus (Num n1) (single e) single (Plus e1 e2) = Plus (single e1) e2 single (Times ....
- Explicit control flow: C-machine
- explicit stack
- explicit handling of control flow
- variable binding still handled by substitution
- we call this machine the C-machine
- Machine state
- the current expression (as before)
- a control stack of subcomputations (frames) which have to be performed before
the machine terminates
- Initial and final states
- initial states: closed expression and an empty stack
- final states: expression is a value and the stack is empty
The C-machine
- Example: addition in three stages
- 1. Evaluate first argument
- first argument becomes current expression
- remember to continue with computation, result as first argument
- 2. Evaluate second argument
- second argument becomes current expression
- remember to continue with computation, result as second argument
- 3. Perform addition
The C-machine
- How can we denote a stack frame as a term?
- We use terms with holes; e.g.,
- suspended computation of addition
- waits for the value of its first argument
- Inductive definition of frames:
The C-machine
Plus ☐ e2
(Plus ☐ e) frame e expr (Plus v ☐) frame v value
Plus e1 ☐ not a frame, because first argument is evaluated first!
Inductive Definition of Frames
- Addition
(Plus ☐ e) frame e expr (Plus v ☐) frame v value
- If-expressions
(If ☐ e1 e2) frame e1 expr e2 expr (Apply ☐ e) frame e expr (Apply v ☐) frame v value
- Application
- Stacks: f1▷f2 ▷◦
- f1 is the top-most
frame
- f2 is the second frame
- ◦is the empty stack
- Inductive definition:
Stack and Machine Modes
- Machine modes: the C-machine operates in two modes:
- s≻e : evaluate expression e under stack s
- s≺v : return value v to stack s
- stack
f ▷ s stack f frame s stack
Transition Rules for MinHs
- Values (integers, booleans, functions)
(Plus ☐ e2) ▷ s ≺ v ↦c s ≻ (Plus e1 e2) ↦c s ≻ v ↦c s ≺ v
- Addition
(Plus ☐ e2) ▷ s ≻ e1 (Plus v ☐) ▷ s ≻ e2
s ≺ Num(n1 + n2)
(Plus(Num n1) ☐) ▷ s ≺ Num n2 ↦c
evaluate the value v under stack s
{
Transition Rules for MinHs
- if-expressions
s ≻ (If e1 e2 e3) ↦c (If ☐ e2 e3) ▷ s ≻ e1 (If ☐ e2 e3) ▷ s ≺ True ↦c (If ☐ e2 e3) ▷ s ≺ False ↦c (If ☐ e2 e3) ▷ s ≻ e1 s ≻ e2 s ≻ e3
Transition Rules for MinHs
- Function application
s ≻(Apply e1 e2) ↦c (Apply ☐ e2) ▷ s ≺ v ↦c (Apply(Fun τ1 τ2 f.x.e) ☐) ▷ s ≺ v ↦c s ≻e [f:=(Fun τ1 τ2 (f.x.e)),x := v]
- Observations;
- all the inference rules are axioms!
- the definition of single-step evaluation in the C-machine is not recursive
- the full evaluator is tail recursive (can be implemented using a while-loop)
(Apply ☐ e2) ▷ s ≻ e1 (Apply v ☐) ▷ s ≻ e2
writing Fun instead of Recfun from now
- n to save some space
- Now, let’s get rid of substitution!
- We used an environment for TinyC
- but we can’t just pass it along, because we wouldn’t know when to delete
bindings
- we need to save the old environment somewhere, and restore it when returning
from the function call
- can we use the stack to keep track of the environment?
Environments
(Apply(Fun τ1 τ2 f.x.e) ☐) ▷ s ≺ v ↦c s ≻e[f := Fun τ1 τ2 f.x.e),x:= v]
The E-machine
- In the E-machine
- we have frames defined exactly as before
- explicit environments, which are a sequence of variable bindings
- stacks in the E-machine are sequences of environments and frames
- states in the E-machine include an environment
- env
x = v, η env η env
- stack
f ▷ s stack f frame s stack η ▷ s stack η env s stack
s | η ≻ e s | η ≺ v
The E-machine: Transition Rules
- Free variables:
s | η ≻ x ↦E s | η ≺ v , if x=v ∈ η
- Application:
(Apply(Fun τ1 τ2 f.x.e) ☐) ▷ s | η ≺ v ↦E
η ▷ s| f =(Fun τ1 τ2 f.x.e), x = v, η ≻ e
- Returning values:
η ▷ s | η’ ≺ v ↦E s | η ≺ v ,
The E-machine: Transition Rules
- Are these rules correct?
- let’s look at two example usages
- Example
- simple function application
- nested application (corresponds to a function which accepts two arguments
and returns the first one) Apply(Fun Int Int (f.x.(Plus x 1)) 3 Apply (Apply (Fun(Int→Int) Int (f.x.(Fun int int g.y.x)) 3) 1)
We omit the type information, and abbreviate apply to app and write n instead of (Num n)
Example 1
↦E ● ▷ ◦| x=3, f=Fun(f.x.Plus x 1), ● ≺ 4 ↦E (Plus 3 ☐) ▷ ● ▷ ◦| x=3, f=Fun(f.x.Plus x 1), ● ≺ 1 ↦E (Plus ☐ 1)▷ ● ▷ ◦| x=3, f=Fun(f.x.Plus x 1), ● ≻ x ↦E (Plus ☐ 1) ▷ ● ▷ ◦| x=3, f=Fun(f.x.Plus x 1), ● ≺ 3 ↦E (Plus 3 ☐) ▷ ● ▷ ◦| x=3, f=Fun(f.x.Plus x 1), ● ≻ 1 ↦E ● ▷ ◦| x=3, f=Fun(f.x.(Plus x 1)),● ≻ (Plus x 1) ↦E (App(Fun(f.x.(Plus x 1)) ☐)▷ ◦|● ≻ 3 ↦E (App ☐ 3)▷ ◦ |● ≺ Fun(f.x.(Plus x 1))
- |● ≻ App(Fun(f.x.(Plus x 1)) 3)
↦E (App(Fun(f.x.(Plus x 1)) ☐) ▷ ◦|● ≺ 3 ↦E (App ☐ 3)▷ ◦ |● ≻ Fun(f.x.(Plus x 1)) ↦E ◦| ● ≺ 4
Example 2
(App(App(Fun(f.x.Fun(g.y.x)) 3) 4) (recfun f x = recfun g y = x) 3 4
Example 2
↦E ●▷ (App ☐ 4)▷◦| x=3, f=Fun(f.x.Fun...),● ≻ Fun(g.y.x) ↦E (App(Fun(f.x.Fun(g.y.x)) ☐) ▷(App ☐ 4)▷◦|● ≺ 3 ↦E (App ☐ 4) ▷ ◦|● ≻ (App(Fun(f.x.Fun(g.y.x)) 3)
- |● ≻ (App(App(Fun(f.x.Fun(g.y.x)) 3) 4)
↦E (App ☐ 3) ▷(App ☐ 4)▷ ◦|● ≻ (Fun(f.x.fun(g.y.x)) ↦E ●▷ (App ☐ 4)▷◦| x=3, f=Fun(f.x.Fun...),● ≺ Fun(g.y.x) ↦E (App(Fun(g.y.x)) ☐)▷◦| ● ≻ 4 ↦E (App ☐ 4)▷◦| ● ≺ Fun(g.y.x) ↦E (App(Fun(g.y.x)) ☐) ▷ ◦| ● ≺ 4 ↦E ◦| y=4, g=Fun(g.y. x), ● ≺ x ↦E (App ☐ 3) ▷(App(☐,4)▷ ◦|● ≺ (Fun(f.x.Fun(g.y.x)) ↦E ... ↦E
- Something went wrong!
- returning the function value and restoring the old (empty) environment, we
threw away the binding for the variable x
- it now occurs freely in g!
- Problem: functions as results are not handled correctly!
- free variables in the function bodies escape the environment they are defined
in.
- partial applications fails
Dealing with partial application
let f x y = x + y g = f 3 in let x = 5 in g x
- Solution:
- we need to bundle returned functions with current environment
- we call this a closure
- requires a new form of return values:
- Closures only appear as values during execution - there is no source form
Dealing with partial application
《 η, (Fun τ1 τ2 f.x.e)》
environment which was current when function value was created
Transition Rules
- Returning values:
η ▷ s | η’ ≻ (Num n) ↦E s | η ≻(Fun τ1 τ2 f.x.e) ↦E η ▷ s | η’ ≺ v ↦E
- Application of functions:
(Apply《 η’, (Fun τ1 τ2 f.x.e)》 ☐) ▷ s | η ≺ v ↦E
s | η ≺ (Num n) s | η ≺《 η, (Fun τ1 τ2 f.x.e)》
s | η ≺ v
η ▷ s| f =(Fun τ1 τ2 f.x.e), x = v, η’ ≻ e
restore environment from closure, add binding for argument x and function f
Example 2
- |● ≻ (App(App(Fun(f.x.fun(g.y.x)) 3) 4)
↦E (App ☐ 4) ▷ ◦|● ≻ (App(Fun(f.x.Fun(g.y.x)) 3) ↦E (App ☐ 3) ▷ (App ☐ 4)▷◦|● ≻ (Fun(f.x.Fun(g.y.x)) ↦E (App(Fun(f.x.Fun(g.y.x)) ☐) ▷ (App ☐ 4)▷◦|● ≺ 3 ↦E ●▷ app(☐,4) ▷ ◦| x=3, f=Fun(f.x.Fun...),● ≻ (Fun (g.y.x)) ↦E ●▷(App ☐ 4)▷◦ | x=3, f=Fun(f.x.(Fun...),● ≺《 x=3, f..,(Fun(g.y.x))》 ↦E (App ☐ 4) ▷ ◦| ● ≺ 《 x=3, f...,Fun(g.y.x)》 ↦E (App《 x=3, f...,Fun(g.y.x)》 ☐) ▷ ◦| ● ≻ 4 ↦E (App《 x=3, f...,Fun(g.y.x)》 ☐) ▷ ◦| ● ≺ 4 ↦E ◦| y=4, g=Fun(g.y. x), x=3, f...,Fun(g.y.x),● ≺ x
....
↦E ◦| y=4, g=Fun(g.y. x), x=3, f...,Fun(g.y.x),● ≺ 3
- Semantics of MinHs using two different approaches
- Control flow implicit: M-machine
- Control flow explicit: C-machine & E-machine
- Are these equivalent?
- for proofs, the more abstract machine is much more convenient
- proof is not obvious, as evaluation methods are very different
- how can we make the connection?
M-machine versus C-machine and E-machine
M-machine versus C-machine and E-machine
A single step in the M-machine
Plus(Plus(Plus(Num 3)(Num 2))(Num 4))(Num 6))↦M Plus(Plus(Num 5)(Num 4))(Num 6))
- ≻ Plus(Plus(Plus(Num 3)(Num 2))(Num 4))(Num 6))
↦C Plus ☐ (Num 6) ▷ ◦ ≻ Plus(Plus(Num 3)(Num 2))(Num 4))) ↦C (Plus ☐ (Num 4))▷(Plus ☐ (Num 6))▷ ◦ ≻ Plus(Num 3)(Num 2)) ↦C (Plus ☐ (Num2)) ▷(Plus ☐ (Num 4)) ▷(Plus ☐ (Num 6))▷ ◦ ≻ (Num 3) ↦C (Plus ☐ (Num 2))▷(Plus ☐ (Num(4)) ▷(Plus ☐ (Num 6))▷ ◦ ≺ (Num 3) ↦C (Plus ☐ (Num 4)) ▷(Plus ☐ (Num 6))▷ ◦ ≺ (Num 5)
corresponds to a sequence of steps in the C-machine: The stack in the C-machine corresponds to the context of the subexpression which is currently evaluated in the M-machine
- Idea: reconstruct the full expression from the stack s and the current
expression e:
M-machine versus C-machine and E-machine
(@):: Stack -> Expr ->Expr
- @ e = e
((Plus ☐ e2) ▷ s)@e1 = s@(Plus e1 e2) ((Plus v1 ☐) ▷ s)@e2 = s@(Plus v1 e2) ((If ☐ e2 e3)▷ s)@e1 = s@(If e1 e2 e3) .....
- with the reconstructed expression, we can show using rule induction that:
- if s ≻ e ↦c* ◦≺ v then s@e ↦M* v , and
- if e ↦M* v, then ◦ ≻ e ↦c* ◦≺ v
For all stacks s, all expressions e: if s ≻ e ↦c* ◦≺ v then s@e ↦M* v Proof by induction over the length of sequence ↦ck:
- Base case: k = 1
- if s ≻ e ↦c1 ◦≺ v, then e has to be a value (Num or Recfun), and s the
empty stack ◦. If e = Num n then
- ◦ ≻ (Num n) ↦c1 ◦ ≺ (Num n) and
- ◦@(Num n) ↦M* (Num n) (same for (Recfun …))
Proof
- Inductive step: k = n+1
- with the Induction Hypothesis:
for all stacks s and expressions e, s ≻ e ↦cn ◦≺ v implies s@e ↦M* v To prove this, we have to consider every possible case in the evaluation rule of the C-machine
Proof
s ≻(Plus e1 e2) ↦c (Plus ☐ e2) ▷ s ≻ e1
- Inductive step: k = n+1
- with the Induction Hypothesis:
for all stacks s and expressions e, s ≻ e ↦cn ◦≺ v implies s@e ↦M* v To prove this, we have to consider every possible case in the evaluation rule of the C-machine Case 1: show for the first rule of addition:
Proof
s ≻(Plus e1 e2) ↦c (Plus ☐ e2) ▷ s ≻ e1
- for e = (Plus e1 e2), show that s ≻ e ↦ck+1 ◦≺ v implies s@e ↦M* vs
(Plus e1 e2) ↦c (Plus ☐ e2) ▷ s ≻ e1 ↦ck ◦≺ v
- Induction Hypothesis:
- for all stacks s and expressions e: s ≻ e ↦cn ◦≺ v implies s@e ↦M* v
Proof
s ≻(Plus e1 e2) ↦c (Plus ☐ e2) ▷ s ≻ e1
- for e = (Plus e1 e2), show that s ≻ e ↦cn+1 ◦≺ v implies s@e ↦M* v
s ≻ (Plus e1 e2) ↦c (Plus ☐ e2) ▷ s ≻ e1 ↦cn ◦≺ v (Plus ☐ e2) ▷ s ≻ e1 ↦cn ◦≺ v
- because of the I.H., we have that
implies (Plus ☐ e2) ▷ s @ e1 ↦M v and since (Plus ☐ e2) ▷ s@e1 is equal to s@(Plus e1 e2)
we have shown s@(Plus e1 e2) ↦M v