Abstract Machines What is an abstract machine? a set of legal - - PowerPoint PPT Presentation
Abstract Machines What is an abstract machine? a set of legal - - PowerPoint PPT Presentation
Concepts of Program Design Abstract Machines Gabriele Keller Abstract Machines What is an abstract machine? a set of legal states - final and initial states as subset A set of instructions altering the state of the machine - it should
Abstract Machines
- What is an abstract machine?
- a set of legal states
- final and initial states as subset
- A set of instructions altering the state of the machine
- it should be possible to implement the operations on a real machine in
a finite (preferably constant) number of steps
- Why use abstract machines at all?
- specifies the semantics of a programming languages
- facilitates porting to other architectures
- mobile code (e.g., Java Virtual Machine)
- We have seen this before
- similar to SOS, but with Abstract Machines, we care about performance
Control Flow
- Base line: the M-machine
- small step semantics for MinHs
- transition system embodies essentially a very high-level (= concise, but
inefficient) abstract machine
- we call it the M-machine
- Characteristics of the M-machine
- substitution as “machine operation”
- why is that bad?
- can be avoided by using an environment
- control-flow is not explicit
- the search rules determine next subexpression to be evaluated
- why is that bad?
Plus(Plus(Num 3)(Num 2))(Num 4) ↦ Plus(Num 5)(Num 4) Plus(Plus(Plus(Num 3)(Num 2))(Num 4))(Num 6) ↦ Plus(Plus(Num 5)(Num 4))(Num 6)
Control Flow
- Example:
(Plus (Num n) (Num m)) ↦ Num (n+m) (Plus e1 e2) ↦ (Plus e1’ e2) e1 ↦ e1’ (Plus(Num n) e2) ↦ (Plus (Num n) e2’) e2 ↦ e2’ Plus(Num 3)(Num 2) ↦ Num 5
depending on the size & nesting depth of the expression, searching for the next reducible subexpression can be very expensive!!!
Control Flow
- Single-step evaluation in Haskell:
eval(Num n) = Num n eval e = eval (single e) single (Plus (Num n1) (Num n2)) = Num (n1 + n2) single (Plus (Num n1) e) = Plus (Num n1) (single e) single (Plus e1 e2) = Plus (single e1) e2 single (Times ....
- Properties:
- for each step, the expression has to be traversed to find the next
subexpression that has to be evaluated
- makes heavy use of the runtime stack
The C-machine
- Explicit control flow: C-machine
- explicit stack
- explicit handling of control flow
- variable binding still handled by substitution
- we call this machine the C-machine
- Machine state
- the current expression (as before)
- a control stack of subcomputations (frames) which have to be performed
before the machine terminates
- Initial and final states
- initial states: closed expression and an empty stack
- final states: expression is a value and the stack is empty
The C-machine
- Example: addition in three stages
- 1. Evaluate first argument
- first argument becomes current expression
- remember to continue with computation, result as first argument
- 2. Evaluate second argument
- second argument becomes current expression
- remember to continue with computation, result as second argument
- 3. Perform addition
The C-machine
- How can we denote a stack frame as a term?
- We use terms with holes; e.g.,
- suspended computation of addition
- waits for the value of its first argument
- Inductive definition of frames:
Plus ☐ e2
(Plus ☐ e) frame e expr (Plus v ☐) frame v value
Plus e1 ☐ not a frame, because first argument is evaluated first!
Inductive Definition of Frames
- Addition
(Plus ☐ e) frame e expr (Plus v ☐) frame v value
- If-expressions
(If ☐ e1 e2) frame e1 expr e2 expr (Apply ☐ e) frame e expr (Apply v ☐) frame v value
- Application
Stack and Machine Modes
- Stacks: f1▷f2 ▷◦
- f1 is the top-most frame
- f2 is the second frame
- ◦is the empty stack
- Inductive definition:
- stack
f ▷ s stack f frame s stack
- Machine modes: the C-machine operates in two modes:
- s≻e : evaluate expression e under stack s
- s≺v : return value v to stack s
(Plus ☐ e2) ▷ s ≺ v ↦c s ≻ (Plus e1 e2) ↦c
Transition Rules for MinHs
- Values (integers, booleans, functions)
s ≻ v ↦c s ≺ v
- Addition
(Plus ☐ e2) ▷ s ≻ e1 (Plus v ☐) ▷ s ≻ e2
s ≺ Num(n1 + n2)
(Plus(Num n1) ☐) ▷ s ≺ Num n2 ↦c
evaluate the value v under stack s
{ {
return the value v to stack s
Transition Rules for MinHs
- if-expressions
s ≻ (If e1 e2 e3) ↦c (If ☐ e2 e3) ▷ s ≻ e1 (If ☐ e2 e3) ▷ s ≺ True ↦c s ≻ e2 (If ☐ e2 e3) ▷ s ≺ False ↦c s ≻ e3
Transition Rules for MinHs
- Function application
s ≻(Apply e1 e2) ↦c (Apply ☐ e2) ▷ s ≻ e1 (Apply ☐ e2) ▷ s ≺ v ↦c (Apply v ☐) ▷ s ≻ e2 (Apply(Fun τ1 τ2 f.x.e) ☐) ▷ s ≺ v ↦c s ≻e [f:= (Fun τ1 τ2 f.x.e),x := v]
- Observations;
- all the inference rules are axioms!
- the definition of single-step evaluation in the C-machine is not recursive
- the full evaluator is tail recursive (can be implemented using a while-loop)
Environments
- Now, let’s get rid of substitution!
- We used an environment for TinyC
- but we can’t just pass it along, because we wouldn’t know when to delete
bindings
- we need to save the old stack somewhere, and restore it when returning
from the function call
- can we use the stack to keep track of the environment?
(Apply(Fun τ1 τ2 f.x.e) ☐) ▷ s ≺ v ↦c s ≻e[f := Fun τ1 τ2 f.x.e),x:= v]
The E-machine
- In the E-machine
- we have frames defined exactly as before
- explicit environments, which are a sequence of variable bindings
- stacks in the E-machine are sequences of environments and frames
- states in the E-machine include an environment
- env
x = v, η env η env
- stack
f ▷ s stack f frame s stack η ▷ s stack η env s stack
s | η ≻ e s | η ≺ v
The E-machine: Transition Rules
- Free variables:
s | η ≻ x ↦E s | η ≺ v , if x=v ∈ η
- Application:
(Apply(Fun τ1 τ2 f.x.e) ☐) ▷ s | η ≺ v ↦E
η ▷ s| f =(Fun τ1 τ2 f.x.e), x = v, η ≻ e
- Returning values:
η ▷ s | η’ ≺ v ↦E s | η ≺ v ,
The E-machine: Transition Rules
- Are these rules correct?
- let’s look at two example usages
- Example
- simple function application
★nested application (corresponds to a function which accepts two
arguments and returns the first one) Apply(Fun int int f.x.(Plus x 1)) 3 Apply (Apply (Fun(int→int) int (f.x.(Fun int int g.y.x)) 3) 1)
We omit the type information, and abbreviate apply to app and write n instead of (Num n)
↦E ● ▷ ◦| x=3, f=Fun(f.x.Plus x 1), ● ≺ 4 ↦E (Plus 3 ☐) ▷ ● ▷ ◦| x=3, f=Fun(f.x.Plus x 1), ● ≺ 1 ↦E (Plus ☐ 1)▷ ● ▷ ◦| x=3, f=Fun(f.x.Plus x 1), ● ≻ x ↦E (Plus ☐ 1) ▷ ● ▷ ◦| x=3, f=Fun(f.x.Plus x 1), ● ≺ 3 ↦E (Plus 3 ☐) ▷ ● ▷ ◦| x=3, f=Fun(f.x.Plus x 1), ● ≻ 1 ↦E ● ▷ ◦| x=3, f=Fun(f.x.(Plus x 1)),● ≻ (Plus x 1) ↦E (App(Fun(f.x.(Plus x 1)) ☐)▷ ◦|● ≻ 3 ↦E (App ☐ 3)▷ ◦ |● ≺ Fun(f.x.(Plus x 1))
Example 1
- |● ≻ App(Fun(f.x.(Plus x 1)) 3)
↦E (App(Fun(f.x.(Plus x 1)) ☐) ▷ ◦|● ≺ 3 ↦E (App ☐ 3)▷ ◦ |● ≻ Fun(f.x.(Plus x 1)) ↦E ◦| ● ≺ 4
Example 2
(App(App(Fun(f.x.Fun(g.y.x)) 3) 4) (letfun f x = letfun g y = x) 3 4
↦E ●▷ (App ☐ 4)▷◦| x=3, f=Fun(f.x.Fun...),● ≻ Fun(g.y.x) ↦E (App(Fun(f.x.Fun(g.y.x)) ☐) ▷(App ☐ 4)▷◦|● ≺ 3
Example 2
↦E (App ☐ 4) ▷ ◦|● ≻ (App(Fun(f.x.Fun(g.y.x)) 3)
- |● ≻ (App(App(Fun(f.x.Fun(g.y.x)) 3) 4)
↦E (App ☐ 3) ▷(App ☐ 4)▷ ◦|● ≻ (Fun(f.x.fun(g.y.x)) ↦E ●▷ (App ☐ 4)▷◦| x=3, f=Fun(f.x.Fun...),● ≺ Fun(g.y.x) ↦E (App(Fun(g.y.x)) ☐)▷◦| ● ≻ 4 ↦E (App ☐ 4)▷◦| ● ≺ Fun(g.y.x) ↦E (App(Fun(g.y.x)) ☐) ▷ ◦| ● ≺ 4 ↦E ◦| y=4, g=Fun(g.y. x), ● ≺ x ↦E (App ☐ 3) ▷(App(☐,4)▷ ◦|● ≺ (Fun(f.x.Fun(g.y.x)) ↦E ... ↦E
Dealing with partial application
- Something went wrong!
- returning the function value and restoring the old (empty) environment, we
threw away the binding for the variable x
- it now occurs freely in g!
- Problem: functions as results are not handled correctly!
- free variables in the function bodies escape the environment they are
defined in.
- partial applications fails
let f x y = x + y g = f 3 in let x = 5 in g x
Dealing with partial application
- Solution:
- we need to bundle returned functions with current environment
- we call this a closure
- requires a new form of return values:
- Closures only appear as values during execution - there is no source form
《 η, (Fun τ1 τ2 f.x.e)》
environment which was current when function value was created
Transition Rules
- Returning values:
η ▷ s | η’ ≻ (Num n) ↦E s | η ≻(Fun τ1 τ2 f.x.e) ↦E η ▷ s | η’ ≺ v ↦E
- Application of functions:
(Apply《 η’, (Fun τ1 τ2 f.x.e)》 ☐) ▷ s | η ≺ v ↦E
s | η ≺ (Num n) s | η ≺《 η, (Fun τ1 τ2 f.x.e)》
s | η ≺ v
η ▷ s| f =(Fun τ1 τ2 f.x.e), x = v, η’ ≻ e
restore environment from closure, add binding for argument x and function f
Example 2
- |● ≻ (App(App(Fun(f.x.fun(g.y.x)) 3) 4)
↦E (App ☐ 4) ▷ ◦|● ≻ (App(Fun(f.x.Fun(g.y.x)) 3) ↦E (App ☐ 3) ▷ (App ☐ 4)▷◦|● ≻ (Fun(f.x.Fun(g.y.x)) ↦E (App(Fun(f.x.Fun(g.y.x)) ☐) ▷ (App ☐ 4)▷◦|● ≺ 3 ↦E ●▷ app(☐,4) ▷ ◦| x=3, f=Fun(f.x.Fun...),● ≻ (Fun (g.y.x)) ↦E ●▷(App ☐ 4)▷◦ | x=3, f=Fun(f.x.(Fun...),● ≺《 x=3, f..,(Fun(g.y.x))》 ↦E (App ☐ 4) ▷ ◦| ● ≺ 《 x=3, f...,Fun(g.y.x)》 ↦E (App《 x=3, f...,Fun(g.y.x)》 ☐) ▷ ◦| ● ≻ 4 ↦E (App《 x=3, f...,Fun(g.y.x)》 ☐) ▷ ◦| ● ≺ 4 ↦E ◦| y=4, g=Fun(g.y. x), x=3, f...,Fun(g.y.x),● ≺ x
....
↦E ◦| y=4, g=Fun(g.y. x), x=3, f...,Fun(g.y.x),● ≺ 3
M-machine versus C-machine and E-machine
- Semantics of MinHs using two different approaches
- Control flow implicit: M-machine
- Control flow explicit: C-machine & E-machine
- Are these equivalent?
- proof is not obvious, as evaluation methods are very different
- how can we make the connection?
M-machine versus C-machine and E-machine
A single step in the M-machine
Plus(Plus(Plus(Num 3)(Num 2))(Num 4))(Num 6))↦M Plus(Plus(Num 5)(Num 4))(Num 6))
- ≻ Plus(Plus(Plus(Num 3)(Num 2))(Num 4))(Num 6))
↦C Plus ☐ (Num 6) ▷ ◦ ≻ Plus(Plus(Num 3)(Num 2))(Num 4))) ↦C (Plus ☐ (Num 4))▷(Plus ☐ (Num 6))▷ ◦ ≻ Plus(Num 3)(Num 2)) ↦C (Plus ☐ (Num2)) ▷(Plus ☐ (Num 4)) ▷(Plus ☐ (Num 6))▷ ◦ ≻ (Num 3) ↦C (Plus ☐ (Num 2))▷(Plus ☐ (Num(4)) ▷(Plus ☐ (Num 6))▷ ◦ ≺ (Num 3) ↦C (Plus ☐ (Num 4)) ▷(Plus ☐ (Num 6))▷ ◦ ≺ (Num 5)
corresponds to a sequence of steps in the C-machine: The stack in the C-machine corresponds to the context of the subexpression which is currently evaluated in the M-machine
M-machine versus C-machine and E-machine
- Idea: reconstruct the full expression from the stack s and the current
expression e:
(@):: Stack -> Expr ->Expr
- @ e = e
((Plus ☐ e2) ▷ s)@e1 = s@(Plus e1 e2) ((Plus v1 ☐) ▷ s)@e2 = s@(Plus v1 e2) ((If ☐ e2 e3)▷ s)@e1 = s@(If e1 e2 e3) .....
- with the reconstructed expression, we can show using rule induction that:
- if s ≻ e ↦c* ◦≺ v then s@e ↦M* v , and
- if e ↦M* v, then ◦ ≻ e ↦c* ◦≺ v
Proof
- for all stacks s, all expressions e: if s ≻ e ↦c* ◦≺ v then s@e ↦M* v
- Induction over the length of sequence ↦ck:
- Base case: k = 1
- if s ≻ e ↦c1 ◦≺ v, then e has to be a value (Num or Fun), and s the
empty stack ◦. If e = Num n then
- ◦ ≻ (Num n) ↦c1 ◦≺ (Num n) and ◦@(Num n) ↦M* (Num n) (same
for (Fun....)
- Inductive step: k = n+1
- with the Induction Hypothesis:
- for all stacks s and expressions e,
s ≻ e ↦cn ◦≺ v implies s@e ↦M* v
- To prove this, we have to consider every possible case in the evaluation rule of
the C-machine
- Case 1: for e = (Plus e1 e2), show that s ≻ e ↦ck+1 ◦≺ v