SLIDE 1 15-150 Fall 2020 Lecture 6
Stephen Brookes
Most of the time I don't have much fun. The rest of the time I don't have any fun at all.
SLIDE 2 today
Sorting an integer list
to guide program design
- “Helper functions should help!”
datatype definitions boolean connectives case expressions <> means ≠
SML features
. . . .
SLIDE 3 datatypes
- ML has datatype declarations
- Allow us to introduce new types,
with constructors for building values datatype order = LESS | EQUAL | GREATER datatype ’a option = NONE | SOME of ’a NONE : int option SOME 42 : int option ’a list is a built-in datatype with constructors nil and ::
SLIDE 4 comparing ints
datatype order = LESS | EQUAL | GREATER
a datatype
definition introducing the type
with values LESS, EQUAL, GREATER
SLIDE 5 comparing ints
datatype order = LESS | EQUAL | GREATER
a datatype
definition introducing the type
with values LESS, EQUAL, GREATER
fun compare(x:int, y:int):order = if x<y then LESS else if y<x then GREATER else EQUAL
SLIDE 6 comparing ints
datatype order = LESS | EQUAL | GREATER compare : int * int -> order compare(x,y) = LESS if x<y compare(x,y) = EQUAL if x=y compare(x,y) = GREATER if x>y
a datatype
definition introducing the type
with values LESS, EQUAL, GREATER
fun compare(x:int, y:int):order = if x<y then LESS else if y<x then GREATER else EQUAL
SLIDE 7 properties
- ≤ is a linear ordering
- < is defined by
and satisfies
If a ≤ b and b ≤ a then a = b (antisymmetric) If a ≤ b and b ≤ c then a ≤ c (transitive) Either a ≤ b or b ≤ a (connected) a < b if and only if (a ≤ b and a ≠ b) a < b or b < a or a = b (trichotomy)
SLIDE 8 sorted
A list is <-sorted (or just sorted) if and only if each item in the list is ≤ all later items.
sorted : int list -> bool fun sorted [ ] = true | sorted [x] = true | sorted (x::y::L) = (x <= y) andalso sorted(y::L)
SLIDE 9 sorted
A list is <-sorted (or just sorted) if and only if each item in the list is ≤ all later items.
sorted : int list -> bool
For all L : int list, sorted(L) = true if L is sorted = false otherwise
fun sorted [ ] = true | sorted [x] = true | sorted (x::y::L) = (x <= y) andalso sorted(y::L)
SLIDE 10 sorted
A list is <-sorted (or just sorted) if and only if each item in the list is ≤ all later items.
sorted : int list -> bool
For all L : int list, sorted(L) = true if L is sorted = false otherwise
fun sorted [ ] = true | sorted [x] = true | sorted (x::y::L) = (x <= y) andalso sorted(y::L)
(Prove this, by induction on list length) (Note the relevance of transitivity etc.)
SLIDE 11 specs and code
- We use sorted only in specifications.
- Our sorting functions won’t use it.
- But you could use it for testing...
SLIDE 12 specs and code
- We use sorted only in specifications.
- Our sorting functions won’t use it.
- But you could use it for testing...
For every integer list L there is a unique sorted permutation of L
SLIDE 13 insertion sort
Insertion sort is a simple sorting algorithm that builds the sorted list recursively, one item at a time.
- If the list is empty, do nothing.
- Otherwise, each recursive call inserts an item from
the input list into its correct position in the sorted list so far.
SLIDE 14 insertion sort
Insertion sort is a simple sorting algorithm that builds the sorted list recursively, one item at a time.
- If the list is empty, do nothing.
- Otherwise, each recursive call inserts an item from
the input list into its correct position in the sorted list so far.
(Wikipedia doesn’t give good specs!)
SLIDE 15 insertion sort
- If the list is empty, do nothing.
- Otherwise, recursively sort the tail, then
insert the head item into its correct position in the (already sorted) tail.
SLIDE 16 insertion sort
- If the list is empty, do nothing.
- Otherwise, recursively sort the tail, then
insert the head item into its correct position in the (already sorted) tail.
… We need a helper function
SLIDE 17 insertion sort
- If the list is empty, do nothing.
- Otherwise, recursively sort the tail, then
insert the head item into its correct position in the (already sorted) tail.
… We need a helper function ins : int * int list -> int list REQUIRES … ENSURES …
SLIDE 18 insertion
ins : int * int list -> int list
REQUIRES L is a sorted list ENSURES ins(x, L) = a sorted permutation of x::L
per·mu·ta·tion
noun
A way, in which a list of things can be arranged:
"his thoughts raced ahead to fifty different permutations of what he must do"
Powered by Oxford Dictionaries
inserts x into its correct position in L
SLIDE 19 insertion
ins : int * int list -> int list
REQUIRES L is a sorted list ENSURES ins(x, L) = a sorted permutation of x::L
fun ins (x, [ ]) = [x]
per·mu·ta·tion
noun
A way, in which a list of things can be arranged:
"his thoughts raced ahead to fifty different permutations of what he must do"
Powered by Oxford Dictionaries
inserts x into its correct position in L
SLIDE 20 insertion
ins : int * int list -> int list
REQUIRES L is a sorted list ENSURES ins(x, L) = a sorted permutation of x::L
fun ins (x, [ ]) = [x] | ins (x, y::R) =
per·mu·ta·tion
noun
A way, in which a list of things can be arranged:
"his thoughts raced ahead to fifty different permutations of what he must do"
Powered by Oxford Dictionaries
inserts x into its correct position in L
SLIDE 21 insertion
ins : int * int list -> int list
REQUIRES L is a sorted list ENSURES ins(x, L) = a sorted permutation of x::L
fun ins (x, [ ]) = [x] | ins (x, y::R) = if x > y
per·mu·ta·tion
noun
A way, in which a list of things can be arranged:
"his thoughts raced ahead to fifty different permutations of what he must do"
Powered by Oxford Dictionaries
inserts x into its correct position in L
SLIDE 22 insertion
ins : int * int list -> int list
REQUIRES L is a sorted list ENSURES ins(x, L) = a sorted permutation of x::L
fun ins (x, [ ]) = [x] | ins (x, y::R) = if x > y then y :: ins(x, R)
per·mu·ta·tion
noun
A way, in which a list of things can be arranged:
"his thoughts raced ahead to fifty different permutations of what he must do"
Powered by Oxford Dictionaries
inserts x into its correct position in L
SLIDE 23 insertion
else x :: (y :: R) ins : int * int list -> int list
REQUIRES L is a sorted list ENSURES ins(x, L) = a sorted permutation of x::L
fun ins (x, [ ]) = [x] | ins (x, y::R) = if x > y then y :: ins(x, R)
per·mu·ta·tion
noun
A way, in which a list of things can be arranged:
"his thoughts raced ahead to fifty different permutations of what he must do"
Powered by Oxford Dictionaries
inserts x into its correct position in L
SLIDE 24
ins equations
ins (x, [ ]) = [x] ins (x, y::R) = if x > y then y::ins(x, R) else x::(y::R)
For all values x, y : int and R : int list,
ins (x, y::R) = y::ins(x, R) if x > y = x::(y::R) otherwise
SLIDE 25 Proof: By induction on length of L.
- Base case: When L has length 0, L is [ ].
[ ] is sorted, and ins(x, [ ]) = [x] is a sorted perm of x::[ ].
- Inductive case: Let k>0 and L be sorted, of length k.
Let y, R be the head, tail of L: so L = y::R. R is sorted, of length < k, and y ≤ all of R. Need to show: ins(x, y::R) = a sorted perm of x::(y::R)
For all sorted integer lists L, all values x:int, ins(x, L) = a sorted permutation of x::L.
IH: For all sorted lists A of length < k, all values x, ins(x, A) = a sorted perm of x::A.
correctness
SLIDE 26 inductive case
R is sorted, length < k, and y ≤ all of R.
- By IH, ins(x, R) = a sorted perm of x::R
If x>y we have ins(x, y::R) = y::ins(x,R) This list is sorted because... This list is a perm of x::(y::R) because... Otherwise, x≤y and ins(x, y::R) = x::(y::R) This list is sorted because... This list is a perm of x::(y::R) because...
- In all cases, ins(x, y::R) = a sorted perm of x::(y::R)
ins (x, y::R) = = x::(y::R) otherwise, i.e. if x ≤ y y::ins(x, R) if x > y
(some more details)
SLIDE 27 comments
- Fill in the missing details in that proof sketch
- Notice where you use basic properties of ≤
- these properties are crucial
- often used implicitly, without mention
- that’s OK, except that you need to realize it
Now that we have ins, let’s define isort…
SLIDE 28 isort
isort : int list -> int list
ENSURES isort(L) = a sorted perm of L
SLIDE 29 isort
isort : int list -> int list
fun isort [ ] = [ ]
ENSURES isort(L) = a sorted perm of L
SLIDE 30 isort
| isort (x::R) = ins (x, isort R)
isort : int list -> int list
fun isort [ ] = [ ]
ENSURES isort(L) = a sorted perm of L
SLIDE 31 isort
| isort (x::R) = ins (x, isort R)
isort : int list -> int list
fun isort [ ] = [ ]
ENSURES isort(L) = a sorted perm of L
“isort (x::R) inserts x into its correct position in the sorted tail, isort R”
SLIDE 32 Proof: By structural induction on L.
Show that isort [ ] = a sorted perm of [ ].
- Inductive case: for L = y::R.
IH: isort R = a sorted perm of R. Show: isort(y::R) = a sorted perm of y::R.
For all values L: int list, isort L = a sorted permutation of L.
By the proven ins spec, it follows that ins (y, isort R) = a sorted perm of y::R
correctness
isort (y::R) = ins (y, isort R) isort R is a sorted perm of R
SLIDE 33 comments
- The proof was “by structural induction on L”
- Every list value L is either [ ] (nil)
- r y::R, where R is a “smaller” list value
- We could just as well have said
“by induction on length of L”
- [ ] has length 0
- 0 ≤ length R < length(y::R)
isort (y::R) calls isort R
SLIDE 34
perm facts
A perm of a perm of L is a perm of L In the correctness proof we used some obvious facts about permutations. y::(a perm of R) is a perm of (y::R)
SLIDE 35
corollaries
SLIDE 36
corollaries
isort is a total function from int list to int list
SLIDE 37
corollaries
isort is a total function from int list to int list When e evaluates to L, isort e evaluates to the sorted version of L
SLIDE 38
a variation
fun isort’ [ ] = [ ] | isort’ [x] = [x] | isort’ (x::R) = ins (x, isort’ R)
| isort (x::R) = ins (x, isort R)
fun isort [ ] = [ ]
SLIDE 39
a variation
fun isort’ [ ] = [ ] | isort’ [x] = [x] | isort’ (x::R) = ins (x, isort’ R)
is this clause redundant
| isort (x::R) = ins (x, isort R)
fun isort [ ] = [ ]
SLIDE 40 If in doubt, test, then prove
variation
fun isort’ [ ] = [ ] | isort’ [x] = [x] | isort’ (x::R) = ins (x, isort’ R)
isort’ : int list -> int list
SLIDE 41 If in doubt, test, then prove
variation
fun isort’ [ ] = [ ] | isort’ [x] = [x] | isort’ (x::R) = ins (x, isort’ R)
isort’ : int list -> int list
SLIDE 42 If in doubt, test, then prove
variation
fun isort’ [ ] = [ ] | isort’ [x] = [x] | isort’ (x::R) = ins (x, isort’ R)
isort’ : int list -> int list
SLIDE 43 equivalent
- isort and isort’ are extensionally equivalent:
For all L : int list, isort L = isort’ L.
- Proof? See lecture notes…
OR: Re-do the isort proof for isort’ (easy)
Hence they satisfy the same spec, so
For all L : int list, isort L = isort’ L = the sorted perm of L
SLIDE 44 equivalent
- isort and isort’ are extensionally equivalent:
For all L : int list, isort L = isort’ L.
- Proof? See lecture notes…
No need for extra clause but it doesn’t do any harm
OR: Re-do the isort proof for isort’ (easy)
Hence they satisfy the same spec, so
For all L : int list, isort L = isort’ L = the sorted perm of L
SLIDE 45 work
- Let Wins(n) be the work for ins(x, L)
when x, L are values and L has length n
- Let Wisort(n) be the work for isort(L)
when L is a list of length n
SLIDE 46 work
- Let Wins(n) be the work for ins(x, L)
when x, L are values and L has length n
- Let Wisort(n) be the work for isort(L)
when L is a list of length n Wins(n) is O(n)
SLIDE 47 work
- Let Wins(n) be the work for ins(x, L)
when x, L are values and L has length n
- Let Wisort(n) be the work for isort(L)
when L is a list of length n Wins(n) is O(n) Wisort(0) = 1 Wisort(n) = 1 + Wins(n-1) + Wisort(n-1) for n > 0
SLIDE 48 work
- Let Wins(n) be the work for ins(x, L)
when x, L are values and L has length n
- Let Wisort(n) be the work for isort(L)
when L is a list of length n Wins(n) is O(n)
SLIDE 49 work
- Let Wins(n) be the work for ins(x, L)
when x, L are values and L has length n
- Let Wisort(n) be the work for isort(L)
when L is a list of length n Wins(n) is O(n) Wisort(0) = 1 Wisort(n) = O(n) + Wisort(n-1) for n > 0
SLIDE 50 work
- Let Wins(n) be the work for ins(x, L)
when x, L are values and L has length n
- Let Wisort(n) be the work for isort(L)
when L is a list of length n Wins(n) is O(n) Wisort(0) = 1 Wisort(n) = O(n) + Wisort(n-1) for n > 0 Wisort(n) is O(n2)
SLIDE 51 work
- Let Wins(n) be the work for ins(x, L)
when x, L are values and L has length n
- Let Wisort(n) be the work for isort(L)
when L is a list of length n Wins(n) is O(n) Wisort(0) = 1 Wisort(n) = O(n) + Wisort(n-1) for n > 0 Wisort(n) is O(n2)
THIS IS SLOW! WE CAN DO BETTER!
SLIDE 52 mergesort
Conceptually, a merge sort works as follows:
- 1. Divide the unsorted list into n sublists,
each containing 1 element.
- 2. Repeatedly Merge sublists to produce new
sublists until there is only 1 sublist left.
SLIDE 53 mergesort
Conceptually, a merge sort works as follows:
- 1. Divide the unsorted list into n sublists,
each containing 1 element.
- 2. Repeatedly Merge sublists to produce new
sublists until there is only 1 sublist left.
Wrong! Wrong! Wrong!
SLIDE 54 mergesort
Conceptually, a merge sort works as follows:
- 1. Divide the unsorted list into n sublists,
each containing 1 element.
- 2. Repeatedly Merge sublists to produce new
sublists until there is only 1 sublist left.
Wrong! Wrong! Wrong! Doesn’t say “recursive”...
SLIDE 55 mergesort
Conceptually, a merge sort works as follows:
- 1. Divide the unsorted list into n sublists,
each containing 1 element.
- 2. Repeatedly Merge sublists to produce new
sublists until there is only 1 sublist left.
Wrong! Wrong! Wrong! Doesn’t say “recursive”...
… what’s n?
SLIDE 56 mergesort
Conceptually, a merge sort works as follows:
- 1. Divide the unsorted list into n sublists,
each containing 1 element.
- 2. Repeatedly Merge sublists to produce new
sublists until there is only 1 sublist left.
Wrong! Wrong! Wrong! Doesn’t say “recursive”...
… what’s n? … repeatedly????
SLIDE 57 mergesort
Conceptually, a merge sort works as follows:
- 1. Divide the unsorted list into n sublists,
each containing 1 element.
- 2. Repeatedly Merge sublists to produce new
sublists until there is only 1 sublist left.
Wrong! Wrong! Wrong! Doesn’t say “recursive”...
… what’s n? … and then? … repeatedly????
SLIDE 58 mergesort
Conceptually, a merge sort works as follows:
- 1. Divide the unsorted list into n sublists,
each containing 1 element.
- 2. Repeatedly Merge sublists to produce new
sublists until there is only 1 sublist left.
Wrong! Wrong! Wrong! Doesn’t say “recursive”...
… what’s n? … and then?
What’s the output? How does it relate to the input?
… repeatedly????
SLIDE 59 mergesort
A recursive divide-and-conquer algorithm
- If list has length 0 or 1, do nothing.
- Otherwise,
split the list into two shorter lists, sort these two lists, merge the (sorted) results
SLIDE 60 implementation
- First, let’s design helper functions
split : int list -> int list * int list merge : int list * int list -> int list
SLIDE 61 implementation
- First, let’s design helper functions
split : int list -> int list * int list merge : int list * int list -> int list (what specs should we use?)
SLIDE 62 implementation
- First, let’s design helper functions
split : int list -> int list * int list merge : int list * int list -> int list (what specs should we use?) split splits a list into two sublists merge combines two sorted lists into one
SLIDE 63 implementation
- First, let’s design helper functions
split : int list -> int list * int list merge : int list * int list -> int list (what specs should we use?) split splits a list into two sublists merge combines two sorted lists into one
(a bit imprecise, but we’ll fix that…)
SLIDE 64 split
ENSURES split(L) = a pair of lists (A, B) such that length(A) and length(B) differ by at most 1, and A@B is a permutation of L. split : int list -> int list * int list
SLIDE 65 split
ENSURES split(L) = a pair of lists (A, B) such that length(A) and length(B) differ by at most 1, and A@B is a permutation of L. split : int list -> int list * int list
length(A)≈length(B)
write as
SLIDE 66 split
ENSURES split(L) = a pair of lists (A, B) such that length(A) and length(B) differ by at most 1, and A@B is a permutation of L. split : int list -> int list * int list
SLIDE 67 split
ENSURES split(L) = a pair of lists (A, B) such that length(A) and length(B) differ by at most 1, and A@B is a permutation of L. split : int list -> int list * int list
fun split [ ] = ([ ], [ ])
SLIDE 68 split
ENSURES split(L) = a pair of lists (A, B) such that length(A) and length(B) differ by at most 1, and A@B is a permutation of L. split : int list -> int list * int list
fun split [ ] = ([ ], [ ]) | split [x] = ([x], [ ])
SLIDE 69 split
ENSURES split(L) = a pair of lists (A, B) such that length(A) and length(B) differ by at most 1, and A@B is a permutation of L. split : int list -> int list * int list
fun split [ ] = ([ ], [ ]) | split [x] = ([x], [ ]) | split (x::y::L) =
SLIDE 70 split
ENSURES split(L) = a pair of lists (A, B) such that length(A) and length(B) differ by at most 1, and A@B is a permutation of L. split : int list -> int list * int list
fun split [ ] = ([ ], [ ]) | split [x] = ([x], [ ]) | split (x::y::L) = let val (A, B) = split L in
SLIDE 71 split
(x::A, y::B) end
ENSURES split(L) = a pair of lists (A, B) such that length(A) and length(B) differ by at most 1, and A@B is a permutation of L. split : int list -> int list * int list
fun split [ ] = ([ ], [ ]) | split [x] = ([x], [ ]) | split (x::y::L) = let val (A, B) = split L in
SLIDE 72 split
(x::A, y::B) end
ENSURES split(L) = a pair of lists (A, B) such that length(A) and length(B) differ by at most 1, and A@B is a permutation of L. split : int list -> int list * int list
fun split [ ] = ([ ], [ ]) | split [x] = ([x], [ ]) | split (x::y::L) = let val (A, B) = split L in
note the use of list patterns and pair patterns
SLIDE 73
split equations
split [ ] = ([ ], [ ]) split [x] = ([x], [ ]) split (x::y::L) = (x::A, y::B) end let val (A, B) = split L in
For all values x, y : int and L : int list,
SLIDE 74
split equations
split [ ] = ([ ], [ ]) split [x] = ([x], [ ]) split (x::y::L) =
For all values x, y : int and L : int list,
SLIDE 75
split equations
split [ ] = ([ ], [ ]) split [x] = ([x], [ ]) split (x::y::L) =
For all values x, y : int and L : int list, (x::A, y::B), where (A, B) = split L
SLIDE 76
split equations
split [ ] = ([ ], [ ]) split [x] = ([x], [ ]) split (x::y::L) =
For all values x, y : int and L : int list, (x::A, y::B), where (A, B) = split L Can be used to calculate split R for any value R : int list
SLIDE 77
split equations
split [ ] = ([ ], [ ]) split [x] = ([x], [ ]) split (x::y::L) =
For all values x, y : int and L : int list, (x::A, y::B), where (A, B) = split L Can be used to calculate split R for any value R : int list split [4,2,1,3] = ([4,1], [2,3])
SLIDE 78
split equations
split [ ] = ([ ], [ ]) split [x] = ([x], [ ]) split (x::y::L) =
For all values x, y : int and L : int list, (x::A, y::B), where (A, B) = split L Can be used to calculate split R for any value R : int list split [4,2,1,3] = ([4,1], [2,3]) split [4,2,1] = ([4,1], [2])
SLIDE 79
- Proof: by (strong) induction on length of L
- Base cases: L = [ ], [x]
EASY
- Inductive case: L=x::(y::R) R is shorter than L
Assume Induction Hypothesis: split(R) = a pair (A’, B’) such
that length(A’)≈length(B’) and A’@B’ is a perm of R. Show that split(x::y::R) = a pair (A, B) such that length(A)≈length(B) and A@B is a perm of x::(y::R).
For all L:int list, split(L) = a pair of lists (A, B) such that length(A) ≈ length(B) and A@B is a permutation of L.
SLIDE 80
- Proof: by (strong) induction on length of L
- Base cases: L = [ ], [x]
EASY
- Inductive case: L=x::(y::R) R is shorter than L
Assume Induction Hypothesis: split(R) = a pair (A’, B’) such
that length(A’)≈length(B’) and A’@B’ is a perm of R. Show that split(x::y::R) = a pair (A, B) such that length(A)≈length(B) and A@B is a perm of x::(y::R).
For all L:int list, split(L) = a pair of lists (A, B) such that length(A) ≈ length(B) and A@B is a permutation of L.
split [ ] = ([ ], [ ]) split [x] = ([x], [ ])
SLIDE 81
- Proof: by (strong) induction on length of L
- Base cases: L = [ ], [x]
EASY
- Inductive case: L=x::(y::R) R is shorter than L
Assume Induction Hypothesis: split(R) = a pair (A’, B’) such
that length(A’)≈length(B’) and A’@B’ is a perm of R. Show that split(x::y::R) = a pair (A, B) such that length(A)≈length(B) and A@B is a perm of x::(y::R).
For all L:int list, split(L) = a pair of lists (A, B) such that length(A) ≈ length(B) and A@B is a permutation of L.
split [ ] = ([ ], [ ]) split [x] = ([x], [ ]) split (x::y::R) = (x::A’, y::B’) length(x::A’) ≈ length(y::B’) (x::A’)@(y::B’) is a perm of x::(y::R)
SLIDE 82 comments
- We used strong induction on length of L
Reason: split(x::y::R) calls split(R) and length of R is two less than length of x::y::R.
- If length L = n > 1 and split(L) = (A, B),
A and B are shorter than L
If n is even > 1, length A = length B = n div 2 < n. If n is odd > 1, length A = (n div 2) + 1 < n, length B = n div 2 < n.
SLIDE 83 merge
REQUIRES A and B are sorted lists ENSURES merge(A, B) = a sorted perm of A@B merge : int list * int list -> int list
SLIDE 84 merge
REQUIRES A and B are sorted lists ENSURES merge(A, B) = a sorted perm of A@B merge : int list * int list -> int list
fun merge (A, [ ]) = A | merge ([ ], B) = B
SLIDE 85 merge
REQUIRES A and B are sorted lists ENSURES merge(A, B) = a sorted perm of A@B merge : int list * int list -> int list
fun merge (A, [ ]) = A | merge ([ ], B) = B | merge (x::L, y::R) = case compare(x, y) of
SLIDE 86 merge
REQUIRES A and B are sorted lists ENSURES merge(A, B) = a sorted perm of A@B merge : int list * int list -> int list
fun merge (A, [ ]) = A | merge ([ ], B) = B | merge (x::L, y::R) = case compare(x, y) of LESS => x :: merge(L, y::R)
SLIDE 87 merge
REQUIRES A and B are sorted lists ENSURES merge(A, B) = a sorted perm of A@B merge : int list * int list -> int list
fun merge (A, [ ]) = A | merge ([ ], B) = B | merge (x::L, y::R) = case compare(x, y) of LESS => x :: merge(L, y::R) | EQUAL => x :: y :: merge(L, R)
SLIDE 88 merge
| GREATER => y :: merge(x::L, R)
REQUIRES A and B are sorted lists ENSURES merge(A, B) = a sorted perm of A@B merge : int list * int list -> int list
fun merge (A, [ ]) = A | merge ([ ], B) = B | merge (x::L, y::R) = case compare(x, y) of LESS => x :: merge(L, y::R) | EQUAL => x :: y :: merge(L, R)
SLIDE 89 merge
| GREATER => y :: merge(x::L, R)
REQUIRES A and B are sorted lists ENSURES merge(A, B) = a sorted perm of A@B merge : int list * int list -> int list
fun merge (A, [ ]) = A | merge ([ ], B) = B | merge (x::L, y::R) = case compare(x, y) of LESS => x :: merge(L, y::R) | EQUAL => x :: y :: merge(L, R)
We need a 3-way branch, so cased comparison is better than nested if-then-else
SLIDE 90 merge equations
merge (A, [ ]) = A merge ([ ], B) = B | GREATER => y :: merge(x::A, B) merge (x::A, y::B) = case compare(x, y) of LESS => x :: merge(A, y::B) | EQUAL => x :: y :: merge(A, B)
For all values x, y : int and A, B : int list,
SLIDE 91 merge equations
merge (A, [ ]) = A merge ([ ], B) = B
For all values x, y : int and A, B : int list,
SLIDE 92 merge equations
merge (A, [ ]) = A merge ([ ], B) = B
For all values x, y : int and A, B : int list,
= y :: merge(x::A, B) merge (x::A, y::B) = x :: merge(A, y::B) = x :: y :: merge(A, B)
if x<y if x=y if x>y
SLIDE 93 merge equations
merge (A, [ ]) = A merge ([ ], B) = B
For all values x, y : int and A, B : int list,
= y :: merge(x::A, B) merge (x::A, y::B) = x :: merge(A, y::B) = x :: y :: merge(A, B)
if x<y if x=y if x>y Can be used to evaluate merge(L, R) for all values L, R : int list
SLIDE 94 merge equations
merge (A, [ ]) = A merge ([ ], B) = B
For all values x, y : int and A, B : int list,
= y :: merge(x::A, B) merge (x::A, y::B) = x :: merge(A, y::B) = x :: y :: merge(A, B)
if x<y if x=y if x>y merge([1,4], [2,3]) = [1,2,3,4] Can be used to evaluate merge(L, R) for all values L, R : int list
SLIDE 95 correctness?
How do we prove this function satisfies the spec?
- Induction, but on on what?
- in base cases, at least one list is empty
- in recursive calls, one or both is shorter
| GREATER => y :: merge(x::L, R) fun merge (A, [ ]) = A | merge ([ ], B) = B | merge (x::L, y::R) = case compare(x, y) of LESS => x :: merge(L, y::R) | EQUAL => x :: y :: merge(L, R)
SLIDE 96 correctness?
How do we prove this function satisfies the spec?
- Induction, but on on what?
- in base cases, at least one list is empty
- in recursive calls, one or both is shorter
| GREATER => y :: merge(x::L, R) fun merge (A, [ ]) = A | merge ([ ], B) = B | merge (x::L, y::R) = case compare(x, y) of LESS => x :: merge(L, y::R) | EQUAL => x :: y :: merge(L, R)
The product of list lengths!
SLIDE 97 Proof: strong induction on product of lengths of A, B.
- Base cases: (A, [ ]) and ([ ], B).
(i) Show: if A is sorted, merge(A, [ ]) = a sorted perm of A@[ ]. (ii) Show: if B is sorted, merge([ ], B) = a sorted perm of [ ]@B.
- Inductive case: (x::A, y::B)
Assume IH: for all pairs of sorted lists (A’, B’) with smaller product
- f lengths than (x::A, y::B), merge(A’, B’) = a sorted perm of A’@B’.
Show: if x::A and y::B are sorted, then merge(x::A, y::B) = a sorted perm of (x::A)@(y::B).
For all sorted lists A and B, merge(A, B) = a sorted permutation of A@B.
Exercise: fill in the details!
correctness
SLIDE 98 msort
- We proved that split and merge are correct
- Now let’s use them to define msort
REQUIRES A and B are sorted lists ENSURES merge(A, B) = a sorted perm of A@B merge : int list * int list -> int list ENSURES split L = a pair of lists (A, B) such that length(A) ≈ length(B) and A@B is a perm of L split : int list -> int list * int list ENSURES msort L = a sorted perm of L msort : int list -> int list
SLIDE 99 msort
ENSURES msort(L) = a sorted perm of L
msort : int list -> int list
SLIDE 100 msort
ENSURES msort(L) = a sorted perm of L
msort : int list -> int list
fun msort [ ] = [ ]
SLIDE 101 msort
ENSURES msort(L) = a sorted perm of L
msort : int list -> int list
fun msort [ ] = [ ] | msort [x] = [x]
SLIDE 102 msort
ENSURES msort(L) = a sorted perm of L
msort : int list -> int list
fun msort [ ] = [ ] | msort [x] = [x] | msort L =
SLIDE 103 msort
ENSURES msort(L) = a sorted perm of L
msort : int list -> int list
fun msort [ ] = [ ] | msort [x] = [x] | msort L = let val (A, B) = split L in
SLIDE 104 msort
ENSURES msort(L) = a sorted perm of L
msort : int list -> int list
fun msort [ ] = [ ] | msort [x] = [x] | msort L = let val (A, B) = split L in merge (msort A, msort B)
SLIDE 105 msort
end ENSURES msort(L) = a sorted perm of L
msort : int list -> int list
fun msort [ ] = [ ] | msort [x] = [x] | msort L = let val (A, B) = split L in merge (msort A, msort B)
SLIDE 106 msort
end ENSURES msort(L) = a sorted perm of L
msort : int list -> int list
fun msort [ ] = [ ] | msort [x] = [x] | msort L = let val (A, B) = split L in merge (msort A, msort B)
msort [4,2,1,3] ⟹* [1,2,3,4]
SLIDE 107 msort
end ENSURES msort(L) = a sorted perm of L
msort : int list -> int list
fun msort [ ] = [ ] | msort [x] = [x] | msort L = let val (A, B) = split L in merge (msort A, msort B)
msort [4,2,1,3] ⟹* [1,2,3,4]
msort [4,2,1,3] = merge(msort [4,1], msort [2,3]) = merge([1,4], [2,3]) = [1,2,3,4]
SLIDE 108
[38, 27, 43, 3, 9, 82, 10] [38, 43, 9, 10] [27, 3, 82] [38, 9] [43, 10] [27, 82] [3] [38] [9] [43] [10] [27] [82] [9, 38] [10, 43] [27, 82] [9, 10, 38, 43] [3, 27, 82] [3, 9, 10, 27, 38, 43, 82] split merge
SLIDE 109
msort equations
msort [ ] = [ ] For all values x : int and L : int list, msort [x] = [x] msort L = merge(msort A, msort B) where (A, B) = split L, if length L ≥ 2 (where did this side condition come from?)
SLIDE 110 correctness
Proof: by strong induction on length of L
(i) Show msort [ ] = a sorted perm of [ ] (ii) Show msort [x] = a sorted perm of [x]
- Inductive case: suppose length(L)>1.
Inductive hypothesis: for all shorter lists R, msort R = a sorted perm of R. Show that msort L = a sorted perm of L.
For all L:int list, msort(L) = a sorted perm of L.
A crucial assumption needed in proof details: length L > 1
SLIDE 111 comments
“by (strong) induction on the length of L”
- msort L calls msort A and msort B,
where A and B have shorter length
- It would not have been appropriate to say
“by structural induction on L”
- msort (x::R) doesn’t call msort R
SLIDE 112
work
Wsplit(n) = work of split(L) when length(L)=n Wmerge(n) = work of merge(A, B) when length(A) + length(B) = n Wsplit(n) is O(n) Wmerge(n) is O(n)
SLIDE 113
work
Wmsort(n) = work of msort(L) when length(L)=n Wmsort(n) = Wsplit(n) + 2Wmsort(n div 2) + Wmerge(n) + 1 = O(n) + 2Wmsort(n div 2) Wmsort(n) is O(n log n) Wmsort(0) = 1 Wmsort(1) = 1 for n>1
SLIDE 114
W(n) = n + 2 W(n div 2) = n + 2 (n div 2 + 2 W(n div 4)) = n + 2(n/2) + 4(n/4) +… + 2k (n/2k) where k = log2 n = n + n + n +… + n (k terms) Deriving the work for msort = n log2 n Simplify recurrence to: So Wmsort(n) is O(n log n) = n + 2(n/2) + 4 W(n/4) = n + 2(n/2) + 4(n/4) + 8W(n/8) This W has same asymptotic behavior as Wmsort
SLIDE 115 summary
msort L = isort L = sorted perm of L
- msort is (more) efficient (than isort)
Wmsort(n) is O(n log n) Wisort(n) is O(n2)
SLIDE 116
SLIDE 117 variations on a theme
- Let’s consider some alternative ways
to write this function
- Some will be correct, some not
end fun msort [ ] = [ ] | msort [x] = [x] | msort L = let val (A, B) = split L in merge (msort A, msort B)
by msort
SLIDE 118 msort
end fun msort [ ] = [ ] | msort [x] = [x] | msort L = let val (A, B) = split L val A’ = msort A val B’ = msort B inmerge (A’, B’)
an alternative version ✓ correct ✓ work
SLIDE 119 msort
end fun msort L = if length L < 2 then L else let val (A, B) = split L in merge (A, B)
another alternative version ✓ correct ✓ work
SLIDE 120 msort
fun msort [ ] = [ ] | msort [x] = [x] | msort L = let val (A, B) = split L in merge (msort A, msort B) end
is this clause redundant?
SLIDE 121 msort
fun msort [ ] = [ ] | msort L = let val (A, B) = split L in merge (msort A, msort B) end
is this clause redundant?
SLIDE 122 after deletion
fun msort [ ] = [ ] | msort L = let val (A, B) = split L in merge (msort A, msort B) end
SLIDE 123 after deletion
fun msort [ ] = [ ] | msort L = let val (A, B) = split L in merge (msort A, msort B) end
SLIDE 124 after deletion
loops forever
fun msort [ ] = [ ] | msort L = let val (A, B) = split L in merge (msort A, msort B) end
SLIDE 125 the problem
- split [x] = ([x], [ ])
- msort [x] ⟹* (fn ... => ...) (msort [x], msort [ ])
leads to infinite computation
SLIDE 126 the problem
- split [x] = ([x], [ ])
- msort [x] ⟹* (fn ... => ...) (msort [x], msort [ ])
leads to infinite computation
Q: What happens if you try to prove msort correct?
SLIDE 127 the problem
- split [x] = ([x], [ ])
- msort [x] ⟹* (fn ... => ...) (msort [x], msort [ ])
leads to infinite computation
Q: What happens if you try to prove msort correct? A: The proof breaks down!
SLIDE 128 the problem
- split [x] = ([x], [ ])
- msort [x] ⟹* (fn ... => ...) (msort [x], msort [ ])
leads to infinite computation
Q: What happens if you try to prove msort correct? A: The proof breaks down!
Cannot assume length L > 1 in inductive step
SLIDE 129
SLIDE 130
- The proof for msort relied only on the
specifications of split and merge
- Can replace split by any other function
with the same specification, and the same proof would work!
SLIDE 131
- The proof for msort relied only on the
specifications of split and merge
- Can replace split by any other function
with the same specification, and the same proof would work!
the new version
also sorts lists!
SLIDE 132 example
fun split’ [ ] = ([ ], [ ]) | split’ [x] = ([ ], [x]) | split’ (x::y::L) = let val (A, B) = split’ L in (x::A, y::B) end
fun msort’ [ ] = [ ] | msort’ [x] = [x] | msort’ L = let val (A, B) = split’ L in merge(msort’ A, msort’ B) end
SLIDE 133 example
- split and split’ are not extensionally equivalent
- But they both satisfy the specification
used in the correctness proof for msort and msort’
- ... so msort and msort’ are both correct
SLIDE 134 clause order
fun merge (A, [ ]) = A | merge ([ ], B) = B | merge (x::A, y::B) = … fun merge (x::A, y::B) = … | merge (A, [ ]) = A | merge ([ ], B) = B
ML tries patterns in the order written
SLIDE 135 clause order
fun merge (A, [ ]) = A | merge ([ ], B) = B | merge (x::A, y::B) = …
Does clause order matter here?
fun merge (x::A, y::B) = … | merge (A, [ ]) = A | merge ([ ], B) = B
ML tries patterns in the order written
SLIDE 136 clause order
fun merge (A, [ ]) = A | merge ([ ], B) = B | merge (x::A, y::B) = …
Does clause order matter here? NO
fun merge (x::A, y::B) = … | merge (A, [ ]) = A | merge ([ ], B) = B
ML tries patterns in the order written
SLIDE 137 clause order
fun merge (A, [ ]) = A | merge ([ ], B) = B | merge (x::A, y::B) = …
Does clause order matter here? NO Patterns are exhaustive Overlap of first two clauses is harmless
Each yields merge([ ], [ ]) = [ ]
fun merge (x::A, y::B) = … | merge (A, [ ]) = A | merge ([ ], B) = B
ML tries patterns in the order written
SLIDE 138 scope
- The helper functions really helped
- They were also useful for testing
- But we only really cared about isort, msort
- fun split …
- fun merge …
- fun msort … ;
Standard ML of New Jersey
Standard ML of New Jersey
val ins = fn - : int * int list -> int list val isort = fn - : int list -> int list val split = fn - : int list -> int list * int list val merge = fn - : int list * int list -> int list val msort = fn - : int list -> int list
SLIDE 139 scope
- There may be no good reason to make the
helper functions visible to the entire world
- We can easily make them “private”
- local
fun ins … in fun isort … end;
fun split … fun merge … in fun msort … end;
Standard ML of New Jersey Standard ML of New Jersey
val msort = fn - : int list -> int list val isort = fn - : int list -> int list
SLIDE 140 conclusion
- We implemented two well known sorting
algorithms for integer lists
- insertion sort
- mergesort
- There are many others…
- quicksort
- bubble sort
- selection sort
SLIDE 141 conclusion
- We implemented two well known sorting
algorithms for integer lists
- insertion sort
- mergesort
- There are many others…
- quicksort
- bubble sort
- selection sort
SLIDE 142 coming soon
- Generalizing from int to an ordered type
- Generalizing from lists to trees of data
Advantages of functional programming