15-150 Fall 2020 Lecture 8 Stephen Brookes trees vs. lists - - PowerPoint PPT Presentation

15 150 fall 2020
SMART_READER_LITE
LIVE PREVIEW

15-150 Fall 2020 Lecture 8 Stephen Brookes trees vs. lists - - PowerPoint PPT Presentation

15-150 Fall 2020 Lecture 8 Stephen Brookes trees vs. lists Representing a collection as a tree may enable a parallel speed-up Using a sorted tree may enable faster code, e.g. for searching With lists, even sorted lists, theres less


slide-1
SLIDE 1

15-150 Fall 2020

Lecture 8 Stephen Brookes

slide-2
SLIDE 2

trees vs. lists

  • Representing a collection as a tree

may enable a parallel speed-up

  • Using a sorted tree may enable faster code,

e.g. for searching

  • With lists, even sorted lists,

there’s less potential for parallelism

  • But badly balanced trees are no better than

lists, and balance may be hard to achieve!

slide-3
SLIDE 3

trees vs. lists

  • Representing a collection as a tree

may enable a parallel speed-up

  • Using a sorted tree may enable faster code,

e.g. for searching

  • With lists, even sorted lists,

there’s less potential for parallelism

  • But badly balanced trees are no better than

lists, and balance may be hard to achieve!

slide-4
SLIDE 4

the plan

  • First, a quick review
  • We’ll discuss how to search in lists and trees
  • under various assumptions (sorted, balanced)
  • Then we’ll implement an algorithm for sorting a tree
  • and prove its correctness
  • and analyze its work and span
slide-5
SLIDE 5

but first…

  • Someone asked about naming conventions
  • I prefer T for trees, t for types (and tea to drink)
  • I often use capitalized names for datatype constructors

like Node, Empty, SOME, NONE

  • Not required by ML, but you must be consistent

datatype ’a tree = Empty | Node of ’a tree * ’a * ’a tree; fun size empty = 0 | size (Node(A, _, B)) = 1 + size A + size B What happens?

slide-6
SLIDE 6

balanced trees

We can build a balanced tree from a list… … and (if we do it right) get the same list back by in-order traversal

1 4 2

[4,1,2]

list2tree inord

slide-7
SLIDE 7

recall

datatype ’a tree = Empty | Node of ’a tree * ’a * ’a tree fun size Empty = 0 | size (Node(T1, x, T2)) = 1 + (size T1) + (size T2) fun depth Empty = 0 | depth (Node(T1, x, T2)) = 1 + Int.max(depth T1, depth T2)

  • size T = number of nodes
  • depth T = length of longest path from root to leaf
  • A full binary tree of depth d has 2d - 1 nodes
  • depth T is O(log (size T)) for a balanced tree,

depth T is O(size T) otherwise(!)

slide-8
SLIDE 8

recall

fun inord Empty = [ ] | inord (Node(T1, x, T2)) = (inord T1) @ x :: (inord T2) fun list2tree [ ] = Empty | list2tree [x] = Node(Empty, x, Empty) | list2tree L = let

val n = length L

val (A, x::B) = takedrop (n div 2, L) in Node(list2tree A, x, list2tree B) end

  • inord T = inorder traversal list of T
  • length(inord T) = size T
slide-9
SLIDE 9

question

  • Would it have been OK to omit the [x] clause?

fun list2tree [ ] = Empty | list2tree L = let val n = length L val (A, x::B) = takedrop (n div 2, L) in Node(list2tree A, x, list2tree B) end

list2tree [4] = ???

slide-10
SLIDE 10

answer

  • Would it have been OK to omit the [x] clause?

fun list2tree [ ] = Empty | list2tree L = let val n = length L val (A, x::B) = takedrop (n div 2, L) in Node(list2tree A, x, list2tree B) end

YES Correctness proof still works!

slide-11
SLIDE 11

precision

  • There may be MANY balanced trees with the

same inorder traversal list

  • list2tree L builds a balanced tree with

inorder traversal list L

  • We don’t need to (or care to) say which one!

(or lack thereof) Go back and see if/where/why we used imprecise specs before!

slide-12
SLIDE 12

balanced

  • Empty is balanced
  • Node(A, x, B) is balanced iff

|size(A) - size(B)| ≤ 1 and A, B are balanced

A structurally inductive definition

slide-13
SLIDE 13

balanced

  • Empty is balanced
  • Node(A, x, B) is balanced iff

|size(A) - size(B)| ≤ 1 and A, B are balanced

A structurally inductive definition

  • If T is balanced, every node of T is balanced
  • If T is balanced, its children each have about half the data
slide-14
SLIDE 14

balanced

  • Empty is balanced
  • Node(A, x, B) is balanced iff

|size(A) - size(B)| ≤ 1 and A, B are balanced

A structurally inductive definition

  • If T is balanced, every node of T is balanced
  • If T is balanced, its children each have about half the data

(by definition + an easy structural induction)

slide-15
SLIDE 15

balanced

  • Empty is balanced
  • Node(A, x, B) is balanced iff

|size(A) - size(B)| ≤ 1 and A, B are balanced

A structurally inductive definition

  • If T is balanced, every node of T is balanced
  • If T is balanced, its children each have about half the data

(by definition + an easy structural induction) (how could you prove this?)

slide-16
SLIDE 16

sorted lists

nil is sorted x::R is sorted iff x is ≤ every integer in R and R is sorted

also a structurally inductive definition

slide-17
SLIDE 17

sorted trees

Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, and A and B are sorted

slide-18
SLIDE 18

sorted trees

Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, and A and B are sorted

T is a sorted tree iff inord T is a sorted list

Theorem

slide-19
SLIDE 19

sorted trees

Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, and A and B are sorted

T is a sorted tree iff inord T is a sorted list

Theorem

prove by structural induction

slide-20
SLIDE 20

sorted trees

Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, and A and B are sorted

prove by structural induction

slide-21
SLIDE 21

sorted trees

Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, and A and B are sorted

slide-22
SLIDE 22

sorted trees

Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, and A and B are sorted

3

.

14

. 57 . 99 .

42

.

42

.

81

.

slide-23
SLIDE 23

sorted trees

Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, and A and B are sorted

42

.

14

.

81

.

3

.

42

. 57 . 99 .

3

.

14

. 57 . 99 .

42

.

42

.

81

.

slide-24
SLIDE 24

all

fun all (p, Empty) = true | all (p, Node(A, x, B)) = (p x) andalso all (p, A) andalso all (p, B) REQUIRES p is total ENSURES all (p, T) = true iff every integer in T satisfies p all : (int -> bool) * int tree -> bool

slide-25
SLIDE 25

all

fun all (p, Empty) = true | all (p, Node(A, x, B)) = (p x) andalso all (p, A) andalso all (p, B) REQUIRES p is total ENSURES all (p, T) = true iff every integer in T satisfies p all : (int -> bool) * int tree -> bool

——————— p x terminates, for all x in T

slide-26
SLIDE 26

sorted

fun sorted (T : int tree) : bool = case T of Empty => true | Node(A, x, B) => all (fn y => y <= x, A) andalso all (fn y => y >= x, B) andalso sorted A andalso sorted B sorted T = true iff T is a sorted tree

slide-27
SLIDE 27

sorted

fun sorted Empty = true | sorted (Node(A, x, B)) = all (fn y => y <= x, A) andalso all (fn y => y >= x, B) andalso sorted A andalso sorted B sorted T = true iff T is a sorted tree

Useful in specs, never used in code!

slide-28
SLIDE 28

motivation

Sorted data may be easier to deal with…

  • That’s why dictionaries are in lexicographic order!

Let’s look at functions for searching data contained in

  • lists (unsorted, sorted)
  • trees (unsorted, sorted)
  • We’ll contrast the work and span.
slide-29
SLIDE 29

searching

fun mem (x, [ ]) = false | mem (x, y::L) = (x = y) orelse mem (x, L) Wmem(x, L) is O(length L) an unsorted list REQUIRES true ENSURES mem (x, L) = true iff x is in L Smem(x, L) is also O(length L) fun mem (x, [ ]) = false | mem (x, y::L) = (x = y) orelse mem (x, L)

mem : int * int list -> bool

slide-30
SLIDE 30

searching

fun mem (x, [ ]) = false | mem (x, y::L) = case Int.compare(x, y) of LESS => false | EQUAL => true | GREATER => mem (x, L) Wmem(x, L) is O(length L) a sorted list REQUIRES L is a sorted list ENSURES mem (x, L) = true iff x is in L Smem(x, L) is also O(length L)

mem : int * int list -> bool

slide-31
SLIDE 31

searching

fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse mem (x, A) orelse mem (x, B) Wmem(x, T) is O(size T) an unsorted tree REQUIRES T is a tree ENSURES mem (x, T) = true iff x is in T Smem(x, T) is also O(size T) fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse mem (x, A) orelse mem (x, B)

(* not designed for parallel evaluation *)

mem : int * int tree -> bool

slide-32
SLIDE 32

searching

fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse let val (a, b) = (mem (x, A), mem (x, B)) in a orelse b end an unsorted tree Smem(x, T) is O(depth T) Wmem(x, T) is O(size T)

(* designed for parallel evaluation *)

mem : int * int tree -> bool

… let’s see why

slide-33
SLIDE 33

searching

fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse let val (a, b) = (mem (x, A), mem (x, B)) in a orelse b end

an unsorted tree

S(mem(x, Empty)) = 1 S(mem(x, Node(A, y, B))) = 1 + max(S(mem(x,A)), S(mem(x, B)))

slide-34
SLIDE 34

searching

fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse let val (a, b) = (mem (x, A), mem (x, B)) in a orelse b end

an unsorted tree S(mem(x, T)) is O(depth T)

S(mem(x, Empty)) = 1 S(mem(x, Node(A, y, B))) = 1 + max(S(mem(x,A)), S(mem(x, B)))

slide-35
SLIDE 35

searching

fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse let val (a, b) = (mem (x, A), mem (x, B)) in a orelse b end

an unsorted tree S(mem(x, T)) is O(depth T)

S(mem(x, Empty)) = 1 S(mem(x, Node(A, y, B))) = 1 + max(S(mem(x,A)), S(mem(x, B))) Let Smem(d) be span for mem(x,T) with T of depth d

slide-36
SLIDE 36

searching

fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse let val (a, b) = (mem (x, A), mem (x, B)) in a orelse b end

an unsorted tree S(mem(x, T)) is O(depth T)

S(mem(x, Empty)) = 1 S(mem(x, Node(A, y, B))) = 1 + max(S(mem(x,A)), S(mem(x, B))) Smem(0) = 1 Smem(d) = 1 + max(Smem(d-1), Smem(d-1)) Let Smem(d) be span for mem(x,T) with T of depth d

slide-37
SLIDE 37

searching

fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse let val (a, b) = (mem (x, A), mem (x, B)) in a orelse b end

an unsorted tree S(mem(x, T)) is O(depth T)

S(mem(x, Empty)) = 1 S(mem(x, Node(A, y, B))) = 1 + max(S(mem(x,A)), S(mem(x, B))) Smem(0) = 1 Smem(d) = 1 + max(Smem(d-1), Smem(d-1))

Smem(d) is O(d)

Let Smem(d) be span for mem(x,T) with T of depth d

slide-38
SLIDE 38

searching

fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = case Int.compare(x, y) of LESS => mem(x, A) | EQUAL => true | GREATER => mem (x, B) Wmem(x, T) is O(depth T) a sorted tree REQUIRES T is a sorted tree ENSURES mem (x, T) = true iff x is in T Smem(x, T) is O(depth T)

check these

slide-39
SLIDE 39

search

unsorted list O(length) O(length) sorted list O(length) O(length) unsorted tree O(size) O(depth) sorted tree O(depth) O(depth)

work span

slide-40
SLIDE 40

balanced case

  • For a balanced tree T we know that

depth T is O(log(size T))

slide-41
SLIDE 41

search

unsorted list O(length) O(length) sorted list O(length) O(length) unsorted tree O(size) O(depth) sorted tree O(depth) O(depth) balanced tree O(size) O(depth) balanced sorted tree O(depth) O(depth)

work span

slide-42
SLIDE 42

search

unsorted list O(length) O(length) sorted list O(length) O(length) unsorted tree O(size) O(depth) sorted tree O(depth) O(depth) balanced tree O(size) O(depth) balanced sorted tree O(depth) O(depth)

work span work O(n) span O(n) work O(n) span O(log n) work O(log n) span O(log n) worst-case (n items)

slide-43
SLIDE 43

motivation

  • Trees may be better than lists…

… especially balanced trees.


  • And sorted trees may enable even faster code.

Let’s develop a function that sorts a tree (of integers)

slide-44
SLIDE 44

sorting a tree

  • If the tree is Empty, do nothing
  • Otherwise

(recursively) sort the two children, then merge the sorted children, then insert the root value

slide-45
SLIDE 45

sorting a tree

  • If the tree is Empty, do nothing
  • Otherwise

(recursively) sort the two children, then merge the sorted children, then insert the root value We’ll design helpers to insert and merge

slide-46
SLIDE 46

sorting a tree

  • If the tree is Empty, do nothing
  • Otherwise

(recursively) sort the two children, then merge the sorted children, then insert the root value We’ll design helpers to insert and merge

merge will also need a helper to split a tree in two

slide-47
SLIDE 47

inserting in a tree

Ins : int * int tree -> int tree fun Ins (x, Empty) = Node(Empty, x, Empty) | Ins (x, Node(T1, y, T2)) = if x > y then Node(T1, y, Ins(x, T2)) else Node(Ins(x, T1), y, T2)

consisting of x and T REQUIRES T is a sorted tree ENSURES Ins(x, T) = a sorted tree

(contrast with list insertion)

slide-48
SLIDE 48

comments

  • We say “Ins(x, T) = a tree consisting of x and T”

and mean “a tree U containing x and the items of T”

  • We may abbreviate “all items of T are < x”

by just saying that “T < x”

  • Similarly, merging trees T1 and T2 produces a tree T that

“consists of (the items of) T1 and (the items of) T2”

  • inord U is a perm of x::(inord T)
  • every integer in inord T is less than x
  • inord (Merge(T1, T2)) is a perm of (inord T1)@(inord T2)

Informal specs are OK provided they are unambiguous

slide-49
SLIDE 49

example

Ins(4, ) 3 1 6 2 5 . . . . .

slide-50
SLIDE 50

example

Ins(4, ) 3 1 6 2 5 = 3 1 Ins(4, 6) 5 2 . . . . . . . . . .

slide-51
SLIDE 51

example

Ins(4, ) 3 1 6 2 5 = 3 1 Ins(4, 6) 5 2 . . . . . . . . . . = 3 2 1 6 Ins(4, 5) . . . . .

slide-52
SLIDE 52

example

Ins(4, ) 3 1 6 2 5 = 3 1 Ins(4, 6) 5 2 . . . . . . . . . . 3 2 1 6 5 . . . . 4 . = . = 3 2 1 6 Ins(4, 5) . . . . .

slide-53
SLIDE 53

merging trees

Merge : int tree * int tree -> int tree

REQUIRES T1 and T2 are sorted trees ENSURES Merge(T1, T2) = a sorted tree consisting of T1 and T2

slide-54
SLIDE 54

merging trees

Merge : int tree * int tree -> int tree

Merge (Node(l1,x,R1), T2) = ???

REQUIRES T1 and T2 are sorted trees ENSURES Merge(T1, T2) = a sorted tree consisting of T1 and T2

slide-55
SLIDE 55

merging trees

Merge : int tree * int tree -> int tree

Merge (Node(l1,x,R1), T2) = ???

We could split T2 into two subtrees (L2, R2), then do Node(Merge(L1 ,L2), x, Merge(R1 ,R2)) REQUIRES T1 and T2 are sorted trees ENSURES Merge(T1, T2) = a sorted tree consisting of T1 and T2

slide-56
SLIDE 56

merging trees

Merge : int tree * int tree -> int tree

Merge (Node(l1,x,R1), T2) = ???

We could split T2 into two subtrees (L2, R2), then do Node(Merge(L1 ,L2), x, Merge(R1 ,R2)) REQUIRES T1 and T2 are sorted trees ENSURES Merge(T1, T2) = a sorted tree consisting of T1 and T2 But we need to stay sorted and not lose data…

slide-57
SLIDE 57

merging trees

Merge : int tree * int tree -> int tree

Merge (Node(l1,x,R1), T2) = ???

We could split T2 into two subtrees (L2, R2), then do Node(Merge(L1 ,L2), x, Merge(R1 ,R2)) REQUIRES T1 and T2 are sorted trees ENSURES Merge(T1, T2) = a sorted tree consisting of T1 and T2 But we need to stay sorted and not lose data… … so split should use x and build (L2, R2) so that L2 ≤ x ≤ R2 …

slide-58
SLIDE 58

splitting a tree

SplitAt : int * int tree -> int tree * int tree

REQUIRES T is a sorted tree ENSURES SplitAt(x, T) = a pair of sorted trees (U1, U2) such that U1 ≤ x ≤ U2 and U1, U2 is a perm of T Not completely specific, but that’s OKAY!

slide-59
SLIDE 59

Plan

Define SplitAt(T) using structural recursion

  • SplitAt(x, Node(T1, y, T2)) should
  • compare x and y
  • call SplitAt(x, -) on T1 or T2
  • build the result
slide-60
SLIDE 60

SplitAt

fun SplitAt(x, Empty) = (Empty, Empty) | SplitAt(x, Node(T1, y, T2)) = if y>x then let val (L1, R1) = SplitAt(x, T1) in (L1, Node(R1, y, T2)) end let val (L2, R2) = SplitAt(x, T2) in (Node(T1, y, L2), R2) end else

REQUIRES T is a sorted tree ENSURES SplitAt(x, T) = a pair of sorted trees (U1, U2) such that U1 ≤ x ≤ U2 and U1, U2 is a perm of T

SplitAt : int * int tree -> int tree * int tree

slide-61
SLIDE 61

Merge

Node(Merge(L1, L2), x, Merge(R1, R2))

Merge : int tree * int tree -> int tree

fun Merge (Empty, T2) = T2 | Merge (Node(L1, x, R1), T2) = let val (L2, R2) = SplitAt(x, T2) in end

REQUIRES T1 and T2 are sorted trees ENSURES Merge(T1, T2) = a sorted tree consisting of T1 and T2

(as we promised!)

slide-62
SLIDE 62

example

val A = Node (Node (Empty,1,Node (Empty,2,Node (Empty,3,Empty))),5,Empty)

val B = Node (Node (Node (Empty,0,Node (Empty,7,Empty)),8,Empty),9,Empty) val M = Node (Node (Node (Empty,0,Empty),1,Node (Empty,2,Node (Empty,3,Empty))),5, Node (Node (Node (Empty,7,Empty),8,Empty),9,Empty)) : int tree

  • inord M;

val it = [0,1,2,3,5,7,8,9] : int list

val M = Merge(A, B);

Standard ML of New Jersey 5 1 2 3 9 7 8 5 1 2 3 9 7 8 A B M = Merge(A, B)

slide-63
SLIDE 63

comments

  • As these examples show, there’s more than one way to

split a tree, or to merge two sorted trees.

  • It’s not always easy to see exactly what results you’ll get.
  • IT DOESN’T MATTER!
  • Just need to know that the results will

satisfy the SPECIFICATION

A sorted tree containing all the items of T1 and T2 A pair of trees such that…

slide-64
SLIDE 64

Msort

Ins (x, Merge(Msort T1, Msort T2))

Msort : int tree -> int tree

REQUIRES true ENSURES Msort(T) = a sorted tree consisting of the items of T

fun Msort Empty = Empty | Msort (Node(T1, x, T2)) =

slide-65
SLIDE 65

Correct?

  • Q: How to prove that Msort is correct?

A: Use structural induction.

  • First prove that the helper functions

Merge, SplitAt, Ins are correct. Again use structural induction.

  • The helper specs were carefully chosen to

make the proof of Msort straightforward.

(An easy structural induction, using the proven facts about helpers.)

See lecture notes

slide-66
SLIDE 66

example

val A = list2tree [4,1,2] val B = list2tree [3,5,0] val T = Node(A, 42, B) 42 1 5 4 2 3 4 2 5 1 3 42 T Msort T

Msort

  • Msort T;

val it = Node (Node (Node (Node (Empty,0,Empty),1,Empty),2,Node (Empty,3,Empty)),4, Node (Empty,5,Node (Empty,42,Empty))) : int tree

  • inord it;

val it = [0,1,2,3,4,5,42] : int list

slide-67
SLIDE 67

demo

  • val T = list2tree (upto 1 1000);

val T = Node (Node (Node (Node (Node (#,#,#),63,Node (#,#,#)),126, Node (Node (#,#,#),189,Node (#,#,#))),251, Node (Node (Node (#,#,#),314,Node (#,#,#)),376, Node (Node (#,#,#),439,Node (#,#,#)))),501, Node (Node (Node (Node (#,#,#),564,Node (#,#,#)),626, Node (Node (#,#,#),689,Node (#,#,#))),751, Node (Node (Node (#,#,#),814,Node (#,#,#)),876, Node (Node (#,#,#),939,Node (#,#,#))))) : int tree

  • layers T;

val it = [[501],[251,751],[126,376,626,876],[63,189,314,439,564,689,814,939], [32,95,158,220,283,345,408,470,533,595,658,720,783,845,908,970], [16,48,79,111,142,174,205,236,267,299,330,361,392,424,455,486,...], [8,24,40,56,71,87,103,119,134,150,166,182,197,213,228,244,...], [4,12,20,28,36,44,52,60,67,75,83,91,99,107,115,123,...], [2,6,10,14,18,22,26,30,34,38,42,46,50,54,58,62,...], [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,...]] : int list list

  • Msort T;

val it = Node (Empty,1, Node (Node (Empty,2,Empty),3, Node (Node (Empty,4,Empty),5,Node (Node (#,#,#),7,Node (#,#,#))))) : int tree

  • inord it;

val it = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,...] : int list

501 251 751 126 376 626 876

63 189 314 439 564 689 814 939

… and so on 1 3 2 5

4 7

… and so on

Msort

slide-68
SLIDE 68

taking stock

  • We’ve defined a function Msort : int tree -> int tree
  • It satisfies the sorting spec
  • We sketched a proof
  • We also tested on some examples
  • It runs pretty fast on a tree of 100000 integers
  • I checked it!
  • But it’s not always returning a balanced sorted tree
slide-69
SLIDE 69

efficiency?

  • We know how to msort a list of n items in O(n log n) time
  • We can hold n items in a tree of size n,

in a balanced tree of size n, depth O(log n)

  • The million dollar question:

how efficient is Msort T when T is a balanced tree of size n?

slide-70
SLIDE 70

The Span is…?

What’s the span of Msort T ?

when T is balanced, depth d

slide-71
SLIDE 71

The Span is…?

What’s the span of Msort T ?

when T is balanced, depth d

slide-72
SLIDE 72

Span of Ins

fun Ins (x, Empty) = Node(Empty, x, Empty) | Ins (x, Node(T1, y, T2)) = if x > y then Node(T1, y, Ins(x, T2)) else Node(Ins(x, T1), y, T2)

(no way to parallelize!) SIns(d) is O(d) SIns(d) = 1 + SIns(d-1) For a balanced tree of depth d>0,

slide-73
SLIDE 73

Span of SplitAt

For a balanced tree of depth d>0, SSplitAt(d) = 1 + SSplitAt(d-1) (similarly) SSplitAt(d) is O(d)

slide-74
SLIDE 74

Span of Merge

fun Merge (Empty, T2) = T2 | Merge (Node(l1, x, r1), T2) = let val (l2, r2) = SplitAt(x, T2) in Node(Merge(l1, l2), x, Merge(r1, r2)) end

SMerge(d) = SSplitAt(d) + max(SMerge(d-1), SMerge(d-1)) For balanced trees of depth d>0,

assuming the trees got by splitting have depth ≤ d-1, we get

= SSplitAt(d) + SMerge(d-1) SMerge(d) is O(d2) independent = O(d) + SMerge(d-1)

slide-75
SLIDE 75

Span of Msort

fun Msort Empty = Empty | Msort (Node(T1, x, T2)) = Ins (x, Merge(Msort T1, Msort T2))

SMsort(d) = max(SMsort(d-1), SMsort(d-1)) + SMerge(d) + SIns(2d) For a balanced tree of depth d > 0 = SMsort(d-1) + O(d2) SMsort(d) is O(d3) independent

slide-76
SLIDE 76

Span of Msort

fun Msort Empty = Empty | Msort (Node(T1, x, T2)) = Ins (x, Merge(Msort T1, Msort T2))

SMsort(d) = max(SMsort(d-1), SMsort(d-1)) + SMerge(d) + SIns(2d) For a balanced tree of depth d > 0 = SMsort(d-1) + O(d2) SMsort(d) is O(d3) independent

slide-77
SLIDE 77
  • ops
  • We assumed that splitting, merging, inserting

with balanced trees produces balanced trees

  • That’s NOT true!
slide-78
SLIDE 78

losing balance

Msort can produce badly unbalanced trees 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 Msort

  • Msort (Full(42, 3));

val it = Node (Node (Node (Node (Node (#,#,#),42,Empty),42,Empty),42,Empty),42,Empty) : int tree

  • layers it;

val it = [[42],[42],[42],[42],[42],[42],[42]] : int list list

slide-79
SLIDE 79

results

  • Msort on trees may build list-shaped trees
  • So its worst-case work ends up being no better than

that of msort on lists

  • In “average” cases the tree-based method may be faster
  • But we can make no promises :-)
slide-80
SLIDE 80

towards a solution

  • Merge, Ins don’t preserve balance!
  • We could use a tree balancing function...
  • Or new versions of Ins and Merge that

actually preserve balance

fun Msort Empty = Empty | Msort (Node(t1, x, t2)) = balance(Ins (x, Merge(Msort t1, Msort t2)))

But perfect balance is hard to achieve… and there are other solutions…

slide-81
SLIDE 81

balanced vs sorted

  • Msort produces a sorted tree
  • Maintaining balance (along with sortedness)

is a lot of extra work!

  • Later we will see how to build

nearly-balanced sorted trees…

  • …with the same asymptotic behavior as

perfectly-balanced sorted trees

slide-82
SLIDE 82

lesson

  • Datatypes allow us to design our own types
  • Structural induction allows us to define functions,

identify sets of values with special properties, and reason about program behavior

  • Work and span recurrences are good for estimating

how efficient our code is, asymptotically

  • But be careful to do proofs and analysis accurately!
  • Be aware of any assumptions you make
slide-83
SLIDE 83

Not Monty Python NOBODY expects the Spanish Inquisition! Our chief weapon is work… work and span… span and work… Our two weapons are work and span… and ruthless efficiency. Our “three” weapons are work, span, and ruthless efficiency… and an almost fanatical devotion to SML. No! “Amongst” our weaponry… are such elements as work, span, recurrences, specifications, structural induction, induction on size, induction on depth, …

slide-84
SLIDE 84

Not Monty Python NOBODY expects the Spanish Inquisition! Our chief weapon is work… work and span… span and work… Our two weapons are work and span… and ruthless efficiency. Our “three” weapons are work, span, and ruthless efficiency… and an almost fanatical devotion to SML. No! “Amongst” our weaponry… are such elements as work, span, recurrences, specifications, structural induction, induction on size, induction on depth, …