SLIDE 1 data structures and algorithms 2020 09 10 lecture 4
quiz
insertion sort: worst-case time complexity? best-case time complexity? in-place? recursive version? merge sort: worst-case time complexity? best-case time complexity? in-place? recurrence equation and recursion tree? heapsort: worst-case time complexity? best-case time complexity? in-place? smooth sort
quiz
sort 5 distinct elements {a, b, c, d, e} using at most 7 comparisons comparison: a < b gives yes or no can we do better than 7 steps? this illustrates the lower bound for comparison-based sorting (more later)
heapsort priority queues
SLIDE 2
heapsort priority queues
MaxHeapify: pseudo-code
Algorithm MaxHeapify(A, i): l := left(i) r := right(i) if l ≤ A.heap-size and A[l] > A[i] then largest := l else largest := i if r ≤ A.heap-size and A[r] > A[largest] then largest := r if largest = i then swap(A[i], A[largest]) MaxHeapify(A, largest)
MaxHeapify: worst-case time complexity via height
intuition: worst-case time complexity of down-heap bubble determined by height of the heap so in O(log(n)) with h the height of the heap and n the number of nodes: T(h) = T(h − 1) + c if h > 0 gives T(h) ∈ O(h) then use h ∈ Θ(log n) gives T(n) ∈ O(log n)
MaxHeapify: worst-case time complexity via nodes
with n the number of nodes: T(n) ≤ T( 2
3n) + 1 if n > 1 gives T(n) ∈ O(log n)
because in the worst case the bottom level is exactly half full ”a lot of work” for ”a small number of nodes”
SLIDE 3
MaxHeapify: intuition of correctness
result of MaxHeapify(A, i) is a max-heap induction on the height of node i if the height is 0, then immediate if the height is > 0, then two cases case largest = i: immediate case largest = l (and equivalently for largest = r): use induction
building a heap: pseudo-code
Algorithm buildMaxHeap(H): H.heap-size := H.length for i = ⌊H.length/2⌋ downto 1 do MaxHeapify(H, i)
buildMaxHeap: correctness
use the following loop invariant: at the start of the for-loop each node i + 1, . . . , n is the root of a max-heap init: for i = ⌊ n
2⌋ the nodes i + 1, . . . , n are leaves hence max-heaps
loop: children are max-heaps by induction use correctness of MaxHeapify end: for i = 0 the invariant gives correctness of the output
buildMaxHeap: complexity
buildMaxHeap is in O(n) a proof is in the book an intuition for the O(n) time complexity can be given in a picture
SLIDE 4
heapsort: pseudo-code
H[1 . . . n] an array of integers directly after building the heap: H.heap-size = H.length Algorithm heapsort(H): buildMaxHeap(H) for i = H.length downto 2 do swap H[1] and H[i] H.heap-size := H.heap-size − 1 MaxHeapify(H, 1)
heapsort
running time of buildMaxHeap in O(n) n − 1 calls of MaxHeapify which is in O(log n) running time of heapsort in O(n log n) correctness follows from correctness of buildMaxHeap and MaxHeapify heapsort is in-place what happens to a sorted input?
inspired by heapsort: smooth sort smooth sort: inventor
Edsger W. Dijkstra 1930–2002 Turing Award 1972
SLIDE 5 quiz
Algorithm Loop1(n): s := 0 for i := 1 to n do s := s + i
T(n) is in Θ(?)
quiz
Algorithm Loop2(n): p := 1 for i := 1 to 2n do p := p · i
T(n) is in Θ(?)
heapsort priority queues
dynamic set
where is x? add x remove x what is the smallest? what is the largest? given x, what is the next one? given x, what is the previous one?
SLIDE 6 priority queue: properties
data type for maintaining a dynamic set access via keys which are totally ordered queue where most important element is served first min-priority queue: minimum is most important max-priority queue: maximum is most important
max-priority queue: abstract data type
add: insert with arguments priority queue and key uses increase-key increases in priority queue the key of an element to given key maximum gives (but does not remove) max key remove: extract-max removes and returns max key (we restrict attention to the keys, mostly ignoring that they are keys of elements)
priority queue: implementation
‘unordered sequence’: add easy but remove difficult ‘ordered sequence’: remove easy but add difficult use a heap max-heap of integers (for the keys) for max-priority queue
- r similarky: min-heap for min-priority queue
maximum: pseudo-code
H a max-heap; return the max key, in Θ(1) H.size and H.heapsize do not change Algorithm heapMaximum(H): return H[1]
SLIDE 7
remove: pseudo-code
H max-heap; remove and return maximum; error omitted H.size does not change; H.heapsize decreases Algorithm heapExtractMax(H): max := H[1] H[1] := H[H.heap-size] H.heap-size := H.heap-size − 1 maxHeapify(H, 1) this is in O(log n) return max
insert for max-priority queue: pseudocode
H.size stays the same; H.heapsize increases Algorithm heapInsert(H, k): H.heap-size := H.heap-size + 1 H[H.heap-size] := − ∞ HeapIncreaseKey(H, H.heap-size, k) Algorithm heapIncreaseKey(H, i, k): if k < H[i] then return error H[i] := k while i > 1 and H[parent(i)] < H[i] do swap(H[parent(i)], H[i]) i := parent(i)
priority queue operations: varia
a way to implement a dynamic set a key may occur more than once worst-case time complexity comes from the height and the cost of ‘bubble’ insert in O(log n) remove in O(log n) maximum in O(1) alternative: use an ordered array alternative: use an unordered array correctness?