CS 225
Data Structures
Oc October 26 26 – Ha Hashing
G G Carl Evans
CS 225 Data Structures Oc October 26 26 Ha Hashing G G Carl - - PowerPoint PPT Presentation
CS 225 Data Structures Oc October 26 26 Ha Hashing G G Carl Evans What if ( ! ) is not fast enough? Do you feel lucky? A H A Hash T Table b based D Dictionary Client Code: 1 Dictionary<KeyType,
Data Structures
Oc October 26 26 – Ha Hashing
G G Carl Evans
A Hash Table consists of three things:
Dictionary<KeyType, ValueType> d; d[k] = v; 1 2
Client Code:
(Angrave, CS 241) (Beckman, CS 421) (Challen, CS 125) (Davis, CS 101) (Evans, CS 126) (Fagen-Ulmschneider, CS 107) (Gunter, CS 422) (Herman, CS 233)
Hash function Key Value
Our hash function consists of two parts:
Choosing a good hash function is tricky…
Characteristics of a good hash function:
… Keyspaces Easy to create if: |KeySpace| N ~
… Keyspaces Easy to create if: |KeySpace| N ~ … Difficult to Create:
… Keyspaces Easy to create if: |KeySpace| N ~ … Difficult to Create:
In CS 225, we focus on general purpose hash functions. Other hash functions exists with different properties (eg: cryptographic hash functions)
S = { 16, 8, 4, 13, 29, 11, 22 } |S| = n h(k) = k % 7 |Array| = N
1 2 3 4 5 6 Worst Case SUHA Insert Remove/Find (Example of open hashing)
S = { 16, 8, 4, 13, 29, 11, 22 } |S| = n h(k) = k % 7 |Array| = N
1 2 3 4 5 6 (Example of closed hashing)
S = { 16, 8, 4, 13, 29, 11, 22 } |S| = n h(k) = k % 7 |Array| = N Try h(k) = (k + 0) % 7, if full… Try h(k) = (k + 1) % 7, if full… Try h(k) = (k + 2) % 7, if full… Try …
1 2 3 4 5 6 (Example of closed hashing) Worst Case SUHA Insert Remove/Find
Primary clustering: Description: Remedy:
S = { 16, 8, 4, 13, 29, 11, 22 } |S| = n h(k) = k % 7 |Array| = N Try h(k) = (k + 0*h2(k)) % 7, if full… Try h(k) = (k + 1*h2(k)) % 7, if full… Try h(k) = (k + 2*h2(k)) % 7, if full… Try … h(k, i) = (h1(k) + i*h2(k)) % 7
1 2 3 4 5 6 (Example of closed hashing)
Linear Probing:
Double Hashing:
Separate Chaining:
The expected number of probes for find(key) under SUHA
(Don’t memorize these equations, no need.) Instead, observe:
Linear Probing:
Double Hashing:
The expected number of probes for find(key) under SUHA
What if the array fills?
Which collision resolution strategy is better?
What structure do hash tables replace? What constraint exists on hashing that doesn’t exist with BSTs? Why talk about BSTs at all?
Hash Table AVL Linked List
Find
Amortized: Worst Case:
Insert
Amortized: Worst Case:
Storage Space
std::map
std::map ::operator[] ::insert ::erase ::lower_bound(key) è Iterator to first element ≤ key ::upper_bound(key) è Iterator to first element > key
std::unordered_map ::operator[] ::insert ::erase ::lower_bound(key) è Iterator to first element ≤ key ::upper_bound(key) è Iterator to first element > key
std::unordered_map ::operator[] ::insert ::erase ::lower_bound(key) è Iterator to first element ≤ key ::upper_bound(key) è Iterator to first element > key ::load_factor() ::max_load_factor(ml) è Sets the max load factor