Algorithms and Data Structures Open Addressing, Priority Queue - PowerPoint PPT Presentation

Algorithms and Data Structures Open Addressing, Priority Queue Albert-Ludwigs-Universität Freiburg Prof. Dr. Rolf Backofen Bioinformatics Group / Department of Computer Science Algorithms and Data Structures, November 2018

Structure Hashing Recapitulation Treatment of hash collisions Open Addressing Summary Priority Queue Introduction November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 2 / 60

Hashing Recapitulation Hashing: No hash function is good for all key sets! This cannot work, because a big universe is mapped onto a small set: | U | > m For random key sets also simple hash functions work, e.g. ⇒ h ( x ) = x mod m Then the random keys make sure that it is distributed evenly To find a good hash function for every key set, universal hashing is needed Then however, for a fixed set of keys not every hash function is suitable, but only some November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 3 / 60

Hashing Recapitulation Rehashing: It is possible to get bad hash functions with universal hashing, but it is unlikely This is determinable by monitoring the maximum bucket size If a pre-defined level is exceeded, then a rehash is performed How to rehash? New hash table with a new random hash function Copy elements into the new table Expensive but does not happen often Therefore the average cost is low Look at amortized analysis in the next lecture November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 4 / 60

Hashing Linked List Buckets as linked list: Each bucket is a linked list Colliding keys are inserted into the linked list of a bucket, either sorted or appended at the end 27,"B" 53,"K" hash table bucket represented as linked list 13,"R" 7,"A" 33,"D" 2,"E" Unsorted list. Sorted list would 105,"Z" make unsuccessful search faster Operations in O (1) are possible if a suitable table size and hash function is selected Worst case O ( n ), e.g. table size of 1 Dynamic number of elements is possible November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 6 / 60

Hashing Open Addressing For colliding keys we choose a new free entry Static, fixed number of elements The probe sequence determines for each key, in which sequence all hash table entries are searched for a free bucket If an entry is already occupied, then iteratively the following entry is checked. If a free entry is found the element is inserted If element is not found at the corresponding table entry, even if the entry is occupied, then probing has to be performed until the element or a free entry has been found November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 8 / 60

Hashing Open Addressing Definitions: h ( s ) Hash function for key s g ( s , j ) Probing function for key s with overflow positions j ∈ { 0 ,..., m − 1 } e.g. g(s,j)=j The probe sequence is calculated by h ( s , j ) = ( h ( s ) − g ( s , j )) mod m ∈ { 0 ,..., m − 1 } 0 1 2 3 4 5 6 s X X X X h ( s , 4) h ( s , 0) November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 9 / 60

Hashing Open Addressing - Python def insert(s, value ): j = 0 while t[(h(s) - g(s, j)) mod m] \ is not None: j += 1 t[(h(s) - g(s, j)) mod m] \ = (s, value) November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 10 / 60

Hashing Open Addressing - Python def lookup(s): j = 0 while t[(h(s) - g(s, j)) mod m] \ is not None: if t[(h(s) - g(s, j)) mod m][0] != s: j += 1 if t[(h(s) - g(s, j)) mod m][0] == s: return t[(h(s) - g(s, j)) mod m] return None November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 11 / 60

Hashing Open Addressing - Linear Probing 0 1 2 3 4 5 6 s X X X X h ( s , 4) h ( s , 0) Figure: Linear probe sequence Check the element with lower index: g ( s , j ) := j ⇒ Hash function: h ( s , j ) = ( h ( s ) − j ) mod m This leads to the following probe sequence h ( s ) , h ( s ) − 1 , h ( s ) − 2 ,..., 0 , m − 1 , m − 2 ,..., h ( s )+1 � �� clipping November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 12 / 60

Hashing Open Addressing - Linear Probing 0 1 2 3 4 5 6 s X X X X h ( s , 4) h ( s , 0) Figure: Linear probe sequence Can result in primary clustering Dealing with a hash collision will result in a higher probability of hash collisions in close entries November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 13 / 60

Hashing Open Addressing - Linear Probing Example: Keys: { 12 , 53 , 5 , 15 , 2 , 19 } Hash function: h ( s , j ) = ( s mod 7 − j ) mod 7 t . insert (12, "A") , h (12 , 0) = 5 0 1 2 3 4 5 6 12, A t . insert (53, "B") , h (53 , 0) = 4 53, B 12, A Figure: Probe/Insertion sequence on a hash map November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 14 / 60

Hashing Open Addressing - Linear Probing Example: Hash function: h ( s , j ) = ( s mod 7 − j ) mod 7 t . insert (5, "C") , h (5 , 0) = 5 , h (5 , 1) = 4 , h (5 , 2) = 3 0 1 2 3 4 5 6 5, C 53, B 12, A t . insert (15, "D") , h (15 , 0) = 1 15, D 5, C 53, B 12, A Figure: Probe/Insertion sequence on a hash map November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 15 / 60

Hashing Open Addressing - Linear Probing Example: Hash function: h ( s , j ) = ( s mod 7 − j ) mod 7 t . insert (2, "E") , h (2 , 0) = 2 0 1 2 3 4 5 6 15, D 2, E 5, C 53, B 12, A t . insert (19, "F") , h (19 , 0) = 5 , h (19 , 1) = 4 , h (19 , 2) = 3 , h (19 , 3) = 2 , h (19 , 4) = 1 , h (19 , 5) = 0 19, F 15, D 2, E 5, C 53, B 12, A Figure: Probe/Insertion sequence on a hash map November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 16 / 60

Hashing Open Addressing - Squared Probing Squared probing: Motivation: avoid local clustering � j � 2 g ( s , j ) := ( − 1) j 2 0 1 2 3 4 5 6 7 8 9 10 11 s X X X h ( s , 0) h ( s , 3) Figure: Squared probe sequence This leads to the following probe sequence h ( s ) , h ( s )+1 , h ( s ) − 1 , h ( s )+4 , h ( s ) − 4 , h ( s )+9 , h ( s ) − 9 , ... November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 17 / 60

Hashing Open Addressing - Squared Probing Squared probing: � j � 2 g ( s , j ) := ( − 1) j 2 If m is a prime number for which m = 4 · k +3 then the probe sequence is a permutation of the indices of the hash tables Alternatively: h ( s , j ) := ( h ( s ) − c 1 · j + c 2 · j 2 ) mod m Problem of secondary clustering: No local clustering anymore, but keys with same hash value have similar probe sequence November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 18 / 60

Hashing Open Addressing - Uniform Probing Uniform Probing: Motivation: so far function g ( s , j ) uses only the step counter j for linear and squared probing ⇒ The probe sequence is independent of the key s Uniform probing computes the sequence g ( s , j ) of permutations of all possible indices dependent on key s Advantage: prevents clustering because different keys with the same hash value do not produce the same probe sequence Disadvantage: hard to implement November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 19 / 60

Hashing Open Addressing - Double Hashing Double Hashing: h 2 ( s ) h 2 ( s ) 0 1 2 3 4 5 6 7 8 9 10 11 12 s X X X X X X h ( s , 0) = h 1 ( s ) h ( s , 3) Figure: double hashing probe sequence Motivation: consider key s in probe sequence Use two independent hash functions h 1 ( s ) , h 2 ( s ) Hash function: h ( s , j ) = ( h 1 ( s )+ j · h 2 ( s )) mod m November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 20 / 60

Hashing Open Addressing - Double Hashing Double Hashing: Hash function: h ( s , j ) = ( h 1 ( s )+ j · h 2 ( s )) mod m Probe sequence: h 1 ( s ) , h 1 ( s )+ h 2 ( s ) , h 1 ( s )+2 · h 2 ( s ) , h 1 ( s )+3 · h 2 ( s ) , ... Works well in practical use This method is an approximation of uniform probing November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 21 / 60

Hashing Open Addressing - Double Hashing - Example Example: h 1 ( s ) = s mod 7 h 2 ( s ) = ( s mod 5)+1 h ( s , j ) = h 1 ( s )+ j · h 2 ( s ) mod 7 Table: comparing both hash functions s 10 19 31 22 14 16 h 1 ( s ) 3 5 3 1 0 2 h 2 ( s ) 1 5 2 3 5 2 The efficiency of double hashing is dependent on h 1 ( s ) � = h 2 ( s ) November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 22 / 60

Hashing Open Addressing - Double Hashing - Optimization h ( s 1 , 0) 0 1 2 3 4 5 6 7 8 9 10 11 12 s 1 X X X h ( s 2 , 0) h ( s 2 , 1) h ( s 2 , 2) h ( s 2 , 3) Figure: double hashing Double hashing by Brent: Motivation: Because different keys have different probe sequences, the sequence of the insertions has impact on efficiency of a sucessful search November 2018 Prof. Dr. Rolf Backofen – Bioinformatics - University Freiburg - Germany 23 / 60

Algorithms and Data Structures Open Addressing, Priority Queue - PowerPoint PPT Presentation

Algorithms and Data Structures Open Addressing, Priority Queue Albert-Ludwigs-Universitt Freiburg Prof. Dr. Rolf Backofen Bioinformatics Group / Department of Computer Science Algorithms and Data Structures, November 2018 Structure Hashing

CS 310 - Advanced Data Structures and Algorithms Basic Data Structures May 31, 2018 Mohammad

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

Data Structures Data Structures Lists Trees Trees Graphs CSE 680 Review basic

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

Algorithms and Data Structures: Overview Algorithms and data structures Data Abstraction,

Algorithms and Data Structures, or . . . Classical Algorithms of the 50s, 60s and 70s Mary Cryan

Hypo contact and Sasakian SU ( 2 ) -structures in 5-dimensions structures on Lie groups Sasakian

COL106: Data Structures and Algorithms Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL106: Data

4/3/13 CS200 Algorithms and Data Structures Colorado State University Pa Part 7. Tables

Introduction to Algorithms and Data Structures CSC 1051 Algorithms and Data Structures I Dr.

Exams Gerth Stlting Brodal Algorithms and Data Structures Retreat, Sandbjerg, Denmark, March 3,

welcome to data structures and algorithms data structures and algorithms 2020 08 31 lecture 1

Algorithms and Data Structures Lecture 6 Binary Search Trees I Fabian Kuhn Algorithms and

CS 310 - Advanced Data Structures and Algorithms Basic Data Structures June 5, 2017 Tong Wang

Algorithms and Data Structures Lecture 11 Dynamic Programming Fabian Kuhn Algorithms and

CS525: Advanced Database Organization Notes 6: Query Optimization and Execution Yousef M.

Table ADT and Sorting Algorithm topics continuing (or reviewing?) CS 24 curriculum A table ADT

Chapter 5 General Architecture of Computer CPU - MEM - I/O Peripheral Computer

The Memory Hierarchy 10/25/16 Transition First half of course: hardware focus How the

Sorted Type Class Interface Diagram SortedType class MakeEmpty Private data: IsFull length

Sorting in Linear Time Pedro Ribeiro DCC/FCUP 2018/2019 Pedro Ribeiro (DCC/FCUP) Sorting in

CS 310 Advanced Data Structures and Algorithms Runtime Analysis May 29, 2018 Mohammad Hadian

Pointers II 1 Outline Pointers arithmetic and others Functions & pointers 2 Pointer

Algorithms and Data Structures Open Addressing, Priority Queue - PowerPoint PPT Presentation

Algorithms and Data Structures Open Addressing, Priority Queue Albert-Ludwigs-Universitt Freiburg Prof. Dr. Rolf Backofen Bioinformatics Group / Department of Computer Science Algorithms and Data Structures, November 2018 Structure Hashing

CS 310 - Advanced Data Structures and Algorithms Basic Data Structures May 31, 2018 Mohammad

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

Data Structures Data Structures Lists Trees Trees Graphs CSE 680 Review basic

COL106: Data Structures and Algorithms Ragesh Jaiswal, IIT Delhi Ragesh Jaiswal, IIT Delhi

Algorithms and Data Structures: Overview Algorithms and data structures Data Abstraction,

Algorithms and Data Structures, or . . . Classical Algorithms of the 50s, 60s and 70s Mary Cryan

Hypo contact and Sasakian SU ( 2 ) -structures in 5-dimensions structures on Lie groups Sasakian

COL106: Data Structures and Algorithms Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL106: Data

4/3/13 CS200 Algorithms and Data Structures Colorado State University Pa Part 7. Tables

Introduction to Algorithms and Data Structures CSC 1051 Algorithms and Data Structures I Dr.

Exams Gerth Stlting Brodal Algorithms and Data Structures Retreat, Sandbjerg, Denmark, March 3,

welcome to data structures and algorithms data structures and algorithms 2020 08 31 lecture 1

Algorithms and Data Structures Lecture 6 Binary Search Trees I Fabian Kuhn Algorithms and

CS 310 - Advanced Data Structures and Algorithms Basic Data Structures June 5, 2017 Tong Wang

Algorithms and Data Structures Lecture 11 Dynamic Programming Fabian Kuhn Algorithms and

CS525: Advanced Database Organization Notes 6: Query Optimization and Execution Yousef M.

Table ADT and Sorting Algorithm topics continuing (or reviewing?) CS 24 curriculum A table ADT

Chapter 5 General Architecture of Computer CPU - MEM - I/O Peripheral Computer

The Memory Hierarchy 10/25/16 Transition First half of course: hardware focus How the

Sorted Type Class Interface Diagram SortedType class MakeEmpty Private data: IsFull length

Sorting in Linear Time Pedro Ribeiro DCC/FCUP 2018/2019 Pedro Ribeiro (DCC/FCUP) Sorting in

CS 310 Advanced Data Structures and Algorithms Runtime Analysis May 29, 2018 Mohammad Hadian

Pointers II 1 Outline Pointers arithmetic and others Functions &amp; pointers 2 Pointer

Pointers II 1 Outline Pointers arithmetic and others Functions & pointers 2 Pointer