review
play

Review Linked List Struktur Data & Algoritme insert, find, - PowerPoint PPT Presentation

Review Linked List Struktur Data & Algoritme insert, find, delete operations take O(n) Stack & Queue ( Data Structures & Algorithms ) insert, find, delete operations take O(1) but the access is restricted Hash Table


  1. Review � Linked List Struktur Data & Algoritme � insert, find, delete operations take O(n) � Stack & Queue ( Data Structures & Algorithms ) � insert, find, delete operations take O(1) � but the access is restricted Hash Table � Binary Search Tree � insert, find, delete operations take O(log n) in average case, but take O(n) in worst case � AVL Tree, Red-Black Tree Denny ( denny@cs.ui.ac.id ) � insert, find, delete operations take O(log n) Suryana Setiawan ( setiawan@cs.ui.ac.id ) Fakultas I lm u Kom puter Universitas I ndonesia Sem ester Genap - 2 0 0 4 / 2 0 0 5 Version 2 .0 - I nternal Use Only SDA/ TOPIC/ V2.0/ 2 Review Objectives � Array � Understand hash table and its operations � all operations take O(1) time � Understand the advantage and disadvantage using hash table � data accessed using index (integer) � size should be determined first � not growable SDA/ TOPIC/ V2.0/ 3 SDA/ TOPIC/ V2.0/ 4 1

  2. Hash Tables Outline � Hashing � Hashing is used for storing relatively large amount of data in a table called hash table ADT. � Definition � Hash table is usually fixed as H-size, which is larger � Hash function than the amount of data that we want to store. � Collition resolution � We define the load factor ( λ ) to be the ratio of data to • Open hashing the size of the hash table. • Separate chaining � Hash function maps an item into an index in range. • Closed hashing (Open addressing) • Linear probing hash table • Quadratic probing 0 item • Double hashing 1 key hash • Primary Clustering, Secondary Clustering 2 function � Access: insert, find, delete 3 H-1 SDA/ TOPIC/ V2.0/ 5 SDA/ TOPIC/ V2.0/ 6 Hash Tables (2) Hash Function � Hashing is a technique used to perform insertions, � Hashing function should have the following features: deletions, and finds in constant average time. � Easy to compute. � To insert or find a certain data, we assign a key to the � Two distinct key map to two different cells in array (Not elements and use a function to determine the location true in general) - why?. of the element within the table called hash function. � This can be achieved by using direct-address table where universal set of keys is reasonably small. � Hash tables are arrays of cells with fixed size containing data or keys corresponding to data. � Distributes the keys evenly among cells. � For each key, we use the hashing function to map key � One simple hashing function is to use mod function into some number in the range 0 to H-size-1 using with a prime number. hashing function. � Any manipulation of digits, with least complexity and good distribution can be used. SDA/ TOPIC/ V2.0/ 7 SDA/ TOPIC/ V2.0/ 8 2

  3. Hash Function: Truncation Hash Function: Folding � Part of the key is simply ignored, with the remainder � The data can be split up into smaller chunks which truncated or concatenated to form the index. are then folded together in some form. Phone no: index Phone no: 3-group index 731-3018 338 7313018 73+13+018 104 539-2309 329 5392309 53+92+309 454 428-1397 217 4281397 42+81+397 520 SDA/ TOPIC/ V2.0/ 9 SDA/ TOPIC/ V2.0/ 10 Hash Function: Modular arithmetic Choosing a hash function � Convert the data into an integer, divide by the size of � A good has function should satisfy two criteria: the hash table, and take the remainder as the index. 1. It should be quick to compute 3-group index 2. It should minimize the number of collisions 731+3018 3749 % 100 = 49 539+2309 2848 % 100 = 48 428+1397 1825 % 100 = 25 SDA/ TOPIC/ V2.0/ 11 SDA/ TOPIC/ V2.0/ 12 3

  4. Example of hash function Example of hash function � Hash function for string int hash(String key, int tableSize) { � X = 128 int hashVal = 0; � A 3 X 3 + A 2 X 2 + A 1 X 1 + A 0 X 0 for (int i=0; i < key.length(); i++) { � (((A 3 X) + A 2 ) X + A 1 ) X + A 0 hashVal = (hashVal * 128 � The result of hash function is much larger than the + key.charAt(i)) % tableSize; size of table, so we should modulo the result with the } size of hash table. return hashVal % tableSize; } � Modulo � (A + B) % C = (A % C + B % C) % C � (A * B) % C = (A % C * B % C) % C SDA/ TOPIC/ V2.0/ 13 SDA/ TOPIC/ V2.0/ 14 Example of hash function Example of hash function int hash(String key, int tableSize) { int hash(String key, int tableSize) { int hashVal = 0; int hashVal = 0; for (int i=0; i < key.length(); i++) { for (int i=0; i < key.length(); i++) { hashVal = (hashVal * 37 hashVal += key.charAt(i) + key.charAt(i)); } } return hashVal % tableSize; hashVal %= tableSize; } if (hashVal < 0) { hashVal += tableSize; } return hashVal; } SDA/ TOPIC/ V2.0/ 15 SDA/ TOPIC/ V2.0/ 16 4

  5. Collision resolution Closed Hashing � When two keys map into the same cell, we get a � If collision, try to find alternative cells within table. collision. � Closed hashing also known as open addressing. � We may have collision in insertion, and need to set a � For insertion, we try cells in sequence by using procedure (collision resolution) to resolve it. incremented function like: � h i (x) = (hash(x) + f(i)) mod H-size f(0) = 0 � Function f is used as collision resolution strategy. � The table is bigger than the number of data. � Different method to choose function f : � Linear probing � Quadratic probing � Double hashing SDA/ TOPIC/ V2.0/ 17 SDA/ TOPIC/ V2.0/ 18 Linear probing Hashing - insert � Use a linear function f(i) = i 0 alpha � Find the first position in the table for the key, which is 1 crystal 2 close to the actual position. 3 dawn 4 emerald � Least complex function. 5 flamingo � May result in primary clustering. 6 7 hallmark � Elements that hash to the different location probe the 8 9 same alternative cells 10 � The complexity of this probing is dependent on the 11 marigold 12 moon value of λ (load factor). 13 14 � We do not use this probing if λ > 0.5. 15 . . . SDA/ TOPIC/ V2.0/ 19 SDA/ TOPIC/ V2.0/ 20 5

  6. Hashing - lookup Hashing - delete � lazy deletion - why? 0 alpha 1 cobalt? 2 crystal 0 alpha 3 dawn 1 4 emerald 2 crystal 5 flamingo 3 dawn 6 delete emerald 4 7 hallmark 5 flamingo 8 6 9 7 hallmark 10 8 11 9 marigold? 12 moon 10 13 marigold 11 14 delete moon 12 private? 15 private 13 marigold . 14 . . 15 private . . . SDA/ TOPIC/ V2.0/ 21 SDA/ TOPIC/ V2.0/ 22 Hashing - operation after delete Primary Clustering � Elements that hash to the different location probe the 0 alpha same alternative cells 1 custom (insert) 2 crystal 3 dawn alpha alpha 4 5 flamingo canary canary cobalt crystal crystal 6 dark dawn dawn 7 hallmark custom custom 8 flamingo flamingo 9 10 hallmark hallmark 11 marigold? 12 13 marigold 14 15 private . . marigold marigold . private private . . . . . . SDA/ TOPIC/ V2.0/ 23 SDA/ TOPIC/ V2.0/ 24 6

  7. Quadratic probing Double hashing � Eliminate the primary clustering by selecting f(i) = i 2 � Collision resolution function is another hash function like f(i) = i * hash2 (x) � There is more problem with a hash table that is more than half full. � Each time a factor of hash2 (x) is added to probe. � You have to select appropriate table size that is not � Have to be careful for the choice of second hash square of a number. function to ensure that it does not come to zero and it probes all the cells. � We can prove that quadratic probing with table size prime number and at least half empty will always find � It is essential to have a prime size hash table. a location for an element. � Can use increment to collision by noting that quadratic function f(i) = i 2 = f(i-1) + 2 i - 1. � Elements that hash to the same location will probe the same alternative cells (secondary clustering). SDA/ TOPIC/ V2.0/ 25 SDA/ TOPIC/ V2.0/ 26 Double Hashing Open Hashing � Collision problems is solved by inserting all elements that hash to the same bucket into a single collection of values. alpha alpha � Open Hashing: canary cobalt crystal crystal � To keep a linked list of all the elements that are dark done dawn dawn hashed to the same cell (separate chaining). custom custom flamingo flamingo � Each cell in the hash table contains a pointer to a linked list containing the data. hallmark hallmark � Functions and Analysis of Open Hashing: � Inserting a new element in to the table: We add the element at the beginning or the end of the appropriate marigold marigold linked list. � Depending if you would want to check for duplicates or private private . . not. . . . . � It also depends on how frequent you expect to access the most recently added elements. SDA/ TOPIC/ V2.0/ 27 SDA/ TOPIC/ V2.0/ 28 7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend