Advanced Algorithms COMS31900 Hashing part one Chaining, true - PowerPoint PPT Presentation

Advanced Algorithms – COMS31900 Hashing part one Chaining, true randomness and universal hashing Rapha¨ el Clifford Slides by Benjamin Sach and Markus Jalsenius

Dictionaries In a dictionary data structure we store ( key , value ) -pairs such that for any key there is at most one pair ( key , value ) in the dictionary. Often we want to perform the following three operations: � add ( x, v ) Add the the pair ( x, v ) . � lookup ( x ) Return v if ( x, v ) is in dictionary, or N ULL otherwise. � delete ( x ) Remove pair ( x, v ) (assuming ( x, v ) is in dictionary).

Dictionaries In a dictionary data structure we store ( key , value ) -pairs such that for any key there is at most one pair ( key , value ) in the dictionary. Often we want to perform the following three operations: � add ( x, v ) Add the the pair ( x, v ) . � lookup ( x ) Return v if ( x, v ) is in dictionary, or N ULL otherwise. � delete ( x ) Remove pair ( x, v ) (assuming ( x, v ) is in dictionary). There are many data structures that will do this job, e.g.: � Linked lists � Red-black trees � Binary search trees � Skip lists � (2,3,4)-trees � van Emde Boas trees (later in this course)

Dictionaries In a dictionary data structure we store ( key , value ) -pairs such that for any key there is at most one pair ( key , value ) in the dictionary. Often we want to perform the following three operations: � add ( x, v ) Add the the pair ( x, v ) . � lookup ( x ) Return v if ( x, v ) is in dictionary, or N ULL otherwise. � delete ( x ) Remove pair ( x, v ) (assuming ( x, v ) is in dictionary). There are many data structures that will do this job, e.g.: � Linked lists � Red-black trees � Binary search trees � Skip lists � (2,3,4)-trees � van Emde Boas trees (later in this course) these data structures all support extra operations beyond the three above

Dictionaries In a dictionary data structure we store ( key , value ) -pairs such that for any key there is at most one pair ( key , value ) in the dictionary. Often we want to perform the following three operations: � add ( x, v ) Add the the pair ( x, v ) . � lookup ( x ) Return v if ( x, v ) is in dictionary, or N ULL otherwise. � delete ( x ) Remove pair ( x, v ) (assuming ( x, v ) is in dictionary). There are many data structures that will do this job, e.g.: � Linked lists � Red-black trees � Binary search trees � Skip lists � (2,3,4)-trees � van Emde Boas trees (later in this course) these data structures all support extra operations beyond the three above but none of them take O (1) worst case time for all operations. . .

Dictionaries In a dictionary data structure we store ( key , value ) -pairs such that for any key there is at most one pair ( key , value ) in the dictionary. Often we want to perform the following three operations: � add ( x, v ) Add the the pair ( x, v ) . � lookup ( x ) Return v if ( x, v ) is in dictionary, or N ULL otherwise. � delete ( x ) Remove pair ( x, v ) (assuming ( x, v ) is in dictionary). There are many data structures that will do this job, e.g.: � Linked lists � Red-black trees � Binary search trees � Skip lists � (2,3,4)-trees � van Emde Boas trees (later in this course) these data structures all support extra operations beyond the three above but none of them take O (1) worst case time for all operations. . . so maybe there is room for improvement?

Hash tables We want to store n elements from the universe, U in a dictionary. Typically u = | U | is much, much larger than n . Universe U containing u keys.

Hash tables We want to store n elements from the universe, U in a dictionary. Typically u = | U | is much, much larger than n . Universe U containing u keys. Array T of size m . m

Hash tables We want to store n elements from the universe, U in a dictionary. Typically u = | U | is much, much larger than n . Universe U containing u keys. Array T of size m . m T is called a hash table .

Hash tables We want to store n elements from the universe, U in a dictionary. Typically u = | U | is much, much larger than n . Universe U containing u keys. Array T of size m . m T is called a hash table . A hash function h : U → [ m ] maps a key to a position in T . We write [ m ] to denote the set { 0 , . . . , m − 1 } .

Hash tables We want to store n elements from the universe, U in a dictionary. Typically u = | U | is much, much larger than n . Universe U containing u keys. Array T of size m . h ( x ) ( x, v x ) x m T is called a hash table . A hash function h : U → [ m ] maps a key to a position in T . We write [ m ] to denote the set { 0 , . . . , m − 1 } .

Hash tables We want to store n elements from the universe, U in a dictionary. Typically u = | U | is much, much larger than n . Universe U containing u keys. Array T of size m . h ( x ) ( x, v x ) x m T is called a hash table . A hash function h : U → [ m ] maps a key to a position in T . We write [ m ] to denote the set { 0 , . . . , m − 1 } . We want to avoid collisions , i.e. h ( x ) = h ( y ) for x � = y .

Hash tables We want to store n elements from the universe, U in a dictionary. Typically u = | U | is much, much larger than n . Universe U containing u keys. Array T of size m . h ( x ) ( y, v y ) ( x, v x ) ( z, v z ) ( w, v w ) z x � Collisions can be resolved y m with chaining , i.e. linked list. T is called a hash table . w A hash function h : U → [ m ] maps a key to a position in T . We write [ m ] to denote the set { 0 , . . . , m − 1 } . We want to avoid collisions , i.e. h ( x ) = h ( y ) for x � = y .

Time complexity We cannot avoid collisions entirely since u ≫ m ; some keys from the universe are bound to be mapped to the same position. (remember u is the size of the universe and m is the size of the table) By building a hash table with chaining, we get the following time complexities: Operation Worst case time Comment add ( x, v ) O (1) Simply add item to the list link if necessary. lookup ( x ) O ( length of chain containing x ) We might have to search through the whole list containing x . Only O (1) to perform the actual delete ( x ) O ( length of chain containing x ) delete. . . but you have to find x first

Time complexity We cannot avoid collisions entirely since u ≫ m ; some keys from the universe are bound to be mapped to the same position. (remember u is the size of the universe and m is the size of the table) By building a hash table with chaining, we get the following time complexities: Operation Worst case time Comment add ( x, v ) O (1) Simply add item to the list link if necessary. lookup ( x ) O ( length of chain containing x ) We might have to search through the whole list containing x . Only O (1) to perform the actual delete ( x ) O ( length of chain containing x ) delete. . . but you have to find x first So how long are these chains?

True randomness T HEOREM Consider any n fixed inputs to the hash table (which has size m ) , i.e. any sequence of n add/lookup/delete operations. Pick h uniformly at random from the set of all functions U → [ m ] . The expected run-time per operation is O (1 + n m ) , or simply O (1) if m � n .

True randomness T HEOREM Consider any n fixed inputs to the hash table (which has size m ) , i.e. any sequence of n add/lookup/delete operations. Pick h uniformly at random from the set of all functions U → [ m ] . The expected run-time per operation is O (1 + n m ) , or simply O (1) if m � n . P ROOF

True randomness T HEOREM Consider any n fixed inputs to the hash table (which has size m ) , i.e. any sequence of n add/lookup/delete operations. Pick h uniformly at random from the set of all functions U → [ m ] . The expected run-time per operation is O (1 + n m ) , or simply O (1) if m � n . P ROOF Let x, y be two distinct keys from U .

True randomness T HEOREM Consider any n fixed inputs to the hash table (which has size m ) , i.e. any sequence of n add/lookup/delete operations. Pick h uniformly at random from the set of all functions U → [ m ] . The expected run-time per operation is O (1 + n m ) , or simply O (1) if m � n . P ROOF Let x, y be two distinct keys from U . Let indicator r.v. I x,y be 1 iff h ( x ) = h ( y ) .

True randomness T HEOREM Consider any n fixed inputs to the hash table (which has size m ) , i.e. any sequence of n add/lookup/delete operations. Pick h uniformly at random from the set of all functions U → [ m ] . The expected run-time per operation is O (1 + n m ) , or simply O (1) if m � n . P ROOF iff means if and only if . Let x, y be two distinct keys from U . Let indicator r.v. I x,y be 1 iff h ( x ) = h ( y ) .

True randomness T HEOREM Consider any n fixed inputs to the hash table (which has size m ) , i.e. any sequence of n add/lookup/delete operations. Pick h uniformly at random from the set of all functions U → [ m ] . The expected run-time per operation is O (1 + n m ) , or simply O (1) if m � n . P ROOF iff means if and only if . Let x, y be two distinct keys from U . Let indicator r.v. I x,y be 1 iff h ( x ) = h ( y ) . = 1 � � we have that, Pr h ( x ) = h ( y ) m

Advanced Algorithms COMS31900 Hashing part one Chaining, true - PowerPoint PPT Presentation

Advanced Algorithms COMS31900 Hashing part one Chaining, true randomness and universal hashing Rapha el Clifford Slides by Benjamin Sach and Markus Jalsenius Dictionaries In a dictionary data structure we store ( key , value ) -pairs

14. Hashing Hash Tables, Pre-Hashing, Hashing, Resolving Collisions using Chaining, Simple

Overview Intro to Hashing Intro to Hashing Hashing with Chaining Whats hashing?

14. Hashing Hash Tables, Pre-Hashing, Hashing, Resolving Collisions using Chaining, Simple

Today. Cuckoo hashing. Today. Cuckoo hashing. Johnson-Lindenstrass. Cuckoo hashing. Hashing

Advanced Algorithms COMS31900 Hashing part two Static Perfect Hashing Rapha el Clifford

Advanced Algorithms COMS31900 Hashing part three Cuckoo Hashing Rapha el Clifford Slides

Chaining Operator in Climb Method Chaining jQuery Method Chaining Extended Climb Christopher

Advanced Algorithms COMS31900 Lowest Common Ancestor (with a bit on on Range Minimum Queries)

Exercise Sheet 1: Hashing and Bloom filters COMS31900 Advanced Algorithms 2019/2020 Please feel

Hashing Algorithms Hash functions Separate Chaining Linear Probing Double Hashing Symbol-Table

Hashing (Application of Probability) Ashwinee Panda Final CS 70 Lecture! 9 Aug 2018 Overview

Advanced Algorithms COMS31900 Approximation algorithms part three (Fully) Polynomial Time

Advanced Algorithms COMS31900 Approximation algorithms part four Asymptotic Polynomial Time

Advanced Algorithms COMS31900 Approximation algorithms part two more constant factor

CS 310 Advanced Data Structures and Algorithms Hashing June 5, 2018 Mohammad Hadian

Using first order logic (Ch. 9) Backward chaining Backward chaining is almost the opposite of

Hashing Hashing What is it? A form of narcotic intake? A side order for your eggs? A

Exact Security Analysis of Hash-then-Mask Type Probabilistic MAC Constructions Avijit Dutta and

Mes Messa sage ge Aut uthe hent ntica ication tion Cod Codes es Instructor: Ahmad

Foundation of Cryptography (0368-4162-01), Lecture 7 MACs and Signatures Iftach Haitner, Tel

Week 9 Oliver Kullmann Generalising arrays Hash tables Direct addressing Hashing in

CS 1501 www.cs.pitt.edu/~nlf4/cs1501/ Hashing Wouldnt it be wonderful if... Search through a

Advanced Algorithms Count Distinct Elements a sequence x 1 , x 2 , ...,

Uses of dictionaries n Symbol table in a compiler n Key: nameof identifier n Values: