Advanced Algorithms COMS31900 Hashing part one Chaining, true - - PowerPoint PPT Presentation

advanced algorithms coms31900 hashing part one chaining
SMART_READER_LITE
LIVE PREVIEW

Advanced Algorithms COMS31900 Hashing part one Chaining, true - - PowerPoint PPT Presentation

Advanced Algorithms COMS31900 Hashing part one Chaining, true randomness and universal hashing Rapha el Clifford Slides by Benjamin Sach and Markus Jalsenius Dictionaries In a dictionary data structure we store ( key , value ) -pairs


slide-1
SLIDE 1

Advanced Algorithms – COMS31900 Hashing part one Chaining, true randomness and universal hashing

Rapha¨ el Clifford Slides by Benjamin Sach and Markus Jalsenius

slide-2
SLIDE 2

Dictionaries

Often we want to perform the following three operations: In a dictionary data structure we store (key, value)-pairs

add(x, v)

Add the the pair (x, v).

lookup(x)

Return v if (x, v) is in dictionary, or NULL otherwise.

delete(x)

Remove pair (x, v) (assuming (x, v) is in dictionary). such that for any key there is at most one pair (key, value) in the dictionary.

slide-3
SLIDE 3

Dictionaries

Often we want to perform the following three operations:

Linked lists Binary search trees (2,3,4)-trees

In a dictionary data structure we store (key, value)-pairs

add(x, v)

Add the the pair (x, v).

lookup(x)

Return v if (x, v) is in dictionary, or NULL otherwise.

delete(x)

Remove pair (x, v) (assuming (x, v) is in dictionary). There are many data structures that will do this job, e.g.:

Red-black trees Skip lists van Emde Boas trees (later in this course)

such that for any key there is at most one pair (key, value) in the dictionary.

slide-4
SLIDE 4

Dictionaries

Often we want to perform the following three operations:

Linked lists Binary search trees (2,3,4)-trees

In a dictionary data structure we store (key, value)-pairs

add(x, v)

Add the the pair (x, v).

lookup(x)

Return v if (x, v) is in dictionary, or NULL otherwise.

delete(x)

Remove pair (x, v) (assuming (x, v) is in dictionary). There are many data structures that will do this job, e.g.:

Red-black trees Skip lists van Emde Boas trees (later in this course)

such that for any key there is at most one pair (key, value) in the dictionary. these data structures all support extra operations beyond the three above

slide-5
SLIDE 5

Dictionaries

Often we want to perform the following three operations:

Linked lists Binary search trees (2,3,4)-trees

In a dictionary data structure we store (key, value)-pairs

add(x, v)

Add the the pair (x, v).

lookup(x)

Return v if (x, v) is in dictionary, or NULL otherwise.

delete(x)

Remove pair (x, v) (assuming (x, v) is in dictionary). There are many data structures that will do this job, e.g.:

Red-black trees Skip lists van Emde Boas trees (later in this course)

such that for any key there is at most one pair (key, value) in the dictionary. these data structures all support extra operations beyond the three above but none of them take O(1) worst case time for all operations. . .

slide-6
SLIDE 6

Dictionaries

Often we want to perform the following three operations:

Linked lists Binary search trees (2,3,4)-trees

In a dictionary data structure we store (key, value)-pairs

add(x, v)

Add the the pair (x, v).

lookup(x)

Return v if (x, v) is in dictionary, or NULL otherwise.

delete(x)

Remove pair (x, v) (assuming (x, v) is in dictionary). There are many data structures that will do this job, e.g.:

Red-black trees Skip lists van Emde Boas trees (later in this course)

such that for any key there is at most one pair (key, value) in the dictionary. these data structures all support extra operations beyond the three above but none of them take O(1) worst case time for all operations. . . so maybe there is room for improvement?

slide-7
SLIDE 7

Hash tables

Universe U containing u keys. We want to store n elements from the universe, U in a dictionary. Typically u = |U| is much, much larger than n.

slide-8
SLIDE 8

Hash tables

Universe U containing u keys. We want to store n elements from the universe, U in a dictionary. Array T of size m. Typically u = |U| is much, much larger than n.

m

slide-9
SLIDE 9

Hash tables

Universe U containing u keys. We want to store n elements from the universe, U in a dictionary. Array T of size m. Typically u = |U| is much, much larger than n.

T is called a hash table. m

slide-10
SLIDE 10

Hash tables

Universe U containing u keys. We want to store n elements from the universe, U in a dictionary. Array T of size m. A hash function h : U → [m] maps a key to a position in T .

We write [m] to denote the set {0, . . . , m − 1}.

Typically u = |U| is much, much larger than n.

T is called a hash table. m

slide-11
SLIDE 11

Hash tables

Universe U containing u keys. We want to store n elements from the universe, U in a dictionary. Array T of size m. A hash function h : U → [m] maps a key to a position in T .

x h(x)

( x, vx)

We write [m] to denote the set {0, . . . , m − 1}.

Typically u = |U| is much, much larger than n.

T is called a hash table. m

slide-12
SLIDE 12

Hash tables

Universe U containing u keys. We want to store n elements from the universe, U in a dictionary. Array T of size m. A hash function h : U → [m] maps a key to a position in T .

x h(x)

We want to avoid collisions, i.e. h(x) = h(y) for x = y.

( x, vx)

We write [m] to denote the set {0, . . . , m − 1}.

Typically u = |U| is much, much larger than n.

T is called a hash table. m

slide-13
SLIDE 13

Hash tables

Universe U containing u keys. We want to store n elements from the universe, U in a dictionary. Array T of size m. A hash function h : U → [m] maps a key to a position in T .

x h(x)

We want to avoid collisions, i.e. h(x) = h(y) for x = y.

y w z

( x, vx) ( y, vy) ( z, vz) ( w, vw)

Collisions can be resolved

with chaining, i.e. linked list.

We write [m] to denote the set {0, . . . , m − 1}.

Typically u = |U| is much, much larger than n.

T is called a hash table. m

slide-14
SLIDE 14

Time complexity

We cannot avoid collisions entirely since u ≫ m; Operation Worst case time Comment Simply add item to the list link if necessary. We might have to search through the whole list containing x. Only O(1) to perform the actual

  • delete. . . but you have to find x first

add(x, v)

O(1)

O(length of chain containing x)

lookup(x) delete(x) some keys from the universe are bound to be mapped to the same position.

O(length of chain containing x)

By building a hash table with chaining, we get the following time complexities: (remember u is the size of the universe and m is the size of the table)

slide-15
SLIDE 15

Time complexity

We cannot avoid collisions entirely since u ≫ m; Operation Worst case time Comment Simply add item to the list link if necessary. We might have to search through the whole list containing x. Only O(1) to perform the actual

  • delete. . . but you have to find x first

add(x, v)

O(1)

O(length of chain containing x)

lookup(x) delete(x) some keys from the universe are bound to be mapped to the same position.

So how long are these chains?

O(length of chain containing x)

By building a hash table with chaining, we get the following time complexities: (remember u is the size of the universe and m is the size of the table)

slide-16
SLIDE 16

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m), i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-17
SLIDE 17

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-18
SLIDE 18

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

Let x, y be two distinct keys from U.

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-19
SLIDE 19

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y).

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-20
SLIDE 20

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y).

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-21
SLIDE 21

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y). we have that, Pr

  • h(x) = h(y)
  • = 1

m

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-22
SLIDE 22

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y). we have that, Pr

  • h(x) = h(y)
  • = 1

m

this is because h(x) and h(y) are chosen uniformly and independently from [m].

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-23
SLIDE 23

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y). we have that, Pr

  • h(x) = h(y)
  • = 1

m

this is because h(x) and h(y) are chosen uniformly and independently from [m]. Therefore, E(Ix,y) = Pr(Ix,y = 1) = Pr

  • h(x) = h(y)
  • = 1

m .

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-24
SLIDE 24

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y). we have that, Pr

  • h(x) = h(y)
  • = 1

m

this is because h(x) and h(y) are chosen uniformly and independently from [m]. Therefore, E(Ix,y) = Pr(Ix,y = 1) = Pr

  • h(x) = h(y)
  • = 1

m .

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

We have that, E(Ix,y) = 1

m .

slide-25
SLIDE 25

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y).

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

We have that, E(Ix,y) = 1

m .

slide-26
SLIDE 26

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y).

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

We have that, E(Ix,y) = 1

m .

slide-27
SLIDE 27

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y).

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

We have that, E(Ix,y) = 1

m .

slide-28
SLIDE 28

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y). We have that, E(Ix,y) = 1

m .

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-29
SLIDE 29

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y). We have that, E(Ix,y) = 1

m .

Let Nx be the number of keys stored in T that are hashed to h(x)

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-30
SLIDE 30

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y). We have that, E(Ix,y) = 1

m .

Let Nx be the number of keys stored in T that are hashed to h(x) so, in the worst case it takes Nx time to look up x in T .

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-31
SLIDE 31

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

Observe that Nx =

  • y∈T

Ix,y

iff means if and only if. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y). We have that, E(Ix,y) = 1

m .

Let Nx be the number of keys stored in T that are hashed to h(x) so, in the worst case it takes Nx time to look up x in T . the keys in T

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-32
SLIDE 32

True randomness

THEOREM

Consider any n fixed inputs to the hash table (which has size m),

PROOF

Observe that Nx =

  • y∈T

Ix,y

iff means if and only if. linearity of expectation. Let x, y be two distinct keys from U. Let indicator r.v. Ix,y be 1 iff h(x) = h(y). We have that, E(Ix,y) = 1

m .

Let Nx be the number of keys stored in T that are hashed to h(x) so, in the worst case it takes Nx time to look up x in T . Finally, we have that E(Nx) = E

 

y∈T

Ix,y   =

  • y∈T

E(Ix,y) = n· 1 m = n m

the keys in T

i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1 + n

m ), or simply O(1) if m n.

Pick h uniformly at random from the set of all functions U → [m].

slide-33
SLIDE 33

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function?

slide-34
SLIDE 34

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , this is a number in [m], so requires ≈ log2 m bits.

slide-35
SLIDE 35

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space!

slide-36
SLIDE 36

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space! (in particular, it’s much bigger than the table :s)

slide-37
SLIDE 37

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , Why not pick the hash function as we go? this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space! (in particular, it’s much bigger than the table :s)

slide-38
SLIDE 38

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , Why not pick the hash function as we go? this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space! (in particular, it’s much bigger than the table :s) Couldn’t we generate h(x) when we first see x?

slide-39
SLIDE 39

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , Why not pick the hash function as we go? this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space! (in particular, it’s much bigger than the table :s) Couldn’t we generate h(x) when we first see x? Wouldn’t we only use n log2 m bits? (one per key we actually store)

slide-40
SLIDE 40

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , Why not pick the hash function as we go? this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space! (in particular, it’s much bigger than the table :s) Couldn’t we generate h(x) when we first see x? Wouldn’t we only use n log2 m bits? (one per key we actually store) The problem with this approach is recalling h(x) the next time we see x

slide-41
SLIDE 41

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , Why not pick the hash function as we go? this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space! (in particular, it’s much bigger than the table :s) Couldn’t we generate h(x) when we first see x? Wouldn’t we only use n log2 m bits? (one per key we actually store) The problem with this approach is recalling h(x) the next time we see x Essentially we’d need to build a dictionary to solve the dictionary problem!

slide-42
SLIDE 42

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , Why not pick the hash function as we go? this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space! (in particular, it’s much bigger than the table :s) Couldn’t we generate h(x) when we first see x? Wouldn’t we only use n log2 m bits? (one per key we actually store) The problem with this approach is recalling h(x) the next time we see x Essentially we’d need to build a dictionary to solve the dictionary problem! This has become rather cyclic... let’s try something else!

slide-43
SLIDE 43

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space! (in particular, it’s much bigger than the table :s)

slide-44
SLIDE 44

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , Instead, we define a set, or family of hash functions: H = {h1, h2, . . . }. this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space! (in particular, it’s much bigger than the table :s)

slide-45
SLIDE 45

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , Instead, we define a set, or family of hash functions: H = {h1, h2, . . . }. this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space! (in particular, it’s much bigger than the table :s) As part of initialising the hash table, we choose the hash function h from H randomly.

slide-46
SLIDE 46

Specifying the hash function

Problem: how do we specify an arbitrary (e.g. a truly random) hash function? For each key in U we need to specify an arbitrary position in T , Instead, we define a set, or family of hash functions: H = {h1, h2, . . . }. How should we specify the hash functions in H and how do we pick one at random? this is a number in [m], so requires ≈ log2 m bits. So in total we need ≈ u log2 m bits, which is a ridiculous amount of space! (in particular, it’s much bigger than the table :s) As part of initialising the hash table, we choose the hash function h from H randomly.

slide-47
SLIDE 47

Weakly universal hashing A set H of hash functions is weakly universal if for any two distinct keys x, y ∈ U, Pr

  • h(x) = h(y)
  • 1

m

where h is chosen uniformly at random from H.

slide-48
SLIDE 48

Weakly universal hashing A set H of hash functions is weakly universal if for any two distinct keys x, y ∈ U, Pr

  • h(x) = h(y)
  • 1

m

where h is chosen uniformly at random from H.

OBSERVE

The randomness here comes from the fact that h is picked randomly.

slide-49
SLIDE 49

Weakly universal hashing A set H of hash functions is weakly universal if for any two distinct keys x, y ∈ U, Pr

  • h(x) = h(y)
  • 1

m

where h is chosen uniformly at random from H.

OBSERVE

The randomness here comes from the fact that h is picked randomly.

THEOREM

Consider any n fixed inputs to the hash table (which has size m), i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1) if m n. Pick h uniformly at random from a weakly universal set H of hash functions.

slide-50
SLIDE 50

Weakly universal hashing A set H of hash functions is weakly universal if for any two distinct keys x, y ∈ U, Pr

  • h(x) = h(y)
  • 1

m

where h is chosen uniformly at random from H.

PROOF

The proof we used for true randomness works here too (which is nice)

OBSERVE

The randomness here comes from the fact that h is picked randomly.

THEOREM

Consider any n fixed inputs to the hash table (which has size m), i.e. any sequence of n add/lookup/delete operations. The expected run-time per operation is O(1) if m n. Pick h uniformly at random from a weakly universal set H of hash functions.

slide-51
SLIDE 51

Constructing a weakly universal family of hash functions Suppose U = [u], i.e. the keys in the universe are integers 0 to u−1. Let p be any prime bigger than u. For a, b ∈ [p], let ha,b(x) = ((ax + b) mod p) mod m, Hp,m = {ha,b | a ∈ {1, . . . , p − 1}, b ∈ {0, . . . , p − 1}}.

slide-52
SLIDE 52

Constructing a weakly universal family of hash functions Suppose U = [u], i.e. the keys in the universe are integers 0 to u−1. Let p be any prime bigger than u. For a, b ∈ [p], let ha,b(x) = ((ax + b) mod p) mod m, Hp,m = {ha,b | a ∈ {1, . . . , p − 1}, b ∈ {0, . . . , p − 1}}.

THEOREM

Hp,m is a weakly universal set of hash functions.

slide-53
SLIDE 53

Constructing a weakly universal family of hash functions Suppose U = [u], i.e. the keys in the universe are integers 0 to u−1. Let p be any prime bigger than u. For a, b ∈ [p], let ha,b(x) = ((ax + b) mod p) mod m, Hp,m = {ha,b | a ∈ {1, . . . , p − 1}, b ∈ {0, . . . , p − 1}}.

THEOREM

Hp,m is a weakly universal set of hash functions.

PROOF

See CLRS, Theorem 11.5, (page 267 in 3rd edition).

slide-54
SLIDE 54

Constructing a weakly universal family of hash functions Suppose U = [u], i.e. the keys in the universe are integers 0 to u−1. Let p be any prime bigger than u. For a, b ∈ [p], let ha,b(x) = ((ax + b) mod p) mod m, Hp,m = {ha,b | a ∈ {1, . . . , p − 1}, b ∈ {0, . . . , p − 1}}.

THEOREM

Hp,m is a weakly universal set of hash functions.

PROOF

See CLRS, Theorem 11.5, (page 267 in 3rd edition).

OBSERVE

ax + b is a linear transformation which “spreads the keys” over p values when

taken modulo p. This does not cause any collisions.

Only when taken modulo m do we get collisions.

slide-55
SLIDE 55

True randomness vs. weakly universal hashing

the expected lookup time in the hash table is O(1). we have seen that when m n, For both, true randomness (h is picked uniformly from the set of all possible hash functions) and weakly universal hashing (h is picked uniformly from a weakly universal set of hash functions)

slide-56
SLIDE 56

True randomness vs. weakly universal hashing

Since constructing a weakly universal set of hash functions seems much easier the expected lookup time in the hash table is O(1). we have seen that when m n, than obtaining true randomness, this is all good news! For both, true randomness (h is picked uniformly from the set of all possible hash functions) and weakly universal hashing (h is picked uniformly from a weakly universal set of hash functions)

slide-57
SLIDE 57

True randomness vs. weakly universal hashing

Since constructing a weakly universal set of hash functions seems much easier

isn’t it?

the expected lookup time in the hash table is O(1). we have seen that when m n, than obtaining true randomness, this is all good news! For both, true randomness (h is picked uniformly from the set of all possible hash functions) and weakly universal hashing (h is picked uniformly from a weakly universal set of hash functions)

slide-58
SLIDE 58

True randomness vs. weakly universal hashing

Since constructing a weakly universal set of hash functions seems much easier What about the length of the longest chain? (the longest linked list)

isn’t it?

the expected lookup time in the hash table is O(1). we have seen that when m n, than obtaining true randomness, this is all good news! For both, true randomness (h is picked uniformly from the set of all possible hash functions) and weakly universal hashing (h is picked uniformly from a weakly universal set of hash functions)

slide-59
SLIDE 59

True randomness vs. weakly universal hashing

Since constructing a weakly universal set of hash functions seems much easier What about the length of the longest chain? (the longest linked list)

isn’t it?

If it is very long, some lookups could take a very long time. . . the expected lookup time in the hash table is O(1). we have seen that when m n, than obtaining true randomness, this is all good news! For both, true randomness (h is picked uniformly from the set of all possible hash functions) and weakly universal hashing (h is picked uniformly from a weakly universal set of hash functions)

slide-60
SLIDE 60

Longest chain – true randomness

If h is selected uniformly at random from all functions U → [m] then,

Pr (any chain has length 3 log m ) 1 m .

LEMMA

  • ver m fixed inputs,
slide-61
SLIDE 61

Longest chain – true randomness

If h is selected uniformly at random from all functions U → [m] then,

Pr (any chain has length 3 log m ) 1 m .

LEMMA OBSERVE

In this lemma we insert m keys, i.e. n = m.

  • ver m fixed inputs,
slide-62
SLIDE 62

Longest chain – true randomness

If h is selected uniformly at random from all functions U → [m] then,

Pr (any chain has length 3 log m ) 1 m .

LEMMA OBSERVE

In this lemma we insert m keys, i.e. n = m.

  • ver m fixed inputs,
slide-63
SLIDE 63

Longest chain – true randomness

If h is selected uniformly at random from all functions U → [m] then,

Pr (any chain has length 3 log m ) 1 m .

LEMMA OBSERVE

In this lemma we insert m keys, i.e. n = m.

PROOF

The problem is equivalent to showing that if we randomly throw m balls into m bins, the probability of having a bin with at least 3 log m balls is at most 1

m .

· · ·

  • ver m fixed inputs,
slide-64
SLIDE 64

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

slide-65
SLIDE 65

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Choose any k of the m balls (we’ll pick k in a bit)
slide-66
SLIDE 66

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Choose any k of the m balls (we’ll pick k in a bit)

the probability that all of these k balls go into the first bin is

1 mk .

slide-67
SLIDE 67

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

slide-68
SLIDE 68

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

slide-69
SLIDE 69

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us Let V1, . . . , Vq be q events. Then

THEOREM

Pr

  • q
  • i=1

Vi

  • q
  • i=1

Pr(Vi).

slide-70
SLIDE 70

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

slide-71
SLIDE 71

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

m k

  • =

m! k!(m − k)!

slide-72
SLIDE 72

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

m k

  • =

m! k!(m − k)! = m · (m − 1) · (m − 2) · . . . (m − k + 1) · (m − k)! k!(m − k)!

slide-73
SLIDE 73

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

m k

  • =

m! k!(m − k)! = m · (m − 1) · (m − 2) · . . . (m − k + 1) · (m − k)! k!(m − k)!

slide-74
SLIDE 74

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

m · (m) · (m) · . . . (m) k! m k

  • =

m! k!(m − k)! = m · (m − 1) · (m − 2) · . . . (m − k + 1) · (m − k)! k!(m − k)!

slide-75
SLIDE 75

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

m · (m) · (m) · . . . (m) k! m k

  • =

m! k!(m − k)! = m · (m − 1) · (m − 2) · . . . (m − k + 1) · (m − k)! k!(m − k)!

slide-76
SLIDE 76

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

m · (m) · (m) · . . . (m) k! k m k

  • =

m! k!(m − k)! = m · (m − 1) · (m − 2) · . . . (m − k + 1) · (m − k)! k!(m − k)!

slide-77
SLIDE 77

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

m · (m) · (m) · . . . (m) k! mk k! k m k

  • =

m! k!(m − k)! = m · (m − 1) · (m − 2) · . . . (m − k + 1) · (m − k)! k!(m − k)!

slide-78
SLIDE 78

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! .

By using the union bound again, we have that

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

slide-79
SLIDE 79

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! . Pr(at least one bin receives at least k balls) m · Pr(X1 k) m k! .

By using the union bound again, we have that

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

slide-80
SLIDE 80

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! . Now we set k = 3 log m and observe that m k! 1 m

for m 2, and we are done.

Pr(at least one bin receives at least k balls) m · Pr(X1 k) m k! .

By using the union bound again, we have that

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

slide-81
SLIDE 81

Longest chain – true randomness

PROOF

  • continued. . .

Let X1 be the number of balls in the first bin.

  • Pr(X1 k)

m k

  • ·

1 mk 1 k! . Now we set k = 3 log m and observe that m k! 1 m

for m 2, and we are done.

Pr(at least one bin receives at least k balls) m · Pr(X1 k) m k! .

By using the union bound again, we have that

Number of subsets of size k.

Choose any k of the m balls (we’ll pick k in a bit) the probability that all of these k balls go into the first bin is

1 mk .

So, the union bound gives us

Why is m

k! 1 m ? (when k = 3 log m)

k! = k × (k − 1) × (k − 2) . . . × 2 × 1

k terms

k! > 2 × 2 × 2 . . . × 2 × 1 = 2k−1

Let k = 3 log m . . .

k! > 2(3 log m−1) 22 log m = (2log m)2 = m2

so m

k! m m2 = 1 m

slide-82
SLIDE 82

Longest chain – true randomness

If h is selected uniformly at random from all functions U → [m] then,

Pr (any chain has length 3 log m ) 1 m .

LEMMA OBSERVE

In this lemma we insert m keys, i.e. n = m.

PROOF

The problem is equivalent to showing that if we randomly throw m balls into m bins, the probability of having a bin with at least 3 log m balls is at most 1

m .

· · ·

  • ver m fixed inputs,
slide-83
SLIDE 83

Longest chain – weakly universal hashing

The conclusion from previous slides is that with true randomness, the longest chain is very short (at most 3 log m) with high probability.

slide-84
SLIDE 84

Longest chain – weakly universal hashing

The conclusion from previous slides is that with true randomness, If h is picked uniformly at random from a weakly universal set of hash functions then,

  • ver m fixed inputs,

Pr

  • any chain has length 1 +

√ 2m

  • 1

2 .

LEMMA

the longest chain is very short (at most 3 log m) with high probability.

slide-85
SLIDE 85

Longest chain – weakly universal hashing

The conclusion from previous slides is that with true randomness, If h is picked uniformly at random from a weakly universal set of hash functions then,

  • ver m fixed inputs,

Pr

  • any chain has length 1 +

√ 2m

  • 1

2 .

LEMMA OBSERVE

This rubbish upper bound of 1

2 does not necessarily rule out the possibility that the

tightest upper bound is indeed very small. However, the upper bound of 1

2 is in fact tight!

the longest chain is very short (at most 3 log m) with high probability.

slide-86
SLIDE 86

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y).

slide-87
SLIDE 87

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y). Let r.v. C be the total number of collisions: C =

x,y∈T, x<y Ix,y.

slide-88
SLIDE 88

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y). Let r.v. C be the total number of collisions: C =

x,y∈T, x<y Ix,y.

Using linearity of expectation and E(Ix,y) = 1

m (h is weakly universal),

E(C) = E

x,y∈T, x<y

Ix,y

  • =
  • x,y∈T, x<y

E(Ix,y) = m 2

  • · 1

m m 2 .

slide-89
SLIDE 89

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y). Let r.v. C be the total number of collisions: C =

x,y∈T, x<y Ix,y.

Using linearity of expectation and E(Ix,y) = 1

m (h is weakly universal),

E(C) = E

x,y∈T, x<y

Ix,y

  • =
  • x,y∈T, x<y

E(Ix,y) = m 2

  • · 1

m m 2 . by Markov’s inequality, Pr(C m) E(C)

m

1

2 .

slide-90
SLIDE 90

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y). Let r.v. C be the total number of collisions: C =

x,y∈T, x<y Ix,y.

Using linearity of expectation and E(Ix,y) = 1

m (h is weakly universal),

E(C) = E

x,y∈T, x<y

Ix,y

  • =
  • x,y∈T, x<y

E(Ix,y) = m 2

  • · 1

m m 2 . by Markov’s inequality, Pr(C m) E(C)

m

1

2 .

Let r.v. L be the length of the longest chain. Then C L

2

  • .
slide-91
SLIDE 91

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y). Let r.v. C be the total number of collisions: C =

x,y∈T, x<y Ix,y.

Using linearity of expectation and E(Ix,y) = 1

m (h is weakly universal),

E(C) = E

x,y∈T, x<y

Ix,y

  • =
  • x,y∈T, x<y

E(Ix,y) = m 2

  • · 1

m m 2 . by Markov’s inequality, Pr(C m) E(C)

m

1

2 .

Let r.v. L be the length of the longest chain. Then C L

2

  • .

This is because a chain of length L causes

L

2

  • collisions!
slide-92
SLIDE 92

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y). Let r.v. C be the total number of collisions: C =

x,y∈T, x<y Ix,y.

Using linearity of expectation and E(Ix,y) = 1

m (h is weakly universal),

E(C) = E

x,y∈T, x<y

Ix,y

  • =
  • x,y∈T, x<y

E(Ix,y) = m 2

  • · 1

m m 2 . by Markov’s inequality, Pr(C m) E(C)

m

1

2 .

Let r.v. L be the length of the longest chain. Then C L

2

  • .
slide-93
SLIDE 93

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y). Let r.v. C be the total number of collisions: C =

x,y∈T, x<y Ix,y.

Using linearity of expectation and E(Ix,y) = 1

m (h is weakly universal),

E(C) = E

x,y∈T, x<y

Ix,y

  • =
  • x,y∈T, x<y

E(Ix,y) = m 2

  • · 1

m m 2 . by Markov’s inequality, Pr(C m) E(C)

m

1

2 .

Let r.v. L be the length of the longest chain. Then C L

2

  • .

Now, Pr

  • (L−1)2

2

m

  • Pr

L

2

  • m
  • Pr (C m) 1

2 .

slide-94
SLIDE 94

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y). Let r.v. C be the total number of collisions: C =

x,y∈T, x<y Ix,y.

Using linearity of expectation and E(Ix,y) = 1

m (h is weakly universal),

E(C) = E

x,y∈T, x<y

Ix,y

  • =
  • x,y∈T, x<y

E(Ix,y) = m 2

  • · 1

m m 2 . by Markov’s inequality, Pr(C m) E(C)

m

1

2 .

Let r.v. L be the length of the longest chain. Then C L

2

  • .

Now, Pr

  • (L−1)2

2

m

  • Pr

L

2

  • m
  • Pr (C m) 1

2 .

this is because

L

2

  • =

L! 2!(L − 2)! = L · (L − 1) 2 (L − 1)2 2

slide-95
SLIDE 95

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y). Let r.v. C be the total number of collisions: C =

x,y∈T, x<y Ix,y.

Using linearity of expectation and E(Ix,y) = 1

m (h is weakly universal),

E(C) = E

x,y∈T, x<y

Ix,y

  • =
  • x,y∈T, x<y

E(Ix,y) = m 2

  • · 1

m m 2 . by Markov’s inequality, Pr(C m) E(C)

m

1

2 .

Let r.v. L be the length of the longest chain. Then C L

2

  • .

Now, Pr

  • (L−1)2

2

m

  • Pr

L

2

  • m
  • Pr (C m) 1

2 .

slide-96
SLIDE 96

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y). Let r.v. C be the total number of collisions: C =

x,y∈T, x<y Ix,y.

Using linearity of expectation and E(Ix,y) = 1

m (h is weakly universal),

E(C) = E

x,y∈T, x<y

Ix,y

  • =
  • x,y∈T, x<y

E(Ix,y) = m 2

  • · 1

m m 2 . by Markov’s inequality, Pr(C m) E(C)

m

1

2 .

Let r.v. L be the length of the longest chain. Then C L

2

  • .

Now, Pr

  • (L−1)2

2

m

  • Pr

L

2

  • m
  • Pr (C m) 1

2 .

slide-97
SLIDE 97

Longest chain – weakly universal hashing

PROOF

For any two keys x, y, let indicator r.v. Ix,y be 1 iff h(x) = h(y). Let r.v. C be the total number of collisions: C =

x,y∈T, x<y Ix,y.

Using linearity of expectation and E(Ix,y) = 1

m (h is weakly universal),

E(C) = E

x,y∈T, x<y

Ix,y

  • =
  • x,y∈T, x<y

E(Ix,y) = m 2

  • · 1

m m 2 . by Markov’s inequality, Pr(C m) E(C)

m

1

2 .

Let r.v. L be the length of the longest chain. Then C L

2

  • .

Now, Pr

  • (L−1)2

2

m

  • Pr

L

2

  • m
  • Pr (C m) 1

2 .

By rearranging, we have that Pr

  • L 1 +

√ 2m

  • 1

2 , and we are done.

slide-98
SLIDE 98

Conclusions

the expected lookup time in a hash table with chaining is O(1). we have seen that when m n, For both, true randomness (h is picked uniformly from the set of all possible hash functions) and weakly universal hashing (h is picked uniformly from a weakly universal set of hash functions) If h is selected uniformly at random from all functions U → [m] then,

Pr (any chain has length 3 log m ) 1 m .

LEMMA LEMMA

(both Lemmas hold for m any fixed inputs) If h is picked uniformly at random from a weakly universal set of hash functions,

Pr

  • any chain has length 1 +

√ 2m

  • 1

2 .