The Probabilistic Method: Proof Through a Probabilistic Argument - - PowerPoint PPT Presentation

the probabilistic method proof through a probabilistic
SMART_READER_LITE
LIVE PREVIEW

The Probabilistic Method: Proof Through a Probabilistic Argument - - PowerPoint PPT Presentation

The Probabilistic Method: Proof Through a Probabilistic Argument Compute n n n 1 i i 2 i =0 Proof Through a Probabilistic Argument Compute n n n n n 1 1 n ! i = i i 2 i !( n


slide-1
SLIDE 1

The Probabilistic Method: Proof Through a Probabilistic Argument

  • Compute

n

  • i=0

i n i 1 2 n

slide-2
SLIDE 2

Proof Through a Probabilistic Argument

  • Compute

n

  • i=0

i n i 1 2 n =

n

  • i=1

i n! i!(n − i)! 1 2 n =

n

  • i=1

n (n − 1)! (i − 1)!(n − i)! 1 2 n = n 2

n

  • i=1

(n − 1)! (i − 1)!(n − i)! 1 2 n−1 = n 2

n−1

  • i=0

(n − 1)! (i)!(n − i − 1)! 1 2 n−1 = n 2

slide-3
SLIDE 3

Proof Through a Probabilistic Argument

  • Compute

n

  • i=0

i n i 1 2 n

  • Let X ∼ B(n, 1/2),
  • Xi independent r.v. with Pr(Xi = 1) = Or(Xi = 0) = 1/2.

n

  • i=0

i n i 1 2 n = E[X] = E[

n

  • i=1

Xi] =

n

  • i=1

E[Xi] = n 2

  • We prove a deterministic statement using a probabilistic

argument!

slide-4
SLIDE 4

Theorem Given any graph G = (V , E) with n vertices and m edges, there is a partition of V into two disjoint sets A and B such that at least m/2 edges connect vertex in A to a vertex in B. Proof. Construct sets A and B by randomly assign each vertex to one of the two sets. The probability that a given edge connect A to B is 1/2, thus the expected number of such edges is m/2. Thus, there exists such a partition.

slide-5
SLIDE 5

Sample and Modify

An independent set in a graph G is a set of vertices with no edges between them. Finding the largest independent set in a graph is an NP-hard problem. Theorem Let G = (V , E) be a graph on n vertices with dn/2 edges. Then G has an independent set with at least n/2d vertices. Algorithm:

1 Delete each vertex of G (together with its incident edges)

independently with probability 1 − 1/d.

2 For each remaining edge, remove it and one of its adjacent

vertices.

slide-6
SLIDE 6

X = number of vertices that survive the first step of the algorithm. E[X] = n d . Y = number of edges that survive the first step. An edge survives if and only if its two adjacent vertices survive. E[Y ] = nd 2 1 d 2 = n 2d . The second step of the algorithm removes all the remaining edges, and at most Y vertices. Size of output independent set: E[X − Y ] = n d − n 2d = n 2d .

slide-7
SLIDE 7

Conditional Expectation

Definition E[Y | Z = z] =

  • y

y Pr(Y = y | Z = z), where the summation is over all y in the range of Y . Lemma For any random variables X and Y , E[X] =

  • y

Pr(Y = y)E[X | Y = y], where the sum is over all values in the range of Y .

slide-8
SLIDE 8

Derandomization using Conditional Expectations

Given a graph G = (V , E) with n vertices and m edges, we showed that there is a partition of V into A and B such that at least m/2 edges connect A to B. How do we find such a partition?

slide-9
SLIDE 9

C(A, B) = number of edges connecting A to B. If A, B is a random partition E[C(A, B)] = m

2 .

Algorithm:

1 Let v1, v2, . . . , vn be an arbitrary enumeration of the vertices. 2 Let xi be the set where vi is placed (xi ∈ {A, B}). 3 For i = 1 to n do: 1 Place vi such that

E[C(A, B) | x1, x2, . . . , xi] ≥ E[C(A, B) | x1, x2, . . . , xi−1] ≥ m/2.

slide-10
SLIDE 10

Lemma For all i = 1, . . . , n there is an assignment of vi such that E[C(A, B) | x1, x2, . . . , xi] ≥ E[C(A, B) | x1, x2, . . . , xi−1] ≥ m/2.

slide-11
SLIDE 11

Proof. By induction on i. For i = 1, E[E[C(A, B) | X1]] = E[C(A, B)] = m/2 For i > 1, if we place vi randomly in one of the two sets, E[C(A, B) | x1, x2, . . . , xi−1] = 1 2E[C(A, B) | x1, x2, . . . , xi = A] + 1 2E[C(A, B) | x1, x2, . . . , xi = B]. max(E[C(A, B) | x1, x2, . . . , xi = A], E[C(A, B) | x1, x2, . . . , xi = B]) ≥ E[C(A, B) | x1, x2, . . . , xi−1] ≥ m/2

slide-12
SLIDE 12

How do we compute max(E[C(A, B) | x1, x2, . . . , xi = A], E[C(A, B) | x1, x2, . . . , xi = B]) ≥ E[C(A, B) | x1, x2, . . . , xi−1] ≥ m/2 We just need to consider edges between vi and v1, . . . , vi−1. Simple Algorithm:

1 Place v1 arbitrarily. 2 For i = 2 to n do 1 Place vi in the set with smaller number of neighbors.

slide-13
SLIDE 13

Perfect Hashing

Goal: Store a static disctionary of n items in a table of O(n) space such that any search takes O(1) time.

slide-14
SLIDE 14

Universal hash functions

Definition Let U be a universe with |U| ≥ n and V = {0, 1, . . . , n − 1}. A family of hash functions H from U to V is said to be k-universal if, for any elements x1, x2, . . . , xk, when a hash function h is chosen uniformly at random from H, Pr(h(x1) = h(x2) = . . . = h(xk)) ≤ 1 nk−1 .

slide-15
SLIDE 15

Example of 2-Universal Hash Functions

Universe U = {0, 1, 2, . . . , m − 1} Table keys V = {0, 1, 2, . . . , n − 1}, with m ≥ n. A family of hash functions obtained by choosing a prime p ≥ m, ha,b(x) = ((ax + b) mod p) mod n, and taking the family H = {ha,b | 1 ≤ a ≤ p − 1, 0 ≤ b ≤ p}. Lemma H is 2-universal.

slide-16
SLIDE 16

Lemma H is 2-universal. Proof. We first observe that for x1, x2 ∈ {0, . . . , p − 1}, x1 = x2, ax1 + b = ax2 + b mod p. Thus, if ha,b(x1) = ha,b(x2) there is a pair (s, r) such that s = r, s = r mod n, and (ax1 + b) mod p = r (ax2 + b) mod p = s There are p choices of r, and for each pair (r, s) there is only one pair (a, b) that satisfies the relation. For each r there are ≤ ⌈ p

n⌉ − 1 values s = r such that s = r

mod n. Thus, the probability of a collision is ≤

p(⌈ p

n ⌉−1)

p(p−1) ≤ 1 n.

slide-17
SLIDE 17

Lemma If h ∈ H is chosen uniformly at random from a 2-universal family

  • f hash functions mapping the universe U to [0, n − 1], then for

any set S ⊂ U of size m, with probability ≥ 1/2 the number of collisions is bounded by m2/n. Proof. Let s1, s2, . . . , sm be the m items of S. Let Xij be 1 if the h(si) = h(sj) and 0 otherwise. Let X =

1≤i<j≤n Xij.

E[X] = E  

  • 1≤i<j≤n

Xij   =

  • 1≤i<j≤m

E[Xij] ≤ m 2 1 n < m2 2n , Markov’s inequality yields Pr(X ≥ m2/n) ≤ Pr(X ≥ 2E[X]) ≤ 1

2.

slide-18
SLIDE 18

Definition A hash function is perfect for a set S if it maps S with no collisions. Lemma If h ∈ H is chosen uniformly at random from a 2-universal family

  • f hash functions mapping the universe U to [0, n − 1], then for

any set S ⊂ U of size m, such that m2 ≤ n with probability ≥ 1/2 the hash function is perfect

slide-19
SLIDE 19

Theorem The two-level approach gives a perfect hashing scheme for m items using O(m) bins. Level I: use a hash table with n = m. Let X be the number of collisions, Pr(X ≥ m2/n) ≤ Pr(X ≥ 2E[X]) ≤ 1 2. When n = m, there exists a choice of hash function from the 2-universal family that gives at most m collisions.

slide-20
SLIDE 20

Level II: Let ci be the number of items in the i-th bin. There are ci

2

  • collisions between items in the i-th bin, thus

m

  • i=1

ci 2

  • ≤ m.

For each bin with ci > 1 items, we find a second hash function that gives no collisions using space c2

i . The total number of bins

used is bounded above by m +

m

  • i=1

c2

i ≤ m + 2 m

  • i=1

ci 2

  • +

m

  • i=1

ci ≤ m + 2m + m = 4m. Hence the total number of bins used is only O(m).

slide-21
SLIDE 21

The First and Second Moment

Theorem For an integer random variable X,

  • Pr(X > 0) = Pr(X ≥ 1) ≤ E[X]
  • Pr(X = 0) ≤ Pr(|X − E[X]| ≥ E[X]) ≤ Var[X]

(E[X])2

slide-22
SLIDE 22

Application: Number of Isolated Nodes

Let Gn,p = (V , E) be a random graph generated as follows:

  • The graph has n nodes.
  • Each of the

n

2

  • pairs of vertices are connected by an edge with

probability p independently of any other edge in the graph. A node is isolated if it is adjacent to no edges. If p = 0 all vertices are isolated (have no edges). If p = 1 no vertex is isolated. What can we say for 0 < p < 1?

slide-23
SLIDE 23

Application: Number of Isolated Nodes

Let Gn,p = (V , E) be a random graph generated as follows:

  • The graph has n nodes.
  • Each of the

n

2

  • pairs of vertices are connected by an edge with

probability p independently of any other edge in the graph. A node is isolated if it has no edges. Theorem For any function w(n) → ∞

  • If p = log n−w(n)

n

, then whp the graph has isolated nodes.

  • If p = log n+w(n)

n

, then whp the graph has no isolated nodes.

slide-24
SLIDE 24

Proof

For i = 1, . . . , n, let Xi = 1 if node i is isolated, otherwise Xi = 0. Let X = n

i=1 Xi.

E[X] = n(1 − p)n−1 For p = log n+w(n)

n

E[X] = n(1 − p)n−1 ≤ elog n−(n−1)p ≤ e−w(n) → 0 Thus, for p = log n+w(n)

n

, Pr(X > 0) ≤ E[X] → 0

slide-25
SLIDE 25

To use the second moment method we need to bound Var[x]. Var[Xi] ≤ E[X 2

i ] = E[Xi] = (1 − p)n−1

Cov(Xi, Xj) = (1 − p)2n−3 − (1 − p)2n−2 Var[X] ≤

n

  • i=1

Var[Xi] +

  • i=j

Cov(Xi, Xi) = n(1 − p)n−1 + n(n − 1)(1 − p)2n−3 − n(n − 1)(1 − p)2n−2 = n(1 − p)n−1 + n(n − 1)p(1 − p)2n−3

slide-26
SLIDE 26

Var[X] =

n

  • i=1

Var[Xi] +

  • i=j

Cov(Xi, Xi) = n(1 − p)n−1 + n(n − 1)p(1 − p)2n−3 Pr(X = 0) = Pr(|X − E[X]| ≥ E[X]) ≤ Var[X] (E[X])2 = n(1 − p)n−1 + n(n − 1)p(1 − p)2n−3 n2(1 − p)2n−2 =

  • 1 − 1

n

  • p

1 − p + 1 n(1 − p)n−1

slide-27
SLIDE 27

For p = log n−w(n)

n

, Pr(X = 0) ≤ Var[X] (E[X])2 =

  • 1 − 1

n

  • p

1 − p + 1 n(1 − p)n−1 → 0 Since n(1 − p)n−1 ≥ ne−p(n−1)(1 − p2 n ) ≥ 1 2ew(n) We use: for |X| ≤ 1 ex

  • 1 − x2

n

  • 1 + x

n n ≤ ex