the probabilistic method proof through a probabilistic
play

The Probabilistic Method: Proof Through a Probabilistic Argument - PowerPoint PPT Presentation

The Probabilistic Method: Proof Through a Probabilistic Argument Compute n n n 1 i i 2 i =0 Proof Through a Probabilistic Argument Compute n n n n n 1 1 n ! i = i i 2 i !( n


  1. The Probabilistic Method: Proof Through a Probabilistic Argument • Compute n � n � n � � 1 � i i 2 i =0

  2. Proof Through a Probabilistic Argument • Compute n n � n � n � n � � 1 � 1 n ! � � i = i i 2 i !( n − i )! 2 i =0 i =1 n � n ( n − 1)! � 1 � = n ( i − 1)!( n − i )! 2 i =1 n � n − 1 n ( n − 1)! � 1 � = 2 ( i − 1)!( n − i )! 2 i =1 n − 1 � n − 1 ( n − 1)! � 1 n � = 2 ( i )!( n − i − 1)! 2 i =0 n = 2

  3. Proof Through a Probabilistic Argument • Compute n � n � n � � 1 � i i 2 i =0 • Let X ∼ B ( n , 1 / 2), • X i independent r.v. with Pr ( X i = 1) = Or ( X i = 0) = 1 / 2. n n n � n � n � � 1 E [ X i ] = n � � � i = E [ X ] = E [ X i ] = i 2 2 i =0 i =1 i =1 • We prove a deterministic statement using a probabilistic argument!

  4. Theorem Given any graph G = ( V , E ) with n vertices and m edges, there is a partition of V into two disjoint sets A and B such that at least m / 2 edges connect vertex in A to a vertex in B. Proof. Construct sets A and B by randomly assign each vertex to one of the two sets. The probability that a given edge connect A to B is 1 / 2, thus the expected number of such edges is m / 2. Thus, there exists such a partition.

  5. Sample and Modify An independent set in a graph G is a set of vertices with no edges between them. Finding the largest independent set in a graph is an NP-hard problem. Theorem Let G = ( V , E ) be a graph on n vertices with dn / 2 edges. Then G has an independent set with at least n / 2 d vertices. Algorithm: 1 Delete each vertex of G (together with its incident edges) independently with probability 1 − 1 / d . 2 For each remaining edge, remove it and one of its adjacent vertices.

  6. X = number of vertices that survive the first step of the algorithm. E [ X ] = n d . Y = number of edges that survive the first step. An edge survives if and only if its two adjacent vertices survive. � 1 � 2 E [ Y ] = nd = n 2 d . 2 d The second step of the algorithm removes all the remaining edges, and at most Y vertices. Size of output independent set: E [ X − Y ] = n d − n 2 d = n 2 d .

  7. Conditional Expectation Definition � E [ Y | Z = z ] = y Pr( Y = y | Z = z ) , y where the summation is over all y in the range of Y . Lemma For any random variables X and Y , � E [ X ] = Pr( Y = y ) E [ X | Y = y ] , y where the sum is over all values in the range of Y .

  8. Derandomization using Conditional Expectations Given a graph G = ( V , E ) with n vertices and m edges, we showed that there is a partition of V into A and B such that at least m / 2 edges connect A to B . How do we find such a partition?

  9. C ( A , B ) = number of edges connecting A to B . If A , B is a random partition E [ C ( A , B )] = m 2 . Algorithm: 1 Let v 1 , v 2 , . . . , v n be an arbitrary enumeration of the vertices. 2 Let x i be the set where v i is placed ( x i ∈ { A , B } ). 3 For i = 1 to n do: 1 Place v i such that E [ C ( A , B ) | x 1 , x 2 , . . . , x i ] ≥ E [ C ( A , B ) | x 1 , x 2 , . . . , x i − 1 ] ≥ m / 2 .

  10. Lemma For all i = 1 , . . . , n there is an assignment of v i such that E [ C ( A , B ) | x 1 , x 2 , . . . , x i ] ≥ E [ C ( A , B ) | x 1 , x 2 , . . . , x i − 1 ] ≥ m / 2 .

  11. Proof. By induction on i . For i = 1, E [ E [ C ( A , B ) | X 1 ]] = E [ C ( A , B )] = m / 2 For i > 1, if we place v i randomly in one of the two sets, E [ C ( A , B ) | x 1 , x 2 , . . . , x i − 1 ] 1 = 2 E [ C ( A , B ) | x 1 , x 2 , . . . , x i = A ] + 1 2 E [ C ( A , B ) | x 1 , x 2 , . . . , x i = B ] . max( E [ C ( A , B ) | x 1 , x 2 , . . . , x i = A ] , E [ C ( A , B ) | x 1 , x 2 , . . . , x i = B ]) ≥ E [ C ( A , B ) | x 1 , x 2 , . . . , x i − 1 ] ≥ m / 2

  12. How do we compute max( E [ C ( A , B ) | x 1 , x 2 , . . . , x i = A ] , E [ C ( A , B ) | x 1 , x 2 , . . . , x i = B ]) ≥ E [ C ( A , B ) | x 1 , x 2 , . . . , x i − 1 ] ≥ m / 2 We just need to consider edges between v i and v 1 , . . . , v i − 1 . Simple Algorithm: 1 Place v 1 arbitrarily. 2 For i = 2 to n do 1 Place v i in the set with smaller number of neighbors.

  13. Perfect Hashing Goal: Store a static disctionary of n items in a table of O ( n ) space such that any search takes O (1) time.

  14. Universal hash functions Definition Let U be a universe with | U | ≥ n and V = { 0 , 1 , . . . , n − 1 } . A family of hash functions H from U to V is said to be k-universal if, for any elements x 1 , x 2 , . . . , x k , when a hash function h is chosen uniformly at random from H , 1 Pr( h ( x 1 ) = h ( x 2 ) = . . . = h ( x k )) ≤ n k − 1 .

  15. Example of 2-Universal Hash Functions Universe U = { 0 , 1 , 2 , . . . , m − 1 } Table keys V = { 0 , 1 , 2 , . . . , n − 1 } , with m ≥ n . A family of hash functions obtained by choosing a prime p ≥ m , h a , b ( x ) = (( ax + b ) mod p ) mod n , and taking the family H = { h a , b | 1 ≤ a ≤ p − 1 , 0 ≤ b ≤ p } . Lemma H is 2-universal.

  16. Lemma H is 2-universal. Proof. We first observe that for x 1 , x 2 ∈ { 0 , . . . , p − 1 } , x 1 � = x 2 , ax 1 + b � = ax 2 + b mod p . Thus, if h a , b ( x 1 ) = h a , b ( x 2 ) there is a pair ( s , r ) such that s � = r , s = r mod n , and ( ax 1 + b ) mod p = r ( ax 2 + b ) mod p = s There are p choices of r , and for each pair ( r , s ) there is only one pair ( a , b ) that satisfies the relation. For each r there are ≤ ⌈ p n ⌉ − 1 values s � = r such that s = r mod n . p ( ⌈ p n ⌉− 1) p ( p − 1) ≤ 1 Thus, the probability of a collision is ≤ n .

  17. Lemma If h ∈ H is chosen uniformly at random from a 2-universal family of hash functions mapping the universe U to [0 , n − 1] , then for any set S ⊂ U of size m, with probability ≥ 1 / 2 the number of collisions is bounded by m 2 / n. Proof. Let s 1 , s 2 , . . . , s m be the m items of S . Let X ij be 1 if the h ( s i ) = h ( s j ) and 0 otherwise. Let X = � 1 ≤ i < j ≤ n X ij .   n < m 2 � m � 1 �  = � E [ X ] = E X ij E [ X ij ] ≤ 2 n ,  2 1 ≤ i < j ≤ n 1 ≤ i < j ≤ m Markov’s inequality yields Pr( X ≥ m 2 / n ) ≤ Pr( X ≥ 2 E [ X ]) ≤ 1 2 .

  18. Definition A hash function is perfect for a set S if it maps S with no collisions. Lemma If h ∈ H is chosen uniformly at random from a 2-universal family of hash functions mapping the universe U to [0 , n − 1] , then for any set S ⊂ U of size m, such that m 2 ≤ n with probability ≥ 1 / 2 the hash function is perfect

  19. Theorem The two-level approach gives a perfect hashing scheme for m items using O ( m ) bins. Level I: use a hash table with n = m . Let X be the number of collisions, Pr( X ≥ m 2 / n ) ≤ Pr( X ≥ 2 E [ X ]) ≤ 1 2 . When n = m , there exists a choice of hash function from the 2-universal family that gives at most m collisions.

  20. Level II: Let c i be the number of items in the i -th bin. There are � c i � collisions between items in the i -th bin, thus 2 m � c i � � ≤ m . 2 i =1 For each bin with c i > 1 items, we find a second hash function that gives no collisions using space c 2 i . The total number of bins used is bounded above by m m m � c i � � c 2 � � m + i ≤ m + 2 + c i ≤ m + 2 m + m = 4 m . 2 i =1 i =1 i =1 Hence the total number of bins used is only O ( m ).

  21. The First and Second Moment Theorem For an integer random variable X, • Pr ( X > 0) = Pr ( X ≥ 1) ≤ E [ X ] • Pr ( X = 0) ≤ Pr ( | X − E [ X ] | ≥ E [ X ]) ≤ Var [ X ] ( E [ X ]) 2

  22. Application: Number of Isolated Nodes Let G n , p = ( V , E ) be a random graph generated as follows: • The graph has n nodes. � n • Each of the � pairs of vertices are connected by an edge with 2 probability p independently of any other edge in the graph. A node is isolated if it is adjacent to no edges. If p = 0 all vertices are isolated (have no edges). If p = 1 no vertex is isolated. What can we say for 0 < p < 1?

  23. Application: Number of Isolated Nodes Let G n , p = ( V , E ) be a random graph generated as follows: • The graph has n nodes. � n • Each of the � pairs of vertices are connected by an edge with 2 probability p independently of any other edge in the graph. A node is isolated if it has no edges. Theorem For any function w ( n ) → ∞ • If p = log n − w ( n ) , then whp the graph has isolated nodes. n • If p = log n + w ( n ) , then whp the graph has no isolated nodes. n

  24. Proof For i = 1 , . . . , n , let X i = 1 if node i is isolated, otherwise X i = 0. Let X = � n i =1 X i . E [ X ] = n (1 − p ) n − 1 For p = log n + w ( n ) n E [ X ] = n (1 − p ) n − 1 ≤ e log n − ( n − 1) p ≤ e − w ( n ) → 0 Thus, for p = log n + w ( n ) , n Pr ( X > 0) ≤ E [ X ] → 0

  25. To use the second moment method we need to bound Var [ x ]. Var [ X i ] ≤ E [ X 2 i ] = E [ X i ] = (1 − p ) n − 1 Cov ( X i , X j ) = (1 − p ) 2 n − 3 − (1 − p ) 2 n − 2 n � � Var [ X ] ≤ Var [ X i ] + Cov ( X i , X i ) i =1 i � = j n (1 − p ) n − 1 + n ( n − 1)(1 − p ) 2 n − 3 − n ( n − 1)(1 − p ) 2 n − 2 = n (1 − p ) n − 1 + n ( n − 1) p (1 − p ) 2 n − 3 =

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend