1 Indicators: Now With Pair-wise Flavor!
- Recall Ii is indicator variable for event Ai when:
- Let X = # of events that occur:
- Now consider pair of events Ai Aj occurring
- Ii Ij = 1 if both events Ai and Aj occur, 0 otherwise
- Number of pairs of events that occur is
- therwise
- ccurs
if 1
i i
A I
n i i n i i n i i
A P I E I E X E
1 1 1
) ( ] [ ] [
n i i
I X
1
j i j i I
I
X 2
From Event Pairs to Variance
- Expected number of pairs of events:
- Recall: Var(X) = E[X2] – (E[X])2
j i j i j i j i j i j i
A A P I I E I I E E
X
) ( ] [
2
j i j i A
A P X E X E E
X X
) ( ]) [ ] [ (
2
2 1 2 ) 1 (
] [ ) ( 2 ] [ ) ( 2 ] [ ] [
2 2
X E A A P X E A A P X E X E
j i j i j i j i
2
]) [ ( ] [ ) ( 2 ) ( Var X E X E A A P X
j i j i
2 1 1
) ( ) ( ) ( 2
n i i n i i j i j i
A P A P A A P
Let’s Try It With the Binomial
- X ~ Bin(n, p)
- Each trial: Xi ~ Ber(p)
- Let event Ai = trial i is success (i.e., Xi = 1)
2 2
2 2
) ( ] [ p p A A P X X E E
n X
j i j i j i j i j i
np A P X E
n i i
1
) ( ] [ p X E
i
] [
2 2
) 1 ( ] [ ] [ )] 1 ( [ p n n X E X E X X E
2 2 2 2
]) [ ( ] [ ]) [ ] [ ( ]) [ ( ] [ ) ( Var X E X E X E X E X E X E X
2 2 2 2 2 2 2
) ( ) 1 ( p n np np p n np np p n n ) 1 ( p np
Computer Cluster Utilization
- Computer cluster with N servers
- Requests independently go to server i with probability pi
- Let event Ai = server i receives no requests
- X = # of events A1, A2, … An that occur
- Y = # servers that receive ≥ 1 request = N – X
- E[Y] after first n requests?
- Since requests independent:
n i i
p A P ) 1 ( ) (
N i n i N i i
p A P X E
1 1
) 1 ( ) ( ] [
N i n i
p N X E N Y E
1
) 1 ( ] [ ] [
n N i n i
N N N
N N Y E N i p ) 1 ( 1 ) 1 ( ] [ , 1
1 1 1
1
for when
Computer Cluster Utilization (cont.)
- Computer cluster with N servers
- Requests independently go to server i with probability pi
- Let event Ai = server i receives no requests
- X = # of events A1, A2, … An that occur
- Y = # servers that receive ≥ 1 request = N – X
- Var(Y) after first n requests?
- Independent requests:
j i p p A A P
n j i j i
, ) 1 ( ) (
j i n j i j i j i
p p A A P X E X E X X E ) 1 ( 2 ) ( 2 ] [ ] [ )] 1 ( [
2 2
]) [ ( ] [ ) 1 ( 2 ) ( Var X E X E p p X
j i n j i
N i n i
p X E
1
) 1 ( ] [ ) ( Var ) 1 ( ) 1 ( ) 1 ( 2
2 1 1
Y p p p p
N i n i N i n i j i n j i
( = (-1)2 Var(X) = Var(X) )
Computer Cluster = Coupon Collecting
- Computer cluster with N servers
- Requests independently go to server i with probability pi
- Let event Ai = server i receives no requests
- X = # of events A1, A2, … An that occur
- Y = # servers that receive ≥ 1 request = N – X
- This is really another “Coupon Collector” problem
- Each server is a “coupon type”
- Request to server = collecting a coupon of that type
- Hash table version
- Each server is a bucket in table
- Request to server = string gets hashed to that bucket