Martingales and the Method of Bounded Differences
Advanced Algorithms Nanjing University, Fall 2018
Martingales and the Method of Bounded Differences Advanced - - PowerPoint PPT Presentation
Martingales and the Method of Bounded Differences Advanced Algorithms Nanjing University, Fall 2018 (Some) Concentration Inequalities Question: probability that X deviates more than from expectation? (Some) Concentration Inequalities
Advanced Algorithms Nanjing University, Fall 2018
Example: roll a fair six-sided dice β°1 = the outcome is six β°2 = the outcome is an even number
Example: roll a fair six-sided dice β°1 = the outcome is six β°2 = the outcome is an even number
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π =?
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π =? π½ π | π = "China" =?
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π =? π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: throw a fair six-sided dice for π times ππ: # of times π appears in π throws
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: throw a fair six-sided dice for π times ππ: # of times π appears in π throws π½ π1 | π2 = π =
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: throw a fair six-sided dice for π times ππ: # of times π appears in π throws π½ π1 | π2 = π = (π β π)/5
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: throw a fair six-sided dice for π times ππ: # of times π appears in π throws π½ π1 | π2 = π = (π β π)/5 π½ π1 | π2 =
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: throw a fair six-sided dice for π times ππ: # of times π appears in π throws π½ π1 | π2 = π = (π β π)/5 π½ π1 | π2 = (π β π2)/5
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: throw a fair six-sided dice for π times ππ: # of times π appears in π throws π½ π1 | π2 = π = (π β π)/5 π½ π1 | π2 = (π β π2)/5 π½ π1 | π2 = π, π3 = π =
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: throw a fair six-sided dice for π times ππ: # of times π appears in π throws π½ π1 | π2 = π = (π β π)/5 π½ π1 | π2 = (π β π2)/5 π½ π1 | π2 = π, π3 = π = (π β π β π)/4
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: throw a fair six-sided dice for π times ππ: # of times π appears in π throws π½ π1 | π2 = π = (π β π)/5 π½ π1 | π2 = (π β π2)/5 π½ π1 | π2 = π, π3 = π = (π β π β π)/4 π½ π1 | π2, π3 =
Example: sample a human being uniformly at random π: height of the chosen human being π: country of origin of the chosen human being π½ π | π = "China" =? π½ π | π = "U.S." =?
Example: throw a fair six-sided dice for π times ππ: # of times π appears in π throws π½ π1 | π2 = π = (π β π)/5 π½ π1 | π2 = (π β π2)/5 π½ π1 | π2 = π, π3 = π = (π β π β π)/4 π½ π1 | π2, π3 = (π β π2 β π3)/4
π: height of the chosen human being π: country of origin of the chosen human being π: gender of the chosen human being Example:
π: height of the chosen human being π: country of origin of the chosen human being π: gender of the chosen human being Example:
π: height of the chosen human being π: country of origin of the chosen human being π: gender of the chosen human being Example:
average height of all human beings = weighted average of the country-by-country average heights
π: height of the chosen human being π: country of origin of the chosen human being π: gender of the chosen human being Example:
average height of all human beings = weighted average of the country-by-country average heights
π: height of the chosen human being π: country of origin of the chosen human being π: gender of the chosen human being Example:
average height of all human beings = weighted average of the country-by-country average heights
average height of all male/female human beings = weighted average of the country-by-country average male/female heights
π: height of the chosen human being π: country of origin of the chosen human being π: gender of the chosen human being Example:
average height of all human beings = weighted average of the country-by-country average heights
average height of all male/female human beings = weighted average of the country-by-country average male/female heights
π: height of the chosen human being π: country of origin of the chosen human being π: gender of the chosen human being Example:
average height of all human beings = weighted average of the country-by-country average heights
average height of all male/female human beings = weighted average of the country-by-country average male/female heights
π½ π π π π, π | π = π¦ = π(π¦)π½ π π, π | π = π¦
Generalization to the multivariate case:
πβ1 2π = 1
πβ1 2π = 1
consider a fair game, with any betting strategy
πβ1 2π = 1
consider a fair game, with any betting strategy let ππ be our wealth after π rounds
πβ1 2π = 1
consider a fair game, with any betting strategy let ππ be our wealth after π rounds π½ ππ+1 | π0, π1, β― , ππ =
πβ1 2π = 1
consider a fair game, with any betting strategy let ππ be our wealth after π rounds π½ ππ+1 | π0, π1, β― , ππ = ππ
πβ1 2π = 1
consider a fair game, with any betting strategy let ππ be our wealth after π rounds π½ ππ+1 | π0, π1, β― , ππ = ππ since the game is fair, conditioned on past history, we expect no change to current value after one round
a dot starting from the origin in each step, move equiprobably to one of four neighbors
a dot starting from the origin in each step, move equiprobably to one of four neighbors after π steps, use ππ to denote # of hops to origin (Manhattan distance)
a dot starting from the origin in each step, move equiprobably to one of four neighbors after π steps, use ππ to denote # of hops to origin (Manhattan distance)
a dot starting from the origin in each step, move equiprobably to one of four neighbors after π steps, use ππ to denote # of hops to origin (Manhattan distance)
How far the dot is away from the origin after π steps?
a dot starting from the origin in each step, move equiprobably to one of four neighbors after π steps, use ππ to denote # of hops to origin (Manhattan distance)
π0, π1, β― are not necessarily independent
After π steps, use ππ to denote # of hops to origin (Manhattan distance) How large is ππ?
After π steps, use ππ to denote # of hops to origin (Manhattan distance) How large is ππ?
After π steps, use ππ to denote # of hops to origin (Manhattan distance) How large is ππ? We know π0 = 0, and ππ β ππβ1 β€ 1
After π steps, use ππ to denote # of hops to origin (Manhattan distance) How large is ππ? We know π0 = 0, and ππ β ππβ1 β€ 1
After π steps, use ππ to denote # of hops to origin (Manhattan distance) How large is ππ? We know π0 = 0, and ππ β ππβ1 β€ 1 Within Ξ( π log π) w.h.p.
For a sequence of r.v., if in each step: * on average make no change to current value (martingale) * no big jump (bounded difference)
For a sequence of r.v., if in each step: * on average make no change to current value (martingale) * no big jump (bounded difference) Then final value does not deviate a lot from the initial value.
Use similar strategy as in proving Chernoff bounds:
Use similar strategy as in proving Chernoff bounds: (a) Apply generalized Markovβs inequality to MGF
Use similar strategy as in proving Chernoff bounds: (a) Apply generalized Markovβs inequality to MGF (b)* Bound the value of MGF (use Hoeffdingβs lemma)
Use similar strategy as in proving Chernoff bounds: (a) Apply generalized Markovβs inequality to MGF (b)* Bound the value of MGF (use Hoeffdingβs lemma) (c) Optimize the value of MGF
for π > 0
for π > 0
for π > 0
for π > 0 π½ π = π½ π½ π | π
for π > 0 π½ π½ π π π π, π |π = π½ π π π½ π π, π |π
for π > 0
for π > 0
for π > 0
for π > 0
for π > 0
for π > 0
for π > 0
for π > 0
for π > 0 minimized when π =
π’ Οπ=1
π
ππ
2
betting on a fair game ππ: gain/loss of the ith bet π
π: wealth after the ith bet
betting on a fair game ππ: gain/loss of the ith bet π
π: wealth after the ith bet β martingale (since game is fair)
martingale π0, π1, β― , ππ
π½ ππ π0, π1, β― , ππβ1) = ππβ1
martingale π
0, π 1, β― w.r.t. π0, π1, β―
π
π = π(π0, π1, β― , ππ)
π½ π
π
π0, π1, β― , ππβ1) = π
πβ1
Azumaβs Inequality
martingale π0, π1, β― with ππ β ππβ1 β€ ππ, then β ππ β π0 β₯ π’ β€ β―
martingale π
0, π 1, β― w.r.t. π0, π1, β―
with π
π β π πβ1 β€ ππ,
then β π
π β π 0 β₯ π’ β€ β―
generalization generalization
π½ π no information
average over
π½ π no information π½ π|π1
randomized by
π½ π no information π½ π|π1
average over
randomized by π½ π|π2
π½ π no information
average over
randomized by π½ π|π3 π½ π|π1 π½ π|π2
π½ π no information
randomized by
π½ π|π4 = π(π4) full information π½ π|π1 π½ π|π2 π½ π|π3
π½ π | Τ¦ π = π½ π½ π | Τ¦ π, Τ¦ π | Τ¦ π
π = α1 edge π β π»
π = α1 edge π β π»
π = π½ π π» | π½1, β― , π½π
π = α1 edge π β π»
π = π½ π π» | π½1, β― , π½π
π
0, π 1, β― , π π 2
is a Doob sequence, called edge exposure martingale In particular, π
0 = π½(π(π»)), and π π 2
π = π½ π π» | π1, β― , ππ
π = π½ π π» | π1, β― , ππ
π
0, π 1, β― , π π is a Doob sequence, called vertex exposure martingale
In particular, π
0 = π½(π(π»)), and π π = π(π»)
π = π½ π π» | π1, β― , ππ
π
0, π 1, β― , π π is a Doob sequence (vertex exposure martingale)
In particular, π
0 = π½(π(π»)), and π π = π(π»)
π = π½ π π» | π1, β― , ππ
0, π 1, β― , π π a Doob martingale: π 0 = π½(π(π»)), and π π = π(π»)
π = π½ π π» | π1, β― , ππ
0, π 1, β― , π π a Doob martingale: π 0 = π½(π(π»)), and π π = π(π»)
π = π½ π π» | π1, β― , ππ
0, π 1, β― , π π a Doob martingale: π 0 = π½(π(π»)), and π π = π(π»)
A new vertex can always be given a new color!
π = π½ π π» | π1, β― , ππ
0, π 1, β― , π π a Doob martingale: π 0 = π½(π(π»)), and π π = π(π»)
A new vertex can always be given a new color! π
π β π πβ1 β€ 1
π = π½ π π» | π1, β― , ππ
0, π 1, β― , π π a Doob martingale: π 0 = π½(π(π»)), and π π = π(π»)
A new vertex can always be given a new color! π
π β π πβ1 β€ 1
martingale π0, π1, β― , ππ
π½ ππ π0, π1, β― , ππβ1) = ππβ1
martingale π
0, π 1, β― w.r.t. π0, π1, β―
π
π = π(π0, π1, β― , ππ)
π½ π
π
π0, π1, β― , ππβ1) = π
πβ1
Azumaβs Inequality
martingale π0, π1, β― with ππ β ππβ1 β€ ππ, then β ππ β π0 β₯ π’ β€ β―
martingale π
0, π 1, β― w.r.t. π0, π1, β―
with π
π β π πβ1 β€ ππ,
then β π
π β π 0 β₯ π’ β€ β―
generalization generalization
martingale π0, π1, β― , ππ
π½ ππ π0, π1, β― , ππβ1) = ππβ1
martingale π
0, π 1, β― w.r.t. π0, π1, β―
π
π = π(π0, π1, β― , ππ)
π½ π
π
π0, π1, β― , ππβ1) = π
πβ1
Doob martingale π
0, π 1, β―
π
π = π½ π(π0, π1, β― , ππ) π0, π1, β― , ππβ1)
Azumaβs Inequality
martingale π0, π1, β― with ππ β ππβ1 β€ ππ, then β ππ β π0 β₯ π’ β€ β―
martingale π
0, π 1, β― w.r.t. π0, π1, β―
with π
π β π πβ1 β€ ππ,
then β π
π β π 0 β₯ π’ β€ β―
generalization special case generalization
martingale π0, π1, β― , ππ
π½ ππ π0, π1, β― , ππβ1) = ππβ1
martingale π
0, π 1, β― w.r.t. π0, π1, β―
π
π = π(π0, π1, β― , ππ)
π½ π
π
π0, π1, β― , ππβ1) = π
πβ1
Doob martingale π
0, π 1, β―
π
π = π½ π(π0, π1, β― , ππ) π0, π1, β― , ππβ1)
Azumaβs Inequality
martingale π0, π1, β― with ππ β ππβ1 β€ ππ, then β ππ β π0 β₯ π’ β€ β―
vertex exposure martingale Generalized Azumaβs Inequality
martingale π
0, π 1, β― w.r.t. π0, π1, β―
with π
π β π πβ1 β€ ππ,
then β π
π β π 0 β₯ π’ β€ β―
generalization special case applied in random graphs generalization
martingale π0, π1, β― , ππ
π½ ππ π0, π1, β― , ππβ1) = ππβ1
martingale π
0, π 1, β― w.r.t. π0, π1, β―
π
π = π(π0, π1, β― , ππ)
π½ π
π
π0, π1, β― , ππβ1) = π
πβ1
Doob martingale π
0, π 1, β―
π
π = π½ π(π0, π1, β― , ππ) π0, π1, β― , ππβ1)
Azumaβs Inequality
martingale π0, π1, β― with ππ β ππβ1 β€ ππ, then β ππ β π0 β₯ π’ β€ β―
vertex exposure martingale Generalized Azumaβs Inequality
martingale π
0, π 1, β― w.r.t. π0, π1, β―
with π
π β π πβ1 β€ ππ,
then β π
π β π 0 β₯ π’ β€ β―
generalization special case applied in random graphs generalization
Sample Application: Tight Concentration of Chromatic number
π(π1, π2, β― , ππ)
π = π½ π π1, β― , ππ | π1, β― , ππ
In particular, π
0 = π½ π(π1, β― , ππ) and π π = π(π1, β― , ππ)
π β π πβ1 are bounded
π β π 0 is bounded
π
π
π
πβ1
π
π
π Doob Martingale
π
π
π
πβ1
π
π
π Doob Martingale Generalized Azumaβs Inequality
May be hard to check!
Lipschitz condition + Independence bounded averaged differences
π½ π π π1, β― , ππ) β π½ π π π1, β― , ππβ1) β€ ππ
martingale π0, π1, β― , ππ
π½ ππ π0, π1, β― , ππβ1) = ππβ1
martingale π
0, π 1, β― w.r.t. π0, π1, β― π
π = π(π0, π1, β― , ππ)
π½ π
π
π0, π1, β― , ππβ1) = π
πβ1
Doob martingale π
0, π 1, β― π
π = π½ π(π0, π1, β― , ππ) π0, π1, β― , ππβ1)
Azumaβs Inequality
martingale π0, π1, β― with ππ β ππβ1 β€ ππ, then β ππ β π0 β₯ π’ β€ β―
Generalized Azumaβs Inequality
martingale π
0, π 1, β― w.r.t. π0, π1, β―
with π
π β π πβ1 β€ ππ,
then β π
π β π 0 β₯ π’ β€ β―
generalization special case generalization
martingale π0, π1, β― , ππ
π½ ππ π0, π1, β― , ππβ1) = ππβ1
martingale π
0, π 1, β― w.r.t. π0, π1, β― π
π = π(π0, π1, β― , ππ)
π½ π
π
π0, π1, β― , ππβ1) = π
πβ1
Doob martingale π
0, π 1, β― π
π = π½ π(π0, π1, β― , ππ) π0, π1, β― , ππβ1)
Azumaβs Inequality
martingale π0, π1, β― with ππ β ππβ1 β€ ππ, then β ππ β π0 β₯ π’ β€ β―
Generalized Azumaβs Inequality
martingale π
0, π 1, β― w.r.t. π0, π1, β―
with π
π β π πβ1 β€ ππ,
then β π
π β π 0 β₯ π’ β€ β―
generalization special case generalization
The Method of Averaged Bounded Differences
π π satisfying π½ π π π1, β― , ππ β π½ π π π1, β― , ππβ1 β€ ππ, then β π(π) β π½(π(π)) β₯ π’ β€ β―
martingale π0, π1, β― , ππ
π½ ππ π0, π1, β― , ππβ1) = ππβ1
martingale π
0, π 1, β― w.r.t. π0, π1, β― π
π = π(π0, π1, β― , ππ)
π½ π
π
π0, π1, β― , ππβ1) = π
πβ1
Doob martingale π
0, π 1, β― π
π = π½ π(π0, π1, β― , ππ) π0, π1, β― , ππβ1)
Azumaβs Inequality
martingale π0, π1, β― with ππ β ππβ1 β€ ππ, then β ππ β π0 β₯ π’ β€ β―
Generalized Azumaβs Inequality
martingale π
0, π 1, β― w.r.t. π0, π1, β―
with π
π β π πβ1 β€ ππ,
then β π
π β π 0 β₯ π’ β€ β―
generalization special case generalization
The Method of Averaged Bounded Differences
π π satisfying π½ π π π1, β― , ππ β π½ π π π1, β― , ππβ1 β€ ππ, then β π(π) β π½(π(π)) β₯ π’ β€ β―
The Method of Bounded Differences
π = (π1, β― , ππ) are independent r.v., π π satisfying the Lipschitz condition, then β π(π) β π½(π(π)) β₯ π’ β€ β― independence + Lipschitz condition
an alphabet Ξ£ with Ξ£ = π, a fixed pattern π β Ξ£π
an alphabet Ξ£ with Ξ£ = π, a fixed pattern π β Ξ£π independently and uniformly generate: π1, π2, β― , ππ β Ξ£ let π be number of substrings π in π1, π2, β― , ππ
an alphabet Ξ£ with Ξ£ = π, a fixed pattern π β Ξ£π independently and uniformly generate: π1, π2, β― , ππ β Ξ£ let π be number of substrings π in π1, π2, β― , ππ π½ π = π β π + 1 1 π
π
an alphabet Ξ£ with Ξ£ = π, a fixed pattern π β Ξ£π independently and uniformly generate: π1, π2, β― , ππ β Ξ£ let π be number of substrings π in π1, π2, β― , ππ π½ π = π β π + 1 1 π
π
Deviation?
an alphabet Ξ£ with Ξ£ = π, a fixed pattern π β Ξ£π independently and uniformly generate: π1, π2, β― , ππ β Ξ£ let π be number of substrings π in π1, π2, β― , ππ
an alphabet Ξ£ with Ξ£ = π, a fixed pattern π β Ξ£π independently and uniformly generate: π1, π2, β― , ππ β Ξ£ let π be number of substrings π in π1, π2, β― , ππ π = π(π1, π2, β― , ππ)
an alphabet Ξ£ with Ξ£ = π, a fixed pattern π β Ξ£π independently and uniformly generate: π1, π2, β― , ππ β Ξ£ let π be number of substrings π in π1, π2, β― , ππ π = π(π1, π2, β― , ππ)
an alphabet Ξ£ with Ξ£ = π, a fixed pattern π β Ξ£π independently and uniformly generate: π1, π2, β― , ππ β Ξ£ let π be number of substrings π in π1, π2, β― , ππ π = π(π1, π2, β― , ππ)
an alphabet Ξ£ with Ξ£ = π, a fixed pattern π β Ξ£π independently and uniformly generate: π1, π2, β― , ππ β Ξ£ let π be number of substrings π in π1, π2, β― , ππ π = π(π1, π2, β― , ππ) changing any ππ changes π for at most π
an alphabet Ξ£ with Ξ£ = π, a fixed pattern π β Ξ£π independently and uniformly generate: π1, π2, β― , ππ β Ξ£ let π be number of substrings π in π1, π2, β― , ππ π = π(π1, π2, β― , ππ) changing any ππ changes π for at most π
π
ππ denote the number of empty bins
π
ππ denote the number of empty bins
π
ππ denote the number of empty bins deviation: β π β π½ π β₯ π’ β€ ? ππ are not independent!
let π denote # of empty bins, we are interested in the deviation: β π β π½ π β₯ π’ β€ ?
let π denote # of empty bins, we are interested in the deviation: β π β π½ π β₯ π’ β€ ? let π
π be the bin that ball π landed in (thus π π are independent)
let π denote # of empty bins, we are interested in the deviation: β π β π½ π β₯ π’ β€ ? let π
π be the bin that ball π landed in (thus π π are independent)
π = π π
1, π 2, β― , π π = | π \{π 1, π 2, β― , π π}|
let π denote # of empty bins, we are interested in the deviation: β π β π½ π β₯ π’ β€ ? let π
π be the bin that ball π landed in (thus π π are independent)
π = π π
1, π 2, β― , π π = | π \{π 1, π 2, β― , π π}|
notice changing any π
π changes π for at most 1 (Lipschitz condition)
let π denote # of empty bins, we are interested in the deviation: β π β π½ π β₯ π’ β€ ? let π
π be the bin that ball π landed in (thus π π are independent)
π = π π
1, π 2, β― , π π = | π \{π 1, π 2, β― , π π}|
notice changing any π
π changes π for at most 1 (Lipschitz condition)