SLIDE 1 MAL - Hypothesis Testing
Rob Franken
- Dept. of Information and Computing Sciences, Utrecht University
P.O. Box 80.089, 3508 TB Utrecht, The Netherlands Web pages: http://www.cs.uu.nl/
7 April 2011
SLIDE 2
Paper
Plan for this half of the lecture is to prove the main result of the paper by Foster and Young
Literature
D.P. Foster, H.P. Young (2003): “Learning, hypothesis testing, and Nash equilibrium” in Games and Economic Behavior Elsevier
SLIDE 3 Main Theorem
Theorem 1
Suppose that the players adopt hypotheses with finite memory, have σi-smoothed best response functions, employ powerful hypothesis tests with comparable amounts of data, and are flexible in the adoption of new hypotheses. Given any ǫ > 0, if the σi are small (given ǫ), if the test tolerances τi are sufficiently fine (given ǫ and σi) and if the amounts of data collected, si, are sufficiently large (given ǫ, σi and τ) then:
- 1. The repeated-game strategies are ǫ-equilibria of the
repeated game G∞( u, X) at least 1 − ǫ of the time.
- 2. All players for whom prediction matters by at least are
ǫ-good predictors.
SLIDE 4
Introductory Definitions
Ai set of responses for player i with memory m
SLIDE 5
Introductory Definitions
Ai set of responses for player i with memory m Bi set of models player i can hold over players j = i
SLIDE 6
Introductory Definitions
Ai set of responses for player i with memory m Bi set of models player i can hold over players j = i A
σ a function from B to A, mapping all players
believe to their responses.
SLIDE 7
Introductory Definitions
Ai set of responses for player i with memory m Bi set of models player i can hold over players j = i A
σ a function from B to A, mapping all players
believe to their responses. B a function from A to B mapping all players current responses to the correct models.
SLIDE 8
Fixed Points
◮ We can easily think of what a fixed point model or
response is, they even correspond to each other.
SLIDE 9 Fixed Points
◮ We can easily think of what a fixed point model or
response is, they even correspond to each other.
◮ B
σ
b
?
= b
◮ A σ (B (
a)) ? = a
SLIDE 10 Fixed Points
◮ We can easily think of what a fixed point model or
response is, they even correspond to each other.
◮ B
σ
b
?
= b
◮ A σ (B (
a)) ? = a
◮
σ
b
τ
SLIDE 11 Fixed Points
◮ We can easily think of what a fixed point model or
response is, they even correspond to each other.
◮ B
σ
b
?
= b
◮ A σ (B (
a)) ? = a
◮
σ
b
τ
◮ Fixed points are equilibria
SLIDE 12
Overview of the Proof
◮ Suppose the current
b is bad for some responsive player i.
SLIDE 13
Overview of the Proof
◮ Suppose the current
b is bad for some responsive player i.
◮ With high probability he will reject his hypothesis.
SLIDE 14
Overview of the Proof
◮ Suppose the current
b is bad for some responsive player i.
◮ With high probability he will reject his hypothesis. ◮ Changing to a remote strategy will lead all other players to
reject their own model hypothesises.
SLIDE 15 Overview of the Proof
◮ Suppose the current
b is bad for some responsive player i.
◮ With high probability he will reject his hypothesis. ◮ Changing to a remote strategy will lead all other players to
reject their own model hypothesises.
◮ Positive chance that this goes to a single fixed point model
- b∗ except for the player i, but this will be corrected after
he tests another time.
SLIDE 16 Overview of the Proof
◮ Suppose the current
b is bad for some responsive player i.
◮ With high probability he will reject his hypothesis. ◮ Changing to a remote strategy will lead all other players to
reject their own model hypothesises.
◮ Positive chance that this goes to a single fixed point model
- b∗ except for the player i, but this will be corrected after
he tests another time.
◮ If in a fixed point model chance to get out of it are small
SLIDE 17 Lemma
Lemma
Fix a finite action space X = n
i=1 Xi. Given any ǫ > 0, and
any finite memory m, there exists functions σ(ǫ), τ(ǫ, σ), s(ǫ, σ, τ) such that if these functions bound the parameters with the same names, then at least 1 − ǫ of the time t, 1.
σ B
2.
i
at
i , Bi
i U t
i
a′
i, Bi
3.
i − Bi
- Aσi
- bt
- ǫ for every player i for whom
prediction matter by at least ǫ.
SLIDE 18
Proof of Lemma (1)
◮ Pick one fixed point pair
a∗ and b∗.
SLIDE 19
Proof of Lemma (1)
◮ Pick one fixed point pair
a∗ and b∗.
◮ Choose σi ǫ/2 for all i.
SLIDE 20
Proof of Lemma (1)
◮ Pick one fixed point pair
a∗ and b∗.
◮ Choose σi ǫ/2 for all i. ◮ It holds that ∃0 < δ < ǫ/2n: (∀
u, t, i)(∀bi, b′
i) |bi − b′ i| δ
⇒ |Aσi
i (bi) − Aσi i (b′ i)| ǫ/2n
SLIDE 21
Proof of Lemma (1)
◮ Pick one fixed point pair
a∗ and b∗.
◮ Choose σi ǫ/2 for all i. ◮ It holds that ∃0 < δ < ǫ/2n: (∀
u, t, i)(∀bi, b′
i) |bi − b′ i| δ
⇒ |Aσi
i (bi) − Aσi i (b′ i)| ǫ/2n ◮ Let di > 0 be the maximum difference between two
responses on the entire model space.
SLIDE 22
Proof of Lemma (1)
◮ Pick one fixed point pair
a∗ and b∗.
◮ Choose σi ǫ/2 for all i. ◮ It holds that ∃0 < δ < ǫ/2n: (∀
u, t, i)(∀bi, b′
i) |bi − b′ i| δ
⇒ |Aσi
i (bi) − Aσi i (b′ i)| ǫ/2n ◮ Let di > 0 be the maximum difference between two
responses on the entire model space.
◮ If di > δ player is responsive, else not.
SLIDE 23 Proof of Lemma (2)
Consider these two cases: All players are unresponsive, At least
Case 1
◮ Every possible response lies within δ < ǫ/2n.
SLIDE 24 Proof of Lemma (2)
Consider these two cases: All players are unresponsive, At least
Case 1
◮ Every possible response lies within δ < ǫ/2n. ◮ Each players utility varies by at most ǫ/2n < ǫ, so they are
ǫ-close to optimal.
SLIDE 25 Proof of Lemma (2)
Consider these two cases: All players are unresponsive, At least
Case 1
◮ Every possible response lies within δ < ǫ/2n. ◮ Each players utility varies by at most ǫ/2n < ǫ, so they are
ǫ-close to optimal.
◮ There are no responsive players.
SLIDE 26 Proof of Lemma (3)
Take a τ < δ 2(n + 1), a model vector b is good if all bi are good wrt. τ, fairly good if it is good for all responsive i, and bad otherwise.
Case 2
The proof consists of two claims
bt is (at least) fairly good at least 1 − ǫ
- f the time, then the three statements of the lemma follow.
SLIDE 27 Proof of Lemma (3)
Take a τ < δ 2(n + 1), a model vector b is good if all bi are good wrt. τ, fairly good if it is good for all responsive i, and bad otherwise.
Case 2
The proof consists of two claims
bt is (at least) fairly good at least 1 − ǫ
- f the time, then the three statements of the lemma follow.
- 2. The model vector
bt is fairly good at least 1 − ǫ of the time
SLIDE 28 Proof of Lemma (4)
Remember ∃0 < δ < ǫ/2n: (∀ u, t, i)(∀bi, b′
i) |bi − b′ i| δ ⇒
|Aσi
i (bi) − Aσi i (b′ i)| ǫ/2n
Claim 1
b being fairly good we can deduce that |ai − Aσi
i (Bi(ai))| ǫ/2n for responsive players, and for
unresponsive players it is 0 δ ǫ/2n, putting this together we get an upper bound of n • ǫ/2n, that is ǫ/2, this leads to statement 1 of the lemma.
SLIDE 29 Proof of Lemma (4)
Remember ∃0 < δ < ǫ/2n: (∀ u, t, i)(∀bi, b′
i) |bi − b′ i| δ ⇒
|Aσi
i (bi) − Aσi i (b′ i)| ǫ/2n
Claim 1
b being fairly good we can deduce that |ai − Aσi
i (Bi(ai))| ǫ/2n for responsive players, and for
unresponsive players it is 0 δ ǫ/2n, putting this together we get an upper bound of n • ǫ/2n, that is ǫ/2, this leads to statement 1 of the lemma.
- 2. Since we transformed the game to a game with payoffs
between 0 and 1 we know because statement 1 of the lemma holds that the maximal difference in utility with the fixed point is σi ǫ/2, thus the maximal difference between two models in that range will be 2ǫ/2.
SLIDE 30 Proof of Lemma (4)
Remember ∃0 < δ < ǫ/2n: (∀ u, t, i)(∀bi, b′
i) |bi − b′ i| δ ⇒
|Aσi
i (bi) − Aσi i (b′ i)| ǫ/2n
Claim 1
b being fairly good we can deduce that |ai − Aσi
i (Bi(ai))| ǫ/2n for responsive players, and for
unresponsive players it is 0 δ ǫ/2n, putting this together we get an upper bound of n • ǫ/2n, that is ǫ/2, this leads to statement 1 of the lemma.
- 2. Since we transformed the game to a game with payoffs
between 0 and 1 we know because statement 1 of the lemma holds that the maximal difference in utility with the fixed point is σi ǫ/2, thus the maximal difference between two models in that range will be 2ǫ/2.
b is fairly good we have statement 3 of the lemma.
SLIDE 31
Claim 2
Reminder
The model vector bt is fairly good at least 1 − ǫ of the time
SLIDE 32 Claim 2
Reminder
The model vector bt is fairly good at least 1 − ǫ of the time There exists a c(τ) such that players reject with exponentially small probability if the model is atmost c(τ) from the truth, and with probabillity near to 1 if it is atleast τ from the truth. (result from the requirement of powerful tests.) There is a 0 < γ < c(τ)/2 such that: (∀i) |bi − b∗
i | < γ ⇒
σ(
B)
SLIDE 33
Claim 2 (2)
Let us call states in which all models are within γ of b∗
j and no
player is testing great states. Consider the following process to reach a great state from a bad state.
SLIDE 34 Claim 2 (2)
Let us call states in which all models are within γ of b∗
j and no
player is testing great states. Consider the following process to reach a great state from a bad state.
- 1. Player 1 alone is testing and rejects his hypothesis and
accepts a model reasonably close to b∗
i but still wrong. (At
most 2s∗ rounds)
SLIDE 35 Claim 2 (2)
Let us call states in which all models are within γ of b∗
j and no
player is testing great states. Consider the following process to reach a great state from a bad state.
- 1. Player 1 alone is testing and rejects his hypothesis and
accepts a model reasonably close to b∗
i but still wrong. (At
most 2s∗ rounds)
- 2. Then one after another the other players test their
hypothesis and adopt a model within γ of b∗
j . (At most
(n − 1)s∗ rounds)
SLIDE 36 Claim 2 (2)
Let us call states in which all models are within γ of b∗
j and no
player is testing great states. Consider the following process to reach a great state from a bad state.
- 1. Player 1 alone is testing and rejects his hypothesis and
accepts a model reasonably close to b∗
i but still wrong. (At
most 2s∗ rounds)
- 2. Then one after another the other players test their
hypothesis and adopt a model within γ of b∗
j . (At most
(n − 1)s∗ rounds)
- 3. Player 1 starts a new test and now adopts b∗
- 1. (At most s∗
rounds)
SLIDE 37 Claim 2 (2)
Let us call states in which all models are within γ of b∗
j and no
player is testing great states. Consider the following process to reach a great state from a bad state.
- 1. Player 1 alone is testing and rejects his hypothesis and
accepts a model reasonably close to b∗
i but still wrong. (At
most 2s∗ rounds)
- 2. Then one after another the other players test their
hypothesis and adopt a model within γ of b∗
j . (At most
(n − 1)s∗ rounds)
- 3. Player 1 starts a new test and now adopts b∗
- 1. (At most s∗
rounds)
- 4. Don’t start a test for (n + 2)s∗ − T rounds where T is
length of step 1-3
SLIDE 38 Claim 2 (3)
We have n+1 test phases that should end in rejection, since we know that all models are at least γ wrong we can choose our test parameters so that chance for rejection is at least 1/2. Then it needs to hit a target of radius λ, lets call the lower bound to this chance to hit f∗(λ) We also require that the test take place sequentially with no
- verlap, the chance for that is:
- (1/s∗)(1 − 1/s∗)(n−1)s∗n+1
since si 2 we can bound this chance from below by:
- (1/s∗)(1 − 4)(n−1)s∗/s∗n+1
SLIDE 39
Claim 2 (4)
Putting this and the eventual wait together we get a lowerbound to the chance of: 4−Ns∗/s∗(f∗(λ)/2s∗)n+1 For some positive integer N which is again reducable to (if the difference between s∗ and s∗ is not more then a factor 2) η = αsn+1
∗
eβs∗ Where α and β are the chances of making a type I and II error. We can bound the chance we leave a great state in T rounds from above by Te−4rs∗ so if we choose T to be e3rs∗ we keep a probability of e−rs∗.
SLIDE 40
Claim 2 (5)
Now define ε to be the event that in ǫT of the states in a period of length T are bad. We can divide that in two sub events, the ones were all the bad states are from time 0 till k (ε′). And the others (ε′′). For ε′ the upperbound is (1 − η)k−1 < e−η(k−1), using our earlier bounds on variable we can bound that variable from above with e−ers∗, which we can make arbitrary small when s∗ is large enough. If ε′′ happens the proces doesn’t stay in great states for T periods thus P(ε′′) e−rs∗, and again if we take s∗ to be large enough we can make this sufficient small. ε = ε′ + ε′′ < ǫ
SLIDE 41 Lemma (again)
Lemma
Fix a finite action space X = n
i=1 Xi. Given any ǫ > 0, and
any finite memory m, there exists functions σ(ǫ), τ(ǫ, σ), s(ǫ, σ, τ) such that if these functions bound the parameters with the same names, then at least 1 − ǫ of the time t, 1.
σ B
SLIDE 42 Lemma (again)
Lemma
Fix a finite action space X = n
i=1 Xi. Given any ǫ > 0, and
any finite memory m, there exists functions σ(ǫ), τ(ǫ, σ), s(ǫ, σ, τ) such that if these functions bound the parameters with the same names, then at least 1 − ǫ of the time t, 1.
σ B
2.
i
at
i , Bi
i U t
i
a′
i, Bi
SLIDE 43 Lemma (again)
Lemma
Fix a finite action space X = n
i=1 Xi. Given any ǫ > 0, and
any finite memory m, there exists functions σ(ǫ), τ(ǫ, σ), s(ǫ, σ, τ) such that if these functions bound the parameters with the same names, then at least 1 − ǫ of the time t, 1.
σ B
2.
i
at
i , Bi
i U t
i
a′
i, Bi
3.
i − Bi
- Aσi
- bt
- ǫ for every player i for whom
prediction matter by at least ǫ.
SLIDE 44 Theorem (again)
Theorem 1
Suppose that the players adopt hypotheses with finite memory, have σi-smoothed best response functions, employ powerful hypothesis tests with comparable amounts of data, and are flexible in the adoption of new hypotheses. Given any ǫ > 0, if the σi are small (given ǫ), if the test tolerances τi are sufficiently fine (given ǫ and σi) and if the amounts of data collected, si, are sufficiently large (given ǫ, σi and τ) then:
- 1. The repeated-game strategies are ǫ-equilibria of the
repeated game G∞( u, X) at least 1 − ǫ of the time.
SLIDE 45 Theorem (again)
Theorem 1
Suppose that the players adopt hypotheses with finite memory, have σi-smoothed best response functions, employ powerful hypothesis tests with comparable amounts of data, and are flexible in the adoption of new hypotheses. Given any ǫ > 0, if the σi are small (given ǫ), if the test tolerances τi are sufficiently fine (given ǫ and σi) and if the amounts of data collected, si, are sufficiently large (given ǫ, σi and τ) then:
- 1. The repeated-game strategies are ǫ-equilibria of the
repeated game G∞( u, X) at least 1 − ǫ of the time.
- 2. All players for whom prediction matters by at least are
ǫ-good predictors.
SLIDE 46
The End
Room for questions