MAL - Hypothesis Testing Rob Franken Dept. of Information and - - PowerPoint PPT Presentation

mal hypothesis testing
SMART_READER_LITE
LIVE PREVIEW

MAL - Hypothesis Testing Rob Franken Dept. of Information and - - PowerPoint PPT Presentation

MAL - Hypothesis Testing Rob Franken Dept. of Information and Computing Sciences, Utrecht University P.O. Box 80.089, 3508 TB Utrecht, The Netherlands Web pages: http://www.cs.uu.nl/ 7 April 2011 Paper Plan for this half of the lecture is to


slide-1
SLIDE 1

MAL - Hypothesis Testing

Rob Franken

  • Dept. of Information and Computing Sciences, Utrecht University

P.O. Box 80.089, 3508 TB Utrecht, The Netherlands Web pages: http://www.cs.uu.nl/

7 April 2011

slide-2
SLIDE 2

Paper

Plan for this half of the lecture is to prove the main result of the paper by Foster and Young

Literature

D.P. Foster, H.P. Young (2003): “Learning, hypothesis testing, and Nash equilibrium” in Games and Economic Behavior Elsevier

slide-3
SLIDE 3

Main Theorem

Theorem 1

Suppose that the players adopt hypotheses with finite memory, have σi-smoothed best response functions, employ powerful hypothesis tests with comparable amounts of data, and are flexible in the adoption of new hypotheses. Given any ǫ > 0, if the σi are small (given ǫ), if the test tolerances τi are sufficiently fine (given ǫ and σi) and if the amounts of data collected, si, are sufficiently large (given ǫ, σi and τ) then:

  • 1. The repeated-game strategies are ǫ-equilibria of the

repeated game G∞( u, X) at least 1 − ǫ of the time.

  • 2. All players for whom prediction matters by at least are

ǫ-good predictors.

slide-4
SLIDE 4

Introductory Definitions

Ai set of responses for player i with memory m

slide-5
SLIDE 5

Introductory Definitions

Ai set of responses for player i with memory m Bi set of models player i can hold over players j = i

slide-6
SLIDE 6

Introductory Definitions

Ai set of responses for player i with memory m Bi set of models player i can hold over players j = i A

σ a function from B to A, mapping all players

believe to their responses.

slide-7
SLIDE 7

Introductory Definitions

Ai set of responses for player i with memory m Bi set of models player i can hold over players j = i A

σ a function from B to A, mapping all players

believe to their responses. B a function from A to B mapping all players current responses to the correct models.

slide-8
SLIDE 8

Fixed Points

◮ We can easily think of what a fixed point model or

response is, they even correspond to each other.

slide-9
SLIDE 9

Fixed Points

◮ We can easily think of what a fixed point model or

response is, they even correspond to each other.

◮ B

  • A

σ

b

?

= b

◮ A σ (B (

a)) ? = a

slide-10
SLIDE 10

Fixed Points

◮ We can easily think of what a fixed point model or

response is, they even correspond to each other.

◮ B

  • A

σ

b

?

= b

◮ A σ (B (

a)) ? = a

  • bi − Bi
  • A

σ

b

  • ?

τ

slide-11
SLIDE 11

Fixed Points

◮ We can easily think of what a fixed point model or

response is, they even correspond to each other.

◮ B

  • A

σ

b

?

= b

◮ A σ (B (

a)) ? = a

  • bi − Bi
  • A

σ

b

  • ?

τ

◮ Fixed points are equilibria

slide-12
SLIDE 12

Overview of the Proof

◮ Suppose the current

b is bad for some responsive player i.

slide-13
SLIDE 13

Overview of the Proof

◮ Suppose the current

b is bad for some responsive player i.

◮ With high probability he will reject his hypothesis.

slide-14
SLIDE 14

Overview of the Proof

◮ Suppose the current

b is bad for some responsive player i.

◮ With high probability he will reject his hypothesis. ◮ Changing to a remote strategy will lead all other players to

reject their own model hypothesises.

slide-15
SLIDE 15

Overview of the Proof

◮ Suppose the current

b is bad for some responsive player i.

◮ With high probability he will reject his hypothesis. ◮ Changing to a remote strategy will lead all other players to

reject their own model hypothesises.

◮ Positive chance that this goes to a single fixed point model

  • b∗ except for the player i, but this will be corrected after

he tests another time.

slide-16
SLIDE 16

Overview of the Proof

◮ Suppose the current

b is bad for some responsive player i.

◮ With high probability he will reject his hypothesis. ◮ Changing to a remote strategy will lead all other players to

reject their own model hypothesises.

◮ Positive chance that this goes to a single fixed point model

  • b∗ except for the player i, but this will be corrected after

he tests another time.

◮ If in a fixed point model chance to get out of it are small

slide-17
SLIDE 17

Lemma

Lemma

Fix a finite action space X = n

i=1 Xi. Given any ǫ > 0, and

any finite memory m, there exists functions σ(ǫ), τ(ǫ, σ), s(ǫ, σ, τ) such that if these functions bound the parameters with the same names, then at least 1 − ǫ of the time t, 1.

  • at − A

σ B

  • at
  • ǫ/2.

2.

  • U t

i

at

i , Bi

  • at − maxa′

i U t

i

a′

i, Bi

  • at
  • ǫ for all i

3.

  • bt

i − Bi

  • Aσi
  • bt
  • ǫ for every player i for whom

prediction matter by at least ǫ.

slide-18
SLIDE 18

Proof of Lemma (1)

◮ Pick one fixed point pair

a∗ and b∗.

slide-19
SLIDE 19

Proof of Lemma (1)

◮ Pick one fixed point pair

a∗ and b∗.

◮ Choose σi ǫ/2 for all i.

slide-20
SLIDE 20

Proof of Lemma (1)

◮ Pick one fixed point pair

a∗ and b∗.

◮ Choose σi ǫ/2 for all i. ◮ It holds that ∃0 < δ < ǫ/2n: (∀

u, t, i)(∀bi, b′

i) |bi − b′ i| δ

⇒ |Aσi

i (bi) − Aσi i (b′ i)| ǫ/2n

slide-21
SLIDE 21

Proof of Lemma (1)

◮ Pick one fixed point pair

a∗ and b∗.

◮ Choose σi ǫ/2 for all i. ◮ It holds that ∃0 < δ < ǫ/2n: (∀

u, t, i)(∀bi, b′

i) |bi − b′ i| δ

⇒ |Aσi

i (bi) − Aσi i (b′ i)| ǫ/2n ◮ Let di > 0 be the maximum difference between two

responses on the entire model space.

slide-22
SLIDE 22

Proof of Lemma (1)

◮ Pick one fixed point pair

a∗ and b∗.

◮ Choose σi ǫ/2 for all i. ◮ It holds that ∃0 < δ < ǫ/2n: (∀

u, t, i)(∀bi, b′

i) |bi − b′ i| δ

⇒ |Aσi

i (bi) − Aσi i (b′ i)| ǫ/2n ◮ Let di > 0 be the maximum difference between two

responses on the entire model space.

◮ If di > δ player is responsive, else not.

slide-23
SLIDE 23

Proof of Lemma (2)

Consider these two cases: All players are unresponsive, At least

  • ne player is responsive.

Case 1

◮ Every possible response lies within δ < ǫ/2n.

slide-24
SLIDE 24

Proof of Lemma (2)

Consider these two cases: All players are unresponsive, At least

  • ne player is responsive.

Case 1

◮ Every possible response lies within δ < ǫ/2n. ◮ Each players utility varies by at most ǫ/2n < ǫ, so they are

ǫ-close to optimal.

slide-25
SLIDE 25

Proof of Lemma (2)

Consider these two cases: All players are unresponsive, At least

  • ne player is responsive.

Case 1

◮ Every possible response lies within δ < ǫ/2n. ◮ Each players utility varies by at most ǫ/2n < ǫ, so they are

ǫ-close to optimal.

◮ There are no responsive players.

slide-26
SLIDE 26

Proof of Lemma (3)

Take a τ < δ 2(n + 1), a model vector b is good if all bi are good wrt. τ, fairly good if it is good for all responsive i, and bad otherwise.

Case 2

The proof consists of two claims

  • 1. If the model vector

bt is (at least) fairly good at least 1 − ǫ

  • f the time, then the three statements of the lemma follow.
slide-27
SLIDE 27

Proof of Lemma (3)

Take a τ < δ 2(n + 1), a model vector b is good if all bi are good wrt. τ, fairly good if it is good for all responsive i, and bad otherwise.

Case 2

The proof consists of two claims

  • 1. If the model vector

bt is (at least) fairly good at least 1 − ǫ

  • f the time, then the three statements of the lemma follow.
  • 2. The model vector

bt is fairly good at least 1 − ǫ of the time

slide-28
SLIDE 28

Proof of Lemma (4)

Remember ∃0 < δ < ǫ/2n: (∀ u, t, i)(∀bi, b′

i) |bi − b′ i| δ ⇒

|Aσi

i (bi) − Aσi i (b′ i)| ǫ/2n

Claim 1

  • 1. From

b being fairly good we can deduce that |ai − Aσi

i (Bi(ai))| ǫ/2n for responsive players, and for

unresponsive players it is 0 δ ǫ/2n, putting this together we get an upper bound of n • ǫ/2n, that is ǫ/2, this leads to statement 1 of the lemma.

slide-29
SLIDE 29

Proof of Lemma (4)

Remember ∃0 < δ < ǫ/2n: (∀ u, t, i)(∀bi, b′

i) |bi − b′ i| δ ⇒

|Aσi

i (bi) − Aσi i (b′ i)| ǫ/2n

Claim 1

  • 1. From

b being fairly good we can deduce that |ai − Aσi

i (Bi(ai))| ǫ/2n for responsive players, and for

unresponsive players it is 0 δ ǫ/2n, putting this together we get an upper bound of n • ǫ/2n, that is ǫ/2, this leads to statement 1 of the lemma.

  • 2. Since we transformed the game to a game with payoffs

between 0 and 1 we know because statement 1 of the lemma holds that the maximal difference in utility with the fixed point is σi ǫ/2, thus the maximal difference between two models in that range will be 2ǫ/2.

slide-30
SLIDE 30

Proof of Lemma (4)

Remember ∃0 < δ < ǫ/2n: (∀ u, t, i)(∀bi, b′

i) |bi − b′ i| δ ⇒

|Aσi

i (bi) − Aσi i (b′ i)| ǫ/2n

Claim 1

  • 1. From

b being fairly good we can deduce that |ai − Aσi

i (Bi(ai))| ǫ/2n for responsive players, and for

unresponsive players it is 0 δ ǫ/2n, putting this together we get an upper bound of n • ǫ/2n, that is ǫ/2, this leads to statement 1 of the lemma.

  • 2. Since we transformed the game to a game with payoffs

between 0 and 1 we know because statement 1 of the lemma holds that the maximal difference in utility with the fixed point is σi ǫ/2, thus the maximal difference between two models in that range will be 2ǫ/2.

  • 3. Since τ < ǫ and

b is fairly good we have statement 3 of the lemma.

slide-31
SLIDE 31

Claim 2

Reminder

The model vector bt is fairly good at least 1 − ǫ of the time

slide-32
SLIDE 32

Claim 2

Reminder

The model vector bt is fairly good at least 1 − ǫ of the time There exists a c(τ) such that players reject with exponentially small probability if the model is atmost c(τ) from the truth, and with probabillity near to 1 if it is atleast τ from the truth. (result from the requirement of powerful tests.) There is a 0 < γ < c(τ)/2 such that: (∀i) |bi − b∗

i | < γ ⇒

  • bi − Bi
  • A

σ(

B)

  • < c(τ)
slide-33
SLIDE 33

Claim 2 (2)

Let us call states in which all models are within γ of b∗

j and no

player is testing great states. Consider the following process to reach a great state from a bad state.

slide-34
SLIDE 34

Claim 2 (2)

Let us call states in which all models are within γ of b∗

j and no

player is testing great states. Consider the following process to reach a great state from a bad state.

  • 1. Player 1 alone is testing and rejects his hypothesis and

accepts a model reasonably close to b∗

i but still wrong. (At

most 2s∗ rounds)

slide-35
SLIDE 35

Claim 2 (2)

Let us call states in which all models are within γ of b∗

j and no

player is testing great states. Consider the following process to reach a great state from a bad state.

  • 1. Player 1 alone is testing and rejects his hypothesis and

accepts a model reasonably close to b∗

i but still wrong. (At

most 2s∗ rounds)

  • 2. Then one after another the other players test their

hypothesis and adopt a model within γ of b∗

j . (At most

(n − 1)s∗ rounds)

slide-36
SLIDE 36

Claim 2 (2)

Let us call states in which all models are within γ of b∗

j and no

player is testing great states. Consider the following process to reach a great state from a bad state.

  • 1. Player 1 alone is testing and rejects his hypothesis and

accepts a model reasonably close to b∗

i but still wrong. (At

most 2s∗ rounds)

  • 2. Then one after another the other players test their

hypothesis and adopt a model within γ of b∗

j . (At most

(n − 1)s∗ rounds)

  • 3. Player 1 starts a new test and now adopts b∗
  • 1. (At most s∗

rounds)

slide-37
SLIDE 37

Claim 2 (2)

Let us call states in which all models are within γ of b∗

j and no

player is testing great states. Consider the following process to reach a great state from a bad state.

  • 1. Player 1 alone is testing and rejects his hypothesis and

accepts a model reasonably close to b∗

i but still wrong. (At

most 2s∗ rounds)

  • 2. Then one after another the other players test their

hypothesis and adopt a model within γ of b∗

j . (At most

(n − 1)s∗ rounds)

  • 3. Player 1 starts a new test and now adopts b∗
  • 1. (At most s∗

rounds)

  • 4. Don’t start a test for (n + 2)s∗ − T rounds where T is

length of step 1-3

slide-38
SLIDE 38

Claim 2 (3)

We have n+1 test phases that should end in rejection, since we know that all models are at least γ wrong we can choose our test parameters so that chance for rejection is at least 1/2. Then it needs to hit a target of radius λ, lets call the lower bound to this chance to hit f∗(λ) We also require that the test take place sequentially with no

  • verlap, the chance for that is:
  • (1/s∗)(1 − 1/s∗)(n−1)s∗n+1

since si 2 we can bound this chance from below by:

  • (1/s∗)(1 − 4)(n−1)s∗/s∗n+1
slide-39
SLIDE 39

Claim 2 (4)

Putting this and the eventual wait together we get a lowerbound to the chance of: 4−Ns∗/s∗(f∗(λ)/2s∗)n+1 For some positive integer N which is again reducable to (if the difference between s∗ and s∗ is not more then a factor 2) η = αsn+1

eβs∗ Where α and β are the chances of making a type I and II error. We can bound the chance we leave a great state in T rounds from above by Te−4rs∗ so if we choose T to be e3rs∗ we keep a probability of e−rs∗.

slide-40
SLIDE 40

Claim 2 (5)

Now define ε to be the event that in ǫT of the states in a period of length T are bad. We can divide that in two sub events, the ones were all the bad states are from time 0 till k (ε′). And the others (ε′′). For ε′ the upperbound is (1 − η)k−1 < e−η(k−1), using our earlier bounds on variable we can bound that variable from above with e−ers∗, which we can make arbitrary small when s∗ is large enough. If ε′′ happens the proces doesn’t stay in great states for T periods thus P(ε′′) e−rs∗, and again if we take s∗ to be large enough we can make this sufficient small. ε = ε′ + ε′′ < ǫ

slide-41
SLIDE 41

Lemma (again)

Lemma

Fix a finite action space X = n

i=1 Xi. Given any ǫ > 0, and

any finite memory m, there exists functions σ(ǫ), τ(ǫ, σ), s(ǫ, σ, τ) such that if these functions bound the parameters with the same names, then at least 1 − ǫ of the time t, 1.

  • at − A

σ B

  • at
  • ǫ/2.
slide-42
SLIDE 42

Lemma (again)

Lemma

Fix a finite action space X = n

i=1 Xi. Given any ǫ > 0, and

any finite memory m, there exists functions σ(ǫ), τ(ǫ, σ), s(ǫ, σ, τ) such that if these functions bound the parameters with the same names, then at least 1 − ǫ of the time t, 1.

  • at − A

σ B

  • at
  • ǫ/2.

2.

  • U t

i

at

i , Bi

  • at − maxa′

i U t

i

a′

i, Bi

  • at
  • ǫ for all i
slide-43
SLIDE 43

Lemma (again)

Lemma

Fix a finite action space X = n

i=1 Xi. Given any ǫ > 0, and

any finite memory m, there exists functions σ(ǫ), τ(ǫ, σ), s(ǫ, σ, τ) such that if these functions bound the parameters with the same names, then at least 1 − ǫ of the time t, 1.

  • at − A

σ B

  • at
  • ǫ/2.

2.

  • U t

i

at

i , Bi

  • at − maxa′

i U t

i

a′

i, Bi

  • at
  • ǫ for all i

3.

  • bt

i − Bi

  • Aσi
  • bt
  • ǫ for every player i for whom

prediction matter by at least ǫ.

slide-44
SLIDE 44

Theorem (again)

Theorem 1

Suppose that the players adopt hypotheses with finite memory, have σi-smoothed best response functions, employ powerful hypothesis tests with comparable amounts of data, and are flexible in the adoption of new hypotheses. Given any ǫ > 0, if the σi are small (given ǫ), if the test tolerances τi are sufficiently fine (given ǫ and σi) and if the amounts of data collected, si, are sufficiently large (given ǫ, σi and τ) then:

  • 1. The repeated-game strategies are ǫ-equilibria of the

repeated game G∞( u, X) at least 1 − ǫ of the time.

slide-45
SLIDE 45

Theorem (again)

Theorem 1

Suppose that the players adopt hypotheses with finite memory, have σi-smoothed best response functions, employ powerful hypothesis tests with comparable amounts of data, and are flexible in the adoption of new hypotheses. Given any ǫ > 0, if the σi are small (given ǫ), if the test tolerances τi are sufficiently fine (given ǫ and σi) and if the amounts of data collected, si, are sufficiently large (given ǫ, σi and τ) then:

  • 1. The repeated-game strategies are ǫ-equilibria of the

repeated game G∞( u, X) at least 1 − ǫ of the time.

  • 2. All players for whom prediction matters by at least are

ǫ-good predictors.

slide-46
SLIDE 46

The End

Room for questions