Game Theory: Lecture #9 Outline: Zero-sum games Security - - PDF document

game theory lecture 9
SMART_READER_LITE
LIVE PREVIEW

Game Theory: Lecture #9 Outline: Zero-sum games Security - - PDF document

Game Theory: Lecture #9 Outline: Zero-sum games Security strategies and values Value Minimax Theorem Individual Optimization Previous focus: Single decision-maker i A set of actions for the individual, denoted by A i .


slide-1
SLIDE 1

Game Theory: Lecture #9

Outline:

  • Zero-sum games
  • Security strategies and values
  • Value
  • Minimax Theorem
slide-2
SLIDE 2

Individual Optimization

  • Previous focus: Single decision-maker i

– A set of actions for the individual, denoted by Ai. – A set of “other things that could happen in the world,” denoted A−i – This induces the set of states of the world A = Ai × A−i – The individual’s preferences over states characterized by a function: Ui : A → R

  • Terminology:

– Ui(·) referred to as “payoff” or “utility” or “reward” function – The individual i is referred to as an “agent,” “player,” “decision-maker,” or “user”

  • Player i prefers state a to state a′ if and only if

Ui(a) > Ui(a′) In case Ui(a) = Ui(a′) player i is “indifferent”

  • Dominant question: What should the decision-maker do in such scenarios?
  • New focus: What if the “other things” are adversarial?

1

slide-3
SLIDE 3

Zero-sum games

  • Setup: Two-player zero-sum games

– Set of players, N = {1, 2} – Set of actions, A1 and A2 – This induces the set of action profiles A = A1 × A2 – For each player, preferences over action profiles characterized by a function: Ui : A → R – Zero-sum constraint: For any action profile a ∈ A, U1(a) + U2(a) = 0

  • Matrix form is a convenient representation for two player strategic games. First entry

Player 1’s payoff, second entry Player 2’s payoff

  • Example: Matching pennies

H T H 1, −1 −1, 1 T −1, 1 1, −1

  • Example: Rock-paper-scissors

R P S R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 S −1, 1 1, −1 0, 0

2

slide-4
SLIDE 4

Zero-sum games

  • Note: Zero-sum distinction is only relevant for two-player games. Why?
  • Convention:

– Represent game by single matrix – View row player as “maximizer” – View column player as “minimizer” (rather than maximizer of negative values)

  • Example: Rock-paper-scissors

R P S R −1 1 P 1 −1 S −1 1

  • Player set: {row, col}
  • Action sets:

Arow = {R, P, S} Acol = {R, P, S}

  • Action profiles:

A = {(R, R), (R, P), (R, S), ..., (S, R), (S, P), (S, S)}

  • Payoff functions:

Urow(R, R) = 0 & Ucol(R, R) = 0 Urow(R, P) = −1 & Ucol(R, P) = 1 . . . Urow(S, S) = 0 & Ucol(S, S) = 0

  • Interpretation: row tries to maximize cell number, col tries to minimize cell number

3

slide-5
SLIDE 5

Worst-case analysis

  • What is reasonable prediction of behavior in zero-sum games?
  • Worst-case model: “worst for row is best for col”

– One such model that seems reasonable in zero-sum games – Requires analyzing “what if” scenarios, i.e., if I play T what would opponent do?

  • Example:

L R T 3 B 1 2 – If row plays T: Worst case outcome is 0 – If row plays B: Worst case outcome is 1 – row’s security strategy: B (worst case is col = L) – Likewise, col’s security strategy is R (worst case is row = B)

  • Guaranteed levels:

– Let v denote the guaranteed payoff for row (=1) – Let v denote the maximum penalty for col (=2)

  • Example:

L R T 3 B 2 1 – row: B (worst case is col = R) – col: R (worst case is row = B) – v = v = 1

4

slide-6
SLIDE 6

Maximin and minimax

  • Question: How do v and v compare in general?

– Is it always the case that v ≤ v? – When is it the case that v = v?

  • Fact: Computation of v and v is precisely the same as “maximin” and “minimax”

computations

  • Let F : X × Y → R
  • Maximin:

max

x∈X min y∈Y F(x, y)

– x commits – y maximizes (as a function of x) max

x∈X F(x, ywc(x))

where F(x, ywc(x)) = min

y∈Y F(x, y)

  • Minimax: (Similar interpretation)

min

y∈Y max x∈X F(x, y)

  • Claim: “Largest minimum is smaller that smallest maximum”

max

x∈X min y∈Y F(x, y) ≤ min y∈Y max x∈X F(x, y)

  • Proof:

y1 y2 y3 x1 · · miny maxx F(x, y) x2 · · · x3 maxx miny F(x, y) · α max

x∈X min y∈Y F(x, y) ≤ α ≤ min y∈Y max x∈X F(x, y)

5

slide-7
SLIDE 7

Security strategies

  • Notation:

– Set of rows: I – Set of columns: J – Game matrix elements: mij

  • Define:

v = max

i∈I min j∈J mij

v = min

j∈J max i∈I mij

  • From prior result:

v ≤ v

  • Game has a value of v∗ if v = v = v∗
  • i∗ is a maximizing security strategy (or maximinimizer) if

v ≤ mi∗j for all j i.e., i∗ assures a payoff of at least v

  • j∗ is a minimizing security strategy (or minimaximizer) if

mij∗ ≤ v for all i i.e., j∗ assures a penalty of at most v

6

slide-8
SLIDE 8

Mixed strategies

  • Recall previous example

H T H 1, − T −1 1 – Row player cannot assure payoff greater than −1 with either H or T – What if row player randomizes 50/50? Then can assure payoff at least 0

  • Recall previous example

L R T 3 B 1 2 – Col player cannot assure penalty of less than 2 with either L or R – What if Col player randomizes (1/2, 1/2)? Then can assure penalty of at most 1.5

  • How do security levels change when using mixed strategies (i.e., probabilistic strategies)

and opposed to pure strategies (i.e., non-probabilistic strategies)?

7

slide-9
SLIDE 9

(Mixed) Security strategies

  • Discussion parallels that of pure actions
  • Notation:

– Mixed strategy of row player: p ∈ ∆ – Mixed strategy of column player: q ∈ ∆

  • Claim: Expected payoff to (maximizing) row player is

pTMq =

  • i∈I
  • j∈J

mijpiqj

  • Define:

v = max

p∈∆ min q∈∆ pTMq

v = min

q∈∆ max p∈∆ pTMq

  • As before

v ≤ v and game has a value of v∗ if v = v = v∗

  • p∗ is a maximizing security strategy (or maximinimizer) if

v ≤ p∗TMq for all q

  • q∗ is a minimizing security strategy (or minimaximizer) if

pTMq∗ ≤ v for all p

8

slide-10
SLIDE 10

Value

  • Minimax theorem: With mixed strategies,

v = v = v∗

  • Proof: Judicious use of “separating hyperplane” theorem
  • Remarks:

– Every zero-sum matrix game has a value over mixed strategies – Mixed strategies reasonable prediction of behavior in zero-sum games – Relatively easy to compute security strategies in zero-sum games.

9

slide-11
SLIDE 11

Example

  • What are the security strategies and value of the following game?

L R T 3 B 1 2

  • Suppose ROW playing a strategy (p, 1 − p), i.e., play T with probability p

– ROW’s expected utility if COL plays L: 3p + 1(1 − p) = 2p + 1 – ROW’s expected utility if COL plays R: 2(1 − p) = 2 − p

  • Plot and inspect:

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 p 0.5 1 1.5 2 2.5 3 Expected Utility for ROW COL playing R COL player L

  • The p that maximizes the minimum payoff is: p = 0.25 with a security level of v = 1.5.
  • Similar analysis for COL demonstrates (1/2, 1/2) is security strategy with a security level
  • f v = 1.5.
  • Hence, the game has a value v∗ = v = v = 1.5.

10