2-player zero-sum game u Prove that NE exists in two ways 1. Nash's - - PDF document

2 player zero sum game
SMART_READER_LITE
LIVE PREVIEW

2-player zero-sum game u Prove that NE exists in two ways 1. Nash's - - PDF document

10/5/20 CSCI 3210: Computational Game Theory Linear Programming and 2-Player Zero-Sum Games Ref: Wikipedia: https://en.wikipedia.org/wiki/Linear_programming and [AGT] Ch 1 Mohammad T . Irfan Email: mirfan@bowdoin.edu Web:


slide-1
SLIDE 1

10/5/20 1

CSCI 3210: Computational Game Theory

Mohammad T . Irfan

Email: mirfan@bowdoin.edu Web: www.bowdoin.edu/~mirfan Course: www.bowdoin.edu/~mirfan/CSCI-3210.html

Linear Programming and 2-Player Zero-Sum Games

Ref: Wikipedia: https://en.wikipedia.org/wiki/Linear_programming and [AGT] Ch 1

1

2-player zero-sum game

u Prove that NE exists– in two ways

  • 1. Nash's theorem

u

Doesn't give an algorithm (why?)

  • 2. Linear programming

u Gives an algorithm

2

slide-2
SLIDE 2

10/5/20 2

Example: 2-player zero-sum game

u Penalty kick game Left (0.42) Right (0.58) Left (0.38)

0.58, 0.42 0.95, 0.05

Right (0.62)

0.93, 0.07 0.70, 0.30

Shooter Goalkeeper

3

Example: 2-player zero-sum game

u Assumption (wlog): sum of payoffs in each

cell is 0

u More than 2 actions?

u Need an algorithm

L R U

2, -2

  • 1, 1

D

  • 3, 3

4, -4

Row player Column player L R U

2

  • 1

D

  • 3

4

4

slide-3
SLIDE 3

10/5/20 3

Linear Programming (LP)

Will come back to game theory later

5

Applications

u Optimization

u Production, machine scheduling, employee

scheduling, supply chain management, etc.

u Game theory u In general: optimization

6

slide-4
SLIDE 4

10/5/20 4

LP

  • 1. Variables (or decision variables)

u We can choose the values of these variables u What's the goal? u What range of values can we choose? Integer vs

real? Any other restrictions?

  • 2. Objective function (What's the goal?)

u Minimization or maximization u Must be linear in the variables

  • 3. Constraints (What values?)

u Restricts the values of choice variables u Must be linear in the variables

7

Example 1: LP formulation & geometric interpretation

u One is planning his day-to-day life. Outside of 10

hours of sleep every day, he wants to set aside a few hours for studying and a few hours for connecting with friends.

u Gets 10 units/hr of payoff from study and 20

units/hr of payoff from connecting with friends.

u Must study at least 6 hours every day. Also, feels

guilty if spends more than 6 hours with friends.

u How should he allocate time optimally?

u Variables? u Objective function? u Constraints?

8

slide-5
SLIDE 5

10/5/20 5

LP formulation

u Maximize 10 x1 + 20 x2 u Subject to

x1 >= 6 x2 <= 6 x1 + x2 <= 14 x1, x2 >= 0

9

x1 x2 (0,0) 14 14 6 6

Note: x1, x2 >= 0: white region

Feasible region One of the vertices (black dots) will give the optimal solution

11

slide-6
SLIDE 6

10/5/20 6

Example 2: infeasible LP

u Want to sleep 10 hours/day, study at least 10

hours/day, and do other activities for at least 5 hours/day. How to allocate time?

12

Example 3: more var. & constraints

u Gets 15 units/hr of payoff for studying up to

3 hours and 10 units/hr of payoff after 3 hours of studying (basically, brain slows down). Also gets 20 units/hr of payoff from connecting with friends.

u Sleep 10 hours/day u Wants at least 6 hours of study/day u Wants at most 6 hours of time with friends/day

13

slide-7
SLIDE 7

10/5/20 7

Example 4: unbounded LP

u A tennis player is making a plan for

practicing service and volley. She gets a payoff of 10 from every service and 5 from every volley.

u She wants to practice service at least 100

times a day and doesn't want to practice volleys more than 500 times a day. What's her optimal plan?

14

Matrix algebra

u Images from this tutorial:

http://www.intmath.com/matrices- determinants/3-matrices.php

u 4x1 matrix (AKA vector) u 3x3 matrix

15

slide-8
SLIDE 8

10/5/20 8

Matrix multiplication

u 2x3 matrix multiplied by 3x2 matrix u Result is a 2x2 matrix

must match

16

Transpose of matrix

u Transpose operator: superscript T u (A B)T = BT AT A = 1 2 3 4 5 6 ! " # # # $ % & & & AT = 1 2 3 4 5 6 ! " # # $ % & &

17

slide-9
SLIDE 9

10/5/20 9

18

Solving LP

u Example 1

Max 10 x1 + 20 x2 s.t. x1 >= 6 x2 <= 6 x1 + x2 <= 14 x1, x2 >= 0

19

slide-10
SLIDE 10

10/5/20 10

x1 x2 (0,0) 14 14 6 6

Note: x1, x2 >= 0: white region

Feasible region One of the vertices (black dots) will give the optimal solution

20

Algorithms for solving LP

u Simplex (Dantzig, 1947)

u Worst case exponential time u Practically fast

u Ellipsoid (Khachiyan, 1979)

u O(n4 L) for n variables and L input bits u Pseudo-polynomial

u Karmarkar's algorithm (Karmarkar, 1984)

u O(n3.5 L) for n variables and L input bits u Pseudo-polynomial, but breakthrough for practical

reasons u Open problem: strongly polynomial algorithm?

21

slide-11
SLIDE 11

10/5/20 11

LP Duality (von Neumann, 1947)

u Interview with Dantzig

u http://www.personal.psu.edu/ecb5/Courses/M475

W/WeeklyReadings/Week%2015/An_Interview_with _George_Dantzig.pdf u If the "primal" LP is maximization, its "dual"

is minimization and vice versa.

u Every variable of the primal LP leads to a

constraint in the dual LP and every constraint of the primal LP leads to a variable in the dual LP .

u Dual of dual is primal.

22

Definition of dual LP

Source: Applied Mathematical Programming book

Example

23

slide-12
SLIDE 12

10/5/20 12

Definition of dual LP

Source: Applied Mathematical Programming book

Primal Maximize cTx subject to: Ax <= b x >= 0 Dual Minimize bTy subject to: A

Ty >= c

y >= 0

24

Example 5: LP duality

u How many Bowdoin logs and chocolate cakes should

Thorne make to maximize its revenue?

u Objective function. Each log has a satisfaction of 10 (or

price of $10), each cake 5.

u Constraints. For both desserts, the chef needs to use an

  • ven, a food processor, and a boiler.

Processing time/log Processing time/cake Total available time Oven

5 min 1 min 85 min

Food processor

1 min 10 min 300 min

Boiler

4 min 6 min 120 min Derive primal and dual LP

25

slide-13
SLIDE 13

10/5/20 13

Example 5 (continued)

u Revenue: $10/log, $5/cake u Primal LP:

Dual LP:

Processing time/log Processing time/cake Total available time Oven

5 min 1 min 85 min

Food processor

1 min 10 min 300 min

Boiler

4 min 6 min 120 min

26

Dual: intuition

u Moulton wants to borrow Thorne's equipment

for a day for a special event.

u Moulton will pay Thorne $y1/min, $y2/min,

and $y3/min for the 3 equipment, resp. such that:

  • 1. (Dual objective) Moulton minimizes the total

cost of renting

  • 2. (Dual constraints) Moulton will make sure that

Thorne recuperates the lost payoff for each piece of dessert through rental income

27

slide-14
SLIDE 14

10/5/20 14

Daily planner (Example 1)

Primal LP Maximize 10 x1 + 20 x2 Subject to x1 >= 6 (or, -x1 <= -6) x2 <= 6 x1 + x2 <= 14 x1, x2 >= 0 Dual LP? Work out the solutions by hand. What's the dual interpretation?

29

Weak duality theorem

Primal LP Maximize ... Dual LP Minimize ...

Increasing

  • bjective

function

Gap?

32

slide-15
SLIDE 15

10/5/20 15

Weak duality theorem

u Any feasible solution of the dual LP

(minimization) gives an upper bound on the optimal solution of the primal LP (maximization). [That’s how we defined dual!]

u Proof (next slide)

u Any feasible solution of the primal LP

(maximization) is a lower bound on the

  • ptimal solution of the dual LP

(minimization). Primal LP (max) Dual LP (min)

Increasing

  • bjective

function Gap?

33

Primal Maximize cTx subject to: Ax <= b x >= 0 Dual Minimize bTy subject to: A

Ty >= c

y >= 0

Proof: weak duality theorem

34

slide-16
SLIDE 16

10/5/20 16

Implications: weak duality theorem

u What will happen if primal (or

dual) is unbounded?

u Primal unbounded è Dual

infeasible

u Dual unbounded è Primal

infeasible

u Both primal and dual may be

infeasible (although not implied by this theorem)

Primal LP (max) Dual LP (min)

Increasing

  • bjective

function Gap?

35

Strong duality theorem

u If the primal LP has a finite optimal solution,

then so does the dual LP . Moreover, these two optimal solutions have the same

  • bjective function value.

u In other words, if either the primal or the dual LP

has a finite optimal solution, the gap between them is 0.

36

slide-17
SLIDE 17

10/5/20 17

Complementary slackness

u In case the strong duality theorem holds: u primal constraint non-binding (not equal) =>

corresponding dual variable = 0 at OPT

u Similar condition holds for dual constr. & primal

var. u The reverse implication may not hold!

37

2-player zero-sum game

Algorithm via LP duality

38

slide-18
SLIDE 18

10/5/20 18

Example 6: 2-player zero-sum game

u Assumption (wlog): sum of payoffs in each

cell is 0

L R U

2, -2

  • 1, 1

D

  • 3, 3

4, -4

Row player Column player L R U

2

  • 1

D

  • 3

4

Matrix A Example: (U,L): row gains 2 and col. loses 2

39

Row player

u How much gain can row player guarantee?

u Call it vr u Wants largest vr possible

u Row: choose mixed strategy p (vector of

prob.) to maximize vr

u Expected gain of row when col. plays j

(or expected loss of col. for playing j) = Σi (pi Ai,j) = (pTA)j

40

slide-19
SLIDE 19

10/5/20 19

Row player's LP

vr = max v subject to piAi, j

i

≥ v, for each action j of column player pi =1

i

pi ≥ 0, for each action i of row player

Row player's thought process: maximize my guaranteed gain v knowing that column player will minimize his loss (in other words, col. player will make sure v <= col. player’s loss for any of his action j).

41

Column player

u How little (vc) can col. player pay to row? u Choose mixed strategy q (vector of

probabilities) to minimize vc

u Expected gain of row player for playing i

(or exp. loss of col. player when row plays i) = (Aq)i = Σj (Ai,j qj)

42

slide-20
SLIDE 20

10/5/20 20

Column player's LP

vc = min u subject to

  • Col. player's thought process:

minimize my loss (or row’s gain) u knowing that row player will choose to maximize his gain (in

  • ther words, u >= row player’s gain

for playing any action i ).

Ai, jqj

j

≤ u, for each action i of row player qj =1

j

qj ≥ 0, for each action j of column player 43

Minimax Theorem

u At an equilibrium, vr = vc

u Proof:

  • 1. The two LPs are duals of each other.
  • 2. Primal LP has a finite optimal solution (it's feasible

+ bounded).

  • 3. By the strong duality theorem, vr = vc.

u Another proof:

  • 1. Let v* be row player's payoff at a NE.
  • 2. v* >= vr, because is vr is row player's guaranteed

payoff and v* cannot be lower than that.

  • 3. By assumption of NE, column player will not give

row player more than vr. So, vr = v*. SImilarly, vc = v*. Therefore, vr = vc.

u This quantity vc or vr is known as the value of

the game (v*)

44