CMU 15-896 Noncooperative games 4: Stackelberg games Teacher: - - PowerPoint PPT Presentation

cmu 15 896
SMART_READER_LITE
LIVE PREVIEW

CMU 15-896 Noncooperative games 4: Stackelberg games Teacher: - - PowerPoint PPT Presentation

CMU 15-896 Noncooperative games 4: Stackelberg games Teacher: Ariel Procaccia A curious game Playing up is a dominant strategy for row player 1,1 3,0 So column player would play left Therefore, is the 0,0 2,1 only Nash


slide-1
SLIDE 1

CMU 15-896

Noncooperative games 4: Stackelberg games

Teacher: Ariel Procaccia

slide-2
SLIDE 2

15896 Spring 2016: Lecture 20

A curious game

  • Playing up is a dominant

strategy for row player

  • So column player would

play left

  • Therefore,

is the

  • nly Nash equilibrium
  • utcome

2

1,1 3,0 0,0 2,1

slide-3
SLIDE 3

15896 Spring 2016: Lecture 20

Commitment is good

  • Suppose the game is played

as follows:

  • Row player commits to

playing a row

  • Column player observes the

commitment and chooses column

  • Row player can commit to

playing down!

3

1,1 3,0 0,0 2,1

slide-4
SLIDE 4

15896 Spring 2016: Lecture 20

Commitment to mixed strategy

  • By committing to a

mixed strategy, row player can guarantee a reward of 2.5

  • Called a Stackelberg

(mixed) strategy

4

1 .49 1,1

3,0

.51 0,0

2,1

slide-5
SLIDE 5

15896 Spring 2016: Lecture 20

Computing Stackelberg

  • Theorem [Conitzer and Sandholm 2006]:

In 2-player normal form games, an optimal Stackelberg strategy can be found in poly time

  • Theorem [ditto]: the problem is NP-hard

when the number of players is  3

5

slide-6
SLIDE 6

15896 Spring 2016: Lecture 20

Tractability: 2 players

  • For each pure follower strategy , we compute via

the LP below a strategy for the leader such that

  • Playing is a best response for the follower
  • Under this constraint, is optimal
  • Choose

∗ that maximizes leader value

6

max ∑ ,

s.t. ∀

∈ ,

∀ ∈ , ∈ 0,1 ∑ , ∑ ,

∑ 1

slide-7
SLIDE 7

15896 Spring 2016: Lecture 20

Application: security

7

  • Airport security:

deployed at LAX

  • Federal Air Marshals
  • Coast Guard
  • Idea:
  • Defender commits to

mixed strategy

  • Attacker observes and

best responds

slide-8
SLIDE 8

15896 Spring 2016: Lecture 20

security games

  • Set of targets
  • Set of

security resources available to the defender (leader)

  • Set of schedules
  • Resource

can be assigned to one of the schedules in

  • Attacker chooses one target

to attack

8

resources targets

slide-9
SLIDE 9

15896 Spring 2016: Lecture 20

security games

  • For each target , there are four

numbers:

  • , and
  • Let
  • be the

vector of coverage probabilities

  • The utilities to the

defender/attacker under c if target is attacked are

  • 9

resources targets

slide-10
SLIDE 10

15896 Spring 2016: Lecture 20

10

This is a 2-player Stackelberg game. Can we compute an optimal strategy for the defender in polynomial time?

slide-11
SLIDE 11

15896 Spring 2016: Lecture 20

Solving security games

  • Consider the case of

, i.e., resources are assigned to individual targets, i.e., schedules have size 1

  • Nevertheless, number of leader strategies is

exponential

  • Theorem [Korzhyk et al. 2010]: Optimal

leader strategy can be computed in poly time

11

slide-12
SLIDE 12

15896 Spring 2016: Lecture 20

A compact LP

  • LP formulation

similar to previous

  • ne
  • Advantage:

logarithmic in #leader strategies

  • Problem: do

probabilities correspond to strategy?

12

max ∗, s.t. ∀ ∈ Ω, ∀ ∈ , 0 , 1 ∀ ∈ ,

  • , 1

∈:∈

∀ ∈ Ω, , 1

∀ ∈ , , ∗,

slide-13
SLIDE 13

15896 Spring 2016: Lecture 20

13

  • 0.7

0.2 0.1 0.3 0.7

  • 0.7 0.2 0.1
  • 0.3 0.7
  • 1
  • 1
  • 1

1

  • 1

1

  • 1

1

slide-14
SLIDE 14

15896 Spring 2016: Lecture 20

Fixing the probabilities

  • Theorem [Birkhoff-von Neumann]: Consider an matrix

with real numbers ∈ 0,1, such that for each , ∑ 1

  • ,

and for each , ∑ 1

  • is kinda doubly stochastic). Then

there exist matrices , … , and weights , … , such that:

1.

∑ 1

  • 2.

  • 3.

For each , is kinda doubly stochastic and its elements are in 0,1

  • The probabilities , satisfy theorem’s conditions
  • By 3, each is a deterministic strategy
  • By 1, we get a mixed strategy
  • By 2, gives right probs

14

slide-15
SLIDE 15

15896 Spring 2016: Lecture 20

Generalizing?

  • What about schedules of

size 2?

  • Air Marshals domain has

such schedules:

  • utgoing+incoming flight

(bipartite graph)

  • Previous apporoach fails
  • Theorem [Korzhyk et al.

2010]: problem is NP-hard

15

0.5 0.5 0.5 0.5

slide-16
SLIDE 16

15896 Spring 2016: Lecture 20

16

slide-17
SLIDE 17

15896 Spring 2016: Lecture 20

Criticisms

  • Problematic assumptions:

1.

The attacker exactly observes the defender’s mixed strategy

2.

The defender knows the attacker’s utility function

3.

The attacker behaves in a perfectly rational way

  • We will focus on relaxing assumption #1

17

slide-18
SLIDE 18

15896 Spring 2016: Lecture 20

Limited surveillance

  • Let us compare two worlds:

1.

Status quo: The defender optimizes against an attacker with unlimited observations (i.e., complete knowledge of the defender’s strategy), but the attacker actually has only

  • bservations

2.

Ideal: The defender optimizes against an attacker with

  • bservations, and,

miraculously, the attacker indeed has exactly

  • bservations

18

slide-19
SLIDE 19

15896 Spring 2016: Lecture 20

Limited surveillance

  • Theorem [Blum et al. 2014]: Assume that

utilities are normalized to be in . For any , there is a zero-sum security game such that the difference between worlds and is

  • Lemma: If
  • , there exists
  • such that:

1.

∀, ||/2

2.

Each ∈ is in exactly members of

3.

If ⊂ and then ⋃

19

2

slide-20
SLIDE 20

15896 Spring 2016: Lecture 20

Proof of theorem

  • resources, each can defend any

targets,

  • ,
  • targets
  • For any target , zero-sum utilities with
  • and
  • Poll: The optimal strategy (in the status

quo world) defends each target with probability roughly…?

20

slide-21
SLIDE 21

15896 Spring 2016: Lecture 20

Proof of theorem

  • Next we define a much better strategy against an

attacker with observations

  • subset of targets 1, … ,
  • Define , … as in the lemma
  • Pure strategy covers ; this is valid because

/2 (by property 1)

  • Let ∗ be the uniform distribution over , … ,
  • By property 2, ∗ covers each target in with

probability ½

  • By property 3, observations from ∗ would show some

target in never being covered; that target is attacked ∎

21

slide-22
SLIDE 22

15896 Spring 2016: Lecture 20

Limited surveillance

  • Theorem [Blum et al. 2014]: For any zero-

sum security game with targets, resources, and a set of schedules with max coverage , and for any

  • bservations, the difference between the

two worlds is at most

22