cmu 15 896
play

CMU 15-896 Noncooperative games 4: Stackelberg games Teacher: - PowerPoint PPT Presentation

CMU 15-896 Noncooperative games 4: Stackelberg games Teacher: Ariel Procaccia A curious game Playing up is a dominant strategy for row player 1,1 3,0 So column player would play left Therefore, is the 0,0 2,1 only Nash


  1. CMU 15-896 Noncooperative games 4: Stackelberg games Teacher: Ariel Procaccia

  2. A curious game • Playing up is a dominant strategy for row player 1,1 3,0 • So column player would play left • Therefore, is the 0,0 2,1 only Nash equilibrium outcome 15896 Spring 2016: Lecture 20 2

  3. Commitment is good • Suppose the game is played as follows: Row player commits to 1,1 3,0 o playing a row Column player observes the o commitment and chooses 0,0 2,1 column • Row player can commit to playing down! 15896 Spring 2016: Lecture 20 3

  4. Commitment to mixed strategy • By committing to a 0 1 mixed strategy, row player can guarantee a .49 1,1 3,0 reward of 2.5 • Called a Stackelberg .51 0,0 2,1 (mixed) strategy 15896 Spring 2016: Lecture 20 4

  5. Computing Stackelberg • Theorem [Conitzer and Sandholm 2006] : In 2-player normal form games, an optimal Stackelberg strategy can be found in poly time • Theorem [ditto]: the problem is NP-hard when the number of players is  3 15896 Spring 2016: Lecture 20 5

  6. Tractability: 2 players • For each pure follower strategy � , we compute via the LP below a strategy � for the leader such that Playing � � is a best response for the follower o Under this constraint, � � is optimal o ∗ that maximizes leader value • Choose � max ∑ � � � � � � �� � , � � � � � ∈� � ∈ �, � ∑ � � � � � � � � , � � � ∑ � � � � � � � � , � � ∀� � s.t. � � ∈� � � ∈� ∑ � � � � � 1 � � ∈� ∀� � ∈ �, � � � � ∈ �0,1� 15896 Spring 2016: Lecture 20 6

  7. Application: security • Airport security: deployed at LAX • Federal Air Marshals • Coast Guard • Idea: Defender commits to o mixed strategy Attacker observes and o best responds 15896 Spring 2016: Lecture 20 7

  8. security games • Set of targets targets • Set of security resources resources available to the defender (leader) � • Set of schedules • Resource can be assigned to one of the schedules in • Attacker chooses one target to attack 15896 Spring 2016: Lecture 20 8

  9. security games • For each target , there are four targets � � numbers: � , and � � � resources � � • Let be the � � vector of coverage probabilities • The utilities to the defender/attacker under c if target is attacked are � � � � � � � � � � � � � � 15896 Spring 2016: Lecture 20 9

  10. This is a 2-player Stackelberg game. Can we compute an optimal strategy for the defender in polynomial time? 15896 Spring 2016: Lecture 20 10

  11. Solving security games • Consider the case of , i.e., resources are assigned to individual targets, i.e., schedules have size 1 • Nevertheless, number of leader strategies is exponential • Theorem [Korzhyk et al. 2010]: Optimal leader strategy can be computed in poly time 15896 Spring 2016: Lecture 20 11

  12. A compact LP • LP formulation similar to previous max � � � ∗ , � one ∀� ∈ Ω, ∀� ∈ � � , 0 � � �,� � 1 s.t. • Advantage: ∀� ∈ �, � � � � � �,� � 1 logarithmic in �∈�:�∈� � #leader strategies ∀� ∈ Ω, � � �,� � 1 • Problem: do �∈� � ∀� ∈ �, � � �, � � � � �� ∗ , �� probabilities correspond to strategy? 15896 Spring 2016: Lecture 20 12

  13. � � � � 0.7 � � � � � � � � � � 0.2 � � 0.7 0.2 0.1 � � � � 0.1 0.3 � � � � � � 0 0.3 0.7 � � � � 0.7 � � � � � � � � � � � � � � � � � � � � � � � � � � 0 0 1 0 1 0 1 0 0 1 0 0 � � 0 1 0 0 0 1 0 1 0 0 0 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 15896 Spring 2016: Lecture 20 13

  14. Fixing the probabilities Theorem [Birkhoff-von Neumann]: Consider an � � � matrix � • with real numbers � �� ∈ �0,1� , such that for each � , ∑ � �� � 1 , � and for each � , ∑ � �� � 1 �� is kinda doubly stochastic). Then � there exist matrices � � , … , � � and weights � � , … , � � such that: ∑ � � � 1 1. � ∑ � � � � � � 2. � For each � , � � is kinda doubly stochastic and its elements are 3. in �0,1� The probabilities � �,� satisfy theorem’s conditions • By 3, each � � is a deterministic strategy • By 1, we get a mixed strategy • By 2, gives right probs • 15896 Spring 2016: Lecture 20 14

  15. Generalizing? • What about schedules of size 2? • Air Marshals domain has such schedules: 0.5 0.5 outgoing+incoming flight 0.5 (bipartite graph) 0.5 • Previous apporoach fails • Theorem [Korzhyk et al. 2010]: problem is NP-hard 15896 Spring 2016: Lecture 20 15

  16. 15896 Spring 2016: Lecture 20 16

  17. Criticisms • Problematic assumptions: The attacker exactly observes the defender’s 1. mixed strategy The defender knows the attacker’s utility 2. function The attacker behaves in a perfectly rational 3. way • We will focus on relaxing assumption #1 15896 Spring 2016: Lecture 20 17

  18. Limited surveillance • Let us compare two worlds: Status quo: The defender optimizes against 1. an attacker with unlimited observations (i.e., complete knowledge of the defender’s strategy), but the attacker actually has only observations Ideal: The defender optimizes against an 2. attacker with observations, and, miraculously, the attacker indeed has exactly observations 15896 Spring 2016: Lecture 20 18

  19. Limited surveillance • Theorem [Blum et al. 2014]: Assume that utilities are normalized to be in . For any , there is a zero-sum security game such that the difference between worlds and is �� • Lemma: If � , there exists � such that: � �� ∀�, � � � |�|/2 1. Each � ∈ � is in exactly � members of � 2. If � � ⊂ � and � � � � then ⋃� � � � 3. � � 2 15896 Spring 2016: Lecture 20 19

  20. Proof of theorem resources, each can defend any • �� �� targets, � , targets � • For any target , zero-sum utilities with � � and � � • Poll: The optimal strategy (in the status quo world) defends each target with probability roughly…? 15896 Spring 2016: Lecture 20 20

  21. Proof of theorem • Next we define a much better strategy against an attacker with � observations �� • � � subset of targets 1, … , ⊆ � � • Define �� � , … � �� � as in the lemma • Pure strategy � � covers � � ; this is valid because � � � � /2 � �� (by property 1) • Let � ∗ be the uniform distribution over � � , … , � �� • By property 2, � ∗ covers each target in � with probability ½ • By property 3, � observations from � ∗ would show some target in � never being covered; that target is attacked ∎ 15896 Spring 2016: Lecture 20 21

  22. Limited surveillance • Theorem [Blum et al. 2014]: For any zero- sum security game with targets, resources, and a set of schedules with max coverage , and for any observations, the difference between the two worlds is at most 15896 Spring 2016: Lecture 20 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend