csc304 lecture 7 game theory
play

CSC304 Lecture 7 Game Theory : Security games, Applications to - PowerPoint PPT Presentation

CSC304 Lecture 7 Game Theory : Security games, Applications to security CSC304 - Nisarg Shah 1 Until now Simultaneous-move Games All players act simultaneously Nash equilibria = stable outcomes Each player is best responding


  1. CSC304 Lecture 7 Game Theory : Security games, Applications to security CSC304 - Nisarg Shah 1

  2. Until now… • Simultaneous-move Games • All players act simultaneously • Nash equilibria = stable outcomes • Each player is best responding to the strategies of all other players CSC304 - Nisarg Shah 2

  3. Sequential Move Games • Focus on two players: “ leader ” and “ follower ” 1. Leader commits to a (possibly mixed) strategy 𝑦 1 ➢ Cannot change later 2. Follower learns about 𝑦 1 ➢ Follower must believe that leader’s commitment is credible 3. Follower chooses the best response 𝑦 2 ➢ Can assume to be a pure strategy without loss of generality ➢ If multiple actions are best response, break ties in favor of the leader CSC304 - Nisarg Shah 3

  4. Sequential Move Games • Wait. Does this give us anything new? ➢ Can’t I, as player 1, commit to playing 𝑦 1 in a simultaneous-move game too? ➢ Player 2 wouldn’t believe you. No you won’t. I’m Doesn’t I’ll play Yeah playing 𝑦 2 ; 𝑦 1 is not matter. I’m 𝑦 1 . right. a best response. committing. CSC304 - Nisarg Shah 4

  5. That’s unless… • You’re as convincing as this guy. CSC304 - Nisarg Shah 5

  6. How to represent the game? • Extensive form representation ➢ Can also represent “information sets”, multiple moves, … Player 1 Player 2 Player 2 (1,1) (3,0) (0,0) (2,1) CSC304 - Nisarg Shah 6

  7. A Curious Case P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Q: What are the Nash equilibria of this game? • Q: You are P1. What is your reward in Nash equilibrium? CSC304 - Nisarg Shah 7

  8. A Curious Case P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Q: As P1, you want to commit to a pure strategy. Which strategy would you commit to? • Q: What would your reward be now? CSC304 - Nisarg Shah 8

  9. Commitment Advantage P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Reward in the unique Nash equilibrium = 1 • Reward when committing to Down = 2 CSC304 - Nisarg Shah 9

  10. Commitment Advantage P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Higher reward in committing to a mixed strategy ➢ P1 commits to: Up w.p. 0.5 − 𝜗 , Down w.p. 0.5 + 𝜗 ➢ P2 is still better off playing Right ➢ 𝔽 [Reward] to P1 ≈ 2.5 ➢ Note: If P1 plays both actions with probability exactly 0.5, we assume P2 plays Right (break ties in favor of leader) CSC304 - Nisarg Shah 10

  11. Stackelberg vs Nash • Committing first is always better than playing a simultaneous-move game? • Yes! ∗ is a NE, P1 can always commit to 𝑦 1 ∗ , 𝑦 2 ∗ , ensure ➢ If 𝑦 1 ∗ , and achieve the reward in the NE that P2 will play 𝑦 2 ∗ ➢ P1 may be able to commit to a better strategy than 𝑦 1 • Applications to security ➢ Law enforcement is better off committing to a mixed patrolling strategy, and announcing the strategy publicly! CSC304 - Nisarg Shah 11

  12. Stackelberg in Zero-Sum • Recall the minimax theorem: 𝑈 𝐵 𝑦 2 = min 𝑈 𝐵 𝑦 2 max min 𝑦 1 max 𝑦 1 𝑦 1 𝑦 2 𝑦 2 𝑦 1 • P1 goes first → P1 chooses her minimax strategy • P2 goes first → P2 chooses her minimax strategy • Minimax Theorem: It doesn’t make a difference! ➢ Simultaneous-move, P1 going first, and P2 going first are essentially identical scenarios. CSC304 - Nisarg Shah 12

  13. Stackelberg in General-Sum • 2-player non-zero-sum game with reward matrices 𝐵 and 𝐶 ≠ −𝐵 for the two players 𝑈 𝐵 𝑔 𝑦 1 max 𝑦 1 𝑦 1 𝑈 𝐶 𝑦 2 where 𝑔 𝑦 1 = argmax 𝑦 1 𝑦 2 • How do we compute this? CSC304 - Nisarg Shah 13

  14. Example P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Let us separately maximize the reward of P1 in 2 cases: ➢ Strategies that cause P2 to play Left ➢ Strategies that cause P2 to play Right • Suppose P1 commits to Up w.p. 𝑞 , Down w.p. 1 − 𝑞 CSC304 - Nisarg Shah 14

  15. Example P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Strategies that cause P2 to play Left Reward of P1 assuming P2 plays Left Max 𝑞 ⋅ 1 + 1 − 𝑞 ⋅ 0 𝑡. 𝑢. 𝑞 ⋅ 1 + 1 − 𝑞 ⋅ 0 ≥ 𝑞 ⋅ 0 + 1 − 𝑞 ⋅ 1 𝑞 ∈ [0,1] Condition that causes P2 to play Left CSC304 - Nisarg Shah 15

  16. Example P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Strategies that cause P2 to play Left Max 𝑞 𝑡. 𝑢. Answer=1 𝑞 ≥ 1 − 𝑞 𝑞 ∈ [0,1] CSC304 - Nisarg Shah 16

  17. Example P2 Left Right P1 Up (1 , 1) (3 , 0) Down (0 , 0) (2 , 1) • Strategies that cause P2 to play Right Answer=2.5 Max 𝑞 ⋅ 3 + 1 − 𝑞 ⋅ 2 𝑡. 𝑢. 𝑞 ⋅ 1 + 1 − 𝑞 ⋅ 0 ≤ 𝑞 ⋅ 0 + 1 − 𝑞 ⋅ 1 𝑞 ∈ [0,1] CSC304 - Nisarg Shah 17

  18. Stackelberg via LPs • High-level Idea: ∗ of P2… ➢ For each action 𝑡 2 ➢ Write a linear program with the mixed strategy 𝑦 1 of P1 as the unknown, which… ➢ Maximizes the reward of P1 when P1 plays 𝑦 1 , P2 ∗ … responds with 𝑡 2 ➢ Subject to the constraint that 𝑦 1 in fact incentivizes P2 to ∗ play 𝑡 2 CSC304 - Nisarg Shah 18

  19. Stackelberg via LPs • 𝑇 1 , 𝑇 2 = sets of actions of leader and follower • 𝑇 1 = 𝑛 1 , 𝑇 2 = 𝑛 2 • 𝑦 1 (𝑡 1 ) = probability of leader playing 𝑡 1 • 𝜌 1 , 𝜌 2 = reward functions for leader and follower ∗ ) max Σ 𝑡 1 ∈𝑇 1 𝑦 1 𝑡 1 ⋅ 𝜌 1 (𝑡 1 , 𝑡 2 ∗ , • One LP for each 𝑡 2 take the maximum subject to over all 𝑛 2 LPs ∗ ∀𝑡 2 ∈ 𝑇 2 , Σ 𝑡 1 ∈𝑇 1 𝑦 1 𝑡 1 ⋅ 𝜌 2 𝑡 1 , 𝑡 2 ≥ • The LP corresponding Σ 𝑡 1 ∈𝑇 1 𝑦 1 𝑡 1 ⋅ 𝜌 2 𝑡 1 , 𝑡 2 ∗ optimizes over to 𝑡 2 Σ 𝑡 1 ∈𝑇 1 𝑦 1 𝑡 1 = 1 ∗ is all 𝑦 1 for which 𝑡 2 the best response ∀𝑡 1 ∈ 𝑇 1 , 𝑦 1 𝑡 1 ≥ 0 CSC304 - Nisarg Shah 19

  20. Real-World Applications • Security Games ➢ Defender (leader) has 𝑙 identical patrol units ➢ Defender wants to defend a set of 𝑜 targets 𝑈 ➢ In a pure strategy, each resource can protect a subset of targets 𝑇 ⊆ 𝑈 from a given collection 𝒯 ➢ A target is covered if it is protected by at least one resource ➢ Attacker wants to select a target to attack CSC304 - Nisarg Shah 20

  21. Real-World Applications • Security Games ➢ For each target, the defender and the attacker have two utilities: one if the target is covered, one if it is not. ➢ Defender commits to a mixed strategy; attacker follows by choosing a target to attack. CSC304 - Nisarg Shah 21

  22. Ah! • Q: Because this is a 2-player Stackelberg game, can we just compute the optimal strategy for the defender in polynomial time…? • Time is polynomial in the number of pure strategies of the defender ➢ In security games, this is 𝒯 𝑙 ➢ Exponential in 𝑙 • Intricate computational machinery required… CSC304 - Nisarg Shah 22

  23. LAX CSC304 - Nisarg Shah 23

  24. Real-World Applications • Protecting entry points to LAX • Scheduling air marshals on flights ➢ Must return home • Protecting the Staten Island Ferry ➢ Continuous-time strategies • Fare evasion in LA metro ➢ Bathroom breaks !!! • Wildlife protection in Ugandan forests ➢ Poachers are not fully rational • Cyber security … CSC304 - Nisarg Shah 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend