CMU 15-896 Noncooperative games 4: Stackelberg games Teacher: - PowerPoint PPT Presentation

CMU 15-896 Noncooperative games 4: Stackelberg games Teacher: Ariel Procaccia

A curious game • Playing up is a dominant strategy for row player 1,1 3,0 • So column player would play left • Therefore, is the 0,0 2,1 only Nash equilibrium outcome 15896 Spring 2016: Lecture 20 2

Commitment is good • Suppose the game is played as follows: Row player commits to 1,1 3,0 o playing a row Column player observes the o commitment and chooses 0,0 2,1 column • Row player can commit to playing down! 15896 Spring 2016: Lecture 20 3

Commitment to mixed strategy • By committing to a 0 1 mixed strategy, row player can guarantee a .49 1,1 3,0 reward of 2.5 • Called a Stackelberg .51 0,0 2,1 (mixed) strategy 15896 Spring 2016: Lecture 20 4

Computing Stackelberg • Theorem [Conitzer and Sandholm 2006] : In 2-player normal form games, an optimal Stackelberg strategy can be found in poly time • Theorem [ditto]: the problem is NP-hard when the number of players is  3 15896 Spring 2016: Lecture 20 5

Tractability: 2 players • For each pure follower strategy � , we compute via the LP below a strategy � for the leader such that Playing � � is a best response for the follower o Under this constraint, � � is optimal o ∗ that maximizes leader value • Choose � max ∑ � � � � � � �� , � � � � � ∈� � ∈ �, � ∑ � � � � � � � � , � � � ∑ � � � � � � � � , � � ∀� � s.t. � � ∈� � � ∈� ∑ � � � � � 1 � � ∈� ∀� � ∈ �, � � � � ∈ �0,1� 15896 Spring 2016: Lecture 20 6

Application: security • Airport security: deployed at LAX • Federal Air Marshals • Coast Guard • Idea: Defender commits to o mixed strategy Attacker observes and o best responds 15896 Spring 2016: Lecture 20 7

security games • Set of targets targets • Set of security resources resources available to the defender (leader) � • Set of schedules • Resource can be assigned to one of the schedules in • Attacker chooses one target to attack 15896 Spring 2016: Lecture 20 8

security games • For each target , there are four targets � � numbers: � , and � � � resources � � • Let be the � � vector of coverage probabilities • The utilities to the defender/attacker under c if target is attacked are � � � � � � � � � � � � � � 15896 Spring 2016: Lecture 20 9

This is a 2-player Stackelberg game. Can we compute an optimal strategy for the defender in polynomial time? 15896 Spring 2016: Lecture 20 10

Solving security games • Consider the case of , i.e., resources are assigned to individual targets, i.e., schedules have size 1 • Nevertheless, number of leader strategies is exponential • Theorem [Korzhyk et al. 2010]: Optimal leader strategy can be computed in poly time 15896 Spring 2016: Lecture 20 11

A compact LP • LP formulation similar to previous max � � � ∗ , � one ∀� ∈ Ω, ∀� ∈ � � , 0 � � �,� � 1 s.t. • Advantage: ∀� ∈ �, � � � � � �,� � 1 logarithmic in �∈�:�∈� � #leader strategies ∀� ∈ Ω, � � �,� � 1 • Problem: do �∈� � ∀� ∈ �, � � �, � � � � �� ∗ , �� probabilities correspond to strategy? 15896 Spring 2016: Lecture 20 12

� � � � 0.7 � � � � � � � � � � 0.2 � � 0.7 0.2 0.1 � � � � 0.1 0.3 � � � � � � 0 0.3 0.7 � � � � 0.7 � � � � � � � � � � � � � � � � � � � � � � � � � � 0 0 1 0 1 0 1 0 0 1 0 0 � � 0 1 0 0 0 1 0 1 0 0 0 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 15896 Spring 2016: Lecture 20 13

Fixing the probabilities Theorem [Birkhoff-von Neumann]: Consider an � � � matrix � • with real numbers � �� ∈ �0,1� , such that for each � , ∑ � �� 1 , � and for each � , ∑ � �� 1 �� is kinda doubly stochastic). Then � there exist matrices � � , … , � � and weights � � , … , � � such that: ∑ � � � 1 1. � ∑ � � � � � � 2. � For each � , � � is kinda doubly stochastic and its elements are 3. in �0,1� The probabilities � �,� satisfy theorem’s conditions • By 3, each � � is a deterministic strategy • By 1, we get a mixed strategy • By 2, gives right probs • 15896 Spring 2016: Lecture 20 14

Generalizing? • What about schedules of size 2? • Air Marshals domain has such schedules: 0.5 0.5 outgoing+incoming flight 0.5 (bipartite graph) 0.5 • Previous apporoach fails • Theorem [Korzhyk et al. 2010]: problem is NP-hard 15896 Spring 2016: Lecture 20 15

15896 Spring 2016: Lecture 20 16

Criticisms • Problematic assumptions: The attacker exactly observes the defender’s 1. mixed strategy The defender knows the attacker’s utility 2. function The attacker behaves in a perfectly rational 3. way • We will focus on relaxing assumption #1 15896 Spring 2016: Lecture 20 17

Limited surveillance • Let us compare two worlds: Status quo: The defender optimizes against 1. an attacker with unlimited observations (i.e., complete knowledge of the defender’s strategy), but the attacker actually has only observations Ideal: The defender optimizes against an 2. attacker with observations, and, miraculously, the attacker indeed has exactly observations 15896 Spring 2016: Lecture 20 18

Limited surveillance • Theorem [Blum et al. 2014]: Assume that utilities are normalized to be in . For any , there is a zero-sum security game such that the difference between worlds and is �� • Lemma: If � , there exists � such that: � �� ∀�, � � � |�|/2 1. Each � ∈ � is in exactly � members of � 2. If � � ⊂ � and � � � � then ⋃� � � � 3. � � 2 15896 Spring 2016: Lecture 20 19

Proof of theorem resources, each can defend any • �� targets, � , targets � • For any target , zero-sum utilities with � � and � � • Poll: The optimal strategy (in the status quo world) defends each target with probability roughly…? 15896 Spring 2016: Lecture 20 20

Proof of theorem • Next we define a much better strategy against an attacker with � observations �� • � � subset of targets 1, … , ⊆ � � • Define �� , … � �� as in the lemma • Pure strategy � � covers � � ; this is valid because � � � � /2 � �� (by property 1) • Let � ∗ be the uniform distribution over � � , … , � �� • By property 2, � ∗ covers each target in � with probability ½ • By property 3, � observations from � ∗ would show some target in � never being covered; that target is attacked ∎ 15896 Spring 2016: Lecture 20 21

Limited surveillance • Theorem [Blum et al. 2014]: For any zero- sum security game with targets, resources, and a set of schedules with max coverage , and for any observations, the difference between the two worlds is at most 15896 Spring 2016: Lecture 20 22

CMU 15-896 Noncooperative games 4: Stackelberg games Teacher: - PowerPoint PPT Presentation

CMU 15-896 Noncooperative games 4: Stackelberg games Teacher: Ariel Procaccia A curious game Playing up is a dominant strategy for row player 1,1 3,0 So column player would play left Therefore, is the 0,0 2,1 only Nash

Why Are We Here? CSCE CSCE 496/896 496/896 Lecture 10: Lecture 10: CSCE 496/896 Lecture 10:

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

Introduction CSCE CSCE 496/896 496/896 Lecture 6: Lecture 6: Recurrent Recurrent CSCE

Introduction CSCE CSCE 496/896 496/896 Lecture 9: Lecture 9: word2vec and word2vec and To

Introduction Supervised Learning CSCE CSCE 496/896 496/896 Lecture 2: Lecture 2: Basic

Welcome to CSCE 496/896: Deep Learning! Welcome to CSCE 496/896: Deep Learning! Please check

FACT: A Diagnostic for Group Fairness Trade-offs Joon Kim, CMU (joonsikk@cs.cmu.edu ) Jiahao Chen,

The bluetides simulation Tiziana DiMatteo (CMU ) Yu Feng (Berkeley), Rupert Croft (CMU ), Aklant

A New Boosting Algorithm Using Input-Dependent Regularizer Rong Jin rong+@cs.cmu.edu Yan Liu

CMU 15-896 Noncooperative games 1: Basic concepts Teacher: Ariel Procaccia Normal-Form Game

CMU 15-896 Social choice 1: The basics Teacher: Ariel Procaccia Social choice theory A

CMU 15-896 Social networks 1: Coordination Games Teacher: Ariel Procaccia Background

CMU 15-896 Social choice 3: Advanced manipulation Teacher: Ariel Procaccia Recap A

CMU 15-896 Mechanism design 2: With money Teacher: Ariel Procaccia MD with money Money

CMU 15-896 Fair division 1: Cake cutting Teacher: Ariel Procaccia Single heterogeneous

CMU 15-896 Noncooperative games 3: Price of anarchy Teacher: Ariel Procaccia Back to prison

Mixed Strategies 4/24/17 Recall: Pursuit/Evasion Game Pursuit/Evasion Payoff Matrix L R L

Lessons from Fukushima August 7, 2012 David Lochbaum Director, Nuclear Safety Project Union of

W HAT IS P LAIN E NGLISH ? According to the Palin English Campaign in Britain, it is the

Reading engines for Visual Narratives by Laurent Le Meur / EDRLab 18 September 2018 EDRLab

CSC304 Lecture 3 Game Theory (More examples, Computation of Mixed Nash Equilibria, Indifference

ECE700.07: Game Theory with Engineering Applications Le Lecture 3: Ga Games in Normal Form

Part II: Strategic Interaction Introduction of competition Three instruments to compete in

Announcements: Homework 1 Out HW1 and a latex template for solutions are out on the course

Sambuz

Useful Links

Newsletter

Mail Us