Strategic Inference with a Single Private Sample Erik Miehling, Roy - - PowerPoint PPT Presentation

strategic inference with a single private sample
SMART_READER_LITE
LIVE PREVIEW

Strategic Inference with a Single Private Sample Erik Miehling, Roy - - PowerPoint PPT Presentation

Strategic Inference with a Single Private Sample Erik Miehling, Roy Dong, Cdric Langbort, and Tamer Ba ar CDC 2019 December 11, 2019 General Setting learning phase game We are interested in se tu ings of strategic interaction where one or


slide-1
SLIDE 1

Strategic Inference with a Single Private Sample

Erik Miehling, Roy Dong, Cédric Langbort, and Tamer Başar

CDC 2019 December 11, 2019

slide-2
SLIDE 2

2

General Setting

  • We are interested in setuings of strategic interaction where one or more

players have the opportunity to extract information from the environment before a game is played

  • Learning actions are observed but the outcomes of the learning actions

are not (they are private to the learner)

  • Player’s then play a game based on their subjective beliefs

learning phase game

slide-3
SLIDE 3

3

Reconnaissance in Cyber Security

  • Motivating (physical) security example…
  • Consider an atuacker visiting multiple locations to determine where to

launch to atuack. Tie defender observes where the atuacker visited, but does not know what information the atuacker obtained

  • Tie atuacker/defender then simultaneously choose which target to

atuack/defend

  • In the above setuing, one player is acting while the other is observing.

Since learning actions are observed, one must consider the effect of an agent’s learning decision on the belief of the other agent

  • Fundamental problem in all multi-agent decision environments (known

as signaling in the team/game theory literature)

slide-4
SLIDE 4

4

Learning in Multi-agent Settings

  • Increasing focus on the intersection of learning and game theory
  • Recent workshop at EC 2019, “Learning in the presence of strategic

behavior”

  • Learning from data that is produced by agents who have vested

interest in the outcome or the learning process

  • Learning a model for the strategic behavior of one or more agents

by observing their interactions

  • Learning as a model of interactions between agents
  • Interactions between multiple learners
slide-5
SLIDE 5

5

Related Work

  • Strategic experimentation:
  • [Bolton & Harris, ’99; Rosenberg et al., ’07; Heidhues et al., ’15]
  • Incentivizing exploration:
  • [Mansour et al., ’16; Slivkins, ’17; Chen et al., ’18]
  • We study a simple setuing in which the learning agent receives a single

sample of private information from a distribution privately known by another (observing) agent

  • Objective of our work: to understand how the learning agent’s private

information influences the observing agent’s inference process

slide-6
SLIDE 6

6

Tie Game Model — Payoffs

  • Two players: atuacker (A) and defender (D)
  • Each player has two actions, ,

corresponding to targets

  • Atuacker’s payoff for choosing target is stochastic;

atuacker receives reward with probability

  • Atuacker’s payoff of target is certain; atuacker

receives reward where

  • Tie atuacker wishes to choose a different target than the defender,

whereas the defender wishes to choose the same target as the atuacker

  • If the same target is chosen, the atuacker incurs a capture cost

attacker defender

slide-7
SLIDE 7

7

Tie Game Model — Information

  • Defender knows the true value of but the atuacker only possesses a

prior distribution

  • Tie prior is further assumed to be common knowledge between the

atuacker and defender

  • Before targets are selected, the atuacker receives

a single private sample, denoted , from the uncertain target and forms its posterior

  • Tie defender does not see this sample and thus

does not know the atuacker’s posterior

attacker defender

sample,

slide-8
SLIDE 8

8

Tie Game Model

  • In summary, the (subjective) payoffs for the atuacker and costs for the

defender are as follows….

attacler’s reward, defender’s cost,

slide-9
SLIDE 9
  • Tie game is a static game of incomplete information, and thus we seek

to find Bayesian Nash equilibria

  • Players’ strategies, denoted by and , represent the

probability that the player will choose target given their type

  • Tie best response functions of the players are given by

where is the type of the atuacker (the private sample) and is the type

  • f the defender (the distribution parameter)

9

Best Response Functions

slide-10
SLIDE 10

10

Best Response Function — Attacler

where thus

slide-11
SLIDE 11

11

Best Response Functions — Defender

where thus

slide-12
SLIDE 12

12

Equilibrium ciaracterization

Tieorem — Tie game has at most one pure strategy saddle-point equilibrium, characterized by the following three disjoint regions: 1) for all if 2) for all if 3) , , and if

slide-13
SLIDE 13

13

Equilibrium discussion — Region 1

  • Tiroughout the discussion, we assume that , specifically let
  • Region 1: Always risky
  • Both players choose target independent
  • f their private information,

for all if

slide-14
SLIDE 14

14

Equilibrium discussion — Region 2

  • Region 2: Always safe
  • Both players choose target independent of their private information,

that is, for all if

slide-15
SLIDE 15

15

Equilibrium discussion — Region 3

  • Region 3: Information dependent
  • Tie equilibrium strategies of the players depend on their private

information

  • Atuacker: , (sample dependent)
  • Defender: (distribution dependent)
slide-16
SLIDE 16

16

Equilibrium discussion

  • First investigate the conditions for and

which is given by the following condition is high relative to so target looks sufficiently desirable Tie atuacker believes it to be very likely that (due to a higher ) and therefore would get caught should it choose . Tie capture cost deters the atuacker from choosing . (i) (ii) (iii) & (iv) are analogous

slide-17
SLIDE 17
  • Tie atuacker chooses if it receives a good

sample, ; defender defends the betuer target

17

Equilibrium discussion

  • Tie third equilibrium region is formed as the intersection of two regions
  • Tie atuacker chooses if it receives a good

sample, ; defender defends the betuer target

slide-18
SLIDE 18

18

Equilibrium discussion

  • Taking the intersection…
slide-19
SLIDE 19

19

Sensitivity Analysis

  • We investigate the equilibrium regions as the capture cost increases
  • As the capture cost increases, the atuacker requires a higher level of

certainty that the risky target will generate

  • All pure strategy equilibrium regions vanish when
slide-20
SLIDE 20

20

Concluding Remarks

  • Motivated by cyber security setuings, we have introduced a simple

asymmetric information game model for describing the influence of a learner’s (the atuacker) private information on the inference process of an

  • bserving agent (the defender)
  • Tie subsequent game admits at most one pure strategy equilibrium

which, depending on the parameters of the game, takes different forms:

  • Two of the equilibrium regions in which the players ignore their

private information; an intermediate region in which the players use their private information

  • In the intermediate region, the atuacker always follows its private

sample

Thank you!