Announcements HW 3 postponed to this Thursday Project proposal due - - PowerPoint PPT Presentation

announcements
SMART_READER_LITE
LIVE PREVIEW

Announcements HW 3 postponed to this Thursday Project proposal due - - PowerPoint PPT Presentation

Announcements HW 3 postponed to this Thursday Project proposal due this Thursday as well 1 CS6501: T opics in Learning and Game Theory (Fall 2019) Bayesian Persuasion Instructor: Haifeng Xu Prediction markets and peer prediction study


slide-1
SLIDE 1

1

Announcements

ØHW 3 postponed to this Thursday ØProject proposal due this Thursday as well

slide-2
SLIDE 2

CS6501: T

  • pics in Learning and Game Theory

(Fall 2019) Bayesian Persuasion

Instructor: Haifeng Xu

slide-3
SLIDE 3

3

ØPrediction markets and peer prediction study how to elicit

information from others

ØThis lecture: when you have information, how to exploit it?

  • Relevant to mechanism design
slide-4
SLIDE 4

4

Outline

Ø Introduction and Bayesian Persuasion Ø Algorithms for Bayesian Persuasion

slide-5
SLIDE 5

5

T wo Primary Ways to Influence Agents’ Behaviors

ØDesign/provide incentives

  • Auctions

You only pay the second highest bid!

slide-6
SLIDE 6

6

T wo Primary Ways to Influence Agents’ Behaviors

ØDesign/provide incentives

  • Auctions
  • Discounts/coupons
slide-7
SLIDE 7

7

T wo Primary Ways to Influence Agents’ Behaviors

ØDesign/provide incentives

  • Auctions
  • Discounts/coupons
  • Job contract design

Bonus depends on performance, and is up to $1M!

slide-8
SLIDE 8

8

T wo Primary Ways to Influence Agents’ Behaviors

ØDesign/provide incentives

  • Auctions
  • Discounts/coupons
  • Job contract design

Mechanism Design

slide-9
SLIDE 9

9

T wo Primary Ways to Influence Agents’ Behaviors

ØDesign/provide incentives

  • Auctions
  • Discounts/coupons
  • Job contract design

ØInfluence agents’ beliefs

  • Deception in wars/battles

All warfare is based on deception. Hence, when we are able to attack, we must seem unable; when using our forces, we must appear inactive…

  • - Sun Tzu, The Art of War

Mechanism Design

slide-10
SLIDE 10

10

T wo Primary Ways to Influence Agents’ Behaviors

ØDesign/provide incentives

  • Auctions
  • Discounts/coupons
  • Job contract design

ØInfluence agents’ beliefs

  • Deception in wars/battles
  • Strategic information disclosure

Strategic inventory information disclosure

Mechanism Design

slide-11
SLIDE 11

11

T wo Primary Ways to Influence Agents’ Behaviors

ØDesign/provide incentives

  • Auctions
  • Discounts/coupons
  • Job contract design

ØInfluence agents’ beliefs

  • Deception in wars/battles
  • Strategic information disclosure

Strategic inventory information disclosure

Mechanism Design

slide-12
SLIDE 12

12

T wo Primary Ways to Influence Agents’ Behaviors

ØDesign/provide incentives

  • Auctions
  • Discounts/coupons
  • Job contract design

ØInfluence agents’ beliefs

  • Deception in wars/battles
  • Strategic information disclosure
  • News articles, advertising, tweets, etc.

Mechanism Design

slide-13
SLIDE 13

13

T wo Primary Ways to Influence Agents’ Behaviors

ØDesign/provide incentives

  • Auctions
  • Discounts/coupons
  • Job contract design

ØInfluence agents’ beliefs

  • Deception in wars/battles
  • Strategic information disclosure
  • News articles, advertising, tweets …
  • In fact, most information you see is

there for a goal

Mechanism Design Persuasion

slide-14
SLIDE 14

14

Ø Intrinsic in human activities: advertising, negotiation, politics, security, marketing, financial regulation,… Ø A large body of research Persuasion is the act of exploiting an informational advantage in order to influence the decisions of others

–– The American Economic Review Vol. 85, No. 2, 1995.

slide-15
SLIDE 15

15

Example: Recommendation Letters

Ø Advisor vs. recruiter Ø 1/3 of the advisor’s students are excellent; 2/3 are average Ø A fresh graduate is randomly drawn from this population Ø Recruiter

  • Utility 1 + 𝜗 for hiring an excellent student; −1 for an average student
  • Utility 0 for not hiring
  • A-priori, only knows the advisor’s student population

(1 + 𝜗)×1/3 − 1×2/3 <

hiring Not hiring

slide-16
SLIDE 16

16

Ø Advisor vs. recruiter Ø 1/3 of the advisor’s students are excellent; 2/3 are average Ø A fresh graduate is randomly drawn from this population Ø Recruiter

  • Utility 1 + 𝜗 for hiring an excellent student; −1 for an average student
  • Utility 0 for not hiring
  • A-priori, only knows the advisor’s student population

Ø Advisor

  • Utility 1 if the student is hired, 0 otherwise
  • Knows whether the student is excellent or not

Example: Recommendation Letters

slide-17
SLIDE 17

17

Ø Attempt 1: always say “excellent” (equivalently, no information) What is the advisor’s optimal “recommendation strategy”?

  • Recruiter ignores the recommendation
  • Advisor expected utility 0

Example: Recommendation Letters

Remark Advisor commitment: cannot deviate and recruiter knows his strategy

slide-18
SLIDE 18

18

Ø Attempt 2: honest recommendation (i.e., full information)

  • Advisor expected utility 1/3

excellent average recruiter 1/3 1/3 2/3 2/3 1/3 2/3 What is the advisor’s optimal “recommendation strategy”?

Example: Recommendation Letters

slide-19
SLIDE 19

19

Ø Attempt 3: noisy information à advisor expected utility 2/3 recruiter 2/3 1/3 1/3 (1 + 𝜗 − 1)/2 >

Hiring Not hiring

What is the advisor’s optimal “recommendation strategy”? average 2/3 1/3 P(excellent | ) = 1/2 excellent 1/3 1/3

Example: Recommendation Letters

slide-20
SLIDE 20

20

Ø Two players: persuader (Sender, she), decision maker (Receiver he)

  • Previous example: advisor = sender, recruiter = receiver

Ø Receiver looks to take an action 𝑗 ∈ 𝑜 = {1, 2, … , 𝑜}

  • Receiver utility 𝑠(𝑗, 𝜄)
  • Sender utility 𝑡(𝑗, 𝜄)

Ø Both players know 𝜄 ∼ 𝑞𝑠𝑗𝑝𝑠 𝑒𝑗𝑡𝑢. 𝜈, but Sender has an informational advantage – she can observe realization of 𝜄 Ø Sender wants to strategically reveal info about 𝜄 to “persuade” Receiver to take an action she likes

  • Concealing or revealing all info is not necessarily the best

Model of Bayesian Persuasion

𝜄 ∈ Θ is a random state of nature

Well…how to reveal partial information?

slide-21
SLIDE 21

21

Example Ø Θ = {𝐹𝑦𝑑𝑓𝑚𝑚𝑓𝑜𝑢, 𝐵𝑤𝑓𝑠𝑏𝑕𝑓}, Σ = {𝐵, 𝐶} Ø 𝜌 𝐵, 𝐵𝑤𝑓𝑠𝑏𝑕𝑓 = 1/2

Revealing Information via Signaling

Definition: A signaling scheme is a mapping 𝜌: Θ → ΔP where Σ is the set of all possible signals. 𝜌 is fully described by 𝜌 𝜏, 𝜄

R∈S,T∈P where 𝜌 𝜏, 𝜄 = prob. of

sending 𝜏 when observing 𝜄 (so ∑T∈P 𝜌 𝜏, 𝜄 = 1 for any 𝜄)

Note: scheme 𝜌 is always assumed public knowledge, thus known by Receiver

slide-22
SLIDE 22

22

Revealing Information via Signaling

Definition: A signaling scheme is a mapping 𝜌: Θ → ΔP where Σ is the set of all possible signals. 𝜌 is fully described by 𝜌 𝜏, 𝜄

R∈S,T∈P where 𝜌 𝜏, 𝜄 = prob. of

sending 𝜏 when observing 𝜄 (so ∑T∈P 𝜌 𝜏, 𝜄 = 1 for any 𝜄) What can Receiver infer about 𝜄 after receiving σ? Bayes updating: Pr(𝜄|𝜏) =

Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^

Pr 𝑓𝑦𝑑𝑓𝑚𝑚𝑓𝑜𝑢 𝐵 = 1/2

slide-23
SLIDE 23

23

Revealing Information via Signaling

Definition: A signaling scheme is a mapping 𝜌: Θ → ΔP where Σ is the set of all possible signals. 𝜌 is fully described by 𝜌 𝜏, 𝜄

R∈S,T∈P where 𝜌 𝜏, 𝜄 = prob. of

sending 𝜏 when observing 𝜄 (so ∑T∈P 𝜌 𝜏, 𝜄 = 1 for any 𝜄) Would such noisy information benefit Receiver?

Ø Expected Receiver utility conditioned on 𝜏:

∑R∈S 𝑠 𝑗, 𝜄 ⋅

Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^

maxb∈[d] [ ] 𝑆 𝜏 =

Ø Pr(𝜏) = ∑R^ 𝜌 𝜏, 𝜄g ⋅ 𝜈 𝜄g

slide-24
SLIDE 24

24

maxb∈[d] [ ]

Revealing Information via Signaling

Definition: A signaling scheme is a mapping 𝜌: Θ → ΔP where Σ is the set of all possible signals. 𝜌 is fully described by 𝜌 𝜏, 𝜄

R∈S,T∈P where 𝜌 𝜏, 𝜄 = prob. of

sending 𝜏 when observing 𝜄 (so ∑T∈P 𝜌 𝜏, 𝜄 = 1 for any 𝜄) Would such noisy information benefit Receiver?

Ø Expected Receiver utility conditioned on 𝜏:

∑R∈S 𝑠 𝑗, 𝜄 ⋅

Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^

𝑆 𝜏 =

Ø Pr(𝜏) = ∑R^ 𝜌 𝜏, 𝜄g ⋅ 𝜈 𝜄g

slide-25
SLIDE 25

25

maxb∈[d] [ ]

Revealing Information via Signaling

Definition: A signaling scheme is a mapping 𝜌: Θ → ΔP where Σ is the set of all possible signals. 𝜌 is fully described by 𝜌 𝜏, 𝜄

R∈S,T∈P where 𝜌 𝜏, 𝜄 = prob. of

sending 𝜏 when observing 𝜄 (so ∑T∈P 𝜌 𝜏, 𝜄 = 1 for any 𝜄) Would such noisy information benefit Receiver?

Ø Expected Receiver utility conditioned on 𝜏:

∑R∈S 𝑠 𝑗, 𝜄 ⋅

Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^

𝑆 𝜏 =

Ø Pr(𝜏) = ∑R^ 𝜌 𝜏, 𝜄g ⋅ 𝜈 𝜄g

  • Pr(𝜏) ⋅ 𝑆 𝜏 = max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 Ø Expected Receiver utility under 𝜌: ∑T max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄

slide-26
SLIDE 26

26

Revealing Information via Signaling

  • Fact. Receiver’s expected utility (weakly) increases under any

signaling scheme 𝜌.

slide-27
SLIDE 27

27

Revealing Information via Signaling

Ø Expected Receiver utility without information: max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜈 𝜄 Proof:

  • Fact. Receiver’s expected utility (weakly) increases under any

signaling scheme 𝜌.

Ø Expected Receiver utility under 𝜌: ∑T max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄

slide-28
SLIDE 28

28

Revealing Information via Signaling

Ø Expected Receiver utility without information: max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜈 𝜄 Proof: Ø Let 𝑗∗ = arg max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜈 𝜄 , we have ∑T max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 ≥ ∑T ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 Ø Expected Receiver utility under 𝜌: ∑T max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 = ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ [∑T 𝜌 𝜏, 𝜄 ] ⋅ 𝜈 𝜄 = ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ 𝜈 𝜄

  • Fact. Receiver’s expected utility (weakly) increases under any

signaling scheme 𝜌.

slide-29
SLIDE 29

29

Revealing Information via Signaling

Ø Expected Receiver utility without information: max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜈 𝜄 Proof: Ø Let 𝑗∗ = arg max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜈 𝜄 , we have ∑T max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 ≥ ∑T ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 Ø Expected Receiver utility under 𝜌: ∑T max

b

∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 = ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ [∑T 𝜌 𝜏, 𝜄 ] ⋅ 𝜈 𝜄 = ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ 𝜈 𝜄

  • Fact. Receiver’s expected utility (weakly) increases under any

signaling scheme 𝜌.

slide-30
SLIDE 30

30

Revealing Information via Signaling

  • Fact. Receiver’s expected utility (weakly) increases under any

signaling scheme 𝜌. Remarks:

Ø Signaling scheme does increase Receiver’s utility Ø More (even noisy) information always helps a decision maker (DM)

  • Note true if multiple DMs (will see examples later)
slide-31
SLIDE 31

31

Revealing Information via Signaling

  • Fact. Receiver’s expected utility (weakly) increases under any

signaling scheme 𝜌. Remarks:

Ø Signaling scheme does increase Receiver’s utility Ø More (even noisy) information always helps a decision maker (DM)

  • Note true if multiple DMs (will see examples later)
  • Corollary. Receiver’s expected utility is maximized when Sender

reveals full info, i.e., directly revealing the realized 𝜄.

Because any other noisy scheme 𝜌 can be improved by further revealing 𝜄 itself

slide-32
SLIDE 32

32

Revealing Information via Signaling

  • Fact. Receiver’s expected utility (weakly) increases under any

signaling scheme 𝜌. Remarks:

Ø Signaling scheme does increase Receiver’s utility Ø More (even noisy) information always helps a decision maker (DM)

  • Note true if multiple DMs (will see examples later)
  • Corollary. Receiver’s expected utility is maximized when Sender

reveals full info, i.e., directly revealing the realized 𝜄. But this is not Sender’s goal… Sender Objective: carefully pick 𝜌 to maximize her expected utility

slide-33
SLIDE 33

33

Outline

Ø Introduction and Bayesian Persuasion Ø Algorithms for Bayesian Persuasion

slide-34
SLIDE 34

34

ØDon’t know what is the set of all possible signals Σ... ØLike in mechanism design, too many signals to consider in this

world

  • Again, you can use “looking 45° up to the sky” as a signal

ØKey observation: a signal is mathematically nothing but a

posterior distribution over Θ

  • Recall the Bayes updates: Pr(𝜄|𝜏) =

Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^

ØIt turns out that 𝑜 signals suffice

Q: What worries you the most when designing 𝜌 = 𝜌 𝜄, 𝜏

R∈S,T∈P?

slide-35
SLIDE 35

35

Revelation Principle

  • Fact. There always exists an optimal signaling scheme that uses at

most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗

slide-36
SLIDE 36

36

Revelation Principle

ØConditioned on any signal 𝜏

  • Receiver infers Pr(𝜄|𝜏) =

Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^

  • Receiver takes optimal action 𝑗∗ = arg max

b∈[d] ∑R Pr(θ|σ) 𝑠(𝑗, 𝜄)

  • Fact. There always exists an optimal signaling scheme that uses at

most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗

slide-37
SLIDE 37

37

Revelation Principle

ØConditioned on any signal 𝜏

  • Receiver infers Pr(𝜄|𝜏) =

Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^

  • Receiver takes optimal action 𝑗∗ = arg max

b∈[d] ∑R Pr(θ|σ) 𝑠(𝑗, 𝜄)

ØNow, if signal 𝜏 and 𝜏′ result in the same optimal action 𝑗∗, Sender

can instead send a new signal 𝜏b∗ = (𝜏, 𝜏′) in both cases

  • Fact. There always exists an optimal signaling scheme that uses at

most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗

slide-38
SLIDE 38

38

Revelation Principle

ØConditioned on any signal 𝜏

  • Receiver infers Pr(𝜄|𝜏) =

Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^

  • Receiver takes optimal action 𝑗∗ = arg max

b∈[d] ∑R Pr(θ|σ) 𝑠(𝑗, 𝜄)

ØNow, if signal 𝜏 and 𝜏′ result in the same optimal action 𝑗∗, Sender

can instead send a new signal 𝜏b∗ = (𝜏, 𝜏′) in both cases

  • Claim: 𝑗∗ is still the optimal action conditioned on 𝜏b∗
  • Fact. There always exists an optimal signaling scheme that uses at

most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗

∑R Pr(θ|σ) 𝑠(𝑗∗, 𝜄) ≥ ∑R Pr(θ|σ) 𝑠(𝑗, 𝜄) , ∀ 𝑗 ∑R Pr(θ|σ’) 𝑠(𝑗∗, 𝜄) ≥ ∑R Pr(θ|σ’) 𝑠(𝑗, 𝜄) , ∀ 𝑗 ⇒ ∑R Pr(θ|σ)𝑞 + Pr(θ|σ’)(1 − 𝑞) 𝑠(𝑗∗, 𝜄) ≥ ∑R Pr(θ|σ)𝑞 + Pr(θ|σ’)(1 − 𝑞) 𝑠(𝑗, 𝜄) , ∀ 𝑗 Pr(θ|σ∗) is a convex combination of Pr(θ|σ) and Pr(θ|σ’)

slide-39
SLIDE 39

39

Revelation Principle

ØConditioned on any signal 𝜏

  • Receiver infers Pr(𝜄|𝜏) =

Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^

  • Receiver takes optimal action 𝑗∗ = arg max

b∈[d] ∑R Pr(θ|σ) 𝑠(𝑗, 𝜄)

ØNow, if signal 𝜏 and 𝜏′ result in the same optimal action 𝑗∗, Sender

can instead send a new signal 𝜏b∗ = (𝜏, 𝜏′) in both cases

  • Claim: 𝑗∗ is still the optimal action conditioned on 𝜏b∗
  • Both players’ utilities did not change as receiver still takes 𝑗∗ as Sender

wanted

ØCan merge all signals with optimal receiver action 𝑗∗ as a single

signal 𝜏b∗

  • Fact. There always exists an optimal signaling scheme that uses at

most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗

slide-40
SLIDE 40

40

Revelation Principle

ØEach 𝜏b can be viewed as an action recommendation of 𝑗

  • Fact. There always exists an optimal signaling scheme that uses at

most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗

slide-41
SLIDE 41

41

Optimal Persuasion via Linear Program

ØInput: prior 𝜈, sender payoff 𝑡(𝑗, 𝜄), receiver payoff 𝑠(𝑗, 𝜄) ØVariables: 𝜌(𝜏b, 𝜄)

Sender expected utility (we know Receiver will take 𝑗 at signal 𝜏b)

slide-42
SLIDE 42

42

Optimal Persuasion via Linear Program

ØInput: prior 𝜈, sender payoff 𝑡(𝑗, 𝜄), receiver payoff 𝑠(𝑗, 𝜄) ØVariables: 𝜌(𝜏b, 𝜄)

𝜏b indeed incentivizes Receiver best action 𝑗

slide-43
SLIDE 43

43

Optimal Persuasion via Linear Program

ØInput: prior 𝜈, sender payoff 𝑡(𝑗, 𝜄), receiver payoff 𝑠(𝑗, 𝜄) ØVariables: 𝜌(𝜏b, 𝜄)

𝜌 is a valid signaling scheme

slide-44
SLIDE 44

Thank You

Haifeng Xu

University of Virginia hx4ad@virginia.edu