1
Announcements
ØHW 3 postponed to this Thursday ØProject proposal due this Thursday as well
Announcements HW 3 postponed to this Thursday Project proposal due - - PowerPoint PPT Presentation
Announcements HW 3 postponed to this Thursday Project proposal due this Thursday as well 1 CS6501: T opics in Learning and Game Theory (Fall 2019) Bayesian Persuasion Instructor: Haifeng Xu Prediction markets and peer prediction study
1
ØHW 3 postponed to this Thursday ØProject proposal due this Thursday as well
CS6501: T
(Fall 2019) Bayesian Persuasion
Instructor: Haifeng Xu
3
ØPrediction markets and peer prediction study how to elicit
information from others
ØThis lecture: when you have information, how to exploit it?
4
Ø Introduction and Bayesian Persuasion Ø Algorithms for Bayesian Persuasion
5
T wo Primary Ways to Influence Agents’ Behaviors
ØDesign/provide incentives
You only pay the second highest bid!
6
T wo Primary Ways to Influence Agents’ Behaviors
ØDesign/provide incentives
7
T wo Primary Ways to Influence Agents’ Behaviors
ØDesign/provide incentives
Bonus depends on performance, and is up to $1M!
8
T wo Primary Ways to Influence Agents’ Behaviors
ØDesign/provide incentives
Mechanism Design
9
T wo Primary Ways to Influence Agents’ Behaviors
ØDesign/provide incentives
ØInfluence agents’ beliefs
All warfare is based on deception. Hence, when we are able to attack, we must seem unable; when using our forces, we must appear inactive…
Mechanism Design
10
T wo Primary Ways to Influence Agents’ Behaviors
ØDesign/provide incentives
ØInfluence agents’ beliefs
Strategic inventory information disclosure
Mechanism Design
11
T wo Primary Ways to Influence Agents’ Behaviors
ØDesign/provide incentives
ØInfluence agents’ beliefs
Strategic inventory information disclosure
Mechanism Design
12
T wo Primary Ways to Influence Agents’ Behaviors
ØDesign/provide incentives
ØInfluence agents’ beliefs
Mechanism Design
13
T wo Primary Ways to Influence Agents’ Behaviors
ØDesign/provide incentives
ØInfluence agents’ beliefs
there for a goal
Mechanism Design Persuasion
14
Ø Intrinsic in human activities: advertising, negotiation, politics, security, marketing, financial regulation,… Ø A large body of research Persuasion is the act of exploiting an informational advantage in order to influence the decisions of others
–– The American Economic Review Vol. 85, No. 2, 1995.
15
Ø Advisor vs. recruiter Ø 1/3 of the advisor’s students are excellent; 2/3 are average Ø A fresh graduate is randomly drawn from this population Ø Recruiter
(1 + 𝜗)×1/3 − 1×2/3 <
hiring Not hiring
16
Ø Advisor vs. recruiter Ø 1/3 of the advisor’s students are excellent; 2/3 are average Ø A fresh graduate is randomly drawn from this population Ø Recruiter
Ø Advisor
17
Ø Attempt 1: always say “excellent” (equivalently, no information) What is the advisor’s optimal “recommendation strategy”?
Remark Advisor commitment: cannot deviate and recruiter knows his strategy
18
Ø Attempt 2: honest recommendation (i.e., full information)
excellent average recruiter 1/3 1/3 2/3 2/3 1/3 2/3 What is the advisor’s optimal “recommendation strategy”?
19
Ø Attempt 3: noisy information à advisor expected utility 2/3 recruiter 2/3 1/3 1/3 (1 + 𝜗 − 1)/2 >
Hiring Not hiring
What is the advisor’s optimal “recommendation strategy”? average 2/3 1/3 P(excellent | ) = 1/2 excellent 1/3 1/3
20
Ø Two players: persuader (Sender, she), decision maker (Receiver he)
Ø Receiver looks to take an action 𝑗 ∈ 𝑜 = {1, 2, … , 𝑜}
Ø Both players know 𝜄 ∼ 𝑞𝑠𝑗𝑝𝑠 𝑒𝑗𝑡𝑢. 𝜈, but Sender has an informational advantage – she can observe realization of 𝜄 Ø Sender wants to strategically reveal info about 𝜄 to “persuade” Receiver to take an action she likes
𝜄 ∈ Θ is a random state of nature
Well…how to reveal partial information?
21
Example Ø Θ = {𝐹𝑦𝑑𝑓𝑚𝑚𝑓𝑜𝑢, 𝐵𝑤𝑓𝑠𝑏𝑓}, Σ = {𝐵, 𝐶} Ø 𝜌 𝐵, 𝐵𝑤𝑓𝑠𝑏𝑓 = 1/2
Definition: A signaling scheme is a mapping 𝜌: Θ → ΔP where Σ is the set of all possible signals. 𝜌 is fully described by 𝜌 𝜏, 𝜄
R∈S,T∈P where 𝜌 𝜏, 𝜄 = prob. of
sending 𝜏 when observing 𝜄 (so ∑T∈P 𝜌 𝜏, 𝜄 = 1 for any 𝜄)
Note: scheme 𝜌 is always assumed public knowledge, thus known by Receiver
22
Definition: A signaling scheme is a mapping 𝜌: Θ → ΔP where Σ is the set of all possible signals. 𝜌 is fully described by 𝜌 𝜏, 𝜄
R∈S,T∈P where 𝜌 𝜏, 𝜄 = prob. of
sending 𝜏 when observing 𝜄 (so ∑T∈P 𝜌 𝜏, 𝜄 = 1 for any 𝜄) What can Receiver infer about 𝜄 after receiving σ? Bayes updating: Pr(𝜄|𝜏) =
Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^
Pr 𝑓𝑦𝑑𝑓𝑚𝑚𝑓𝑜𝑢 𝐵 = 1/2
23
Definition: A signaling scheme is a mapping 𝜌: Θ → ΔP where Σ is the set of all possible signals. 𝜌 is fully described by 𝜌 𝜏, 𝜄
R∈S,T∈P where 𝜌 𝜏, 𝜄 = prob. of
sending 𝜏 when observing 𝜄 (so ∑T∈P 𝜌 𝜏, 𝜄 = 1 for any 𝜄) Would such noisy information benefit Receiver?
Ø Expected Receiver utility conditioned on 𝜏:
∑R∈S 𝑠 𝑗, 𝜄 ⋅
Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^
maxb∈[d] [ ] 𝑆 𝜏 =
Ø Pr(𝜏) = ∑R^ 𝜌 𝜏, 𝜄g ⋅ 𝜈 𝜄g
24
maxb∈[d] [ ]
Definition: A signaling scheme is a mapping 𝜌: Θ → ΔP where Σ is the set of all possible signals. 𝜌 is fully described by 𝜌 𝜏, 𝜄
R∈S,T∈P where 𝜌 𝜏, 𝜄 = prob. of
sending 𝜏 when observing 𝜄 (so ∑T∈P 𝜌 𝜏, 𝜄 = 1 for any 𝜄) Would such noisy information benefit Receiver?
Ø Expected Receiver utility conditioned on 𝜏:
∑R∈S 𝑠 𝑗, 𝜄 ⋅
Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^
𝑆 𝜏 =
Ø Pr(𝜏) = ∑R^ 𝜌 𝜏, 𝜄g ⋅ 𝜈 𝜄g
25
maxb∈[d] [ ]
Definition: A signaling scheme is a mapping 𝜌: Θ → ΔP where Σ is the set of all possible signals. 𝜌 is fully described by 𝜌 𝜏, 𝜄
R∈S,T∈P where 𝜌 𝜏, 𝜄 = prob. of
sending 𝜏 when observing 𝜄 (so ∑T∈P 𝜌 𝜏, 𝜄 = 1 for any 𝜄) Would such noisy information benefit Receiver?
Ø Expected Receiver utility conditioned on 𝜏:
∑R∈S 𝑠 𝑗, 𝜄 ⋅
Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^
𝑆 𝜏 =
Ø Pr(𝜏) = ∑R^ 𝜌 𝜏, 𝜄g ⋅ 𝜈 𝜄g
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 Ø Expected Receiver utility under 𝜌: ∑T max
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄
26
signaling scheme 𝜌.
27
Ø Expected Receiver utility without information: max
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜈 𝜄 Proof:
signaling scheme 𝜌.
Ø Expected Receiver utility under 𝜌: ∑T max
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄
28
Ø Expected Receiver utility without information: max
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜈 𝜄 Proof: Ø Let 𝑗∗ = arg max
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜈 𝜄 , we have ∑T max
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 ≥ ∑T ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 Ø Expected Receiver utility under 𝜌: ∑T max
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 = ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ [∑T 𝜌 𝜏, 𝜄 ] ⋅ 𝜈 𝜄 = ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ 𝜈 𝜄
signaling scheme 𝜌.
29
Ø Expected Receiver utility without information: max
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜈 𝜄 Proof: Ø Let 𝑗∗ = arg max
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜈 𝜄 , we have ∑T max
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 ≥ ∑T ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 Ø Expected Receiver utility under 𝜌: ∑T max
b
∑R∈S 𝑠 𝑗, 𝜄 ⋅ 𝜌 𝜏, 𝜄 ⋅ 𝜈 𝜄 = ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ [∑T 𝜌 𝜏, 𝜄 ] ⋅ 𝜈 𝜄 = ∑R∈S 𝑠 𝑗∗, 𝜄 ⋅ 𝜈 𝜄
signaling scheme 𝜌.
30
signaling scheme 𝜌. Remarks:
Ø Signaling scheme does increase Receiver’s utility Ø More (even noisy) information always helps a decision maker (DM)
31
signaling scheme 𝜌. Remarks:
Ø Signaling scheme does increase Receiver’s utility Ø More (even noisy) information always helps a decision maker (DM)
reveals full info, i.e., directly revealing the realized 𝜄.
Because any other noisy scheme 𝜌 can be improved by further revealing 𝜄 itself
32
signaling scheme 𝜌. Remarks:
Ø Signaling scheme does increase Receiver’s utility Ø More (even noisy) information always helps a decision maker (DM)
reveals full info, i.e., directly revealing the realized 𝜄. But this is not Sender’s goal… Sender Objective: carefully pick 𝜌 to maximize her expected utility
33
Ø Introduction and Bayesian Persuasion Ø Algorithms for Bayesian Persuasion
34
ØDon’t know what is the set of all possible signals Σ... ØLike in mechanism design, too many signals to consider in this
world
ØKey observation: a signal is mathematically nothing but a
posterior distribution over Θ
Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^
ØIt turns out that 𝑜 signals suffice
Q: What worries you the most when designing 𝜌 = 𝜌 𝜄, 𝜏
R∈S,T∈P?
35
most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗
36
ØConditioned on any signal 𝜏
Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^
b∈[d] ∑R Pr(θ|σ) 𝑠(𝑗, 𝜄)
most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗
37
ØConditioned on any signal 𝜏
Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^
b∈[d] ∑R Pr(θ|σ) 𝑠(𝑗, 𝜄)
ØNow, if signal 𝜏 and 𝜏′ result in the same optimal action 𝑗∗, Sender
can instead send a new signal 𝜏b∗ = (𝜏, 𝜏′) in both cases
most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗
38
ØConditioned on any signal 𝜏
Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^
b∈[d] ∑R Pr(θ|σ) 𝑠(𝑗, 𝜄)
ØNow, if signal 𝜏 and 𝜏′ result in the same optimal action 𝑗∗, Sender
can instead send a new signal 𝜏b∗ = (𝜏, 𝜏′) in both cases
most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗
∑R Pr(θ|σ) 𝑠(𝑗∗, 𝜄) ≥ ∑R Pr(θ|σ) 𝑠(𝑗, 𝜄) , ∀ 𝑗 ∑R Pr(θ|σ’) 𝑠(𝑗∗, 𝜄) ≥ ∑R Pr(θ|σ’) 𝑠(𝑗, 𝜄) , ∀ 𝑗 ⇒ ∑R Pr(θ|σ)𝑞 + Pr(θ|σ’)(1 − 𝑞) 𝑠(𝑗∗, 𝜄) ≥ ∑R Pr(θ|σ)𝑞 + Pr(θ|σ’)(1 − 𝑞) 𝑠(𝑗, 𝜄) , ∀ 𝑗 Pr(θ|σ∗) is a convex combination of Pr(θ|σ) and Pr(θ|σ’)
39
ØConditioned on any signal 𝜏
Z T,R ⋅\ R ∑]^ Z T,R^ ⋅\ R^
b∈[d] ∑R Pr(θ|σ) 𝑠(𝑗, 𝜄)
ØNow, if signal 𝜏 and 𝜏′ result in the same optimal action 𝑗∗, Sender
can instead send a new signal 𝜏b∗ = (𝜏, 𝜏′) in both cases
wanted
ØCan merge all signals with optimal receiver action 𝑗∗ as a single
signal 𝜏b∗
most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗
40
ØEach 𝜏b can be viewed as an action recommendation of 𝑗
most 𝑜(= # receiver actions) signals, where signal 𝜏b induce optimal Receiver action 𝑗
41
ØInput: prior 𝜈, sender payoff 𝑡(𝑗, 𝜄), receiver payoff 𝑠(𝑗, 𝜄) ØVariables: 𝜌(𝜏b, 𝜄)
Sender expected utility (we know Receiver will take 𝑗 at signal 𝜏b)
42
ØInput: prior 𝜈, sender payoff 𝑡(𝑗, 𝜄), receiver payoff 𝑠(𝑗, 𝜄) ØVariables: 𝜌(𝜏b, 𝜄)
𝜏b indeed incentivizes Receiver best action 𝑗
43
ØInput: prior 𝜈, sender payoff 𝑡(𝑗, 𝜄), receiver payoff 𝑠(𝑗, 𝜄) ØVariables: 𝜌(𝜏b, 𝜄)
𝜌 is a valid signaling scheme
Haifeng Xu
University of Virginia hx4ad@virginia.edu