1
Announcement
Ø Grades for HW2 and project proposal are released
Announcement Grades for HW2 and project proposal are released 1 - - PowerPoint PPT Presentation
Announcement Grades for HW2 and project proposal are released 1 CS6501: T opics in Learning and Game Theory (Fall 2019) Learning from Strategically Transformed Samples Instructor: Haifeng Xu Part of the Slides are provided by Hanrui Zhang
1
Ø Grades for HW2 and project proposal are released
CS6501: T
(Fall 2019) Learning from Strategically Transformed Samples
Instructor: Haifeng Xu
Part of the Slides are provided by Hanrui Zhang
3
Ø Introduction Ø The Model and Results
4
Q: Why attending good universities? Q: Why publishing and presenting at top conferences? Q: Why doing internships?
5
Q: Why attending good universities? Q: Why publishing and presenting at top conferences? Q: Why doing internships?
Ø All in all, these are just signals (directly observable) to indicate “excellence” (not directly observable)
6
Q: Why attending good universities? Q: Why publishing and presenting at top conferences? Q: Why doing internships?
Ø All in all, these are just signals (directly observable) to indicate “excellence” (not directly observable) Ø Asymmetric information between employees and employers 2001 Nobel Econ Price is awarded to research on asymmetric information
7
Ø A simple example
AML
theoretical idea applied idea COLT KDD 𝑀: hidden types/labels 𝑇: Samples (unobservable) Σ: Signals (observable)
TML
NeurIPs
8
Ø A simple example
AML
theoretical idea applied idea COLT KDD 𝑀: hidden types/labels 𝑇: Samples (unobservable) Σ: Signals (observable)
TML
NeurIPs Our world is known to be noisy….
9
Ø A simple example
AML
theoretical idea applied idea COLT KDD 𝑀: hidden types/labels
TML
NeurIPs 0.2 0.8 0.2 0.8
𝑚 ∈ 𝑀 is a distribution
generated by 𝑚
𝑇: Samples (unobservable) Σ: Signals (observable)
reporting strategy
10
Ø Agent’s problem:
Ø Principle’s problem:
Answers for this particular instance?
11
Ø Agent’s problem:
Ø Principle’s problem:
Generally, classification with strategically transformed samples
12
AML
theoretical idea applied idea COLT KDD 𝑀: hidden types/labels
TML
NeurIPs 0.2 0.4 0.2 0.4 𝑇: Samples (unobservable) Σ: Signals (observable)
reporting strategy
middle idea 0.4 0.4 Intuitions
ØAgent: try to report as far from others as possible ØPrincipal: examine a set of signals that maximally separate AML from TML
13
Ø Introduction Ø The Model and Results
14
ØTwo distribution types/labels: 𝑚 ∈ {, 𝑐}
Ø, 𝑐 ∈ Δ(𝑇) where 𝑇 is the set of samples ØBipartite graph 𝐻 = (𝑇 ∪ Σ, 𝐹) captures feasible signals for each
sample: 𝑡, 𝜏 ∈ 𝐹 iff 𝜏 is a valid signal for 𝑡
Ø, 𝑐, 𝐻 publicly known; 𝑇, Σ both discrete ØDistribution 𝑚 ∈ , 𝑐 generates 𝑈 samples
15
ØTwo distribution types/labels: 𝑚 ∈ {, 𝑐}
Ø, 𝑐 ∈ Δ(𝑇) where 𝑇 is the set of samples ØBipartite graph 𝐻 = (𝑇 ∪ Σ, 𝐹) captures feasible signals for each
sample: 𝑡, 𝜏 ∈ 𝐹 iff 𝜏 is a valid signal for 𝑡
Ø, 𝑐, 𝐻 publicly known; 𝑇, Σ both discrete ØDistribution 𝑚 ∈ , 𝑐 generates 𝑈 samples ØA few special cases
“empty signal”)
feasible “lies”
16
Agent’s reporting strategy 𝜌 transform 𝑈 samples to a set 𝑆 of 𝑈 signals
ØA reporting strategy is a signaling scheme
𝜌 𝜏 𝑡
17
Agent’s reporting strategy 𝜌 transform 𝑈 samples to a set 𝑆 of 𝑈 signals
ØA reporting strategy is a signaling scheme
ØGiven 𝑈 samples, 𝜌 generates 𝑈 signals (possibly randomly) as
an agent report 𝑆 ∈ Σ?
ØA special case is deterministic reporting strategy
𝜌 𝜏 𝑡
18
Remark:
ØTimeline: principal announces 𝑔 first; agent then best responds ØType ’s [𝑐’s] incentive is aligned with [opposite to] principal
Principal’s action 𝑔: Σ? → [0,1] maps agent’s report to an acceptance prob Agent’s reporting strategy 𝜌 transform 𝑈 samples to a set 𝑆 of 𝑈 signals Ø Objective: minimize prob of mistakes (i.e., reject or accept 𝑐) Ø Objective: maximize probability of being accepted
19
ØSay 𝑚 ∈ {, 𝑐} generates 𝑈 = ∞ many samples ØAny reporting strategy 𝜌 generates a distribution over Σ
ØIntuitively, type should make his 𝜌 “far from” other’s distribution
= 𝜌 𝜏|𝑚 (slight abuse of notation)
20
ØDiscrete distribution 𝑦, 𝑧 supported on Σ
=∼Q(𝜏 ∈ 𝐵)
𝑒?S 𝑦, 𝑧 = max
W [𝑦 𝐵 − 𝑧(𝐵)]
= ∑=: Q = YZ(=)[𝑦 𝜏 − 𝑧(𝜏)] =
[ \ ∑=: Q = YZ(=)[𝑦 𝜏 − 𝑧(𝜏)] + [ \ ∑=:Z = ^Q(=)[𝑧 𝜏 − 𝑦(𝜏)]
These two terms are equal
21
ØDiscrete distribution 𝑦, 𝑧 supported on Σ
=∼Q(𝜏 ∈ 𝐵)
𝑒?S 𝑦, 𝑧 = max
W [𝑦 𝐵 − 𝑧(𝐵)]
= ∑=: Q = YZ(=)[𝑦 𝜏 − 𝑧(𝜏)] =
[ \ ∑=: Q = YZ(=)[𝑦 𝜏 − 𝑧(𝜏)] + [ \ ∑=:Z = ^Q(=)[𝑧 𝜏 − 𝑦(𝜏)]
=
[ \ ∑= |𝑦 𝜏 − 𝑧 𝜏 |
=
[ \ | 𝑦 − 𝑧 |[
22
How Can Distinguish Himself from 𝑐?
ØType uses reporting strategy 𝜌 (and 𝑐 uses 𝜚) ØType wants 𝜌(⋅ |) to be far from 𝜚(⋅ |𝑐) ØThis naturally motivates a zero-sum game between , 𝑐
max
`
min
c 𝑒?S ( 𝜌 ⋅ , 𝜚 ⋅ 𝑐 ) = 𝑒d?S(, 𝑐)
à What about type 𝑐? Game value of this zero-sum game
23
How Can Distinguish Himself from 𝑐?
ØType uses reporting strategy 𝜌 (and 𝑐 uses 𝜚) ØType wants 𝜌(⋅ |) to be far from 𝜚(⋅ |𝑐) ØThis naturally motivates a zero-sum game between , 𝑐
max
`
min
c 𝑒?S ( 𝜌 ⋅ , 𝜚 ⋅ 𝑐 ) = 𝑒d?S(, 𝑐)
Note 𝑒d?S , 𝑐 ≥ 0….now, what happens if 𝑒d?S , 𝑐 > 0?
à What about type 𝑐?
24
How Can Distinguish Himself from 𝑐?
ØType uses reporting strategy 𝜌 (and 𝑐 uses 𝜚) ØType wants 𝜌(⋅ |) to be far from 𝜚(⋅ |𝑐) ØThis naturally motivates a zero-sum game between , 𝑐
max
`
min
c 𝑒?S ( 𝜌 ⋅ , 𝜚 ⋅ 𝑐 ) = 𝑒d?S(, 𝑐)
Note 𝑒d?S , 𝑐 ≥ 0….now, what happens if 𝑒d?S , 𝑐 > 0?
Ø has a strategy 𝜌∗ such that dij 𝜌∗ ⋅ , 𝜚 ⋅ 𝑐
> 0 for any 𝜚
ØUsing 𝜌∗, can distinguish himself from 𝑐 with constant probability via
Θ
[ lmno p,q
r
samples
[ sr) samples suffice to distinguish 𝑦, 𝑧 with 𝑒?S 𝑦, 𝑧 = 𝜗
à What about type 𝑐?
25
How Can Distinguish Himself from 𝑐?
ØSo 𝑒d?S , 𝑐 > 0 is sufficient for distinguishing from 𝑐 ØIt turns out that it is also necessary
Theorem: 1. If 𝑒d?S , 𝑐 = 𝜗 > 0, then there is a policy 𝑔 that makes mistakes with probability 𝜀 when #samples 𝑈 ≥ 2 ln
[ w /𝜗\.
2. If 𝑒d?S , 𝑐 = 0, then no policy 𝑔 can separate from 𝑐 regardless how large is #samples 𝑈.
26
How Can Distinguish Himself from 𝑐?
ØSo 𝑒d?S , 𝑐 > 0 is sufficient for distinguishing from 𝑐 ØIt turns out that it is also necessary
Theorem: 1. If 𝑒d?S , 𝑐 = 𝜗 > 0, then there is a policy 𝑔 that makes mistakes with probability 𝜀 when #samples 𝑈 ≥ 2 ln
[ w /𝜗\.
2. If 𝑒d?S , 𝑐 = 0, then no policy 𝑔 can separate from 𝑐 regardless how large is #samples 𝑈. Remarks:
ØProb of mistake 𝜀 can be made arbitrarily small with more samples ØWe have shown the first part ØSecond part is more difficult to prove, uses an elegant result for matching
theory
27
But…Deciding Whether 𝑒d?S , 𝑐 > 0 is Hard
ØRecall 𝑒d?S , 𝑐 = max
`
min
c 𝑒?S ( 𝜌 ⋅ , 𝜚 ⋅ 𝑐 )
Theorem: it is NP-hard to check whether 𝑒d?S , 𝑐 = 0 or not.
28
But…Deciding Whether 𝑒d?S , 𝑐 > 0 is Hard
ØRecall 𝑒d?S , 𝑐 = max
`
min
c 𝑒?S ( 𝜌 ⋅ , 𝜚 ⋅ 𝑐 )
ØWait…this is a zero-sum game, and we can solve it in poly time?
Theorem: it is NP-hard to check whether 𝑒d?S , 𝑐 = 0 or not. Q: What goes wrong?
29
But…Deciding Whether 𝑒d?S , 𝑐 > 0 is Hard
ØRecall 𝑒d?S , 𝑐 = max
`
min
c 𝑒?S ( 𝜌 ⋅ , 𝜚 ⋅ 𝑐 )
ØWait…this is a zero-sum game, and we can solve it in poly time?
Theorem: it is NP-hard to check whether 𝑒d?S , 𝑐 = 0 or not. Q: What goes wrong?
ØWe can only solve normal-form zero-sum games in poly time ØIn that case, utility fnc is linear in both players’ strategies
30
But…Deciding Whether 𝑒d?S , 𝑐 > 0 is Hard
ØRecall 𝑒d?S , 𝑐 = max
`
min
c 𝑒?S ( 𝜌 ⋅ , 𝜚 ⋅ 𝑐 )
Theorem: it is NP-hard to check whether 𝑒d?S , 𝑐 = 0 or not. Proof:
ØWill argue if we can compute 𝜌∗, then we can check 𝑒d?S , 𝑐 = 0 or not
ØIf we computed 𝜌∗, to compute 𝑒d?S , 𝑐 , we only need to solve
min
c 𝑒?S ( 𝜌∗ ⋅ , 𝜚 ⋅ 𝑐 which is convex in 𝜚
ØFirst example of reduction in this class
Corollary: it is NP-hard to compute ’s best strategy 𝜌∗.
31
ØSeparability is determined by some “distance” between , 𝑐
than the employer’s
ØThe model can be generalized to many “good” (y) and “bad”(𝑐
z)
distributions
z
y,z 𝑒d?S (y, 𝑐 z)
ØThe agent’s reporting strategy can even be adaptive
signals
32
Next Lecture will talk about how to utilize strategic manipulations to induce desirable social outcome
Haifeng Xu
University of Virginia hx4ad@virginia.edu