Announcements HW 3 due next Tuesday No HW 4 1 CS6501: T opics in - - PowerPoint PPT Presentation

announcements
SMART_READER_LITE
LIVE PREVIEW

Announcements HW 3 due next Tuesday No HW 4 1 CS6501: T opics in - - PowerPoint PPT Presentation

Announcements HW 3 due next Tuesday No HW 4 1 CS6501: T opics in Learning and Game Theory (Fall 2019) Crowdsourcing Information and Peer Prediction Instructor: Haifeng Xu Outline Eliciting Information without Verification


slide-1
SLIDE 1

1

Announcements

ØHW 3 due next Tuesday ØNo HW 4

slide-2
SLIDE 2

CS6501: T

  • pics in Learning and Game Theory

(Fall 2019) Crowdsourcing Information and Peer Prediction

Instructor: Haifeng Xu

slide-3
SLIDE 3

3

Outline

Ø Eliciting Information without Verification Ø Equilibrium Concept and Peer Prediction Mechanism Ø Bayesian Truth Serum

slide-4
SLIDE 4

4

Crowdsourcing Information

ØRecruit AMT workers to label images

  • Cannot check ground truth (too costly)
slide-5
SLIDE 5

5

Crowdsourcing Information

ØRecruit AMT workers to label images

  • Cannot check ground truth (too costly)

ØPeer grading (of, e.g., essays) on MOOC

  • Don’t know true scores
slide-6
SLIDE 6

6

Crowdsourcing Information

ØRecruit AMT workers to label images

  • Cannot check ground truth (too costly)

ØPeer grading (of, e.g., essays) on MOOC

  • Don’t know true scores

ØElicit ratings for various entities (e.g., on Yelp or Google)

  • We never find out the true quality/rating
slide-7
SLIDE 7

7

Crowdsourcing Information

ØRecruit AMT workers to label images

  • Cannot check ground truth (too costly)

ØPeer grading (of, e.g., essays) on MOOC

  • Don’t know true scores

ØElicit ratings for various entities (e.g., on Yelp or Google)

  • We never find out the true quality/rating

ØAnd many other applications…

slide-8
SLIDE 8

8

Common Features in These Applications

ØWe (the designer) elicit information from population ØCannot or too costly to know ground truth

  • The reason of using crowdsourcing info elicitation
  • Key difference from prediction markets

ØAgents/experts may misreport

Challenge: cannot verify the report/prediction Solution: let multiple agents compete for the same task, and score them against each other (thus the name “peer prediction”) Where else did we see a similar idea?

slide-9
SLIDE 9

9

A Simple and Concrete Example

ØElicit Alice’s and Bob’s truthful rating 𝐵, 𝐶 about UVA dinning

  • 𝐵, 𝐶 ∈ {𝐼𝑗𝑕ℎ, 𝑀𝑝𝑥}
  • There is a common joint belief: 𝑄 𝐵, 𝐶 = [𝐼, 𝐼] = 0.5; 𝑄(

) 𝐵, 𝐶 = [𝐼, 𝑀] = 0.24; 𝑄 𝐵, 𝐶 = [𝑀, 𝐼] = 0.24; 𝑄 𝐵, 𝐶 = [𝑀, 𝑀] = 0.02

Let’s try to understand this distribution …

Ø It is symmetric among Alice and Bob Ø 𝑄 𝐵 = 𝐼 = 0.5 + 0.24 = 0.74

  • Each expert very likely rates 𝐼

Ø 𝑄 𝐵 = 𝐼|𝐶 = 𝐼 =

>(?@A,B@A) >(B@A)

=

C.D C.EF = GD HE

  • Given that one rates 𝐼, the other very likely rates 𝐼 as well

Ø 𝑄 𝐵 = 𝐼|𝐶 = 𝑀 = >(?@A,B@I)

>(B@I)

= C.GF

C.GJ = KG KH

  • Given that one rates 𝑀, the other still very likely rates 𝐼
slide-10
SLIDE 10

10

A Simple and Concrete Example

ØElicit Alice’s and Bob’s truthful rating 𝐵, 𝐶 about UVA dinning

  • 𝐵, 𝐶 ∈ {𝐼𝑗𝑕ℎ, 𝑀𝑝𝑥}
  • There is a common joint belief: 𝑄 𝐵, 𝐶 = [𝐼, 𝐼] = 0.5; 𝑄(

) 𝐵, 𝐶 = [𝐼, 𝑀] = 0.24; 𝑄 𝐵, 𝐶 = [𝑀, 𝐼] = 0.24; 𝑄 𝐵, 𝐶 = [𝑀, 𝑀] = 0.02

  • 𝑄 𝐵 = 𝐼 = 0.74; 𝑄 𝐵 = 𝐼|𝐶 = 𝐼 =

GD HE ; 𝑄 𝐵 = 𝐼|𝐶 = 𝑀 = KG KH

Q: What are some natural peer comparison and rewarding mechanisms?

ØOne simple idea is to reward agreement

  • Ask Alice and Bob to report their signals L

𝐵 , L 𝐶 (may misreport)

  • Award 1 to both if L

𝐵 = L 𝐶 , otherwise reward 0

slide-11
SLIDE 11

11

A Simple and Concrete Example

ØElicit Alice’s and Bob’s truthful rating 𝐵, 𝐶 about UVA dinning

  • 𝐵, 𝐶 ∈ {𝐼𝑗𝑕ℎ, 𝑀𝑝𝑥}
  • There is a common joint belief: 𝑄 𝐵, 𝐶 = [𝐼, 𝐼] = 0.5; 𝑄(

) 𝐵, 𝐶 = [𝐼, 𝑀] = 0.24; 𝑄 𝐵, 𝐶 = [𝑀, 𝐼] = 0.24; 𝑄 𝐵, 𝐶 = [𝑀, 𝑀] = 0.02

  • 𝑄 𝐵 = 𝐼 = 0.74; 𝑄 𝐵 = 𝐼|𝐶 = 𝐼 =

GD HE ; 𝑄 𝐵 = 𝐼|𝐶 = 𝑀 = KG KH

Q: What are some natural peer comparison and rewarding mechanisms?

ØOne simple idea is to reward agreement

  • Ask Alice and Bob to report their signals L

𝐵 , L 𝐶 (may misreport)

  • Award 1 to both if L

𝐵 = L 𝐶 , otherwise reward 0

ØDoes this work?

  • If 𝐵 = 𝐼, what should Alice report?
  • If 𝐵 = 𝑀, what should Alice report?
slide-12
SLIDE 12

12

A Simple and Concrete Example

ØElicit Alice’s and Bob’s truthful rating 𝐵, 𝐶 about UVA dinning

  • 𝐵, 𝐶 ∈ {𝐼𝑗𝑕ℎ, 𝑀𝑝𝑥}
  • There is a common joint belief: 𝑄 𝐵, 𝐶 = [𝐼, 𝐼] = 0.5; 𝑄(

) 𝐵, 𝐶 = [𝐼, 𝑀] = 0.24; 𝑄 𝐵, 𝐶 = [𝑀, 𝐼] = 0.24; 𝑄 𝐵, 𝐶 = [𝑀, 𝑀] = 0.02

  • 𝑄 𝐵 = 𝐼 = 0.74; 𝑄 𝐵 = 𝐼|𝐶 = 𝐼 =

GD HE ; 𝑄 𝐵 = 𝐼|𝐶 = 𝑀 = KG KH

Q: What are some natural peer comparison and rewarding mechanisms?

ØOne simple idea is to reward agreement

  • Ask Alice and Bob to report their signals L

𝐵 , L 𝐶 (may misreport)

  • Award 1 to both if L

𝐵 = L 𝐶 , otherwise reward 0

ØDoes this work?

  • If 𝐵 = 𝐼, what should Alice report?
  • If 𝐵 = 𝑀, what should Alice report?

Truthful report is not an equilibrium!

slide-13
SLIDE 13

13

A Simple and Concrete Example

ØElicit Alice’s and Bob’s truthful rating 𝐵, 𝐶 about UVA dinning

  • 𝐵, 𝐶 ∈ {𝐼𝑗𝑕ℎ, 𝑀𝑝𝑥}
  • There is a common joint belief: 𝑄 𝐵, 𝐶 = [𝐼, 𝐼] = 0.5; 𝑄(

) 𝐵, 𝐶 = [𝐼, 𝑀] = 0.24; 𝑄 𝐵, 𝐶 = [𝑀, 𝐼] = 0.24; 𝑄 𝐵, 𝐶 = [𝑀, 𝑀] = 0.02

  • 𝑄 𝐵 = 𝐼 = 0.74; 𝑄 𝐵 = 𝐼|𝐶 = 𝐼 =

GD HE ; 𝑄 𝐵 = 𝐼|𝐶 = 𝑀 = KG KH

Q: What are some natural peer comparison and rewarding mechanisms?

ØBoth players always report 𝐼 (i.e., L

𝐵 = L 𝐶 = 𝐼) is a Nash Equ.

ØWhy?

  • Well, under “rewarding agreement”, they both get 1, the maximum

possible

  • In fact, both always reporting 𝑀 is also a NE
slide-14
SLIDE 14

14

Outline

Ø Eliciting Information without Verification Ø Equilibrium Concept and Peer Prediction Mechanism Ø Bayesian Truth Serum

slide-15
SLIDE 15

15

The Model of Peer Prediction

ØTwo experts Alice and Bob, each holding a signal 𝐵 ∈ {𝐵K, ⋯ , 𝐵O}

and 𝐶 ∈ {𝐶K, ⋯ , 𝐶P} respectively

  • A joint distribution 𝑞 of (𝐵, 𝐶) is publicly known
  • Everything we describe generalize to 𝑜 experts

ØWe would like to elicit Alice’s and Bob’s true signals

  • We never know what signals they truly have

A seemingly richer but equivalent model

ØWe want to estimate distribution of random var 𝐹 ØJoint prior distribution 𝑞 of (𝐵, 𝐶, 𝐹) is publicly known

  • E.g., 𝐹 is true quality of our dinning, which we never observe

ØGoal: elicit 𝐵, 𝐶 to refine our estimation of 𝐹

slide-16
SLIDE 16

16

A Subtle Issue

Eliciting signals vs distributions

ØIn prediction markets, we asked experts to report distributions ØHere, could have done the same thing

  • Alice could report 𝑞 𝐹 𝐵 , the dist. of 𝐹 conditioned on her signal 𝐵

A seemingly richer but equivalent model

ØWe want to estimate distribution of random var 𝐹 ØJoint prior distribution 𝑞 of (𝐵, 𝐶, 𝐹) is publicly known

  • E.g., 𝐹 is true quality of our dinning, which we never observe

ØGoal: elicit 𝐵, 𝐶 to refine our estimation of 𝐹

slide-17
SLIDE 17

17

A Subtle Issue

Eliciting signals vs distributions

ØIn prediction markets, we asked experts to report distributions ØHere, could have done the same thing

  • Alice could report 𝑞 𝐹 𝐵 , the dist. of 𝐹 conditioned on her signal 𝐵
  • Let’s make a minor assumption: 𝑞 𝐹 𝐵

≠ 𝑞 𝐹 𝐵′ for any 𝐵 ≠ 𝐵′

  • Then, reporting signal 𝐵 is equivalent to reporting distribution 𝑞 𝐹 𝐵
  • So, w.l.o.g., eliciting signals is equivalent

A seemingly richer but equivalent model

ØWe want to estimate distribution of random var 𝐹 ØJoint prior distribution 𝑞 of (𝐵, 𝐶, 𝐹) is publicly known

  • E.g., 𝐹 is true quality of our dinning, which we never observe

ØGoal: elicit 𝐵, 𝐶 to refine our estimation of 𝐹

slide-18
SLIDE 18

18

A Subtle Issue

Eliciting signals vs distributions

ØIn prediction markets, we asked experts to report distributions ØHere, could have done the same thing

  • Alice could report 𝑞 𝐹 𝐵 , the dist. of 𝐹 conditioned on her signal 𝐵
  • Let’s make a minor assumption: 𝑞 𝐹 𝐵

≠ 𝑞 𝐹 𝐵′ for any 𝐵 ≠ 𝐵′

  • Then, reporting signal 𝐵 is equivalent to reporting distribution 𝑞 𝐹 𝐵
  • So, w.l.o.g., eliciting signals is equivalent

ØDrawback: have to assume an accurate and known prior

A seemingly richer but equivalent model

ØWe want to estimate distribution of random var 𝐹 ØJoint prior distribution 𝑞 of (𝐵, 𝐶, 𝐹) is publicly known

  • E.g., 𝐹 is true quality of our dinning, which we never observe

ØGoal: elicit 𝐵, 𝐶 to refine our estimation of 𝐹

slide-19
SLIDE 19

19

Info Elicitation Mechanisms and Equilibrium

ØRecall, we elicit info by asking Alice’s and Bob’s signal L

𝐵 , L 𝐶

ØAs before, will design rewards 𝑠

?(L

𝐵 , L 𝐶 ) and 𝑠

B(L

𝐵 , L 𝐶 )

ØAlice’s action is a report strategy 𝜏

? 𝐵 ∈ {𝐵K, ⋯ , 𝐵O} [Bob similar]

  • This is a pure strategy
  • Will not consider mixed strategy here as we will design 𝑠

? and 𝑠 B so

that there is a good pure equilibrium

  • Truth-telling strategy: 𝜏

? 𝐵 = 𝐵, 𝜏B(𝐶) = 𝐶

ØThen, what outcome is expected to occur? ØGenerally, it is a Bayesian Nash equilibrium (BNE)

  • For simplicity, only define the equilibrium for our particular setting

à equilibrium outcome

slide-20
SLIDE 20

20

Info Elicitation Mechanisms and Equilibrium

ØRecall, we elicit info by asking Alice’s and Bob’s signal L

𝐵 , L 𝐶

ØAs before, will design rewards 𝑠

?(L

𝐵 , L 𝐶 ) and 𝑠

B(L

𝐵 , L 𝐶 )

ØAlice’s action is a report strategy 𝜏

? 𝐵 ∈ {𝐵K, ⋯ , 𝐵O} [Bob similar]

  • Definition. 𝜏

? 𝐵 , 𝜏B(𝐶) is a Bayesian Nash equilibrium if the

following holds 𝔽B|? 𝑠

? 𝜏 ? 𝐵 , 𝜏B 𝐶

≥ 𝔽B|? 𝑠

? 𝜏Z? 𝐵 , 𝜏B 𝐶

, ∀𝐵 𝔽?|B 𝑠

B 𝜏 ? 𝐵 , 𝜏B 𝐶

≥ 𝔽?|B 𝑠

B 𝜏 ? 𝐵 , 𝜏′B 𝐶

, ∀𝐶.

We say it is a strict BNE if both “≥” are “>”

slide-21
SLIDE 21

21

Mechanism for Peer Prediction

ØDesign objective: choose 𝑠

?, 𝑠 B so that truth-telling is an Equ.

Any ideas? Ø Use proper scoring rules, but don’t know signal distributions… Ø Alice’s signal can be used to estimate a distribution of Bob’s signal, and vice versa

slide-22
SLIDE 22

22

Mechanism for Peer Prediction

Note: step 2 relies on the prior distribution 𝑞 Information Elicitation without Verification

“Parameter”: any strict proper scoring rule 𝑇(𝑗; 𝑞)

1.

Elicit Alice’s signal L 𝐵 and Bob’s signal L 𝐶

2.

Calculate 𝑞 ̅

? = dist of 𝐶 conditioned on

̅ 𝐵, and similarly 𝑞 h

B

3.

Award Alice 𝑠

? L

𝐵 , L 𝐶 = 𝑇( L 𝐶 ; 𝑞 ̅

?) and Bob 𝑠 B L

𝐵 , L 𝐶 = 𝑇( L 𝐵 ; 𝑞 h

B)

slide-23
SLIDE 23

23

Mechanism for Peer Prediction

Information Elicitation without Verification

“Parameter”: any strict proper scoring rule 𝑇(𝑗; 𝑞)

1.

Elicit Alice’s signal L 𝐵 and Bob’s signal L 𝐶

2.

Calculate 𝑞 ̅

? = dist of 𝐶 conditioned on

̅ 𝐵, and similarly 𝑞 h

B

3.

Award Alice 𝑠

? L

𝐵 , L 𝐶 = 𝑇( L 𝐶 ; 𝑞 ̅

?) and Bob 𝑠 B L

𝐵 , L 𝐶 = 𝑇( L 𝐵 ; 𝑞 h

B)

  • Theorem. Truth-telling is a strict BNE in the above game

Proof: show 𝜏

? 𝐵 = 𝐵 is a best response to 𝜏B(𝐶) = 𝐶, and vice versa

Ø If Bob reports 𝐶 truthfully, Alice receives 𝑇(𝐶; 𝑞 ̅

?) by reporting ̅

𝐵 Ø With true signal 𝐵, what is Alice’s best response report ̅ 𝐵?

  • By strict properness, Alice wants 𝑞 ̅

? to be exactly her true belief of dist. of 𝐶

  • So, Alice should report ̅

𝐵 = 𝐵.

slide-24
SLIDE 24

24

Remarks

ØMechanism is only described for two experts, but no difficult to

generalize to 𝑜 experts

  • Can randomly match each expert to a “peer” as reference

ØSerious issues are the following

Issue 1: there are many other equilibria in the game

ØDinning rating example with slightly different numbers

  • A common joint belief: 𝑄 𝐵, 𝐶 = [𝐼, 𝐼] = 0.4; 𝑄 𝐵, 𝐶 = [𝐼, 𝑀] =

0.1; 𝑄 𝐵, 𝐶 = [𝑀, 𝐼] = 0.1; 𝑄 𝐵, 𝐶 = [𝑀, 𝑀] = 0.4

ØBoth always report 𝐼 is also an equilibrium

  • If Bob always say 𝐼, Alice’s reward is always 𝑇(𝐼; 𝑞 ̅

?) for whatever

true 𝐵

  • ̅

𝐵 = 𝐼 makes 𝑞 ̅

? 𝐼 = 𝑄 𝐶 = 𝐼

̅ 𝐵 = 𝐼 = 4/5

  • ̅

𝐵 = 𝑀 makes 𝑞 ̅

? 𝐼 = 𝑄 𝐶 = 𝐼

̅ 𝐵 = 𝑀 = 1/5

slide-25
SLIDE 25

25

Remarks

ØMechanism is only described for two experts, but no difficult to

generalize to 𝑜 experts

  • Can randomly match each expert to a “peer” as reference

ØSerious issues are the following

Issue 1: there are many other equilibria in the game

ØMore generally, reporting quantities that are easy to coordinate

likely forms an equilibrium

  • E.g., you are asked to grade essays, but you may all report the length
  • f the essay while not its true quality (less effort, more well correlated)

ØThis is a fundamental issue of peer prediction

Open question: how to design mechanisms where truth- telling is unique (or the most plausible) equilibrium

slide-26
SLIDE 26

26

Remarks

ØMechanism is only described for two experts, but no difficult to

generalize to 𝑜 experts

  • Can randomly match each expert to a “peer” as reference

ØSerious issues are the following

Issue 2: Designer has to know the joint distribution of (𝐵, 𝐶) Ø Not very realistic, as designer usually has little knowledge Ø But, there are remedies for this

slide-27
SLIDE 27

27

Outline

Ø Eliciting Information without Verification Ø Equilibrium Concept and Peer Prediction Mechanism Ø Bayesian Truth Serum

slide-28
SLIDE 28

28

Designed for a Special yet Realistic Setting

ØWe, the designer, want to predict distribution of 𝐹 Ø𝑜 experts, each 𝑗 has a signal 𝑇j ∼ 𝑞(𝑇|𝐹) i.i.d.

  • In this setting, we have to have many experts
  • Assume experts know 𝑞(𝑇|𝐹) but we do not know

ØObjective: elicit true signals 𝑇K, ⋯ , 𝑇O

Key design ideas

slide-29
SLIDE 29

29

Designed for a Special yet Realistic Setting

ØWe, the designer, want to predict distribution of 𝐹 Ø𝑜 experts, each 𝑗 has a signal 𝑇j ∼ 𝑞(𝑇|𝐹) i.i.d.

  • In this setting, we have to have many experts
  • Assume experts know 𝑞(𝑇|𝐹) but we do not know

ØObjective: elicit true signals 𝑇K, ⋯ , 𝑇O

Key design ideas

Ø Cannot compute posterior distribution conditioned on any expert’s signal anymore, but still need it to score him Ø So, will elicit both his signal and his posterior belief of others’ signals

slide-30
SLIDE 30

30

Bayesian Truth Serum [Prelec, Science’04]

The Protocol

1. For each 𝑗, elicit her signal L 𝑇j and her prediction h

𝑞j ∈ Δ|m| of the

distribution of any other expert’s signal (agents are i.i.d. a-priori) 2. Calculate (geometric) mean prediction ̅

𝑞 where log ̅

𝑞m =

K O ∑j log h

𝑞m

j for any signal 𝑇

3. Compute ̅

𝜇 to the empirical distribution of reported signals L

𝑇j’s. 4. Reward agent 𝑗 the following (𝐻 is any proper scoring rule) log

h 𝜇h

ms

̅ 𝑞 ̅

ms

+ 𝔽m∼L

t 𝐻(𝑇; ̅

𝑞j)

slide-31
SLIDE 31

31

Bayesian Truth Serum [Prelec, Science’04]

The Protocol

1. For each 𝑗, elicit her signal L 𝑇j and her prediction h

𝑞j ∈ Δ|m| of the

distribution of any other expert’s signal (agents are i.i.d. a-priori) 2. Calculate (geometric) mean prediction ̅

𝑞 where log ̅

𝑞m =

K O ∑j log h

𝑞m

j for any signal 𝑇

3. Compute ̅

𝜇 to the empirical distribution of reported signals L

𝑇j’s. 4. Reward agent 𝑗 the following (𝐻 is any proper scoring rule) log

h 𝜇h

ms

̅ 𝑞 ̅

ms

+ 𝔽m∼L

t 𝐻(𝑇; ̅

𝑞j) Score of 𝑗’s signal report 𝑇j (good if ̅ 𝜇 ̅

ms ≥

̅ 𝑞 ̅

ms)

Ø That is, 𝑗’s reported type is surprisingly more common than predicted probability

slide-32
SLIDE 32

32

Bayesian Truth Serum [Prelec, Science’04]

The Protocol

1. For each 𝑗, elicit her signal L 𝑇j and her prediction h

𝑞j ∈ Δ|m| of the

distribution of any other expert’s signal (agents are i.i.d. a-priori) 2. Calculate (geometric) mean prediction ̅

𝑞 where log ̅

𝑞m =

K O ∑j log h

𝑞m

j for any signal 𝑇

3. Compute ̅

𝜇 to the empirical distribution of reported signals L

𝑇j’s. 4. Reward agent 𝑗 the following (𝐻 is any proper scoring rule) log

h 𝜇h

ms

̅ 𝑞 ̅

ms

+ 𝔽m∼L

t 𝐻(𝑇; ̅

𝑞j) Score of 𝑗’s prediction ̅ 𝑞j, against the true signal distribution ̅ 𝜇 Ø By properness, want ̅ 𝑞j to be close to ̅ 𝜇

slide-33
SLIDE 33

33

Bayesian Truth Serum [Prelec, Science’04]

  • Theorem. When 𝑜 → ∞, truthful report is a Bayesian Nash

equilibrium in the previous protocol. Ø That is, expert 𝑗 should report his true signal 𝑇j and his true posterior belief of other expert’s signals Ø 𝑜 → ∞ is needed because in that case ̅ 𝜇 → the exact signal distribution (under truthful signal report)

  • Several works try to relax this assumption to sufficiently large 𝑜

Ø Proof is a bit intricate (see the Science paper) Ø Very insightful, particularly, the design of rewarding “surprisingly common” signals, which is not clear before at all Ø The issue of existence of multiple equilibria is still there

slide-34
SLIDE 34

Thank You

Haifeng Xu

University of Virginia hx4ad@virginia.edu