A tutorial on: Dynamic Mechanism Design Ruggiero Cavallo - - PowerPoint PPT Presentation

a tutorial on dynamic mechanism design
SMART_READER_LITE
LIVE PREVIEW

A tutorial on: Dynamic Mechanism Design Ruggiero Cavallo - - PowerPoint PPT Presentation

A tutorial on: Dynamic Mechanism Design Ruggiero Cavallo University of Pennsylvania Department of Computer and Information Science July 7, 2009 ACM EC The setting Sequence of decisions to be made impacting utility experienced by a


slide-1
SLIDE 1

A tutorial on: Dynamic Mechanism Design

Ruggiero Cavallo University of Pennsylvania

Department of Computer and Information Science

July 7, 2009 ACM EC

slide-2
SLIDE 2

The setting

  • Sequence of decisions to be made impacting

utility experienced by a group of agents

  • Social planner wants to make optimal choices

agents hold private valuation information knowledge of private information required to determine optimal decision at every point in time new private information potentially arrives after each decision

2

slide-3
SLIDE 3

Example: a resource – say a government-

  • wned super-computer – is to be allocated

repeatedly for 1 week intervals.

3

Problem: Agents’ goals differ from center’s goals, but agent cooperation is essential.

slide-4
SLIDE 4

The solution framework

  • Dynamic mechanism design:

specification of payment schemes such that

  • ptimal outcomes are achieved in

equilibrium. An extension/generalization of “static” mechanism design.

4

slide-5
SLIDE 5

Does static MD really fall short?

  • Yes. Rare is the decision scenario that is

completely independent of future decisions.

  • E.g., allocating a resource. What future
  • pportunities will there be for procuring

the resource? What opportunities for reselling the resource?

5

slide-6
SLIDE 6

Tutorial plan

  • 1. Rudimentary review of static mechanism

design.

  • 2. Dynamic MD basics: modeling of “types” in

dynamic settings, dynamic equilibrium notions.

  • 3. Key solutions so far.
  • 4. Extensions.

6

slide-7
SLIDE 7

Static mechanism design

Just social-welfare maximizing, here

7

slide-8
SLIDE 8
  • Many of the marquee static MD results

have (more complicated) analogs in the dynamic setting.

  • So start with quick review of static case...

8

slide-9
SLIDE 9

9

slide-10
SLIDE 10

10

private information decision, payments

Mechanism Design

slide-11
SLIDE 11

11

Mechanism design

  • Specify decision rule (outcome selection), plus a

monetary charge/payment imposed on each agent.

  • Outcome/payments enforced by a center.
  • Criteria for success:

social-welfare maximizing (a.k.a., efficient) individual rational (no agent worse off) budget properties

slide-12
SLIDE 12

Solution concepts

  • Strategyproof: reporting true type is always a

utility-maximizing strategy, regardless of what

  • ther agents do.
  • Ex post incentive compatible: reporting true

type is utility-maximizing, whatever the types of

  • ther agents, assuming they’re truthful.

[same as strategyproof in private values setting]

  • Bayes-Nash incentive compatible: reporting true

type is utility-maximizing, in expectation given distribution over others’ types, assuming other agents are truthful.

12

slide-13
SLIDE 13

Efficient static mechanisms: the Groves class

  • Choose outcome that is social welfare

maximizing according to agent reports.

  • Pay each agent the (combined) reported

value of all other agents for the chosen

  • utcome... minus some quantity

independent of the agent’s report.

[Vickrey, 1961; Clarke, 1971; Groves 1973]

13

slide-14
SLIDE 14

14

$10 $10 $10

slide-15
SLIDE 15

Groves (and nothing else) works

  • The set of Groves mechanisms exactly

corresponds to those that are efficient in dominant strategies.*

[Green & Laffont, 77], strengthened by [Holmstrom, 79]

Our freedom is limited to defining the agent-independent “charge” term

*For sufficiently rich domains (“for all practical purposes”).

15

slide-16
SLIDE 16

Efficient mechanism design boiled down

  • 1. Align incentives – make each agent’s

payoff equal to social welfare.

  • 2. Recover funds – have each agent make a

payment independent of his behavior.

16

slide-17
SLIDE 17

Efficient mechanism design boiled down

  • 1. Align incentives – make each agent’s

payoff equal to social welfare.

  • 2. Recover funds – have each agent make a

payment independent of his behavior.

16

A little more subtle in dynamic setting...

slide-18
SLIDE 18

The VCG mechanism

  • A Groves mechanism.
  • Defines “charge” term for each agent i equal

to the value other agents could have obtained if i’s interests were ignored.

Each agent’s utility equals contribution to social welfare. Ex post individual rational (if agents have non- negative values for all outcomes) No-deficit... in fact often yields high revenue.

17

slide-19
SLIDE 19

The expected externality (AGV) mechanism

[Arrow, 79; d’Aspremont & Gerard-Varet, 79]

  • Each agent’s payment is expected social

welfare others will get given his report, minus some uninfluencable quantity.

  • Efficient in Bayes-Nash equilibrium.
  • Ex ante individual rational.
  • Strongly budget-balanced.

18

slide-20
SLIDE 20

Redistribution mechanisms

  • AGV maintains all value with agents, but is only

weakly efficient and IR, unlike VCG.

  • Idea: try to return revenue under VCG back to

agents – thus improving social welfare – without weakening equilibrium or running deficit.

  • First reference: [Bailey, 97] for certain allocation

settings.

19

slide-21
SLIDE 21

Redistribution mechanism

[Cavallo, 06]

Idea: leverage domain information to obtain “revenue-guarantees”.

For each agent i, compute minimum revenue that i could cause to result, given reports of other agents (Gi). Run VCG. Give each i payment of Gi/n.

Applicable to any setting (e.g., combinatorial allocation). In single- item allocation, coincides with [Bailey, 97] mechanism.

20

slide-22
SLIDE 22

21

10 20 30 40 50 60 70 80 90 100 3 4 5 6 7 8 9 10 % value retained by agents number of agents dynamic-VCG dynamic-RM

slide-23
SLIDE 23

Lots of interesting recent work for case of multi-unit auctions

  • [Guo & Conitzer, 07; Moulin, 2007] – worst-

case optimality.

  • [Guo & Conitzer, 08] – optimal-in-expectation

mechanism.

  • [Hartline & Roughgarden, 2008] – money

burning when payments not possible.

  • [de Clippel, Naroditskiy, Greenwald, here].

22

slide-24
SLIDE 24

Much not discussed here, e.g.,

  • Interdependent (“common”) values settings
  • Inefficient mechanism design, concerned

with, e.g., revenue maximization, maximizing the minimum utility, etc.

23

slide-25
SLIDE 25

Basics of the dynamic setting

24

slide-26
SLIDE 26

Aspects of the problem

  • At each time period each agent holds some private

information (“local state”).

  • At each time period, the center selects an action to

execute, which generates value (of varying degree) for agents and yields new local states.

  • The (predicted) effects of taking any given action

depend on state.

  • Agents perceive utility of value x obtained k steps in

future to be γk x, for some 0 < γ ≤ 1.

Key variable: local state.

25

slide-27
SLIDE 27

Local state

  • Encapsulates all information required to

determine: a conditional distribution over value the agent would obtain for every possible action a conditional distribution over future local states for every possible action

26

slide-28
SLIDE 28

Assumption

  • Given the action executed by the center,

value obtained and subsequent local state for each agent are independent of other agents’ local states. Dynamic version of private values.

27

slide-29
SLIDE 29

Markov decision processes (MDPs)

  • State space
  • Action space
  • Reward function

When a given action is taken in a given state, what value results?

  • Non-deterministic transition function

When a given action is taken in a given state, what new state results?

28

slide-30
SLIDE 30

29

10 1/4 3/4 1/2 20 100 20 1/2 1/2 60 1/2 20

  • Two possible actions (red and blue).
  • Two time periods.
slide-31
SLIDE 31

So what’s a dynamic type?

  • It’s an MDP.

Transition dynamics between local states. Value function for state-action pairs. Indicator of “current” state.

30

slide-32
SLIDE 32
  • In static setting, type is “complete” and
  • reportable. In dynamic setting, type is

gradually revealed to the agent by nature over time.

  • It’s not the multiple time steps alone, it’s

the uncertainty.

  • If types are MDPs with no stochastic state

transitions, we’re in a static MD setting – just decide policy at time 0.

31

slide-33
SLIDE 33

A simple (and important) special case: MABs

  • Multi-armed bandit problems: a

special case of the general sequential decision-making framework.

  • Captures, e.g., single-item repeated

allocation scenarios.

32

slide-34
SLIDE 34

A simple (and important) special case: MABs

  • Each agent’s dynamics can be represented

by a Markov chain: no multiplicity of actions.

  • A single action associated with each agent.

When an agent’s action is chosen, his state changes; otherwise, it doesn’t.

33

slide-35
SLIDE 35

Captures, e.g., repeated allocation of a resource

34

3/4 1/4 1/2 1/2 30 30 30 1/4 3/4 1/2 1/2 20 20 20

slide-36
SLIDE 36

Captures, e.g., repeated allocation of a resource

35

3/4 1/4 1/2 1/2 30 30 30 1/4 3/4 1/2 1/2 20 20 20

Allocate to agent 1, who finds no value.

slide-37
SLIDE 37

36

DMD setup

  • There is a set of actions.
  • Each agent has a type represented by an MDP.
  • In each period agents report types and the center

takes an action.

  • A dynamic mechanism specifies two things:

a decision policy: a function that maps a joint type to an action. a transfer function: that maps a joint type to a payment for each agent.

slide-38
SLIDE 38

Basics of the dynamic setting: equilibrium concepts

37

slide-39
SLIDE 39

Within-period ex post Nash equilibrium

If all other agents play the equilibrium strategy in the future, no agent can benefit from deviating – regardless of what the joint state is and regardless of what came before.

38

slide-40
SLIDE 40

Within-period ex post incentive compatibility

If all other agents report types truthfully in the future, no agent can benefit from misreporting type – regardless of what the joint type is and regardless of what came before.

39

No incentive to deviate even if agents know everything

  • ne can know – without being able to see the future.
slide-41
SLIDE 41

This is the gold standard

  • In a dynamic setting, agents needs to make

predictions about the future in determining how to maximize utility – and this requires positing some behavior for other agents.

  • Weaker than dominant strategy.
  • But if others’ future types were irrelevant

to the agent’s utility, incentives couldn’t possibly be aligned.

40

slide-42
SLIDE 42

Bayes-Nash equilibrium

Given distribution over other agents’ types, no agent can expect to gain from deviating if

  • thers don’t.

Within-period ex post also involves expectation, but expectation is over uncertain type transitions, not current types.

41

slide-43
SLIDE 43

Mechanism desiderata

  • Efficiency: social-welfare maximizing decisions

achieved in equilibrium.

  • Individual rationality: no agent expects to

lose from participating.

Within-period ex post: at every time-step, for every joint type. Ex ante: from beginning of the mechanism, for whatever the joint type is then.

  • Budget-balance / no-deficit.

42

slide-44
SLIDE 44

By the way...

  • A dynamic analog of the revelation principle

holds [Myerson, 1986].

  • So we can think only about direct

revelation mechanisms, without loss of generality.

43

slide-45
SLIDE 45

Some solutions so far

44

slide-46
SLIDE 46

A basic efficient dynamic mechanism

  • Dynamic team mechanism

[Athey & Segal, 07]

Follows efficient policy given agent reports. In each period, pays each agent the expected immediate value obtained by other agents given reported types (“Groves payment”).

45

slide-47
SLIDE 47

Dynamic team mechanism

example

200 1/4 3/4 1/4 20 1/2 1/2 60 3/4 20 20 100

Agent 2 Agent 1

20 20 9/10 1/10

A B C J K L D E F G H I M N O P Q * *

46

slide-48
SLIDE 48

Dynamic team mechanism

example

  • ptimal policy (γ close to 1):

* → blue AJ → red or blue BJ → red or blue CK → red CL → blue

200 1/4 3/4 1/4 20 1/2 1/2 60 3/4 20 20 100

Agent 2 Agent 1

20 20 9/10 1/10

A B C J K L D E F G H I M N O P Q * *

47

slide-49
SLIDE 49

Dynamic team mechanism

example

  • T1(*) = 0, T2(*) = 0

200 1/4 3/4 1/4 20 1/2 1/2 60 3/4 20 20 100

Agent 2 Agent 1

20 20 9/10 1/10

A B C J K L D E F G H I M N O P Q * *

48

slide-50
SLIDE 50

Dynamic team mechanism

example

  • T1(*) = 0, T2(*) = 0
  • T1(CL) = 100, T2(CL) = 0

200 1/4 3/4 1/4 20 1/2 1/2 60 3/4 20 20 100

Agent 2 Agent 1

20 20 9/10 1/10

A B C J K L D E F G H I M N O P Q * *

49

slide-51
SLIDE 51

Dynamic team mechanism

Theorem: The dynamic team mechanism is truthful and efficient in within-period ex post Nash equilibrium.

[Athey & Segal, 07]

50

slide-52
SLIDE 52

Dynamic-Groves mechanism class

  • Follows efficient policy given agent reports;

defines payments such that:

Each agent’s expected sum of payments when he follows strategy σ equals the expected value

  • ther agents obtain when he follows σ, minus

some quantity independent of σ.

51

slide-53
SLIDE 53

Dynamic-Groves mechanism class

52

Theorem: Every dynamic-Groves mechanism is truthful and efficient in within-period ex post Nash equilibrium.

[Cavallo, Parkes, & Singh, 07]

Proof: Each agent obtains social utility (aligns incentives) minus some constant (doesn’t distort).

slide-54
SLIDE 54

Dynamic-Groves: all efficient mechanisms

Theorem: For unrestricted types, the dynamic- Groves class exactly corresponds to the history- independent dynamic mechanisms that are truthful and efficient in within-period ex post Nash equilibrium. [Cavallo, 08]

For within-period ex post efficient (and history- independent) dynamic mechanism design, dynamic-Groves is the only game in town.

53

slide-55
SLIDE 55

Dynamic-Groves: all efficient mechanisms

Theorem: For unrestricted types, the dynamic- Groves class exactly corresponds to the history- independent dynamic mechanisms that are truthful and efficient in within-period ex post Nash equilibrium. [Cavallo, 08]

Generalizes [Green & Laffont, 77] (Groves class unique for static settings). Proof idea: If non-Groves, there is always some type for which incentives are sufficiently distorted from efficiency.

54

slide-56
SLIDE 56

Budget & participation

  • Given characterization theorem, if we

demand efficiency in strongest sense, we know what the possibilities are.

  • Now pick mechanisms in class with

desirable budget/participation properties. basic “team mechanism” won’t fly – extreme budget imbalance need to recover payments...

55

slide-57
SLIDE 57

Recovering payments:

ex ante charge (EAC)

Charge agents some quantity computed “ex ante” of anything they report.

56

slide-58
SLIDE 58

Recovering payments:

ex ante charge (EAC)

  • At every time-step:

Choose efficient decision given reported types. Make Groves payments. Charge each agent a quantity based only on the reported types of other agents in the first time-step: (1-γ) times total value other agents would obtain, in expectation from beginning of mechanism, if policy optimal for them was chosen. [Cavallo, Parkes, & Singh, 06] Ti(θt) = r−i(θt

−i, π∗(θt)) − (1 − γ)V−i(θ0 −i)

57

slide-59
SLIDE 59

Dynamic-EAC mechanism

example

  • γ = 0.9
  • V1(*-2) = 50 + γ10 = 59
  • V2(*-1) = γ90 = 81
  • T1(*) = -8.1, T2(*) = -5.9
  • T1(CL) = 100 - 8.1, T2(CL) = -5.9
  • ...

200 1/4 30 3/4 20 20 100

Agent 2 Agent 1

20 9/10 1/10

A B C J K L E F G I M P Q * *

10

58

N

slide-60
SLIDE 60

Recovering payments:

ex ante charge (EAC)

Theorem: The dynamic-EAC mechanism is truthful and efficient in within-period ex post Nash equilibrium, ex ante individual rational, and ex ante no-deficit.

[Cavallo, Parkes, & Singh, 06]

59

slide-61
SLIDE 61

Weak IR and budget- balance properties

  • With dynamic-EAC scheme agents will

“sign up” at beginning of mechanism, but may wish to back out...

  • Same for center.
  • Can we strengthen?

60

slide-62
SLIDE 62

Dynamic-VCG

[Bergemann & Valimaki, 08]

  • At each time-step, pay each agent i the

expected value other agents would obtain if i were ignored after one step, minus the value they’d obtain if i were always ignored. Each agent has to pay the amount he inhibits

  • ther agents from obtaining value (now and

in the future) by his current report.

61

slide-63
SLIDE 63

Dynamic-VCG

[Bergemann & Valimaki, 08]

  • At each time-step, pay each agent i the

expected value other agents would obtain if i were ignored after one step, minus the value they’d obtain if i were always ignored.

Ti(θt) = r−i(θt

−i, π∗(θt)) + γE[V−i(τ(θt −i, π∗(θt)))] − V−i(θt −i)

62

slide-64
SLIDE 64

Dynamic-VCG mechanism

example

  • γ = 0.9
  • T1(*) = γ90 - γ90
  • T2(*) = γ30 - (50 + γ10)
  • T1(CL) = 100 - 100
  • T2(CL) = 0 - 30
  • ...

200 1/4 30 3/4 20 100

Agent 2 Agent 1

20 9/10 1/10

A B C J K L E F G I M P Q * *

10

63

N

slide-65
SLIDE 65

Dynamic-VCG

[Bergemann & Valimaki, 08]

  • No payment to any agent in any period is

positive.

  • Expected future payoff to every agent i,

from any joint state, at any time t, is:

64

(NB: assumes no negative values)

Ti(θt) = r−i(θt

−i, π∗(θt)) + γE[V−i(τ(θt −i, π∗(θt)))] − V−i(θt −i)

r−i(θt

−i, π∗(θt)) + γ [V−i(τ(θt −i, π∗(θt)))] ≤ V−i(θt −i)

V (θt) − V−i(θt

−i) ≥ 0

slide-66
SLIDE 66

Dynamic-VCG

[Bergemann & Valimaki, 08]

Theorem: The dynamic-VCG mechanism is truthful and efficient in within-period ex post Nash equilibrium, within-period ex post individual rational, and ex post no-deficit.

65

slide-67
SLIDE 67

Dynamic-VCG: good social-welfare?

66

10 20 30 40 50 60 70 80 90 100 3 4 5 6 7 8 9 10 % value retained by agents number of agents dynamic-VCG

In a single-item allocation setting, with values normally distributed.

slide-68
SLIDE 68

Dynamic-VCG: good social-welfare?

Theorem: Among all history-independent mechanisms that are efficient in within- period ex post Nash equilibrium and within- period ex post individual rational, dynamic- VCG yields the most expected revenue, for every joint type.

[Cavallo, 08]

67

slide-69
SLIDE 69

Dynamic-VCG: good social-welfare?

  • Since dynamic-VCG can be so bad for the

agents, what do we do?

  • Think back to the static setting... better

budget balance was achieved by redistribution mechanisms; strong budget- balance by moving to Bayes-Nash equilibrium.

68

slide-70
SLIDE 70

A dynamic redistribution mechanism?

  • Redistribution much more complicated in

the dynamic setting. Now redistribution payment computed in later time periods can potentially be influenced via an agent’s reports in earlier periods... in subtle ways. Focus on worlds representable as multi- armed bandits.

69

slide-71
SLIDE 71

70

Dynamic-VCG for MABs reduces to:

  • Determine optimal agent i to activate.

i pays (1-γ) times the expected value other agents would get if i were always ignored. Other agents pay nothing.

slide-72
SLIDE 72

71

Dynamic-VCG for MABs:

  • Winner pays (1-γ) times the

expected value other agents would get if he were always ignored.

  • Other agents pay nothing.

3/4 1/4 1/2 1/2 30 30 30 1/2 1/2 20 20

slide-73
SLIDE 73

72

Dynamic-VCG for MABs:

  • Winner pays (1-γ) times the

expected value other agents would get if he were always ignored.

  • Other agents pay nothing.

3/4 1/4 1/2 1/2 30 30 30 1/2 1/2 20 20 T1 = -(1-γ) (10 + γ10)

slide-74
SLIDE 74

73

Dynamic-VCG for MABs:

  • Winner pays (1-γ) times the

expected value other agents would get if he were always ignored.

  • Other agents pay nothing.

3/4 1/4 1/2 1/2 30 30 30 1/2 1/2 20 20 T1 = -(1-γ) (10 + γ10)

slide-75
SLIDE 75

74

Dynamic-VCG for MABs:

  • Winner pays (1-γ) times the

expected value other agents would get if he were always ignored.

  • Other agents pay nothing.

3/4 1/4 1/2 1/2 30 30 30 1/2 1/2 20 20 T1 = -(1-γ) (10 + γ10) T2 = -(1-γ) 7.5

slide-76
SLIDE 76

75

Dynamic-VCG for MABs:

  • Winner pays (1-γ) times the

expected value other agents would get if he were always ignored.

  • Other agents pay nothing.

3/4 1/4 1/2 1/2 30 30 30 1/2 1/2 20 20 T1 = -(1-γ) (10 + γ10) T2 = -(1-γ) 7.5

slide-77
SLIDE 77

76

Dynamic-VCG for MABs:

  • Winner pays (1-γ) times the

expected value other agents would get if he were always ignored.

  • Other agents pay nothing.

3/4 1/4 1/2 1/2 30 30 30 1/2 1/2 20 20 T1 = -(1-γ) (10 + γ10) T2 = -(1-γ) 7.5 T2 = -(1-γ) 7.5

slide-78
SLIDE 78

77

Dynamic-RM for MABs

[Cavallo, 08]

  • Modify dynamic-VCG by adding the following

payments to the agents each period:

For agent i receiving item: (1-γ)/n times the expected total discounted revenue that would result if i were ignored going forward. For every other agent j: 1/n times the expected immediate revenue that would have resulted this period if j were ignored.

slide-79
SLIDE 79

Dynamic-RM for MABs

[Cavallo, 08]

Lemma: Whatever strategy an agent follows, his expected redistribution payments over time equal: a 1/n share of the expected total (over time) revenue that would result if the agent were not present.

(This is the hard part to prove. Once we have, it follows that dynamic-RM is a dynamic-Groves mechanism, and thus efficient.)

78

slide-80
SLIDE 80

Dynamic-RM for MABs

[Cavallo, 08]

Theorem: Dynamic-RM is efficient in within- period ex post Nash equilibrium, within- period ex post IR, and never runs a deficit.

And yields significantly more value for the agents than dynamic-VCG.

Examples with three or more agents are tough to illustrate, so let’s just look at aggregate results:

79

slide-81
SLIDE 81

Value retained: normal distribution

80

10 20 30 40 50 60 70 80 90 100 3 4 5 6 7 8 9 10 % value retained by agents number of agents dynamic-RM dynamic-VCG

slide-82
SLIDE 82

Value retained: uniform distribution

81

10 20 30 40 50 60 70 80 90 100 3 4 5 6 7 8 9 10 % value retained by agents number of agents dynamic-RM dynamic-VCG

slide-83
SLIDE 83

82

efficiency IR budget- balance team mechanism w.p. ex post w.p. ex post huge deficit dynamic-EAC w.p. ex post ex ante ex ante no-deficit dynamic-VCG w.p. ex post w.p. ex post ex post no-deficit dynamic-RM

(only for MABs)

w.p. ex post w.p. ex post ex post no-deficit,

much closer to perfect BB

balanced- mechanism Bayes-Nash ex ante perfect

slide-84
SLIDE 84

(Balanced team mechanism presented by Susan Athey)

83

slide-85
SLIDE 85

Extensions

84

slide-86
SLIDE 86

Dynamically changing populations of agents

  • What’s new: agents may – either

temporarily or permanently – become “inaccessible”, i.e., unable to communicate with the center or make/receive payments.

  • Generalizes arrival/departure dynamics.

85

slide-87
SLIDE 87

For instance:

  • Imagine selling theater tickets to tourists

who plan to see multiple shows over a period of days. New tourists always arriving, others leaving (dynamic population). A tourist may see a show, realize she likes the theater more/less (dynamic types).

86

slide-88
SLIDE 88

Related area:

  • nline mechanism design
  • Dynamic population (arrivals and departures),

but static types – all private information an agent will ever obtain can be reported in arrival period.

[Friedman & Parkes, 03] [Parkes & Singh, 03] [Lavi & Nisan, 04] [Porter, 04]

87

slide-89
SLIDE 89

Online-VCG mechanism

[Parkes & Singh, 03]

  • Collects a single payment from each agent

in her “arrival period”. Within-period ex post efficient. Ex post individual rational Ex post no-deficit.

88

slide-90
SLIDE 90

Online-VCG mechanism

[Parkes & Singh, 03]

  • Collects a single payment from each agent

in her “arrival period”. Within-period ex post efficient. Ex post individual rational Ex post no-deficit.

88

But only for static types.

slide-91
SLIDE 91

Dynamic populations, dynamic types

[Cavallo, Parkes, & Singh, 07]

  • Unifies dynamic mechanism design and
  • nline mechanism design.
  • The new challenges:

Optimal policy must consider accessibility/ inaccessibility dynamics Agents may not be available for payment while still exerting influence on welfare of other agents.

89

slide-92
SLIDE 92

→ 1 → 1 A B C → 2 t = 1 t = 2 t = 3 8 2

(a) Agent 1’s Type.

→ 1 → 1 E G → 2 D → 1 → 1 F H → 2 t = 1 t = 2 t = 3 0.2 4 0.8 20

(b) Agent 2’s Type.

90

slide-93
SLIDE 93
  • Imagine agent 1 accessible at t = 1, and agent

2 inaccessible at t = 1 but very likely to become accessible at t = 2.

→ 1 → 1 A B C → 2 t = 1 t = 2 t = 3 8 2

(a) Agent 1’s Type.

→ 1 → 1 E G → 2 D → 1 → 1 F H → 2 t = 1 t = 2 t = 3 0.2 4 0.8 20

(b) Agent 2’s Type.

90

slide-94
SLIDE 94
  • Imagine agent 1 accessible at t = 1, and agent

2 inaccessible at t = 1 but very likely to become accessible at t = 2.

  • In “naive” dynamic-VCG mechanism, agent 1

better off “hiding” to improve social-welfare.

→ 1 → 1 A B C → 2 t = 1 t = 2 t = 3 8 2

(a) Agent 1’s Type.

→ 1 → 1 E G → 2 D → 1 → 1 F H → 2 t = 1 t = 2 t = 3 0.2 4 0.8 20

(b) Agent 2’s Type.

90

slide-95
SLIDE 95
  • Imagine agent 1 accessible at t = 1, and agent

2 inaccessible at t = 1 but very likely to become accessible at t = 2.

  • In “naive” dynamic-VCG mechanism, agent 1

better off “hiding” to improve social-welfare.

  • In non-naive mechanism that makes

dynamic-VCG payments only to accessible agents, agent 2 can benefit by hiding.

→ 1 → 1 A B C → 2 t = 1 t = 2 t = 3 8 2

(a) Agent 1’s Type.

→ 1 → 1 E G → 2 D → 1 → 1 F H → 2 t = 1 t = 2 t = 3 0.2 4 0.8 20

(b) Agent 2’s Type.

90

slide-96
SLIDE 96

A fix

  • For any inaccessible agent, keep log of

payments dynamic-VCG would impose on agent; when the agent becomes accessible, execute “lump sum” payment, appropriately scaled for discounting.

  • Requires that all agents eventually “come

back”.

91

slide-97
SLIDE 97

Imagine both agents accessible in all periods. Should agent 2 feign inaccessibility until t = 2?

→ 1 → 1 A B C → 2 t = 1 t = 2 t = 3 8 2

(a) Agent 1’s Type.

→ 1 → 1 E G → 2 D → 1 → 1 F H → 2 t = 1 t = 2 t = 3 0.2 4 0.8 20

(b) Agent 2’s Type.

92

slide-98
SLIDE 98

Imagine both agents accessible in all periods. Should agent 2 feign inaccessibility until t = 2?

→ 1 → 1 A B C → 2 t = 1 t = 2 t = 3 8 2

(a) Agent 1’s Type.

→ 1 → 1 E G → 2 D → 1 → 1 F H → 2 t = 1 t = 2 t = 3 0.2 4 0.8 20

(b) Agent 2’s Type.

T2 = -6 - 2 = -8, same whether he hides at t = 1 or not.

92

slide-99
SLIDE 99

Imagine both agents accessible in all periods. Should agent 2 feign inaccessibility until t = 2?

→ 1 → 1 A B C → 2 t = 1 t = 2 t = 3 8 2

(a) Agent 1’s Type.

→ 1 → 1 E G → 2 D → 1 → 1 F H → 2 t = 1 t = 2 t = 3 0.2 4 0.8 20

(b) Agent 2’s Type.

T2 = -6 - 2 = -8, same whether he hides at t = 1 or not.

difference in optimal value for agent 1, with and without agent 2 present at t=1 difference in optimal value for agent 1, with and without agent 2 present at t=2

92

slide-100
SLIDE 100

What if agents don’t always come back?

  • In general, the scheme won’t work.
  • For an arrival/departure model, within-

period ex post efficiency is recovered if agent arrivals are independent conditioned

  • n actions chosen.

93

slide-101
SLIDE 101

Randomly arriving agents, revenue maximization

[Gershkov and Moldovanu, 09]

  • Goal is not efficiency, but rather revenue

maximization.

  • Agents arrive randomly over time.
  • Set of resources to be allocated before a

deadline.

94

slide-102
SLIDE 102

Computation

95

slide-103
SLIDE 103

The big bad secret

  • Computing optimal policies is, in general,

very hard... but often necessary.

  • What can we do?

Approximations, yielding approximate equilibria (even this is hard)? Identify tractable special cases. Thankfully, MABs are such a case.

96

slide-104
SLIDE 104

Computing optimal policies in MABs [Gittins & Jones, 74]

  • At each period, compute Gittins index for each

agent’s Markov chain.

  • “Activate” (e.g., allocate resource to) agent

with highest index. Complexity: Gittins indices are independent, so linear in number of agents.

97

slide-105
SLIDE 105

Beyond simple repeated allocation

Coordination of value information acquisition preceding one-time allocation of a single item (“metadeliberation auctions”).

98

slide-106
SLIDE 106

Metadeliberation auction

[Cavallo & Parkes, 08]

  • A resource is to be allocated. Agents have

initial valuations for the resource. Valuations can potentially be increased by costly “deliberation” (e.g., researching new ways

  • f using the resource).
  • How to coordinate deliberation/allocation

to maximize social welfare?

99

slide-107
SLIDE 107

Metadeliberation auction

[Cavallo & Parkes, 08]

  • Given optimal policy, dynamic-VCG mechanism

can be applied to deal with incentives.

  • Computing optimal deliberation/allocation

policy is tractable (reduction to multi-armed bandits problem).

  • Note: even in this one-time allocation

scenario, a realistic analysis of the problem reveals the need for dynamic solution.

100

slide-108
SLIDE 108

Computation

Beyond bandits: heuristics for special cases

101

slide-109
SLIDE 109

Self-correcting dynamic multi-unit auctions

[Constantin & Parkes, here]

  • When computing optimal policy is

infeasible...

  • Propose heuristic method that is

strategyproof, yet achieves social-welfare ~90% of optimal.

  • See talk tomorrow for details.

102

slide-110
SLIDE 110

Auctions with online supply

[Babaioff, Blumrosen, & Roth, workshop here]

  • Dynamically arriving items – unknown total

quantity.

  • Approximate mechanisms.
  • Nonetheless truthful – possible due to the

restricted setting.

103

slide-111
SLIDE 111

Open problems, future directions

104

slide-112
SLIDE 112

Interdependent values

  • Interestingly, sequential nature of problem

kind of helps here: ex post payments become natural.

Version of the team mechanism is still within- period ex post efficient. But no apparent way to extend dynamic-VCG... Can we achieve no-deficit, IR, and efficienct in interdependent settings?

105

slide-113
SLIDE 113

Computation

  • The general case looks hopeless.
  • Continue to identify tractable special cases?
  • Adopt more realistic equilibrium notions?

106

slide-114
SLIDE 114

References

  • [Vickrey, 61] William Vickrey. Counterspeculations, auctions, and competitive

sealed tenders. Journal of Finance, 16:8–37, 1961.

  • [Clarke, 71] Edward Clarke. Multipart pricing of public goods. Public Choice,

8:19–33, 1971.

  • [Groves, 73] Theodore Groves. Incentives in teams. Econometrica, 41:617–

631, 1973.

  • [Green & Laffont, 77] Jerry Green and Jean-Jacques Lafgont. Characterization
  • f satisfactory mechanisms for the revelation of preferences for public
  • goods. Econometrica, 45:427–438, 1977.
  • [Holmstrom, 79] Bengt Holmstrom. Groves’ scheme on restricted domains.

Econometrica, 47(5):1137–1144, 1979.

107

slide-115
SLIDE 115
  • [Arrow, 79] Kenneth J. Arrow. The property rights doctrine and demand

revelation under incomplete information. In M. Boskin, editor, Economics and Human Welfare. Academic Press, 1979.

  • [d’Aspremont & Gerard-Varet, 79] C. D’Aspermont and L.A. Gerard-Varet.

Incentives and incomplete information. Journal of Public Economics, 11:25– 45, 1979.

  • [Bailey, 97] Martin J. Bailey. The demand revealing process: To distribute the
  • surplus. Public Choice, 91:107–126, 1997.
  • [Cavallo, 06] Ruggiero Cavallo. Optimal decision-making with minimal waste:

Strategyproof redistribution of VCG payments. In Proceedings of the 5th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS’06), pages 882–889, 2006.

  • [Guo & Conitzer, 07] Mingyu Guo and Vincent Conitzer. Worst-case optimal

redistribution of VCG payments. In Proceedings of the 8th ACM Conference

  • n Electronic Commerce (EC-07), San Diego, CA, USA, pages 30–39, 2007.
  • [Moulin, 2007] Herv´e Moulin. Effjcient, strategy-proof and almost budget-

balanced assignment. unpublished, 2007.

108

slide-116
SLIDE 116
  • [Guo & Conitzer, 08] Mingyu Guo and Vincent Conitzer. Optimal-in-

expectation redistribution mechanisms. In Proceedings of the Seventh International Conference on Autonomous Agents and Multiagent Systems (AAMAS-08), 2008.

  • [Hartline & Roughgarden, 2008] Jason D. Hartline and Tim Roughgarden.

Optimal mechanism design and money burning. In Proceedings of the 40th annual ACM symposium on Theory of Computing (STOC’08), 2008.

  • [de Clippel, Naroditskiy, Greenwald, here]. Geoffroy de Clippel, Victor

Naroditskiy, and Amy Greenwald. Destroy to Save. In Proceedings of the 10th ACM Conference on Electronic Commerce (EC-09), 2009.

  • [Myerson, 1986] Roger Myerson. Multistage games with communication.

Econometrica, 54(2):323–358, 1986.

  • [Athey & Segal, 07] Susan Athey and Ilya Segal. An effjcient dynamic
  • mechanism. Working paper, http://www.stanford.edu/~isegal/agv.pdf, 2007.
  • [Cavallo, Parkes, & Singh, 06] Ruggiero Cavallo, David C. Parkes, and Satinder
  • Singh. Optimal coordinated planning amongst self-interested agents with

private state. In Proceedings of the Twenty-second Annual Conference on Uncertainty in Artificial Intelligence (UAI’06), 2006.

109

slide-117
SLIDE 117
  • [Cavallo, 08] Ruggiero Cavallo. Effjciency and redistribution in dynamic

mechanism design. In Proceedings of the 9th ACM Conference on Electronic Commerce (EC-08) (to appear), 2008.

  • [Bergemann & Valimaki, 08] Dirk Bergemann and Juuso Valimaki. Efficient

dynamic auctions. Cowles Foundation Discussion Paper 1584, http:// cowles.econ.yale.edu/P/cd/d15b/d1584.pdf, 2006.

  • [Cavallo, Parkes, & Singh, 07] Ruggiero Cavallo, David C. Parkes, and Satinder
  • Singh. Online mechanisms for persistent, periodically inaccessible self-

interested agents. In DIMACS Workshop on the Boundary between Economic Theory and Computer Science, 2007.

  • [Friedman & Parkes, 03] E. Friedman and D. C. Parkes. Pricing WiFi at

Starbucks– issues in online mechanism design. In Proc. Fourth ACM Conference on Electronic Commerce (EC’03), pages 240–241, 2003.

  • [Parkes & Singh, 03] David C. Parkes and Satinder Singh. An MDP-based

approach to Online Mechanism Design. In Proceedings of the 17th Annual

  • Conf. on Neural Information Processing Systems (NIPS’03), 2003.
  • [Lavi & Nisan, 04] Ron Lavi and Noam Nisan. Competitive analysis of

incentive compatible on-line auctions. Theoretical Computer Science, 310:159–180, 2004. Earlier version in ACMEC 2000.

110

slide-118
SLIDE 118
  • [Porter, 04] Ryan Porter. Mechanism design for online real-time scheduling.

In Proceedings of the ACM Conference on Electronic Commerce (EC’04), pages 61–70, 2004.

  • [Gershkov and Moldovanu, 09] Alex Gershkov and Benny Moldovanu.

Dynamic Revenue Maximization with Heterogeneous Objects: A Mechanism Design Approach. Forthcoming in American Economic Journal: Microeconomics.

  • [Gittins & Jones, 74] J. C. Gittins and D. M. Jones. A dynamic allocation index

for the sequential design of experiments. In In Progress in Statistics, pages 241–266. J. Gani et al., 1974.

  • [Cavallo & Parkes, 08] Ruggiero Cavallo and David C. Parkes. Effjcient

metade-liberation auctions. In Proceedings of the 26th Annual Conference

  • n Artificial Intelligence (AAAI-08), 2008.
  • [Constantin & Parkes, here] Florin Constantin and David C. Parkes. Self-

Correcting Sampling-Based Dynamic Multi-Unit Auctions. In the 10th ACM Electronic Commerce Conference (EC'09), 2009

  • [Babaioff, Blumrosen, & Roth, workshop here] Moshe Babaioff, Liad

Blumrosen, and Aaron Roth. Auctions with Online Supply. In Fifth Workshop

  • n Ad Auctions, 2009.

111