Overcoming Limitations of Game-Theoretic Distributed Control Jason - - PowerPoint PPT Presentation

overcoming limitations of game theoretic distributed
SMART_READER_LITE
LIVE PREVIEW

Overcoming Limitations of Game-Theoretic Distributed Control Jason - - PowerPoint PPT Presentation

Overcoming Limitations of Game-Theoretic Distributed Control Jason R. Marden California Institute of Technology (joint work with Adam Wierman) Southern California Network Economics and Game Theory Symposium October 1, 2009 Engineering systems


slide-1
SLIDE 1

Overcoming Limitations of Game-Theoretic Distributed Control

Jason R. Marden California Institute of Technology

(joint work with Adam Wierman)

Southern California Network Economics and Game Theory Symposium October 1, 2009

slide-2
SLIDE 2

Engineering systems

Network Coding

Trend: Transition from centralized to local decision making

Local processing (manageable) Reduces communication Robustness Characterization Coordination Efficiency

Appeal Challenges

range

Vehicle Target Assignment Sensor coverage

How should we design distributed engineering systems?

slide-3
SLIDE 3

Features of distributed design:

Local decisions Local information Global behavior depends on local decisions

Game Theory

Network Coding

range

Vehicle Target Assignment Sensor coverage

Trend: Transition from centralized to local decision making Engineering systems

slide-4
SLIDE 4

Game theory Decision Makers Global Behavior

model as “game”

game theory

Descriptive Agenda: Modeling

social system Reasonable description of sociocultural phenomena? Matches available experimental/observational data?

Metrics:

slide-5
SLIDE 5

Game theory Decision Makers Global Behavior

model as “game” social system engineering system desired global behavior

Prescriptive Agenda: Distributed robust optimization

game theory distributed control

Asymptotic global behavior? Communication/Information requirement? Computation requirement? Convergence rates?

Metrics: Design parameters:

Decision makers Objective/Utility functions Decision/Learning rule

slide-6
SLIDE 6

Big picture Game theory for distributed robust optimization Part #1: model interactions as game

decision makers / players possible choices local objective functions

Goal: Emergent global behavior is desirable Appeal:

available distributed learning algorithms robustness to uncertainties self-interested users?

Challenges:

convergence rates?

Part #2: local agent decision rules

informational dependencies processing requirements

slide-7
SLIDE 7

Big picture Game theory for distributed robust optimization Part #1: model interactions as game

decision makers / players possible choices local objective functions

Goal: Emergent global behavior is desirable Appeal:

available distributed learning algorithms robustness to uncertainties self-interested users?

Challenges:

convergence rates?

Part #2: local agent decision rules

informational dependencies processing requirements

slide-8
SLIDE 8

Outline

Existence of (pure) NE Efficiency of NE Locality of information Tractability Budget balance

Goal: Establish methodology for designing desirable utility functions Outline:

  • Propose framework to study utility design: Distributed welfare games
  • Identify methodologies that guarantees desirable properties
  • Identify fundamental limitations
  • Propose new framework to overcome limitations
slide-9
SLIDE 9

Game theory Non-cooperative game:

  • Players:
  • Actions:
  • Joint actions:
  • Utilities:

ai ∈ Ai

(preferences)

(Pure) Nash equilibrium:

N = {1, 2, ..., n}

Ui(a∗

i , a∗ −i) = max ai∈Ai Ui(ai, a∗ −i)

A = A1 × ... × An

Ui : A → R

Ui(a) = Ui(ai, a−i)

slide-10
SLIDE 10

Resource allocation games

R

Ai ⊆ 2R W(a) =

  • r

W r(ar)

  • Resources:
  • Players:
  • Actions:
  • Welfare
  • Global Welfare:

N

W r : 2N → R+

Setup:

player set that chose resource r

Game design = Utility design

slide-11
SLIDE 11

Resource allocation games Framework is common to many application domains

Akella et al., 2002. (Congestion control) Goemans et al., 2004 (Content distribution) Kesselman et al., 2005. (Switching/congestion control) Komali and MacKenzie, 2007. (Topology control in ad-hoc networks) Campos-Nanez et al., 2008. (Power management in sensor networks)

Network Coding

range

Vehicle Target Assignment Sensor coverage

slide-12
SLIDE 12

Example: Vehicle target assignment

Resources: Targets Players: Vehicles / Weapons Actions: Possible engagements Welfare: worth, expected damage and loss.

Welfare Wr(1) Wr(2) Wr(3) Wr(1,2) Wr(1,3) Wr(2,3) Wr(1,2,3)

range restriction vehicle 1 vehicle 2 vehicle 3

  • G. Arslan et al., “Autonomous vehicle-target assignment: a game theoretical formulation,” 2007.

no communication

slide-13
SLIDE 13

Example: Vehicle target assignment

Resources: Targets Players: Vehicles / Weapons Actions: Possible engagements Welfare: worth, expected damage and loss.

Welfare Wr(1) Wr(2) Wr(3) Wr(1,2) Wr(1,3) Wr(2,3) Wr(1,2,3)

range restriction vehicle 1 vehicle 2 vehicle 3 no communication

  • G. Arslan et al., “Autonomous vehicle-target assignment: a game theoretical formulation,” 2007.
slide-14
SLIDE 14

Example: Vehicle target assignment

Resources: Targets Players: Vehicles / Weapons Actions: Possible engagements Welfare: worth, expected damage and loss.

Welfare Wr(1) Wr(2) Wr(3) Wr(1,2) Wr(1,3) Wr(2,3) Wr(1,2,3)

range restriction vehicle 1 vehicle 2 vehicle 3 no communication

  • G. Arslan et al., “Autonomous vehicle-target assignment: a game theoretical formulation,” 2007.
slide-15
SLIDE 15

Welfare Wr(1) Wr(2) Wr(3) Wr(1,2) Wr(1,3) Wr(2,3) Wr(1,2,3)

Example: Vehicle target assignment

Resources: Targets Players: Vehicles / Weapons Actions: Possible engagements Welfare: worth, expected damage and loss.

Welfare Wr(1) Wr(2) Wr(3) Wr(1,2) Wr(1,3) Wr(2,3) Wr(1,2,3)

range restriction vehicle 1 vehicle 2 vehicle 3 no communication

  • G. Arslan et al., “Autonomous vehicle-target assignment: a game theoretical formulation,” 2007.
slide-16
SLIDE 16

Welfare Wr(1) Wr(2) Wr(3) Wr(1,2) Wr(1,3) Wr(2,3) Wr(1,2,3)

Example: Vehicle target assignment

Resources: Targets Players: Vehicles / Weapons Actions: Possible engagements Welfare: worth, expected damage and loss.

Welfare Wr(1) Wr(2) Wr(3) Wr(1,2) Wr(1,3) Wr(2,3) Wr(1,2,3)

range restriction vehicle 1 vehicle 2 vehicle 3 no communication

Global objective: Maximize sum of welfare (centralized assignment not feasible)

  • G. Arslan et al., “Autonomous vehicle-target assignment: a game theoretical formulation,” 2007.
slide-17
SLIDE 17

Utility design

Goal: Assign each agent a utility such that the resulting game is desirable

  • Existence of NE
  • Efficiency of NE
  • Locality of information
  • Tractability
  • Budget balance

Approach: View like a cost sharing problem

distribution rule assignment generates welfare

W r( , )

welfare distributed to players

U U

slide-18
SLIDE 18

Distributed welfare games Utility structure:

Properties of distribution rule: 1. 2. 3.

W(a) =

  • Ui(a)

Budget Balanced: f r(i, ar) ≥ 0

distribution rule

W r( , )

U U

r / ∈ ai ⇒ f r(i, ar) = 0

depends only on local information

Ui(a) =

  • r∈ai

f r(i, ar)W r(ar)

  • i

f r(i, ar) ≤ 1

slide-19
SLIDE 19

Distributed welfare games Utility structure:

Properties of distribution rule: 1. 2. 3.

W(a) =

  • Ui(a)

Budget Balanced: f r(i, ar) ≥ 0

distribution rule

W r( , )

U U

r / ∈ ai ⇒ f r(i, ar) = 0

depends only on local information

Ui(a) =

  • r∈ai

f r(i, ar)W r(ar)

  • i

f r(i, ar) ≤ 1

Are cost sharing methodologies useful in designing utilities?

slide-20
SLIDE 20

Equal share low

Equal share

NE exists Budget Balanced Complexity W r(ar) = W r(|ar|)

(Monderer and Shapley, 1996)

** If welfare function is anonymous, then NE exists.

Ui(ai, a−i) =

  • r∈ai

1 |ar|W r(ar)

slide-21
SLIDE 21

Equal share low Marginal contribution medium

Marginal contribution

NE exists Budget Balanced Complexity

(Wolpert and Tumor, 1999)

Ui(ai, a−i) =

  • r∈ai

W r(ar) − W r(ar \ i)

slide-22
SLIDE 22

Equal share low Marginal contribution medium Shapley value high

Shapley value

NE exists Budget Balanced Complexity

(builds upon Hart and Mas-Collell, 1989)

Ui(ai, a−i) =

  • r∈ai

Shr(i, ar)

slide-23
SLIDE 23

Equal share low Marginal contribution medium Shapley value high

Shapley value

NE exists Budget Balanced Complexity

(builds upon Hart and Mas-Collell, 1989) summation over all player subset marginal contribution to player subset

intractable for large N

Shr(i, N) =

  • S⊆N:i∈S

ωS

  • W r(S) − W r(S \ i)
  • Ui(ai, a−i) =
  • r∈ai

Shr(i, ar)

slide-24
SLIDE 24

Summary

Equal share low Marginal contribution medium Shapley value high NE exists Budget Balanced Complexity

Tradeoff: Properties vs. Complexity Is there anything else?

[Chen, Roughgarden & Valiant, 2008]: Network formation games (uniform)

No, (weighted) SV only rule that guarantees NE + BB in all games. Yes if we restrict attention to special classes of games

[JRM & Wierman, 2008]: Not restricted to SV in some settings

slide-25
SLIDE 25

Efficiency

Can we provide efficiency guarantees for general welfare functions?

Yes if welfare is submodular (decreasing marginal welfare)

  • No. In general a NE can be arbitrarily bad.

(independent of number of game specifics)

Price of Anarchy worst case performance of any NE Price of Stability worst case performance of best NE

POA = inf

G min ane∈G

W(ane) W(aopt) POS = inf

G max ane∈G

W(ane) W(aopt)

slide-26
SLIDE 26

Submodularity

  • Submodularity (decreasing marginal welfare)
  • Submodularity can be exploited to improve efficiency

W(S + s) − W(S) ≥ W(S′ + s) − W(S′)

S ⊂ S′ ⊂ N

Andreas Krause (Caltech)

range

Vehicle Target Assignment Sensor coverage

slide-27
SLIDE 27

Efficiency of equilibria

  • Submodularity (decreasing marginal welfare)
  • Submodularity can be exploited to improve efficiency

W(S + s) − W(S) ≥ W(S′ + s) − W(S′)

S ⊂ S′ ⊂ N

W(S + s) − W(S) ≥ W(S′ + s) − W(S′)

Theorem: For any distributed welfare game where (i) Resource specific welfare functions are submodular (ii) Utilities are greater than or equal to marginal contribution then if a NE exists, the price of anarchy is 1/2, i.e.,

[JRM & Wierman, 2008] [Vetta, 2002]

W(ane) W(aopt) ≥ 1 2

Ui(ai, a−i) ≥ W(ai, a−i) − W(∅, a−i)

slide-28
SLIDE 28

NE exists Budget Balanced Complexity POS

Efficiency

Marginal contribution medium

1/2

Shapley value high

1/2

POA

W(S + s) − W(S) ≥ W(S′ + s) − W(S′)

Theorem: For any distributed welfare game where (i) Resource specific welfare functions are submodular (ii) Utilities are greater than or equal to marginal contribution then if a NE exists, the price of anarchy is 1/2, i.e.,

[JRM & Wierman, 2008] [Vetta, 2002]

W(ane) W(aopt) ≥ 1 2

Ui(ai, a−i) ≥ W(ai, a−i) − W(∅, a−i)

slide-29
SLIDE 29

NE exists Budget Balanced Complexity POS

Efficiency

Marginal contribution medium

1/2

Shapley value high

1/2

POA

Best known centralized approximation algorithms: (1-1/e) = 0.63

What about price of stability? 1 ?

slide-30
SLIDE 30

NE exists Budget Balanced Complexity POS

Efficiency

Marginal contribution medium

1/2

Shapley value high

1/2

POA

Best known centralized approximation algorithms: (1-1/e) = 0.63

What about price of stability?

W(S + s) − W(S) ≥ W(S′ + s) − W(S′)

[JRM & Wierman, 2009]

Fundamental Limitation: Existence of NE Budget balance POS < 1 POS = 1/2 (submodular)

1 ?

slide-31
SLIDE 31

Proof

1 x y

≥ 1 2 ≥

distribution rule game (POS=1) Direction: Submodular welfare functions of the form for all W r(ar) = c

ar = ∅

slide-32
SLIDE 32

Proof

x − ǫ

  • r

1 x y

≥ 1 2 ≥

distribution rule game (POS=1) Direction: Submodular welfare functions of the form for all W r(ar) = c

ar = ∅

slide-33
SLIDE 33

Proof

x − ǫ 1 x y

≥ 1 2 ≥

Unique NE W=1

distribution rule game (POS=1) Direction: Submodular welfare functions of the form for all W r(ar) = c

ar = ∅

slide-34
SLIDE 34

Proof

x − ǫ 1 x y

≥ 1 2 ≥

Unique NE W=1

1 x y x − ǫ

Optimal W=1+x

distribution rule game (POS=1) Direction: Submodular welfare functions of the form for all W r(ar) = c

ar = ∅

slide-35
SLIDE 35

Proof

x − ǫ 1 x y

≥ 1 2 ≥

POS ≤ 2 3

Unique NE W=1

1 x y x − ǫ

Optimal W=1+x

By increasing the number of players we can drive POS to 1/2 distribution rule game (POS=1) Direction: Submodular welfare functions of the form for all W r(ar) = c

ar = ∅

slide-36
SLIDE 36

NE exists Budget Balanced Complexity POS

Efficiency

Marginal contribution medium

1 1/2

Shapley value high

1/2 1/2

POA

conflict between budget balanced and efficiency

Is it possible to overcome limitations by conditioning utilities on more information?

slide-37
SLIDE 37

Recap distribution rule game POS<1

Proved:

game distribution rule POS=1

Possible?

  • rdered

protocols

slide-38
SLIDE 38

Ordered Protocol 1 2 3

Welfare Wr(1) Wr(2) Wr(3) Wr(1,2) Wr(1,3) Wr(2,3) Wr(1,2,3)

Ordered Protocol (1st) (2nd) (3rd)

W r(1) W r(1, 2) − W r(1) W r(1, 2, 3) − W r(1, 2)

Payoffs

slide-39
SLIDE 39

Ordered Protocol

W r(1, 2, 3)

= Ordered Protocol (1st) (2nd) (3rd)

W r(1) W r(1, 2) − W r(1) W r(1, 2, 3) − W r(1, 2)

Payoffs Budget Balanced Properties

slide-40
SLIDE 40

Ordered Protocol Ordered Protocol (1st) (2nd) (3rd)

W r(1) W r(1, 2) − W r(1) W r(1, 2, 3) − W r(1, 2)

Payoffs Budget Balanced Properties U > Marginal Contribution _ ≥ W r(1, 2, 3) − W r(2, 3)

≥ W r(1, 2, 3) − W r(1, 3) = W r(1, 2, 3) − W r(1, 2)

Last player’s utility equal to marginal contribution

Can we use ordered protocols to guarantee POS = 1 for a given game?

slide-41
SLIDE 41

Efficiency distribution rule (POS = 1) game Direction:

slide-42
SLIDE 42

Efficiency (1) Consider OPT Create ordered protocol distribution rule (POS = 1) game Direction:

slide-43
SLIDE 43

Efficiency (1) Consider OPT (2) Specify any order (1st) (2nd) (3rd) Create ordered protocol distribution rule (POS = 1) game Direction:

slide-44
SLIDE 44

Efficiency (1) Consider OPT (2) Specify any order (3) Extend order by alternatives last (1st) (2nd) (3rd) Create ordered protocol distribution rule (POS = 1) game Direction:

slide-45
SLIDE 45

Efficiency (1) Consider OPT (2) Specify any order (3) Extend order by alternatives last (1st) (2nd) (3rd) Create ordered protocol distribution rule (POS = 1) game Direction:

slide-46
SLIDE 46

Efficiency (1) Consider OPT (2) Specify any order (3) Extend order by alternatives last (1st) (2nd) (3rd) Create ordered protocol distribution rule (POS = 1) game Direction:

slide-47
SLIDE 47

Efficiency (1) Consider OPT (2) Specify any order (3) Extend order by alternatives last (4) Remaining order anything (1st) (2nd) (3rd) Create ordered protocol distribution rule (POS = 1) game Direction:

slide-48
SLIDE 48

Efficiency (1) Consider OPT (2) Specify any order (3) Extend order by alternatives last (4) Remaining order anything (1st) (2nd) (3rd) Utility at OPT satisfies

Ui(aopt) ≥ W(aopt) − W(∅, aopt

−i )

Ui(a′

i, aopt −i ) = W(a′ i, aopt −i ) − W(∅, aopt −i )

Create ordered protocol distribution rule (POS = 1) game Direction:

slide-49
SLIDE 49

Efficiency (1) Consider OPT (2) Specify any order (3) Extend order by alternatives last (4) Remaining order anything (1st) (2nd) (3rd) Utility at OPT satisfies

Ui(a′

i, aopt −i ) > Ui(aopt) ⇒ W(a′ i, aopt −i ) > W(aopt)

Ui(aopt) ≥ W(aopt) − W(∅, aopt

−i )

Ui(a′

i, aopt −i ) = W(a′ i, aopt −i ) − W(∅, aopt −i )

Create ordered protocol distribution rule (POS = 1) game Direction:

slide-50
SLIDE 50

Efficiency (1) Consider OPT (2) Specify any order (3) Extend order by alternatives last (4) Remaining order anything (1st) (2nd) (3rd) Utility at OPT satisfies

Ui(a′

i, aopt −i ) > Ui(aopt) ⇒ W(a′ i, aopt −i ) > W(aopt)

Ui(aopt) ≥ W(aopt) − W(∅, aopt

−i )

Ui(a′

i, aopt −i ) = W(a′ i, aopt −i ) − W(∅, aopt −i )

(OPT = NE)

Create ordered protocol distribution rule (POS = 1) game Direction:

slide-51
SLIDE 51

Recap game distribution rule POS=1 distribution rule game POS<1

Proved: Possible?

  • rdered

protocols

  • No. Simple adaptive dynamics can find desired distribution rule.

Do we need to condition the distribution rule on the game?

slide-52
SLIDE 52

Priority Based Distribution Rule

(1) Define an auxiliary state for each resource that specifies the order

slide-53
SLIDE 53

Priority Based Distribution Rule

(1) Define an auxiliary state for each resource that specifies the order (2) If user leaves resource, all player behind him move up one spot in the queue

slide-54
SLIDE 54

Priority Based Distribution Rule

(1) Define an auxiliary state for each resource that specifies the order (2) If user leaves resource, all player behind him move up one spot in the queue (3) If user joins resource, user enter last spot in queue

slide-55
SLIDE 55

Priority Based Distribution Rule

(1) Define an auxiliary state for each resource that specifies the order (2) If user leaves resource, all player behind him move up one spot in the queue (3) If user joins resource, user enter last spot in queue

slide-56
SLIDE 56

Priority Based Distribution Rule

(1) Define an auxiliary state for each resource that specifies the order (2) If user leaves resource, all player behind him move up one spot in the queue (3) If user joins resource, user enter last spot in queue

slide-57
SLIDE 57

Priority Based Distribution Rule

(1) Define an auxiliary state for each resource that specifies the order (2) If user leaves resource, all player behind him move up one spot in the queue (3) If user joins resource, user enter last spot in queue

slide-58
SLIDE 58

Priority Based Distribution Rule

(1) Define an auxiliary state for each resource that specifies the order (2) If user leaves resource, all player behind him move up one spot in the queue (3) If user joins resource, user enter last spot in queue

If OPT is played then it is a NE

slide-59
SLIDE 59

Summary

NE exists Budget Balanced Complexity POS Marginal contribution medium

1 1/2

Shapley value high

1/2 1/2

Priority based medium

1 1/2

POA (1) Noncooperative game theory has inherent limitation with respect to distributed control (2) Utilizing noncooperative game theory for distributed control is a design choice, not a requirement (3) Many of the limitations can be overcome by moving beyond noncooperative games (introducing auxiliary state variable)

Take Away Points:

slide-60
SLIDE 60

Summary

Noncooperative Game Players Actions Utilities

Ai Ui : A → R {1, ..., n}

Extra flexibility in design can be utilized to improve performance States State Transition State Based Game

Ui : A × X → R X P : X × A → ∆(X) Ai {1, ..., n}

slide-61
SLIDE 61

Conclusions

Thank You!

Decision Makers Global Behavior