Optimal Decentralized Control of System with Partially Exchangeable - - PowerPoint PPT Presentation

optimal decentralized control of system with partially
SMART_READER_LITE
LIVE PREVIEW

Optimal Decentralized Control of System with Partially Exchangeable - - PowerPoint PPT Presentation

Optimal Decentralized Control of System with Partially Exchangeable Agents Aditya Mahajan McGill University Joint work with Jalal Arabneydi Allerton Conference on Communication, Control, and Computing 28 Sep, 2016 Decentralized control with


slide-1
SLIDE 1

Optimal Decentralized Control of System with Partially Exchangeable Agents

Aditya Mahajan

McGill University

Joint work with Jalal Arabneydi Allerton Conference on Communication, Control, and Computing 28 Sep, 2016

slide-2
SLIDE 2

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

1

Optimal decentralized control: Applications

Internet of Things

slide-3
SLIDE 3

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

1

Optimal decentralized control: Applications

Internet of Things Smart Grids

slide-4
SLIDE 4

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

1

Optimal decentralized control: Applications

Internet of Things Smart Grids Sensor Networks

slide-5
SLIDE 5

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

1

Optimal decentralized control: Applications

Internet of Things Smart Grids Sensor Networks Swarm Robotics

slide-6
SLIDE 6

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

1

Optimal decentralized control: Applications and Theory

Internet of Things Smart Grids Sensor Networks Swarm Robotics

Salient features

Multiple decision makers Access to difgerent information Cooperate towards a common objective

slide-7
SLIDE 7

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

1

Optimal decentralized control: Applications and Theory

Internet of Things Smart Grids Sensor Networks Swarm Robotics

Salient features

Multiple decision makers Access to difgerent information Cooperate towards a common objective

slide-8
SLIDE 8

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

1

Optimal decentralized control: Applications and Theory

Internet of Things Smart Grids Sensor Networks Swarm Robotics

Salient features

Multiple decision makers Access to difgerent information Cooperate towards a common objective

slide-9
SLIDE 9

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

1

Optimal decentralized control: Applications and Theory

Internet of Things Smart Grids Sensor Networks Swarm Robotics

Salient features

Multiple decision makers Access to difgerent information Cooperate towards a common objective

slide-10
SLIDE 10

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

1

Optimal decentralized control: Applications and Theory

Internet of Things Smart Grids Sensor Networks Swarm Robotics

Salient features

Multiple decision makers Access to difgerent information Cooperate towards a common objective

Series of positive results in the last 10-15 years:

funnel causality, quadratic invariance, common information approach, and others.

Explicit solutions are rare and typically exist for systems with two or three agents.

slide-11
SLIDE 11

Are there features that are present in the applications but are missing from the theory?

slide-12
SLIDE 12

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

2

System with exchangeable agents

Dynamics 𝐲t+1 = ft(𝐲t, 𝐯t, 𝐱t) with per-step cost ct(𝐲t, 𝐯t).

slide-13
SLIDE 13

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

2

System with exchangeable agents

Dynamics 𝐲t+1 = ft(𝐲t, 𝐯t, 𝐱t) with per-step cost ct(𝐲t, 𝐯t).

Pair of exchangeable agents

Agents i and j are exchangeable if 𝒴i = 𝒴j, 𝒱i = 𝒱j, 𝒳i = 𝒳j. ft(σij𝐲t, σij𝐯t, σij𝐱t) = σij(ft(𝐲t, 𝐯t, 𝐱t)) ct(σij𝐲t, σij𝐯t) = ct(𝐲t, 𝐯t).

slide-14
SLIDE 14

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

2

System with exchangeable agents

Dynamics 𝐲t+1 = ft(𝐲t, 𝐯t, 𝐱t) with per-step cost ct(𝐲t, 𝐯t).

Pair of exchangeable agents

Agents i and j are exchangeable if 𝒴i = 𝒴j, 𝒱i = 𝒱j, 𝒳i = 𝒳j. ft(σij𝐲t, σij𝐯t, σij𝐱t) = σij(ft(𝐲t, 𝐯t, 𝐱t)) ct(σij𝐲t, σij𝐯t) = ct(𝐲t, 𝐯t).

Set of exchangeable agents

A set of agents is exchangeable if every pairin that set is exchangeable

slide-15
SLIDE 15

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

2

System with exchangeable agents

Dynamics 𝐲t+1 = ft(𝐲t, 𝐯t, 𝐱t) with per-step cost ct(𝐲t, 𝐯t).

Pair of exchangeable agents

Agents i and j are exchangeable if 𝒴i = 𝒴j, 𝒱i = 𝒱j, 𝒳i = 𝒳j. ft(σij𝐲t, σij𝐯t, σij𝐱t) = σij(ft(𝐲t, 𝐯t, 𝐱t)) ct(σij𝐲t, σij𝐯t) = ct(𝐲t, 𝐯t).

Set of exchangeable agents

A set of agents is exchangeable if every pairin that set is exchangeable

System with partially exchangeable agents

. . . is a multi-agent system where the set of agents can be partitioned into disjoint sets of exchangeable agents.

slide-16
SLIDE 16

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

3

Notation

N : number of heterogeneous agents K : number of subpopulations

slide-17
SLIDE 17

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

3

Notation

For agent i of sub- population k xi

t ∈ ℝdk

x : state of agent i

ui

t ∈ ℝdk

u : control action of agent i

N : number of heterogeneous agents K : number of subpopulations

slide-18
SLIDE 18

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

3

Notation

For agent i of sub- population k xi

t ∈ ℝdk

x : state of agent i

ui

t ∈ ℝdk

u : control action of agent i

For sub-population k 𝒪k : set of agents in sub-popln k ¯ xk

t =

1 |𝒪k| ∑

i∈𝒪k

xi

t : mean-fjeld of states

¯ uk

t =

1 |𝒪k| ∑

i∈𝒪k

ui

t : mean-fjeld of actions

N : number of heterogeneous agents K : number of subpopulations

slide-19
SLIDE 19

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

4

Notation

For the entire population 𝒪 = 𝒪1 ∪ ⋅ ⋅ ⋅ ∪ 𝒪K : set of all agents 𝒧 = {1, . . . , K} : set of all sub-populations 𝐲t = (xi

t)i∈𝒪 : global state of the system

𝐯t = (ui

t)i∈𝒪 : joint actions of all agents

¯ 𝐲t = vec(¯ x1

t, . . . , ¯

xK

t )

: global mean-fjeld of states ¯ 𝐯t = vec( ¯ u1

t, . . . , ¯

uK

t ) : global mean-fjeld of actions

N : number of heterogeneous agents K : number of subpopulations

slide-20
SLIDE 20

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

5

Linear quadratic system with partially exchangeable agents

Dynamics 𝐲t+1 = At𝐲t + Bt𝐯t + 𝐱t Cost

T

t=1 [𝐲⊺ t Qt𝐲t + 𝐯⊺ t Rt𝐯t]

slide-21
SLIDE 21

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

5

Linear quadratic system with partially exchangeable agents

Dynamics 𝐲t+1 = At𝐲t + Bt𝐯t + 𝐱t Cost

T

t=1 [𝐲⊺ t Qt𝐲t + 𝐯⊺ t Rt𝐯t]

slide-22
SLIDE 22

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

5

Linear quadratic system with partially exchangeable agents

Dynamics 𝐲t+1 = At𝐲t + Bt𝐯t + 𝐱t Cost

T

t=1 [𝐲⊺ t Qt𝐲t + 𝐯⊺ t Rt𝐯t]

Irrespective of the information structure such a system is equivalent to a mean-fjeld coupled system

slide-23
SLIDE 23

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

5

Linear quadratic system with partially exchangeable agents

Dynamics 𝐲t+1 = At𝐲t + Bt𝐯t + 𝐱t Cost

T

t=1 [𝐲⊺ t Qt𝐲t + 𝐯⊺ t Rt𝐯t]

Irrespective of the information structure such a system is equivalent to a mean-fjeld coupled system

Agent dynamics in sub-population k xi

t+1 = Ak t xi t + Bk t ui t + Dk t ¯

𝐲t + Ek

t ¯

𝐯t + wi

t

Cost

T

t=1 [ ∑ k∈𝒧

i∈𝒪k

1 |𝒪k|[(xi

t) ⊺Qk t xi t+(ui t) ⊺Rk t ui t]+ ¯

𝐲⊺

t Px t ¯

𝐲t + ¯ 𝐯⊺

t Pu t ¯

𝐯t ]

slide-24
SLIDE 24

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

6

There is a long history of mean-field approximations

Mean-field approximation in statistical physics (Weiss 1907; Landau 1937)

slide-25
SLIDE 25

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

6

There is a long history of mean-field approximations

Mean-field approximation in statistical physics (Weiss 1907; Landau 1937)

It is a well-known phenomenon in many branches of the exact and physical sciences that very great numbers are often easier to handle than those of medium size. An almost exact theory of a gas, containing about 1025 freely moving particles, is incomparably easier than that of the solar system, made up of 9 major bodies… This is, of course, due to the excellent possibility of applying the laws of statistics and probabilities in the fjrst case. — von Neumann and Morgenstern, Theory of Games and Economic Behavior (1944) §2.4.2

slide-26
SLIDE 26

Anonymous games Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

6

There is a long history of mean-field approximations

Mean-field approximation in statistical physics (Weiss 1907; Landau 1937)

. . .

Mean-field approximations in Game Theory

Jovanovic Rosenthal 1988 Bergin Bernhardt 1995 Weintraub Benkard Van Roy 2008 . . .

slide-27
SLIDE 27

Anonymous games Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

6

There is a long history of mean-field approximations

Mean-field approximation in statistical physics (Weiss 1907; Landau 1937)

. . .

Mean-field approximations in Game Theory

Jovanovic Rosenthal 1988 Bergin Bernhardt 1995 Weintraub Benkard Van Roy 2008 . . .

Mean-field approximations in Systems and Control (Mean-field games)

Huang Caines Malhalmé 2003, . . . Larsy Lions 2006, . . . . . .

slide-28
SLIDE 28

Our results are different There is no approximation! Results are applicable to systems with arbitrary (not necessarily large) number of agents

slide-29
SLIDE 29

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

7

Main idea: What happens if mean-field is observed?

Mean-field sharing information structure

Ii

t = {xi 1:t, ui 1:t−1, ¯

𝐲1:t}

slide-30
SLIDE 30

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

7

Main idea: What happens if mean-field is observed?

Mean-field sharing information structure

Ii

t = {xi 1:t, ui 1:t−1, ¯

𝐲1:t}

Is it a restrictive assumption?

Not really. Mean-fjeld can be shared using small communication

  • verhead (using consensus algorithms)

We later provide approx. results when mean-fjeld is not shared.

slide-31
SLIDE 31

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

7

Main idea: What happens if mean-field is observed?

Mean-field sharing information structure

Ii

t = {xi 1:t, ui 1:t−1, ¯

𝐲1:t}

Is it a restrictive assumption?

Not really. Mean-fjeld can be shared using small communication

  • verhead (using consensus algorithms)

We later provide approx. results when mean-fjeld is not shared.

Not one of the known tractable information structures

Not partially nested (or stochastically nested) Not quadratic invariant Not partial history sharing

slide-32
SLIDE 32

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

8

A surprisingly simple solution . . .

slide-33
SLIDE 33

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

8

A surprisingly simple solution . . .

Parallel axis Theorem 1 |𝒪k| ∑

i∈𝒪k

(xi

t)⊺Qk t xi t =

1 |𝒪k| ∑

i∈𝒪k

(˘ xi

t)⊺Qk t ˘

xi

t+(¯

xk

t )⊺Qk t ¯

xk

t ,

where ˘ xi

t = xi t − ¯

xk

t .

slide-34
SLIDE 34

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

8

A surprisingly simple solution . . .

Parallel axis Theorem 1 |𝒪k| ∑

i∈𝒪k

(xi

t)⊺Qk t xi t =

1 |𝒪k| ∑

i∈𝒪k

(˘ xi

t)⊺Qk t ˘

xi

t+(¯

xk

t )⊺Qk t ¯

xk

t ,

where ˘ xi

t = xi t − ¯

xk

t .

Decoupled Per-step cost ∑

k∈𝒧

i∈𝒪k

1 |𝒪k|[(˘ xi

t) ⊺Qk t ˘

xi

t] + ¯

𝐲⊺

t ( ¯

Qt + Px

t )¯

𝐲t

+ similar u-terms

slide-35
SLIDE 35

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

8

A surprisingly simple solution . . .

Parallel axis Theorem 1 |𝒪k| ∑

i∈𝒪k

(xi

t)⊺Qk t xi t =

1 |𝒪k| ∑

i∈𝒪k

(˘ xi

t)⊺Qk t ˘

xi

t+(¯

xk

t )⊺Qk t ¯

xk

t ,

where ˘ xi

t = xi t − ¯

xk

t .

Decoupled Per-step cost ∑

k∈𝒧

i∈𝒪k

1 |𝒪k|[(˘ xi

t) ⊺Qk t ˘

xi

t] + ¯

𝐲⊺

t ( ¯

Qt + Px

t )¯

𝐲t

+ similar u-terms

Noise coupled Dynamics ˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t,

¯ 𝐲t+1 = Ak

t ¯

𝐲t + Bk

t ¯

𝐯t + ¯ 𝐱t

slide-36
SLIDE 36

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

8

A surprisingly simple solution . . .

Parallel axis Theorem 1 |𝒪k| ∑

i∈𝒪k

(xi

t)⊺Qk t xi t =

1 |𝒪k| ∑

i∈𝒪k

(˘ xi

t)⊺Qk t ˘

xi

t+(¯

xk

t )⊺Qk t ¯

xk

t ,

where ˘ xi

t = xi t − ¯

xk

t .

Decoupled Per-step cost ∑

k∈𝒧

i∈𝒪k

1 |𝒪k|[(˘ xi

t) ⊺Qk t ˘

xi

t] + ¯

𝐲⊺

t ( ¯

Qt + Px

t )¯

𝐲t

+ similar u-terms

Noise coupled Dynamics ˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t,

¯ 𝐲t+1 = Ak

t ¯

𝐲t + Bk

t ¯

𝐯t + ¯ 𝐱t We still have a non-classical information structure

slide-37
SLIDE 37

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

9

Assume centralized information and use certainty equivalence

Local States Mean-field state Dynamics

˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t

¯ 𝐲t+1 = At ¯ 𝐲t + Bt ¯ 𝐯t + ¯ 𝐱t

Cost

(˘ xi

t) ⊺Qk t ˘

xi

t + ( ˘

ui

t) ⊺Rk t ˘

ui

t

(¯ 𝐲t)

⊺(Px t + Qt)¯

𝐲t + ( ¯ 𝐯t)

⊺(Pu t + Rt) ¯

𝐯t

slide-38
SLIDE 38

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

9

Assume centralized information and use certainty equivalence

Local States Mean-field state Dynamics

˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t

¯ 𝐲t+1 = At ¯ 𝐲t + Bt ¯ 𝐯t + ¯ 𝐱t

Cost

(˘ xi

t) ⊺Qk t ˘

xi

t + ( ˘

ui

t) ⊺Rk t ˘

ui

t

(¯ 𝐲t)

⊺(Px t + Qt)¯

𝐲t + ( ¯ 𝐯t)

⊺(Pu t + Rt) ¯

𝐯t

Control Law

˘ ui

t = ˘

Lk

t ˘

xi

t

¯ 𝐯t = ¯ 𝐌t ¯ 𝐲t

slide-39
SLIDE 39

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

9

Assume centralized information and use certainty equivalence

Local States Mean-field state Dynamics

˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t

¯ 𝐲t+1 = At ¯ 𝐲t + Bt ¯ 𝐯t + ¯ 𝐱t

Cost

(˘ xi

t) ⊺Qk t ˘

xi

t + ( ˘

ui

t) ⊺Rk t ˘

ui

t

(¯ 𝐲t)

⊺(Px t + Qt)¯

𝐲t + ( ¯ 𝐯t)

⊺(Pu t + Rt) ¯

𝐯t

Control Law

˘ ui

t = ˘

Lk

t ˘

xi

t

¯ 𝐯t = ¯ 𝐌t ¯ 𝐲t

Gains

˘ Lk

t = −( ⋅ ⋅ ⋅ ) −1(Bk t )⊺ ˘

Mk

t+1Ak t

¯ Lt = −( ⋅ ⋅ ⋅ )

−1( ¯

Bt)⊺ ¯ 𝐍t+1 ¯ At

Riccati Equation

˘ Mk

1:T = DRE(Ak 1:T, Bk 1:T, Qk 1:T, Rk 1:T)

¯ M1:T = DRE( ¯ A1:T, ¯ B1:T, ¯ Q1:T + Px

1:T,

¯ R1:T + Pu

1:T)

slide-40
SLIDE 40

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

9

Assume centralized information and use certainty equivalence

Local States Mean-field state Dynamics

˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t

¯ 𝐲t+1 = At ¯ 𝐲t + Bt ¯ 𝐯t + ¯ 𝐱t

Cost

(˘ xi

t) ⊺Qk t ˘

xi

t + ( ˘

ui

t) ⊺Rk t ˘

ui

t

(¯ 𝐲t)

⊺(Px t + Qt)¯

𝐲t + ( ¯ 𝐯t)

⊺(Pu t + Rt) ¯

𝐯t

Control Law

˘ ui

t = ˘

Lk

t ˘

xi

t

¯ 𝐯t = ¯ 𝐌t ¯ 𝐲t

Gains

˘ Lk

t = −( ⋅ ⋅ ⋅ ) −1(Bk t )⊺ ˘

Mk

t+1Ak t

¯ Lt = −( ⋅ ⋅ ⋅ )

−1( ¯

Bt)⊺ ¯ 𝐍t+1 ¯ At

Riccati Equation

˘ Mk

1:T = DRE(Ak 1:T, Bk 1:T, Qk 1:T, Rk 1:T)

¯ M1:T = DRE( ¯ A1:T, ¯ B1:T, ¯ Q1:T + Px

1:T,

¯ R1:T + Pu

1:T)

K equations, one for each sub-population 1 equation for all mean-fjelds

slide-41
SLIDE 41

ui

t = ˘

ui

t + ¯

uk

t = ˘

Lk

t (xi t − ¯

xk

t ) + ¯

Lk

t ¯

𝐲t Optimal centralized solution can be implemented with mean-field sharing information structure.

slide-42
SLIDE 42

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

10

Solution generalizes to . . .

Major-minor setup One major agent and a population of minor agents. Tracking cost function ∑

k∈𝒧

i∈𝒪k

1 |𝒪k|[(xi

t − ˚

xi

t) ⊺Qk t (xi t − ˚

xi

t) + (ui t)⊺Rk t ui t]

+ (¯ 𝐲t − rt)⊺Px

t (¯

𝐲t − rt) + ¯ 𝐯⊺

t Pu t ¯

𝐯t Systems coupled through weighted mean-field ¯ xk

t =

1 |𝒪k| ∑

i∈𝒪k

λixi

t,

¯ uk

t =

1 |𝒪k| ∑

i∈𝒪k

λiui

t.

slide-43
SLIDE 43

But what if the mean-field is not observed?

slide-44
SLIDE 44

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

11

Partial mean-field sharing information structure

slide-45
SLIDE 45

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

11

Partial mean-field sharing information structure

Notation We will compare performance with system where mean- fjeld is completely observed. To avoid confusion, use State: si

t;

Actions: vi

t.

and similar notation for mean-fjeld ¯ sk

t , etc.

Set 𝒯: MF observed Set 𝒯c: MF not observed

slide-46
SLIDE 46

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

11

Partial mean-field sharing information structure

Estimated mean-field

𝐴t = (z1

t, . . . , zK t ) = 𝔽[¯

𝐭t | {¯ sk

t }k∈𝒪],

where zk

t+1 =

{ ¯ sk

t+1,

k ∈ 𝒯 Ak

t zk t + (Bk t ¯

Lk

t + Dk t + Ek t ¯

Lt)𝐴t, k ∉ 𝒯 Set 𝒯: MF observed Set 𝒯c: MF not observed

slide-47
SLIDE 47

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

12

Certainty equivalence controller and its performance

Certainty equivalence controller

ui

t = ˘

Lk

t (si t − zk t ) + ¯

Lk

t 𝐴t

slide-48
SLIDE 48

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

12

Certainty equivalence controller and its performance

Certainty equivalence controller

ui

t = ˘

Lk

t (si t − zk t ) + ¯

Lk

t 𝐴t

Key Lemma Under the certainty equivalence control: ˘ si

t = ˘

xi

t.

slide-49
SLIDE 49

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

12

Certainty equivalence controller and its performance

Certainty equivalence controller

ui

t = ˘

Lk

t (si t − zk t ) + ¯

Lk

t 𝐴t

Key Lemma Under the certainty equivalence control: ˘ si

t = ˘

xi

  • t. Thus,

ˆ J − J∗ = 𝔽 [

T

t=1

[¯ 𝐭⊺

t ˆ

Qt ¯ 𝐭t + ¯ 𝐰⊺

t ˆ

Rt ¯ 𝐰t − ¯ 𝐲⊺

t ˆ

Qt ¯ 𝐲t − ¯ 𝐯⊺

t ˆ

Rt ¯ 𝐯t]]

slide-50
SLIDE 50

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

12

Certainty equivalence controller and its performance

Certainty equivalence controller

ui

t = ˘

Lk

t (si t − zk t ) + ¯

Lk

t 𝐴t

Key Lemma Under the certainty equivalence control: ˘ si

t = ˘

xi

  • t. Thus,

ˆ J − J∗ = 𝔽 [

T

t=1

[¯ 𝐭⊺

t ˆ

Qt ¯ 𝐭t + ¯ 𝐰⊺

t ˆ

Rt ¯ 𝐰t − ¯ 𝐲⊺

t ˆ

Qt ¯ 𝐲t − ¯ 𝐯⊺

t ˆ

Rt ¯ 𝐯t]] = 𝔽 [

T

t=1 [ ζt

ξt ] ̃ Q [ ζt ξt ]], where ζk

t = ¯

xk

t − zk t and ξk t = ¯

sk

t − zk t

slide-51
SLIDE 51

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

12

Certainty equivalence controller and its performance

Certainty equivalence controller

ui

t = ˘

Lk

t (si t − zk t ) + ¯

Lk

t 𝐴t

Key Lemma Under the certainty equivalence control: ˘ si

t = ˘

xi

  • t. Thus,

ˆ J − J∗ = 𝔽 [

T

t=1

[¯ 𝐭⊺

t ˆ

Qt ¯ 𝐭t + ¯ 𝐰⊺

t ˆ

Rt ¯ 𝐰t − ¯ 𝐲⊺

t ˆ

Qt ¯ 𝐲t − ¯ 𝐯⊺

t ˆ

Rt ¯ 𝐯t]] = 𝔽 [

T

t=1 [ ζt

ξt ] ̃ Q [ ζt ξt ]], where ζk

t = ¯

xk

t − zk t and ξk t = ¯

sk

t − zk t

Moreover, [ ζt+1 ξt+1 ] = ˜ At [ ζt ξt ] + [ h ∘ ¯ 𝐱t h ∘ ¯ 𝐱t ]

slide-52
SLIDE 52

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

12

Certainty equivalence controller and its performance

Certainty equivalence controller

ui

t = ˘

Lk

t (si t − zk t ) + ¯

Lk

t 𝐴t

Key Lemma Under the certainty equivalence control: ˘ si

t = ˘

xi

  • t. Thus,

ˆ J − J∗ = 𝔽 [

T

t=1

[¯ 𝐭⊺

t ˆ

Qt ¯ 𝐭t + ¯ 𝐰⊺

t ˆ

Rt ¯ 𝐰t − ¯ 𝐲⊺

t ˆ

Qt ¯ 𝐲t − ¯ 𝐯⊺

t ˆ

Rt ¯ 𝐯t]] Quadratic Cost = 𝔽 [

T

t=1 [ ζt

ξt ] ̃ Q [ ζt ξt ]], where ζk

t = ¯

xk

t − zk t and ξk t = ¯

sk

t − zk t

Moreover, [ ζt+1 ξt+1 ] = ˜ At [ ζt ξt ] + [ h ∘ ¯ 𝐱t h ∘ ¯ 𝐱t ] Linear Dynamics

slide-53
SLIDE 53

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

13

Certainty equivalence controller and its performance

Exact Performance

ˆ J−J∗ = Tr( ̃ X1 ̃ M1)+

T−1

t=1

Tr( ̃ Wt ̃ Mt+1) where ̃ M1:T = DLE( ̃ A1:T, ̃ Q1:T)

slide-54
SLIDE 54

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

13

Certainty equivalence controller and its performance

Exact Performance

ˆ J−J∗ = Tr( ̃ X1 ̃ M1)+

T−1

t=1

Tr( ̃ Wt ̃ Mt+1) where ̃ M1:T = DLE( ̃ A1:T, ̃ Q1:T)

Performance bound

Let n = mink∉S{|𝒪k|}. Suppose all noises are independent. Then, there exists a matrix C such that ̃ X1 ≤ C/n and ̃ Wt ≤ C/n. Thus,

ˆ J − J∗ ∈ 𝒫 ( T n) ,

slide-55
SLIDE 55

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

13

Certainty equivalence controller and its performance

Exact Performance

ˆ J−J∗ = Tr( ̃ X1 ̃ M1)+

T−1

t=1

Tr( ̃ Wt ̃ Mt+1) where ̃ M1:T = DLE( ̃ A1:T, ̃ Q1:T)

Performance bound

Let n = mink∉S{|𝒪k|}. Suppose all noises are independent. Then, there exists a matrix C such that ̃ X1 ≤ C/n and ̃ Wt ≤ C/n. Thus,

ˆ J − J∗ ∈ 𝒫 ( T n) , Infinite horizon

Results extend to infjnite horizon setup understandard assumptions. For both discounted and average cost setup:

ˆ J − J∗ ∈ 𝒫 ( 1 n)

slide-56
SLIDE 56

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

13

Certainty equivalence controller and its performance

Exact Performance

ˆ J−J∗ = Tr( ̃ X1 ̃ M1)+

T−1

t=1

Tr( ̃ Wt ̃ Mt+1) where ̃ M1:T = DLE( ̃ A1:T, ̃ Q1:T)

Performance bound

Let n = mink∉S{|𝒪k|}. Suppose all noises are independent. Then, there exists a matrix C such that ̃ X1 ≤ C/n and ̃ Wt ≤ C/n. Thus,

ˆ J − J∗ ∈ 𝒫 ( T n) , Infinite horizon

Results extend to infjnite horizon setup understandard assumptions. For both discounted and average cost setup:

ˆ J − J∗ ∈ 𝒫 ( 1 n) , c.f. 𝒫 ( 1 √n) in MFG

slide-57
SLIDE 57

An example: Demand response with minimum discomfort to users

slide-58
SLIDE 58

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

14

Demand response of space heaters

Dynamics of space heater xi

t+1 = a(xi t − xnom) + b(ui t + unom) + wi t

Objective 𝔽 [ 1 n

T

t=1 n

i=1 [qt(xi t − xi des)2 + rt(ui t)2

] + pt(¯ 𝐲t − ¯ 𝐲ref

t )2

]

slide-59
SLIDE 59

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

15

Everyone follows the mean-field

slide-60
SLIDE 60

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

16

Everyone follows the optimal strategy

slide-61
SLIDE 61

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

17

Summary

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

2

System with exchangeable agents

Dynamics 𝐲t+1 = ft(𝐲t, 𝐯t, 𝐱t) with per-step cost ct(𝐲t, 𝐯t).

Pair of exchangeable agents

Agents i and j are exchangeable if 𝒴i = 𝒴j, 𝒱i = 𝒱j, 𝒳i = 𝒳j. ft(σij𝐲t, σij𝐯t, σij𝐱t) = σij(ft(𝐲t, 𝐯t, 𝐱t)) ct(σij𝐲t, σij𝐯t) = ct(𝐲t, 𝐯t).

Set of exchangeable agents

A set of agents is exchangeable if every pairin that set is exchangeable

System with partially exchangeable agents

. . . is a multi-agent system where the set of agents can be partitioned into disjoint sets of exchangeable agents.

slide-62
SLIDE 62

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

17

Summary

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

2

System with exchangeable agents

Dynamics 𝐲t+1 = ft(𝐲t, 𝐯t, 𝐱t) with per-step cost ct(𝐲t, 𝐯t).

Pair of exchangeable agents

Agents i and j are exchangeable if 𝒴i = 𝒴j, 𝒱i = 𝒱j, 𝒳i = 𝒳j. ft(σij𝐲t, σij𝐯t, σij𝐱t) = σij(ft(𝐲t, 𝐯t, 𝐱t)) ct(σij𝐲t, σij𝐯t) = ct(𝐲t, 𝐯t).

Set of exchangeable agents

A set of agents is exchangeable if every pairin that set is exchangeable

System with partially exchangeable agents

. . . is a multi-agent system where the set of agents can be partitioned into disjoint sets of exchangeable agents. Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

5

Linear quadratic system with partially exchangeable agents

Dynamics 𝐲t+1 = At𝐲t + Bt𝐯t + 𝐱t Cost

T

t=1 [𝐲⊺ t Qt𝐲t + 𝐯⊺ t Rt𝐯t]

Irrespective of the information structure such a system is equivalent to a mean-fjeld coupled system

Agent dynamics in sub-population k xi

t+1 = Ak t xi t + Bk t ui t + Dk t ¯

𝐲t + Ek

t ¯

𝐯t + wi

t

Cost

T

t=1 [ ∑ k∈𝒧

i∈𝒪k

1 |𝒪k|[(xi

t) ⊺Qk t xi t+(ui t) ⊺Rk t ui t]+ ¯

𝐲⊺

t Px t ¯

𝐲t + ¯ 𝐯⊺

t Pu t ¯

𝐯t ]

slide-63
SLIDE 63

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

17

Summary

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

2

System with exchangeable agents

Dynamics 𝐲t+1 = ft(𝐲t, 𝐯t, 𝐱t) with per-step cost ct(𝐲t, 𝐯t).

Pair of exchangeable agents

Agents i and j are exchangeable if 𝒴i = 𝒴j, 𝒱i = 𝒱j, 𝒳i = 𝒳j. ft(σij𝐲t, σij𝐯t, σij𝐱t) = σij(ft(𝐲t, 𝐯t, 𝐱t)) ct(σij𝐲t, σij𝐯t) = ct(𝐲t, 𝐯t).

Set of exchangeable agents

A set of agents is exchangeable if every pairin that set is exchangeable

System with partially exchangeable agents

. . . is a multi-agent system where the set of agents can be partitioned into disjoint sets of exchangeable agents. Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

5

Linear quadratic system with partially exchangeable agents

Dynamics 𝐲t+1 = At𝐲t + Bt𝐯t + 𝐱t Cost

T

t=1 [𝐲⊺ t Qt𝐲t + 𝐯⊺ t Rt𝐯t]

Irrespective of the information structure such a system is equivalent to a mean-fjeld coupled system

Agent dynamics in sub-population k xi

t+1 = Ak t xi t + Bk t ui t + Dk t ¯

𝐲t + Ek

t ¯

𝐯t + wi

t

Cost

T

t=1 [ ∑ k∈𝒧

i∈𝒪k

1 |𝒪k|[(xi

t) ⊺Qk t xi t+(ui t) ⊺Rk t ui t]+ ¯

𝐲⊺

t Px t ¯

𝐲t + ¯ 𝐯⊺

t Pu t ¯

𝐯t ] Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

9

Assume centralized information and use certainty equivalence

Local States Mean-field state Dynamics

˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t

¯ 𝐲t+1 = At ¯ 𝐲t + Bt ¯ 𝐯t + ¯ 𝐱t

Cost

(˘ xi

t) ⊺Qk t ˘

xi

t + ( ˘

ui

t) ⊺Rk t ˘

ui

t

(¯ 𝐲t)

⊺(Px t + Qt)¯

𝐲t + ( ¯ 𝐯t)

⊺(Pu t + Rt) ¯

𝐯t

Control Law

˘ ui

t = ˘

Lk

t ˘

xi

t

¯ 𝐯t = ¯ 𝐌t ¯ 𝐲t

Gains

˘ Lk

t = −( ⋅ ⋅ ⋅ ) −1(Bk t )⊺ ˘

Mk

t+1Ak t

¯ Lt = −( ⋅ ⋅ ⋅ )

−1( ¯

Bt)⊺ ¯ 𝐍t+1 ¯ At

Riccati Equation

˘ Mk

1:T = DRE(Ak 1:T, Bk 1:T, Qk 1:T, Rk 1:T)

¯ M1:T = DRE( ¯ A1:T, ¯ B1:T, ¯ Q1:T + Px

1:T,

¯ R1:T + Pu

1:T)

K equations, one for each sub-population 1 equation for all mean-fjelds

slide-64
SLIDE 64

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

17

Summary

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

2

System with exchangeable agents

Dynamics 𝐲t+1 = ft(𝐲t, 𝐯t, 𝐱t) with per-step cost ct(𝐲t, 𝐯t).

Pair of exchangeable agents

Agents i and j are exchangeable if 𝒴i = 𝒴j, 𝒱i = 𝒱j, 𝒳i = 𝒳j. ft(σij𝐲t, σij𝐯t, σij𝐱t) = σij(ft(𝐲t, 𝐯t, 𝐱t)) ct(σij𝐲t, σij𝐯t) = ct(𝐲t, 𝐯t).

Set of exchangeable agents

A set of agents is exchangeable if every pairin that set is exchangeable

System with partially exchangeable agents

. . . is a multi-agent system where the set of agents can be partitioned into disjoint sets of exchangeable agents. Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

5

Linear quadratic system with partially exchangeable agents

Dynamics 𝐲t+1 = At𝐲t + Bt𝐯t + 𝐱t Cost

T

t=1 [𝐲⊺ t Qt𝐲t + 𝐯⊺ t Rt𝐯t]

Irrespective of the information structure such a system is equivalent to a mean-fjeld coupled system

Agent dynamics in sub-population k xi

t+1 = Ak t xi t + Bk t ui t + Dk t ¯

𝐲t + Ek

t ¯

𝐯t + wi

t

Cost

T

t=1 [ ∑ k∈𝒧

i∈𝒪k

1 |𝒪k|[(xi

t) ⊺Qk t xi t+(ui t) ⊺Rk t ui t]+ ¯

𝐲⊺

t Px t ¯

𝐲t + ¯ 𝐯⊺

t Pu t ¯

𝐯t ] Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

9

Assume centralized information and use certainty equivalence

Local States Mean-field state Dynamics

˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t

¯ 𝐲t+1 = At ¯ 𝐲t + Bt ¯ 𝐯t + ¯ 𝐱t

Cost

(˘ xi

t) ⊺Qk t ˘

xi

t + ( ˘

ui

t) ⊺Rk t ˘

ui

t

(¯ 𝐲t)

⊺(Px t + Qt)¯

𝐲t + ( ¯ 𝐯t)

⊺(Pu t + Rt) ¯

𝐯t

Control Law

˘ ui

t = ˘

Lk

t ˘

xi

t

¯ 𝐯t = ¯ 𝐌t ¯ 𝐲t

Gains

˘ Lk

t = −( ⋅ ⋅ ⋅ ) −1(Bk t )⊺ ˘

Mk

t+1Ak t

¯ Lt = −( ⋅ ⋅ ⋅ )

−1( ¯

Bt)⊺ ¯ 𝐍t+1 ¯ At

Riccati Equation

˘ Mk

1:T = DRE(Ak 1:T, Bk 1:T, Qk 1:T, Rk 1:T)

¯ M1:T = DRE( ¯ A1:T, ¯ B1:T, ¯ Q1:T + Px

1:T,

¯ R1:T + Pu

1:T)

K equations, one for each sub-population 1 equation for all mean-fjelds Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

11

Partial mean-field sharing information structure

Estimated mean-field

𝐴t = (z1

t, . . . , zK t ) = 𝔽[¯

𝐭t | {¯ sk

t }k∈𝒪],

where zk

t+1 =

{ ¯ sk

t+1,

k ∈ 𝒯 Ak

t zk t + (Bk t ¯

Lk

t + Dk t + Ek t ¯

Lt)𝐴t, k ∉ 𝒯 Set 𝒯: MF observed Set 𝒯c: MF not observed

slide-65
SLIDE 65

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

17

Summary

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

2

System with exchangeable agents

Dynamics 𝐲t+1 = ft(𝐲t, 𝐯t, 𝐱t) with per-step cost ct(𝐲t, 𝐯t).

Pair of exchangeable agents

Agents i and j are exchangeable if 𝒴i = 𝒴j, 𝒱i = 𝒱j, 𝒳i = 𝒳j. ft(σij𝐲t, σij𝐯t, σij𝐱t) = σij(ft(𝐲t, 𝐯t, 𝐱t)) ct(σij𝐲t, σij𝐯t) = ct(𝐲t, 𝐯t).

Set of exchangeable agents

A set of agents is exchangeable if every pairin that set is exchangeable

System with partially exchangeable agents

. . . is a multi-agent system where the set of agents can be partitioned into disjoint sets of exchangeable agents. Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

5

Linear quadratic system with partially exchangeable agents

Dynamics 𝐲t+1 = At𝐲t + Bt𝐯t + 𝐱t Cost

T

t=1 [𝐲⊺ t Qt𝐲t + 𝐯⊺ t Rt𝐯t]

Irrespective of the information structure such a system is equivalent to a mean-fjeld coupled system

Agent dynamics in sub-population k xi

t+1 = Ak t xi t + Bk t ui t + Dk t ¯

𝐲t + Ek

t ¯

𝐯t + wi

t

Cost

T

t=1 [ ∑ k∈𝒧

i∈𝒪k

1 |𝒪k|[(xi

t) ⊺Qk t xi t+(ui t) ⊺Rk t ui t]+ ¯

𝐲⊺

t Px t ¯

𝐲t + ¯ 𝐯⊺

t Pu t ¯

𝐯t ] Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

9

Assume centralized information and use certainty equivalence

Local States Mean-field state Dynamics

˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t

¯ 𝐲t+1 = At ¯ 𝐲t + Bt ¯ 𝐯t + ¯ 𝐱t

Cost

(˘ xi

t) ⊺Qk t ˘

xi

t + ( ˘

ui

t) ⊺Rk t ˘

ui

t

(¯ 𝐲t)

⊺(Px t + Qt)¯

𝐲t + ( ¯ 𝐯t)

⊺(Pu t + Rt) ¯

𝐯t

Control Law

˘ ui

t = ˘

Lk

t ˘

xi

t

¯ 𝐯t = ¯ 𝐌t ¯ 𝐲t

Gains

˘ Lk

t = −( ⋅ ⋅ ⋅ ) −1(Bk t )⊺ ˘

Mk

t+1Ak t

¯ Lt = −( ⋅ ⋅ ⋅ )

−1( ¯

Bt)⊺ ¯ 𝐍t+1 ¯ At

Riccati Equation

˘ Mk

1:T = DRE(Ak 1:T, Bk 1:T, Qk 1:T, Rk 1:T)

¯ M1:T = DRE( ¯ A1:T, ¯ B1:T, ¯ Q1:T + Px

1:T,

¯ R1:T + Pu

1:T)

K equations, one for each sub-population 1 equation for all mean-fjelds Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

11

Partial mean-field sharing information structure

Estimated mean-field

𝐴t = (z1

t, . . . , zK t ) = 𝔽[¯

𝐭t | {¯ sk

t }k∈𝒪],

where zk

t+1 =

{ ¯ sk

t+1,

k ∈ 𝒯 Ak

t zk t + (Bk t ¯

Lk

t + Dk t + Ek t ¯

Lt)𝐴t, k ∉ 𝒯 Set 𝒯: MF observed Set 𝒯c: MF not observed

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

13

Certainty equivalence controller and its performance

Exact Performance

ˆ J−J∗ = Tr( ̃ X1 ̃ M1)+

T−1

t=1

Tr( ̃ Wt ̃ Mt+1) where ̃ M1:T = DLE( ̃ A1:T, ̃ Q1:T)

Performance bound

Let n = mink∉S{|𝒪k|}. Suppose all noises are independent. Then, there exists a matrix C such that ̃ X1 ≤ C/n and ̃ Wt ≤ C/n. Thus,

ˆ J − J∗ ∈ 𝒫 ( T n) , Infinite horizon

Results extend to infjnite horizon setup understandard assumptions. For both discounted and average cost setup:

ˆ J − J∗ ∈ 𝒫 ( 1 n) , c.f. 𝒫 ( 1 √n) in MFG

slide-66
SLIDE 66

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

17

Summary

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

2

System with exchangeable agents

Dynamics 𝐲t+1 = ft(𝐲t, 𝐯t, 𝐱t) with per-step cost ct(𝐲t, 𝐯t).

Pair of exchangeable agents

Agents i and j are exchangeable if 𝒴i = 𝒴j, 𝒱i = 𝒱j, 𝒳i = 𝒳j. ft(σij𝐲t, σij𝐯t, σij𝐱t) = σij(ft(𝐲t, 𝐯t, 𝐱t)) ct(σij𝐲t, σij𝐯t) = ct(𝐲t, 𝐯t).

Set of exchangeable agents

A set of agents is exchangeable if every pairin that set is exchangeable

System with partially exchangeable agents

. . . is a multi-agent system where the set of agents can be partitioned into disjoint sets of exchangeable agents. Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

5

Linear quadratic system with partially exchangeable agents

Dynamics 𝐲t+1 = At𝐲t + Bt𝐯t + 𝐱t Cost

T

t=1 [𝐲⊺ t Qt𝐲t + 𝐯⊺ t Rt𝐯t]

Irrespective of the information structure such a system is equivalent to a mean-fjeld coupled system

Agent dynamics in sub-population k xi

t+1 = Ak t xi t + Bk t ui t + Dk t ¯

𝐲t + Ek

t ¯

𝐯t + wi

t

Cost

T

t=1 [ ∑ k∈𝒧

i∈𝒪k

1 |𝒪k|[(xi

t) ⊺Qk t xi t+(ui t) ⊺Rk t ui t]+ ¯

𝐲⊺

t Px t ¯

𝐲t + ¯ 𝐯⊺

t Pu t ¯

𝐯t ] Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

9

Assume centralized information and use certainty equivalence

Local States Mean-field state Dynamics

˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t

¯ 𝐲t+1 = At ¯ 𝐲t + Bt ¯ 𝐯t + ¯ 𝐱t

Cost

(˘ xi

t) ⊺Qk t ˘

xi

t + ( ˘

ui

t) ⊺Rk t ˘

ui

t

(¯ 𝐲t)

⊺(Px t + Qt)¯

𝐲t + ( ¯ 𝐯t)

⊺(Pu t + Rt) ¯

𝐯t

Control Law

˘ ui

t = ˘

Lk

t ˘

xi

t

¯ 𝐯t = ¯ 𝐌t ¯ 𝐲t

Gains

˘ Lk

t = −( ⋅ ⋅ ⋅ ) −1(Bk t )⊺ ˘

Mk

t+1Ak t

¯ Lt = −( ⋅ ⋅ ⋅ )

−1( ¯

Bt)⊺ ¯ 𝐍t+1 ¯ At

Riccati Equation

˘ Mk

1:T = DRE(Ak 1:T, Bk 1:T, Qk 1:T, Rk 1:T)

¯ M1:T = DRE( ¯ A1:T, ¯ B1:T, ¯ Q1:T + Px

1:T,

¯ R1:T + Pu

1:T)

K equations, one for each sub-population 1 equation for all mean-fjelds Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

11

Partial mean-field sharing information structure

Estimated mean-field

𝐴t = (z1

t, . . . , zK t ) = 𝔽[¯

𝐭t | {¯ sk

t }k∈𝒪],

where zk

t+1 =

{ ¯ sk

t+1,

k ∈ 𝒯 Ak

t zk t + (Bk t ¯

Lk

t + Dk t + Ek t ¯

Lt)𝐴t, k ∉ 𝒯 Set 𝒯: MF observed Set 𝒯c: MF not observed

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

13

Certainty equivalence controller and its performance

Exact Performance

ˆ J−J∗ = Tr( ̃ X1 ̃ M1)+

T−1

t=1

Tr( ̃ Wt ̃ Mt+1) where ̃ M1:T = DLE( ̃ A1:T, ̃ Q1:T)

Performance bound

Let n = mink∉S{|𝒪k|}. Suppose all noises are independent. Then, there exists a matrix C such that ̃ X1 ≤ C/n and ̃ Wt ≤ C/n. Thus,

ˆ J − J∗ ∈ 𝒫 ( T n) , Infinite horizon

Results extend to infjnite horizon setup understandard assumptions. For both discounted and average cost setup:

ˆ J − J∗ ∈ 𝒫 ( 1 n) , c.f. 𝒫 ( 1 √n) in MFG

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

10

Solution generalizes to . . .

Major-minor setup One major agent and a population of minor agents. Tracking cost function ∑

k∈𝒧

i∈𝒪k

1 |𝒪k|[(xi

t − ˚

xi

t) ⊺Qk t (xi t − ˚

xi

t) + (ui t)⊺Rk t ui t]

+ (¯ 𝐲t − rt)⊺Px

t (¯

𝐲t − rt) + ¯ 𝐯⊺

t Pu t ¯

𝐯t Systems coupled through weighted mean-field ¯ xk

t =

1 |𝒪k| ∑

i∈𝒪k

λixi

t,

¯ uk

t =

1 |𝒪k| ∑

i∈𝒪k

λiui

t.

slide-67
SLIDE 67

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

17

Conclusion

Salient Features

The solution complexity depends only on the number of sub-populations; not on the number of agents. Agents don’t need to be aware of the number of agents. Same performance as centralized information. Thus, centralized performance can be achieved by simply sharing the mean-fjeld (empirical mean) of the states!

Generalizations

Noisy observation of mean-fjeld Delay in the observation of mean-fjeld Controlled Markov processes

arXiv:1609.00056

slide-68
SLIDE 68

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

1

Optimal decentralized control: Applications and Theory

Internet of Things Smart Grids Sensor Networks Swarm Robotics

Salient features

Multiple decision makers Access to difgerent information Cooperate towards a common objective

Series of positive results in the last 10-15 years:

funnel causality, quadratic invariance, common information approach, and others.

Explicit solutions are rare and typically exist for systems with two or three agents.

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

2

System with exchangeable agents

Dynamics 𝐲t+1 = ft(𝐲t, 𝐯t, 𝐱t) with per-step cost ct(𝐲t, 𝐯t).

Pair of exchangeable agents

Agents i and j are exchangeable if 𝒴i = 𝒴j, 𝒱i = 𝒱j, 𝒳i = 𝒳j. ft(σij𝐲t, σij𝐯t, σij𝐱t) = σij(ft(𝐲t, 𝐯t, 𝐱t)) ct(σij𝐲t, σij𝐯t) = ct(𝐲t, 𝐯t).

Set of exchangeable agents

A set of agents is exchangeable if every pairin that set is exchangeable

System with partially exchangeable agents

. . . is a multi-agent system where the set of agents can be partitioned into disjoint sets of exchangeable agents. Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

5

Linear quadratic system with partially exchangeable agents

Dynamics 𝐲t+1 = At𝐲t + Bt𝐯t + 𝐱t Cost

T

t=1 [𝐲⊺ t Qt𝐲t + 𝐯⊺ t Rt𝐯t]

Irrespective of the information structure such a system is equivalent to a mean-fjeld coupled system

Agent dynamics in sub-population k xi

t+1 = Ak t xi t + Bk t ui t + Dk t ¯

𝐲t + Ek

t ¯

𝐯t + wi

t

Cost

T

t=1 [ ∑ k∈𝒧

i∈𝒪k

1 |𝒪k|[(xi

t) ⊺Qk t xi t+(ui t) ⊺Rk t ui t]+ ¯

𝐲⊺

t Px t ¯

𝐲t + ¯ 𝐯⊺

t Pu t ¯

𝐯t ] Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

8

A surprisingly simple solution . . .

Parallel axis Theorem 1 |𝒪k| ∑

i∈𝒪k

(xi

t)⊺Qk t xi t =

1 |𝒪k| ∑

i∈𝒪k

(˘ xi

t)⊺Qk t ˘

xi

t+(¯

xk

t )⊺Qk t ¯

xk

t ,

where ˘ xi

t = xi t − ¯

xk

t .

Decoupled Per-step cost ∑

k∈𝒧

i∈𝒪k

1 |𝒪k|[(˘ xi

t) ⊺Qk t ˘

xi

t] + ¯

𝐲⊺

t ( ¯

Qt + Px

t )¯

𝐲t

+ similar u-terms

Noise coupled Dynamics ˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t,

¯ 𝐲t+1 = Ak

t ¯

𝐲t + Bk

t ¯

𝐯t + ¯ 𝐱t We still have a non-classical information structure

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

9

Assume centralized information and use certainty equivalence

Local States Mean-field state Dynamics

˘ xi

t+1 = Ak t ˘

xi

t + Bk t ˘

ui

t + ˘

wi

t

¯ 𝐲t+1 = At ¯ 𝐲t + Bt ¯ 𝐯t + ¯ 𝐱t

Cost

(˘ xi

t) ⊺Qk t ˘

xi

t + ( ˘

ui

t) ⊺Rk t ˘

ui

t

(¯ 𝐲t)

⊺(Px t + Qt)¯

𝐲t + ( ¯ 𝐯t)

⊺(Pu t + Rt) ¯

𝐯t

Control Law

˘ ui

t = ˘

Lk

t ˘

xi

t

¯ 𝐯t = ¯ 𝐌t ¯ 𝐲t

Gains

˘ Lk

t = −( ⋅ ⋅ ⋅ ) −1(Bk t )⊺ ˘

Mk

t+1Ak t

¯ Lt = −( ⋅ ⋅ ⋅ )

−1( ¯

Bt)⊺ ¯ 𝐍t+1 ¯ At

Riccati Equation

˘ Mk

1:T = DRE(Ak 1:T, Bk 1:T, Qk 1:T, Rk 1:T)

¯ M1:T = DRE( ¯ A1:T, ¯ B1:T, ¯ Q1:T + Px

1:T,

¯ R1:T + Pu

1:T)

K equations, one for each sub-population 1 equation for all mean-fjelds Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

10

Solution generalizes to . . .

Major-minor setup One major agent and a population of minor agents. Tracking cost function ∑

k∈𝒧

i∈𝒪k

1 |𝒪k|[(xi

t − ˚

xi

t) ⊺Qk t (xi t − ˚

xi

t) + (ui t)⊺Rk t ui t]

+ (¯ 𝐲t − rt)⊺Px

t (¯

𝐲t − rt) + ¯ 𝐯⊺

t Pu t ¯

𝐯t Systems coupled through weighted mean-field ¯ xk

t =

1 |𝒪k| ∑

i∈𝒪k

λixi

t,

¯ uk

t =

1 |𝒪k| ∑

i∈𝒪k

λiui

t.

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

11

Partial mean-field sharing information structure

Estimated mean-field

𝐴t = (z1

t, . . . , zK t ) = 𝔽[¯

𝐭t | {¯ sk

t }k∈𝒪],

where zk

t+1 =

{ ¯ sk

t+1,

k ∈ 𝒯 Ak

t zk t + (Bk t ¯

Lk

t + Dk t + Ek t ¯

Lt)𝐴t, k ∉ 𝒯 Set 𝒯: MF observed Set 𝒯c: MF not observed

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

13

Certainty equivalence controller and its performance

Exact Performance

ˆ J−J∗ = Tr( ̃ X1 ̃ M1)+

T−1

t=1

Tr( ̃ Wt ̃ Mt+1) where ̃ M1:T = DLE( ̃ A1:T, ̃ Q1:T)

Performance bound

Let n = mink∉S{|𝒪k|}. Suppose all noises are independent. Then, there exists a matrix C such that ̃ X1 ≤ C/n and ̃ Wt ≤ C/n. Thus,

ˆ J − J∗ ∈ 𝒫 ( T n) , Infinite horizon

Results extend to infjnite horizon setup understandard assumptions. For both discounted and average cost setup:

ˆ J − J∗ ∈ 𝒫 ( 1 n) , c.f. 𝒫 ( 1 √n) in MFG

Decentralized control with exchangeable agents–(Arabneydi and Mahajan)

16

Everyone follows the optimal strategy