Distributed demand control in power grids and ODEs for Markov - - PowerPoint PPT Presentation
Distributed demand control in power grids and ODEs for Markov - - PowerPoint PPT Presentation
Distributed demand control in power grids and ODEs for Markov decision processes Second Conference on the Mathematics of Energy Markets Wolfgang Pauli Institute, Vienna, 4-9 July 2017 Ana Bu si c Inria Paris D epartement
Challenges
Challenges of renewable power generation
Impact of wind and solar on net-load at CAISO Ramp limitations cause price-spikes
Price spike due to high net-load ramping need when solar production ramped out Negative prices due to high mid-day solar production
1200 15 2 4 19 17 21 23 27 25 800 1000 600 400 200
- 200
GW GW Toal Load
Load and Net-load Renewable Generation
Total Wind Net-load: Toal Load, less Wind and Solar $/MWh 24 hrs 24 hrs Peak ramp Peak
Peak ramp Peak
Total Solar
Challenges
Challenges of renewable power generation
Impact of wind and solar on net-load at CAISO Ramp limitations cause price-spikes
Price spike due to high net-load ramping need when solar production ramped out Negative prices due to high mid-day solar production
1200 15 2 4 19 17 21 23 27 25 800 1000 600 400 200
- 200
GW GW Toal Load
Load and Net-load Renewable Generation
Total Wind Net-load: Toal Load, less Wind and Solar $/MWh 24 hrs 24 hrs Peak ramp Peak
Peak ramp Peak
Total Solar
Jan 01 Jan 02 Jan 03 Jan 04 Jan 05 Jan 06 GW 1 2 3 4
GW (t) = Wind generation in BPA, Jan 2015
Ramps
Challenges
Challenges of renewable power generation
Balancing control loop wind and solar volatility seen as disturbance grid level measurements: scalar function of time (ACE) a linear combination
- f frequency deviation and the tie-line error (power missmatch between the
sceduled and actual power out of the balancing region) compensation Gc designed by a balancing authority In many cases control loops are based on standard PI (proportional-integral) control design.
Compensation
+
Disturbances Measurements GRID Actuation
Gc H Gp
∆P delivered = Ha + Hb + · · ·
Y (t) U(t)
Challenges
Challenges of renewable power generation
Increasing needs for ancillary services
20 40 60 80 100 120 140 160 t/hour Reference (from Balancing Authority)
Balancing Authority Ancillary Services Grid Voltage Frequency Phase
Σ −
In the past, provided by the generators - high costs!
Challenges
Tracking Grid Signal with Residential Loads
Tracking objective:
20 40 60 80 100 120 140 160 t/hour Reference (from Balancing Authority)
Balancing Authority Ancillary Services Grid Voltage Frequency Phase
Σ −
Prior work Deterministic centralized control: Sanandaji et al. 2014 [HICSS], Biegel et al. 2013 [IEEE TSG] Randomized control: Mathieu, Koch, Callaway 2013 [IEEE TPS] (decisions at the BA) Meyn, Barooah, B., Chen, Ehren 2015 [IEEE TAC] (local decisions, restricted load models)
Challenges
Tracking Grid Signal with Residential Loads
Example: 20 pools, 20 kW max load
Each pool consumes 1kW when operating 12 hour cleaning cycle each 24 hours Power Deviation:
20 40 60 80 100 120 140 160 10
- 10
t/hour
kW 20 pools
Input ζt
- 3
3
Output deviation Reference
Nearly Perfect Service from Pools
Meyn, Barooah, B., Chen, Ehren 2015 [IEEE TAC]
using an extension/reinterpretation of Todorov 2007 [NIPS] (linearly solvable MDPs)
Challenges
Tracking Grid Signal with Residential Loads
Example: 300,000 pools, 300 MW max load
Each pool consumes 1kW when operating 12 hour cleaning cycle each 24 hours Power Deviation:
20 40 60 80 100 120 140 160 t/hour
Output deviation Reference
Input ζt
- 3
3 −100 −50 50 100
MW 300,000 pools
Nearly Perfect Service from Pools What About Other Loads?
Meyn, Barooah, B., Chen, Ehren 2015 [IEEE TAC]
using an extension/reinterpretation of Todorov 2007 [NIPS] (linearly solvable MDPs)
Demand Dispatch
Control Goals and Architecture
Macro control
High-level control layer: BA or a load aggregator. The balancing challenges are of many different categories and time-scales: Automatic Generation Control (AGC); time scales of seconds to 20 minutes. Balancing reserves. In the Bonneville Power Authority, the balancing reserves include both AGC and balancing on timescales of many hours. Balancing on a slower time-scale is achieved through real time markets in some other regions of the U.S. Contingencies (e.g., a generator outage) Peak shaving Smoothing ramps from solar or wind generation
Demand Dispatch
Control Goals and Architecture
Local Control: decision rules designed to respect needs of load and grid
Demand Dispatch: Power consumption from loads varies automatically to provide service to the grid, without impacting QoS to the consumer
Power Grid Control
Water Pump Batteries Coal Gas Turbine
BP BP BP C BP BP Voltage Frequency Phase
H C
Σ − Actuator feedback loop
A
LOAD
Local feedback loop Local Control Load i ζt Y i
t
U i
t
Xi
t
Grid signal Local decision Power deviation
- Min. communication: each load monitors its state and a regulation signal
from the grid. Aggregate must be controllable: randomized policies for finite-state loads.
Mean Field Model
Load Model
Controlled Markovian Dynamics
...
Load 1
BA
Reference (MW)
Load 2 Load N
ζ r
+
Gc
Power Consumption (MW)
Discrete time: ith load Xi(t) evolves on finite state space X Each load is subject to common controlled Markovian dynamics. Signal ζ = {ζt} is broadcast to all loads Controlled transition matrix {Pζ : ζ ∈ R}: P{Xi
t+1 = x′ | Xi t = x, ζt = ζ} = Pζ(x, x′)
Questions
- How to analyze aggregate of similar loads?
- Local control design?
Aggregate model
Mean Field Model
How to analyze aggregate?
Mean field model
N loads running independently, each under the command ζ. Empirical Distributions: µN
t (x) = 1
N
N
- i=1
I{Xi(t) = x}, x ∈ X
U(x) power consumption in state x,
yN
t = 1
N
N
- i=1
U(Xi
t) =
- x
µN
t (x)U(x)
Mean-field model:
via Law of Large Numbers for martingales
µt+1 = µtPζt, yt = µt, U ζt = ft(y0, . . . , yt) by design
Local Control Design
Local Control Design
Local Design
Goal: Construct a family of transition matrices {Pζ : ζ ∈ R}
Myopic Design: P myop
ζ
(x, x′) := P0(x, x′) exp
- ζU(x′) − Λζ(x)
- with Λζ(x) := log
- x′ P0(x, x′) exp
- ζU(x′)
- the normalizing constant.
Local Control Design
Local Design
Goal: Construct a family of transition matrices {Pζ : ζ ∈ R}
Myopic Design: P myop
ζ
(x, x′) := P0(x, x′) exp
- ζU(x′) − Λζ(x)
- with Λζ(x) := log
- x′ P0(x, x′) exp
- ζU(x′)
- the normalizing constant.
Exponential family design: Pζ(x, x′) := P0(x, x′) exp
- hζ(x, x′) − Λhζ(x)
- with
hζ(x, x′) = ζH0(x, x′). The choice of H0 will typically correspond to the linearization of a more advanced design around the value ζ = 0 (or some other fixed value of ζ).
Local Control Design
Local Design
Goal: Construct a family of transition matrices {Pζ : ζ ∈ R}
Individual Perspective Design Consider a finite-time-horizon optimization problem: For a given terminal time T, let p0 denote the pmf on strings of length T, p0(x1, . . . , xT ) =
T −1
- i=0
P0(xi, xi+1) , where x0 ∈ X is assumed to be given. The scalar ζ ∈ R is interpreted as a weighting parameter in the following definition of total welfare. For any pmf p, WT (p) = ζEp T
- t=1
U(Xt)
- − D(pp0)
where the expectation is with respect to p, and D denotes relative entropy: D(pp0) :=
- x1,...,xT
log p(x1, . . . , xT ) p0(x1, . . . , xT )
- p(x1, . . . , xT )
Local Control Design
Local Design
Goal: Construct a family of transition matrices {Pζ : ζ ∈ R}
It is easy to check that the myopic design is an optimizer for the horizon T = 1, P myop
ζ
(x0, ·) ∈ arg max
p
W1(p). The infinite-horizon mean welfare is denoted, η∗
ζ = lim T →∞
1 T WT (p∗
T )
Explicit construction via eigenvector problem: Pζ(x, y) = 1 λ v(y) v(x) ˆ Pζ(x, y) , x, y ∈ X, where ˆ Pζv = λv, ˆ Pζ(x, y) = exp(ζU(x))P0(x, y)
Extension/reinterpretation of [Todorov 2007] + [Kontoyiannis & Meyn 200X]
Local Control Design
Example: pool pumps
How Pools Can Help Regulate The Grid
1,5KW 400V
Needs of a single pool ⊲ Filtration system circulates and cleans: Average pool pump uses 1.3kW and runs 6-12 hours per day, 7 days per week ⊲ Pool owners are oblivious, until they see frogs and algae ⊲ Pool owners do not trust anyone: Privacy is a big concern Single pool dynamics: X = {(m, j) : m ∈ {0, 1}, j ∈ {1, 2, . . . , I}}.
1 2
. . .
On Off 1 2
. . .
I −1 I I I −1
Local Control Design
Tracking Grid Signal with Residential Loads
Example: 20 pools, 20 kW max load
Each pool consumes 1kW when operating 12 hour cleaning cycle each 24 hours Power Deviation:
20 40 60 80 100 120 140 160 10
- 10
t/hour
kW 20 pools
Input ζt
- 3
3
Output deviation Reference
Nearly Perfect Service from Pools
Meyn et al. 2013 [CDC], Meyn et al. 2015 [IEEE TAC]
Local Control Design
Tracking Grid Signal with Residential Loads
Example: 300,000 pools, 300 MW max load
Each pool consumes 1kW when operating 12 hour cleaning cycle each 24 hours Power Deviation:
20 40 60 80 100 120 140 160 t/hour
Output deviation Reference
Input ζt
- 3
3 −100 −50 50 100
MW 300,000 pools
Nearly Perfect Service from Pools
Meyn et al. 2013 [CDC], Meyn et al. 2015 [IEEE TAC]
Local Control Design
Range of services provided by pools
Example: 10,000 pools, 10 MW max load Reference Power Deviation
20 40 60 80 100 120 140 160
MW
- 4
- 2
2
- 15
15
t/hour
12 hr/day cycle
ζ
Local Control Design
Local Design
Extending local control design to include exogenous disturbances
State space for a load model: X = Xu × Xn. Components Xn are not subject to direct control (e.g. impact of the weather on the climate of a building).
Local Control Design
Local Design
Extending local control design to include exogenous disturbances
State space for a load model: X = Xu × Xn. Components Xn are not subject to direct control (e.g. impact of the weather on the climate of a building). Conditional-independence structure of the local transition matrix P(x, x′) = R(x, x′
u)Q0(x, x′ n),
x′ = (x′
u, x′ n)
Q0 models uncontroled load dynamics and exogenous disturbances.
Local Control Design
Local Design
Goal: Construct a family of transition matrices {Pζ : ζ ∈ R}
Nominal model A Markovian model for an individual load, based on its typical behavior. Finite state space X = {x1, . . . , xd}; Transition matrix P0, with unique invariant pmf π0.
Local Control Design
Local Design
Goal: Construct a family of transition matrices {Pζ : ζ ∈ R}
Nominal model A Markovian model for an individual load, based on its typical behavior. Finite state space X = {x1, . . . , xd}; Transition matrix P0, with unique invariant pmf π0. Common structure for design The family of transition matrices used for distributed control is of the form: Pζ(x, x′) := P0(x, x′) exp
- hζ(x, x′) − Λhζ(x)
- with hζ continuously differentiable in ζ, and the normalizing constant
Λhζ(x) := log
- x′
P0(x, x′) exp
- hζ(x, x′)
Local Control Design
Local Design
Goal: Construct a family of transition matrices {Pζ : ζ ∈ R}
Nominal model A Markovian model for an individual load, based on its typical behavior. Finite state space X = {x1, . . . , xd}; Transition matrix P0, with unique invariant pmf π0. Common structure for design The family of transition matrices used for distributed control is of the form: Pζ(x, x′) := P0(x, x′) exp
- hζ(x, x′) − Λhζ(x)
- with hζ continuously differentiable in ζ, and the normalizing constant
Λhζ(x) := log
- x′
P0(x, x′) exp
- hζ(x, x′)
- Assumption: for all x ∈ X, x′ = (x′
u, x′ n) ∈ X, hζ(x, x′) = hζ(x, x′ u).
Local Control Design
Local Design
Goal: Construct a family of transition matrices {Pζ : ζ ∈ R}
Construction of the family of functions {hζ : ζ ∈ R} Step 1: The specification of a function H that takes as input a transition matrix. H = H(P) is a real-valued function on X × X. Step 2: The families {Pζ} and {hζ} are defined by the solution to the ODE:
d dζ hζ = H(Pζ),
ζ ∈ R, in which Pζ is determined by hζ through: Pζ(x, x′) := P0(x, x′) exp
- hζ(x, x′) − Λhζ(x)
- The boundary condition: h0 ≡ 0.
Local Control Design
Local Design
Extending local control design to include exogenous disturbances
For any function H◦ : X → R, one can define H(x, x′
u) =
- x′
n
Q0(x, x′
n)H◦(x′ u, x′ n)
(1) Then functions {hζ} satisfy hζ(x, x′
u) =
- x′
n
Q0(x, x′
n)h◦ ζ(x′ u, x′ n),
for some h◦
ζ : X → R. Moreover, these functions solve the d-dimensional ODE, d dζ h◦ ζ = H◦(Pζ),
ζ ∈ R, with boundary condition h◦
0 ≡ 0.
Local Control Design
Individual Perspective Design
Local welfare function: Wζ(x, P) = ζU(x) − D(PP0),
where D denotes relative entropy: D(PP0) =
x′ P(x, x′) log
P (x,x′)
P0(x,x′)
.
Markov Decision Process: lim supT →∞
1 T
T
t=1 E[Wζ(Xt, P)]
Average reward optimization equation (AROE): max
P
- Wζ(x, P) +
- x′
P(x, x′)h∗
ζ(x′)
- = h∗
ζ(x) + η∗ ζ
where P(x, x′) = R(x, x′
u)Q0(x, x′ n),
x′ = (x′
u, x′ n)
Local Control Design
Individual Perspective Design
ODE method for IPD design: Family {Pζ}: Pζ(x, x′) := P0(x, x′) exp
- hζ(x, x′) − Λhζ(x)
- Functions {hζ}: hζ(x, x′
u) = x′
n Q0(x, x′
n)h◦ ζ(x′ u, x′ n),
for h◦
ζ : X → R solutions of the d-dimensional ODE, d dζ h◦ ζ = H◦(Pζ),
ζ ∈ R, with boundary condition h◦
0 ≡ 0.
H◦
ζ (x) = d dζ h◦ ζ(x) = x′[Zζ(x, x′) − Zζ(x◦, x′)]U(x′),
x ∈ X,
where Z = [I − P + 1 ⊗ π]−1 = ∞
n=0[Pζ − 1 ⊗ π]n is the fundamental matrix.
Local Control Design
Example: Thermostatically Controlled Loads
refrigerators, water heaters, air-conditioning . . . TCLs are already equipped with primitive “local intelligence” based on a deadband (or hysteresis interval) The state process for a TCL at time t: X(t) = (Xu(t), Xn(t)) = (m(t), Θ(t)) , where m(t) ∈ {0, 1} denotes the power mode (“1” indicating the unit is on), and Θ(t) the inside temperature of the load Exogenous disturbances: ambient temperature, and usage
Local Control Design
Example: Thermostatically Controlled Loads
The standard ODE model of a water heater is the first-order linear system, d dtΘ(t) = −λ[Θ(t) − Θa(t)] + γm(t) − α[Θ(t) − Θin(t)]f(t) , Θ(t) temperature of the water in the tank Θin(t) temperature of the cold water entering the tank f(t) flow rate of hot water from the WH m(t) power mode of the WH (“on” indicated by m(t) = 1). Deterministic deadband control: Θ(t) ∈ [Θ−, Θ+] Nominal model for local control design: based on the specification of two CDFs for the temperature at which the load turns on or turns off
F (θ) Θ− Θ+ θ⊕ 1 F ⊕(θ θ ) Θ− Θ+ θ 1 ̺
Local Control Design
Example: Thermostatically Controlled Loads
Discrete-time control. At time instance k, if the water heater is on (i.e., m(k) = 1), then it turns off with probability, p⊖(k + 1) = [F ⊖(Θ(k + 1)) − F ⊖(Θ(k))]+ 1 − F ⊖(Θ(k)) where [x]+ := max(0, x) for x ∈ R; Similarly, if the load is off, then it turns on with probability p⊕(k + 1) = [F ⊕(Θ(k)) − F ⊕(Θ(k + 1))]+ F ⊕(Θ(k)) The nominal behavior of the power mode can be expressed P{m(k) = 1 | θ(k − 1), θ(k), m(k − 1) = 0} = p⊕(k) P{m(k) = 0 | θ(k − 1), θ(k), m(k − 1) = 1} = p⊖(k)
Local Control Design
Example: Thermostatically Controlled Loads
Myopic design - exponential tilting of these distributions: p⊕
ζ (k) := P{m(k) = 1 | θ(k − 1), θ(k), m(k − 1) = 0, ζ(k − 1) = ζ}
= p⊕(k)eζ p⊕(k)eζ + 1 − p⊕(k) p⊖
ζ (k) = P{m(k) = 0 | θ(k − 1), θ(k), m(k − 1) = 1, ζ(k − 1) = ζ}
= p⊖(k) p⊖(k) + (1 − p⊖(k))eζ If p⊕
0 (k) > 0, then the probability p⊕ ζ (k) is strictly increasing in ζ, approaching 1
as ζ → ∞; it approaches 0 as ζ → −∞, if p⊕
0 (k) < 1.
Local Control Design
Example: Thermostatically Controlled Loads
System identification
d dtΘ(t) = −λ[Θ(t) − Θa(t)] + γm(t) − α[Θ(t) − Θin(t)]f(t) ,
Θ(t) temperature of the water in the tank Θin(t) temperature of the cold water entering the tank f(t) flow rate of hot water from the WH m(t) power mode of the WH (“on” indicated by m(t) = 1).
- Temp. Ranges
ODE Pars.
- Loc. Control
Θ+ ∈ [118, 122] F λ ∈ [8, 12.5] × 10−6 Ts = 15 sec Θ− ∈ [108, 112] F γ ∈ [2.6, 2.8] × 10−2 κ = 4 Θa ∈ [68, 72] F α ∈ [6.5, 6.7] × 10−2 ̺ = 0.8 Θin ∈ [68, 72] F Pon = 4.5 kW θ0 = Θ−
Heterogeneous population: 100 000 WHs simulated by uniform sampling of the values in the table Usage data from Oakridge National Laboratory (35WHs over 50 days)
Local Control Design
Tracking performance
and the controlled dynamics for an individual load
100,000 water-heaters When on, individual load consumes 4, 5 kW With no usage, approx. 2% duty cycle, avg. power consumption 10MW.
80 100 120 140 5 10 15 20 5 10 15 20 80 100 120 140 80 100 120 140 50 100
- 50
50
MW MW MW
- 10
10
Nominal power consumption Tracking Tracking Typical Load Response
temp (F) temp (F) temp (F)
rt ≡ 0
No reg:
|rt| ≤ 40 MW |rt| ≤ 10 MW
Load On Load On Load On (hrs)
t
(hrs)
t
BPA Reference: Power Deviation
rt
Local Control Design
Tracking performance
Potential for contingency reserves and ramping
ζ
- 8
- 6
- 4
- 2
2 4 6 8
Power deviation (MW)
- 6
- 5
- 4
- 3
- 2
- 1
1 2
- 8
- 6
- 4
- 2
2 4 6 8
- 6
- 4
- 2
2
Tracking two sawtooth waves with 100,000 water heaters: average power consumption 8MW
5 10 15 20 5 10 15 20
Reference Power Deviation ζ ζ (hrs)
t
Local Control Design
Tracking performance
and the controlled dynamics for an individual load
Heterogeneous setting: 40 000 loads per experiment; 20 different load types in each case Lower plots show the on/off state for a typical load
Stochastic Output Mean-field Model BPA balancing reserves (filtered/scaled)
Open Loop Tracking (MW) Power state
- 15
- 10
- 5
5 10 1 15
- 6
- 4
- 2
2 4 6
- 4
- 2
2 4 6
Refrigerators Fast Electric Water Heaters Slow Electric Water Heaters
24 hrs 24 hrs 6 hrs
Nominal Demand Dispatch
Local Control Design
Unmodeled dynamics
Setting: 0.1% sampling, and
1
Heterogeneous population of loads
2
Load i overrides when QoS is out of bounds
0.5 −10 −5 5 10
MW
100 120 110 130
- pt out %
N = 300,000 N = 30,000
100 120 110 130
Closed-loop tracking
−100 −50 50 100 0.5
Output deviation Reference
t/hour t/hour
PI control: ζt = kP et + kIeI
t ,
et = rt − yt, eI
t = t s=0 es
Conclusions and Future Directions
Control Architecture
Frequency Allocation for Demand Dispatch
10-2 10-1 100 101 Frequency (rad/s) 10-5 10-4 10-3 Frequency (rad/s) Magnitude (dB)
- 15
- 10
- 5
5 10 15 20 Phase (deg)
- 90
- 45
45 G r i d T r a n s f e r F u nc t i
- n
Uncertainty Here Fans in Commercial Buildings Residential Water Heaters Refrigerators Water Pumping Pool Pumps Chiller Tanks
Bandwidth centered around its natural cycle
Reference (from Bonneville Power Authority)
10,000 pools
Output deviation
−300 −200 −100 100 200 300
Tracking BPA Regulation Signal (MW)
20 40 60 80 100 120 140 160 t/hour 20 40 60 80 100 120 140 160
Conclusions and Future Directions
Conclusions
Virtual storage from flexible loads
Approach: creating Virtual Energy Storage through direct control of flexible loads
- helping the grid while respecting user QoS
Conclusions and Future Directions
Conclusions
Virtual storage from flexible loads
Approach: creating Virtual Energy Storage through direct control of flexible loads
- helping the grid while respecting user QoS
Challenges: − Stability properties for IPD and myopic design? − Information Architecture: ζt = f(?) Different needs for communication, state estimation and forecast. − Capacity estimation (time varying) − Network constraints − Resource optimization & learning Integrating VES with traditional generation and batteries. − Economic issues Contract design, aggregators, markets . . .
Conclusions and Future Directions
Conclusions
Thank You!
Conclusions and Future Directions
References: this talk
- A. Buˇ
si´ c and S. Meyn. Distributed randomized control for demand dispatch. 55th IEEE Conference on Decision and Control, 2016.
- A. Buˇ
si´ c and S. Meyn. Ordinary Differential Equation Methods For Markov Decision Processes and Application to Kullback-Leibler Control Cost. arXiv:1605.04591v2. Oct 2016.
- S. Meyn, P. Barooah, A. Buˇ
si´ c, Y. Chen, and J. Ehren. Ancillary Service to the Grid Using Intelligent Deferrable Loads. IEEE Trans. Automat. Contr., 60(11): 2847-2862, 2015.
- P. Barooah, A. Buˇ
si´ c, and S. Meyn. Spectral Decomposition of Demand-Side Flexibility for Reliable Ancillary Services in a Smart Grid. 48th Annual Hawaii International Conference on System Sciences (HICSS). 2015.
- A. Buˇ
si´ c and S. Meyn. Passive dynamics in mean field control. 53rd IEEE Conf. on Decision and Control (CDC) 2014.
Conclusions and Future Directions
References: related
Demand dispatch:
- Y. Chen, A. Buˇ
si´ c, and S. Meyn. Individual risk in mean-field control models for decentralized control, with application to automated demand response. 53rd IEEE Conf. on Decision and Control (CDC), 2014.
- Y. Chen, A. Buˇ
si´ c, and S. Meyn. State Estimation and Mean Field Control with Application to Demand
- Dispatch. 54rd IEEE Conference on Decision and Control (CDC) 2015.
- J. L. Mathieu. Modeling, Analysis, and Control of Demand Response Resources. PhD thesis, Berkeley,
2012.
- J. L. Mathieu, S. Koch, D. S. Callaway, State Estimation and Control of Electric Loads to Manage
Real-Time Energy Imbalance, IEEE Transactions on Power Systems, 28(1):430-440, 2013.
Markov processes:
- I. Kontoyiannis and S. P. Meyn. Spectral theory and limit theorems for geometrically ergodic Markov
- processes. Ann. Appl. Probab., 13:304–362, 2003.
- I. Kontoyiannis and S. P. Meyn. Large deviations asymptotics and the spectral theory of multiplicatively
regular Markov processes. Electron. J. Probab., 10(3):61–123 (electronic), 2005.
- E. Todorov. Linearly-solvable Markov decision problems. In B. Sch¨
- lkopf, J. Platt, and T. Hoffman,
editors, Advances in Neural Information Processing Systems, (19) 1369–1376. MIT Press, Cambridge, MA, 2007.
Conclusions and Future Directions
Mean Field Model
Linearized Dynamics
Mean-field model: µt+1 = µtPζt, yt = µt, U ζt = ft(y0, . . . , yt) Linear state space model: Φt+1 = AΦt + Bζt γt = CΦt Interpretations: |ζt| is small, and π denotes invariant measure for P0.
- Φt ∈ R|X|,
a column vector with Φt(x) ≈ µt(x) − π(x), x ∈ X
- γt ≈ yt − y0; deviation from nominal steady-state
- A = P T
0 , C = U T, and input dynamics linearized:
B
T = d
dζ πPζ
- ζ=0