SLIDE 1 Distributed Power Management under Limited Communication Na Li
Harvard University
Rutgers, 08/22/2017
SLIDE 2 Acknowledgment :
Harvard Univ: Guannan Qu, Chinwendu Enyioha, Vahid Tarokh KTH: Sindri Magnusson, Carlo Fishchione Caltech: Steven Low NREL: Changhong Zhao
- Univ. of Colorado, Boulder: Lijun Chen
SLIDE 3
A Vision of Future(IoT)?
All devices are connectedand coordinatedto ➢ Maximize social welfare ➢ Satisfy operation constraints
SLIDE 4
Distributed Optimization
Devices communicate, compute decisions, & communicate, … until reach an efficient point (Iterative, two-way comm)
SLIDE 5 Sensing, Communication, Computation
Sources: Gigaom
Intelligent Power Systems
SLIDE 6 Communication Challenges
▪ Lack reliability ▪ Unaccepted delays ▪ Vulnerable to malicious attacks ▪ Leak privacy ▪ Limited bandwidth (e.g. Power Line Comm.) ▪ High deployment cost ▪ …
How about reducing communication needs?
Package drop
SLIDE 7 Reduce communication in power management
▪ Extract information from physical measurements (Feedback) ▪ Recover information from local computation
- Load frequency control
- Power allocation in buildings/data centers
- Quantized dual gradient for power allocation
SLIDE 8 This talk: Limited communication in power systems
▪ Recover information from local computation
- Quantized dual gradient for power allocation
▪ Extra information from physical measurements (Feedback)
- Load frequency control
- Power allocation in buildings/data centers
SLIDE 9 Source: Graphic courtesy of North American Electric Reliability Corportion(NERC) Blue: Transmission Green: Distribution Black: Generation Generating station Transformer Transmission Customer 138kV or 230kV Transmission Lines
765, 500, 345, 230, 138 kV Substation Step-Down Transformer
Sub- transmission
26kV, 69kV Primary distribution 13kV, 4kV Secondary distribution 120kV, 240kV
Supply = Demand Power Systems
Storage DR appliances EV
SLIDE 10
2 1 2 ( ) ( ) : :
min
, ,
=
l l i i i l L i l l l i m i l i i ij ki l L i j i j k k i
C d D d d d P d D P P
➢ Balance total generation and load ➢ Keep frequency deviation small ➢ Minimize aggregate load disutility
Optimal Load Control
Distributed Optimization (e.g. ADMM) Applies. But…
Freq deviation disutility Power balance at each i
P
i m
P
ij
i
j
l i i l i d
D
i: control area /aggregated bus
SLIDE 11 Can loads response in real-time and closed-loop?
▪ Hard to get the real-time disturbance information ▪ Heavily relies on iterative communication
But…
➢ Network physical dynamics help!
P
i m
P
ij
i
j
l i i l i d
D
SLIDE 12 Physical dynamics: Swing Dynamics
i: Aggregated bus/control area/balance authority
Variables denote the deviations from their reference (steady state) values
P
i m
P
ij
i j
l i i l i
d D
frequency Mechanical power Inertia freq-sensitive load freq-insensitive loads power flow
SLIDE 13 DC approximation of power flow
- Lossless (resistance=0)
- Fixed voltage magnitudes
- Small deviation of angles
i
V
P
i m
P
ij
i
j
l i i l i
d D
Network dynamics
SLIDE 14 System Model Recap
P
i m
P
ij
i j
l i i l i
d D
Load Control
2 1 2 ( ) ( ) : :
min
, ,
=
l l i i i l L i l l l i m i l i i ij ki l L i j i j k k i
C d D d d d P d D P P
?
SLIDE 15 Load frequency control
System Dynamics
' 1
( ) ( ) for ( )
l l
d l l i d
d t C t l L i
Load Control
2 1 2 ( ) ( ) : :
min
, ,
=
l l i i i l L i l l l i m i l l i i ij ki l L i j i j k k i
C d D d d d P c d D P P
Optimal Load Control
Converge to the optimal solution (Primal-Dual Gradient Flow)
Primal-Dual Gradient Flow: Arrow etc 1958, Feijer and Paganini 2010, Zhao, Low etc 2013, You, Chen etc 2014, Cherukuri, Mallada, Cortes, 2015, etc
Dual Dynamics
SLIDE 16 ➢ Frequency: a locally measurable signal (“price” of imbalance) ➢ Completely decentralized; no explicit communication necessary
Load frequency control
control frequency load control load frequency control load frequency
SLIDE 17 Simulations
Dynamic simulation of IEEE 68-bus system (New England)
- Power System Toolbox (RPI)
- Detailed generation model
- Exciter model, power system
stabilizer model
Sample rate 250ms Step increase of loads on bus 1, 7, 27
SLIDE 18 59.964 Hz ERCOT threshold for freq control
Simulations
SLIDE 19
Simulations
SLIDE 20 Recap
P
i m
P
ij
i j
l i i l i
d D
2 1 2 ( ) ( ) : :
min
, ,
=
l l i i i l L i l l l i m i l i i ij ki l L i j i j k k i
C d D d d d P d D P P
Network Dynamics Optimization
SLIDE 21
, : ~
min ( , )
i i
i i i i x u i i ij j i i i i j i j i i i
f x g u A x B u C w h x u
Network Dynamics: Optimization:
How to design distributed, closed-loop controller u?
- [Li, Chen, Zhao, 2015]: Economic Automatic Generation Control
- [Zhang, Antonois, Li, 2016]: Sufficient and Necessary Conditions
- [Zhang, Malkawi, Li, 2016]: Thermal Control for HVAC
This Idea Extends to General Systems
SLIDE 22 This talk: limited communication
▪ Recover information from local computation
- Load frequency control
- Decentralized voltage control (distribution network)
(Qu, Li, Dahleh, 2014)
- Power allocation in buildings/data center
- Quantized dual gradient for power allocation
▪ Extract information from physical measurements (Feedback)
SLIDE 23 This talk: limited communication
▪ Extract information from physical measurements (Feedback)
- Load frequency control
- Decentralized voltage control (distribution network)
(Qu, Li, Dahleh, 2014)
- Power allocation in buildings/data center
▪ Recover information from local computation
- Quantized dual gradient for power allocation
SLIDE 24 This talk: limited communication
- Load frequency control
- Decentralized voltage control (distribution network)
(Qu, Li, Dahleh, 2014)
- Power allocation in buildings/data center
▪ Recover information from local computation
- Quantized dual gradient for power allocation
▪ Extract information from physical measurements (Feedback)
SLIDE 25 Power management within buildings
Control center coordinates power consumption of appliances
➢ Maximize utility, minimize cost ➢ Satisfy operation constraints, e.g. power capacity constraints
Control Center
SLIDE 26 Distributed Coordination under Two-way Comm.
Control Center
Step 1: Appliances to center: Power request Step 2: Center to appliances: Coordination signal
Assume perfect, reliable, and ubiquitous communication resources
Iterate
SLIDE 27 Q 1: Is it possible to use only one-way comm.? Q 2: How many bits are needed?
Reduce Communication Needs
Control Center
SLIDE 28
Not just for the buildings/grids
Data Center Multi-core Processor
Communication cost is much higher than computation [Bolsens I., 2002]
SLIDE 29
Power allocation problem
Control center User 1: User 2: User N: … p(t) q1(t)
SLIDE 30
A distributed algorithm: Dual gradient descent
Control center … User 1 User 2 User N p(t) q1(t)
SLIDE 31
A distributed algorithm: One-way comm.
Control center q1(t)
Replace this with true measurement of total power consump. Q(t).
Q(t) … User 1 User 2 User N p(t)
SLIDE 32 It might violates hard physical constraint
What’s the problem here?
Control center p(t)
Theorem: If the step size and initial setting are chosen properly, the constraint will hold all the time.
“Distributed resource allocation using one-way communication”, Magnusson, Enyioha, Li, Fischione, Tarokh, 2016
… User 1 User 2 User N
SLIDE 33 This Talk
▪ Extract information from physical measurements (Feedback) ▪ Recover information from local computation
- Load frequency control
- Power allocation in buildings/data center
- Quantized dual gradient for power allocation
SLIDE 34 This Talk
▪ Extract information from physical measurements (Feedback) ▪ Recover information from local computation
- Load frequency control
- Power allocation in buildings/data center
- Quantized dual gradient for power allocation
SLIDE 35
Control center … User 1 User 2 User N p(t)
Recall: Dual Gradient with One-way Comm.
SLIDE 36
Control center
Further reduce comm.
Just send one bit to indicate the sign
s(k)=0 or 1 … User 1 User 2 User N p(t)
SLIDE 37
Dual Gradient with One-bit One-way Comm.
Control center s(k)
This is quantized (normalized) gradient descent of dual function
… User 1 User 2 User N
Normalized Gradient Descent [Shor 1985]:
SLIDE 38
Quantized (Normalized) Gradient Descent (QGD)
Problem: QGD:
Definition: A quantization is proper (good) if and only if the algorithm is able to converge to the optimal points for any well- behaviored f, e.g. convex smooth function.
SLIDE 39
Quantized (Normalized) Gradient Descent (QGD)
Problem: Questions: A) How to determine a quantization is proper? B) What is the minimal size of the quantization to be proper? C) How to choose d(t) and ε(t) , given a good quantization? D) What are the connections between the fineness of the quantization to the convergence of the algorithm? QGD:
SLIDE 40 Descent direction
Red: Quantization direction Blue: Gradient direction
SLIDE 41 Proper quantization
“Convergence of limited communications gradient methods”, Magnusson, Enyioha, Li, Fischione, Tarokh, Transactions on Automatic Control, 2017 Red: Quantization direction Blue: Gradient direction
SLIDE 42 Convergence rate
➢ Finer quantization, larger stepsizeis allowed ➢ Finer quantization, faster convergence
One Stopping Criterion:
*More convergence results are available in the paper
SLIDE 43 Quantization size
Shannon,1959
SLIDE 44 Simulation
(3) Infinite bandwidth: normalized gradient (2) Infinite bandwidth: gradient
Message: Should incorporate the info. of gradient magnitude
SLIDE 45 Summary of QGD
Problem: A) Proper quantization = θ-cover B) Minimal size of proper quantization is K+1 C) Pick the quantized direction closest to gradient direction D) θ plays an important role in the convergence QGD:
“Convergence of limited communications gradient methods”, Magnusson, Enyioha, Li, Fischione, Tarokh, Transactions on Automatic Control, 2017
SLIDE 46
Extension to Constrained Case
Problem: QGD: Can the results of unconstrained case extend?
SLIDE 47 Θ-cover does not work for constrained case
Grey: Constraints set; : x(t)
Get stuck at non-optimal points Not necessarily a descent direction
SLIDE 48
Extension to Constrained Case
Problem: QGD:
SLIDE 49
Communication Complexity
Control center s(t) … User 1 User N
SLIDE 50 Communication Complexity
s(t) … User 1 User N Question: What minimal bits (in total) are needed to achieve ℇ-optimal solution? ℇ-complexity (a min max definition)
“Communication Complexity of Distributed Resource Allocation Optimization”, Magnusson, Enyioha, Li, Fischione, Tarokh, submitted, 2017
What optimal accuracy is able to be achieved using b-bits (in total)? b-complexity (a min max definition) Is there a simple coding scheme that reaches the complexity? Yes.
SLIDE 51 Summary: Limited Communication
▪ Extract information from physical measurements (Feedback) ▪ Recover information from local computation
- Load frequency control
- Power allocation in buildings/data center
- Quantized dual (normalized) gradient for power allocation
Question: How to choose the right algorithms and integrate them together? Tradeoff: Efficiency, Robustness, Communication, Sensing, Computation, Convergence speed
Thank you!
SLIDE 52 Accelerated Distributed Nesterov Gradient Descent
Guannan Qu, Na Li, John A. Paulson School of Engineering and Applied Sciences, Harvard University Problem Formulation
Local Communication
Background
Main Results 𝝂-strongly convex and 𝑴-smooth cost functions Centralized Gradient Methods for minimizing 𝒈
𝝂-strongly convex and 𝑴-smooth cost functions
Proposed Algorithm
Preliminaries
For more detailed results, see Guannan Qu and Na Li, "Accelerated Distributed Nesterov Gradient Descent," arXiv preprint arXiv:1705.07176(2017).
Communication Graph Connected
Convergence Rate:
Initialize:
Proposed Algorithm
Initialize:
Summary of Results
Gradient Descent Nesterov Gradient Descent (for 𝒈 𝝂-strongly
convex, 𝑴-smooth)
Nesterov Gradient Descent (for 𝒈 convex, 𝑴-smooth)
𝒈 type algo.
GD Nesterov GD Convex and 𝑴-Smooth 𝜈-Strongly Convex and 𝑀-Smooth Nesterov GD brings acceleration! Most distributed gradient methods are based on GD, and the convergence rate is not better than GD.
Can Nesterov momentum be used in Distributed Gradient Methods and accelerate the convergence? Summary of Results
convex and 𝑴-smooth cost functions
Simulation Main Results convex and 𝑴-smooth cost functions
(for 𝒈 convex and 𝑴-smooth, or 𝒈 𝝂-strongly convex and 𝑴-smooth)
Convergence Rates
SLIDE 53
Back up: Reverse and Forward Engineering
SLIDE 54 Power Balance, Stability: Dynamic model
sec min 5 min 60 min day year
primary freq control secondary freq control
Economic efficiency: power flow model
economic dispatch unit commitment
Frequency control
SLIDE 55 Power Balance, Stability: Dynamic model
Control
sec min 5 min 60 min day year
primary freq control secondary freq control
Economic efficiency: power flow model
Optimization economic dispatch unit commitment
Frequency control
- Traditionally this is done at the generation side.
➢ Goal: Balance the grid in an optimal (cost-effective) way
SLIDE 56 Loss of 2 nuclear plants in ERCOT
Kirby 2003 [ORNL/TM-2003/19]
(1 min) (10 min) deadband 59.964Hz
Frequency response
Imagine if there is 50%+ renewable generation
SLIDE 57
Advantages of load-side control
faster (no/low inertia!) no waste or emission more resources (large #) localize disturbance
Idea dates back to 1970s (Schweppe et al (1979, 1980))
SLIDE 58 Hierarchical Control at Different Time-scales
Physical Systems
Real Disturbance Optimization (slow)
Nominal Operating Point
Predicted Disturbance
Control (fast)
─
SLIDE 59 Imagine when we have 33%+ renewable generation …
(1 min) (10 min)
Challenges
Can the grid follow its
faster and more efficiently?
SLIDE 60 Distributed Ec Economically-Effi ficient Control
sec min 5 min 60 min day year
Distributed Economically-Efficient Control
economic dispatch unit commitment
Control Goals:
Rebalance power Stabilize frequency Restore nominal frequency Re-dispatch power optimally (min cost/disutility)
SLIDE 61
Distributed Ec Economically-Efficient Control
Advantages: For the control: Stable and more economically-efficient For the optimization: Save sensing/communication/computation
Optimal Power Dispatch Automatically solve Physical Systems Real Disturbance Redesigned Control (fast)
SLIDE 62
System m Dynami mics cs & Existing Control
Problem setup
Example:
Frequency dynamics, Voltage dynamics Primary/Secondary frequency/Voltage control Inverter dynamics/control (Model limitation: linear approximation)
SLIDE 63
Optimization Problem
Problem setup
Example:
Economic Dispatch, Optimal Load Response.
SLIDE 64 System m Dynami mics cs & Existing Control Economica cal Effici cient State
Problem setup
How to (re)design the control u to reach the optimal solution?
- Distributed
- Closed-loop (state-feedback)
Tool: reverse/forward engineering
SLIDE 65
System m Dynami mics cs & Existing Control Economica cally Effici cient State
Reverse
Optimization Problem
solve Analogy ?
SLIDE 66
System m Dynami mics cs & Existing Control Economica cally Effici cient State
Forward
Optimization Problem
Equivalent
Modified
SLIDE 67 System m Dynami mics cs & ModifiedControl Economica cally Effici cient State
Forward
Optimization Problem
Equivalent
Modified
solve
( ) ( ) ( )
, ( ), , , :
i ij j i i i i j N i i ij j ij j i i i j N i j N i i j j j i
x A x B u C w u D x E u Fw g z z f x u z j N
& & &
SLIDE 68 System m Dynami mics cs & Existing Control Economica cal Effici cient State
Sufficient and necessary conditions are available at
[Zhang, Antonois, Li, 2015, 2016 ]
Distributed Ec Economically-Efficient Control
SLIDE 69
Simulation of IEEE a 68-bus system
Application to optimal load control for primary freq. control
SLIDE 70
Application to automatic generation control
Simulation of a 4-bus system
SLIDE 71
Distributed Ec Economically-Effi ficient Control
Advantages: For the control: Stable and more economically-efficient For the optimization: A large amount of sensing, comm. and comp. is saved
Thank you!
Optimal Power Dispatch Automatically solve Physical Systems Real Disturbance Redesigned Control (fast)
SLIDE 72 System Dynamics
( ) : :
1 ( ) ( ) ( ) ( ) ( ) ( )
m i l i i i ij ji l L i j i j k k i i ij ij i j
d t D t P P t P t M P b t t
➢ Frequency: a locally measurable signal (“price” of imbalance) ➢ Completely decentralized; no explicit communication necessary
Load frequency control
' 1
( ) ( ) for ( )
l l
d l l i d
d t C t l L i
Load Control
SLIDE 73
Dual Gradient Algorithms
Step 1: Each appliance i updates the power request qi (t) & sends to the control center Step 2: Control center updates the signal p(t) & sends to each appliance
Replace this with true measurement of total power consump. Q(t).
SLIDE 74
Normalized Gradient
Problem: Questions: Reduce the communication?
Primal Problem e.g. Network constraints; Multi-resource allocation
Gradient Descent: Normalized Gradient Descent:
SLIDE 75
Quantized (Normalized) Gradient Descent (QGD)
Problem: QGD:
SLIDE 76
Proper Quantization
A quantization is proper if and only if the algorithm is able to converge to the optimal points for any well-behaviored f.
SLIDE 77
Quantized (Normalized) Gradient Descent (QGD)
Problem: Questions: A) How to determine a quantization is proper? B) What is the minimal size of the quantization to be proper? C) How to choose d(t) and ε(t) , Given a proper quantization? D) What are the connections between the fineness of the quantization to the convergence of the algorithm? QGD:
SLIDE 78 Descent direction
Red: Quantization direction Blue: Gradient direction
SLIDE 79 Proper quantization
“Convergence of limited communications gradient methods”, Magnusson, Heal, Enyioha, Li, Fischione, Tarokh, ACC submitted
SLIDE 80 Convergence rate
“Convergence of limited communications gradient methods”, Magnusson, Heal, Enyioha, Li, Fischione, Tarokh, ACC submitted
➢ Finer quantization, larger stepsizeis allowed ➢ Finer quantization, faster convergence
Stopping Criterion:
SLIDE 81 “Convergence of limited communications gradient methods”, Magnusson, Heal, Enyioha, Li, Fischione, Tarokh, ACC submitted
Stopping Criterion:
Convergence rate
SLIDE 82 Quantization size
Red: Quantization direction
SLIDE 83 Quantization size
Shannon,1959
SLIDE 84 Simulation
(3) Infinite bandwidth: normalized gradient (2) Infinite bandwidth: gradient
Message: Should incorporate the info. of gradient magnitude
SLIDE 85
Summary of QGD
Problem: A) Proper quantization = θ-cover B) Minimal size of proper quantization is N+1 C) Pick the quantized direction closest to gradient direction D) θ plays an important role in determines the convergence QGD:
SLIDE 86 Proper quantization
“Convergence of limited communications gradient methods”, Magnusson, Heal, Enyioha, Li, Fischione, Tarokh, ACC submitted
Finer quantization, larger stepsizeis allowed Finer quantization, faster convergence
SLIDE 87
Communication Complexity
Control center s(k) … User 1 User N Control center s(t) … User 1 User N