[PPT] - Distributed Power Management under Limited Communication Na Li PowerPoint Presentation

SLIDE 1

Distributed Power Management under Limited Communication Na Li

Harvard University

Rutgers, 08/22/2017

SLIDE 2

Acknowledgment :

Harvard Univ: Guannan Qu, Chinwendu Enyioha, Vahid Tarokh KTH: Sindri Magnusson, Carlo Fishchione Caltech: Steven Low NREL: Changhong Zhao

Univ. of Colorado, Boulder: Lijun Chen

SLIDE 3

A Vision of Future(IoT)?

All devices are connectedand coordinatedto ➢ Maximize social welfare ➢ Satisfy operation constraints

SLIDE 4

Distributed Optimization

Devices communicate, compute decisions, & communicate, … until reach an efficient point (Iterative, two-way comm)

SLIDE 5

Sensing, Communication, Computation

Sources: Gigaom

Intelligent Power Systems

SLIDE 6

Communication Challenges

▪ Lack reliability ▪ Unaccepted delays ▪ Vulnerable to malicious attacks ▪ Leak privacy ▪ Limited bandwidth (e.g. Power Line Comm.) ▪ High deployment cost ▪ …

How about reducing communication needs?

Package drop

SLIDE 7

Reduce communication in power management

▪ Extract information from physical measurements (Feedback) ▪ Recover information from local computation

Load frequency control
Power allocation in buildings/data centers
Quantized dual gradient for power allocation

SLIDE 8

This talk: Limited communication in power systems

▪ Recover information from local computation

Quantized dual gradient for power allocation

▪ Extra information from physical measurements (Feedback)

Load frequency control
Power allocation in buildings/data centers

SLIDE 9

Source: Graphic courtesy of North American Electric Reliability Corportion(NERC) Blue: Transmission Green: Distribution Black: Generation Generating station Transformer Transmission Customer 138kV or 230kV Transmission Lines

765, 500, 345, 230, 138 kV Substation Step-Down Transformer

Sub- transmission

26kV, 69kV Primary distribution 13kV, 4kV Secondary distribution 120kV, 240kV

Supply = Demand Power Systems

Storage DR appliances EV

SLIDE 10

 

2 1 2 ( ) ( ) : :

min

ver

, ,

s. t.

=

l l i i i l L i l l l i m i l i i ij ki l L i j i j k k i

C d D d d d P d D P P   

   

             

    

➢ Balance total generation and load ➢ Keep frequency deviation small ➢ Minimize aggregate load disutility

Optimal Load Control

Distributed Optimization (e.g. ADMM) Applies. But…

Freq deviation disutility Power balance at each i

P

i m

P

ij

i

j

l i i l i d

D







i: control area /aggregated bus

SLIDE 11

Can loads response in real-time and closed-loop?

▪ Hard to get the real-time disturbance information ▪ Heavily relies on iterative communication

But…

➢ Network physical dynamics help!

P

i m

P

ij

i

j

l i i l i d

D







SLIDE 12

Physical dynamics: Swing Dynamics

i: Aggregated bus/control area/balance authority

Variables denote the deviations from their reference (steady state) values

P

i m

P

ij

i j

l i i l i

d D







frequency Mechanical power Inertia freq-sensitive load freq-insensitive loads power flow

SLIDE 13

DC approximation of power flow

Lossless (resistance=0)
Fixed voltage magnitudes
Small deviation of angles

i

V

P

i m

P

ij

i

j

l i i l i

d D







Network dynamics

SLIDE 14

System Model Recap

P

i m

P

ij

i j

l i i l i

d D 







Load Control

 

2 1 2 ( ) ( ) : :

min

ver

, ,

s. t.

=

l l i i i l L i l l l i m i l i i ij ki l L i j i j k k i

C d D d d d P d D P P   

   

             

    

?

SLIDE 15

Load frequency control

System Dynamics

 

' 1

( ) ( ) for ( )

l l

d l l i d

d t C t l L i 



     

Load Control

   

2 1 2 ( ) ( ) : :

min

ver

, ,

s. t.

=

l l i i i l L i l l l i m i l l i i ij ki l L i j i j k k i

C d D d d d P c d D P P   

   

             

    

Optimal Load Control

Converge to the optimal solution (Primal-Dual Gradient Flow)

Primal-Dual Gradient Flow: Arrow etc 1958, Feijer and Paganini 2010, Zhao, Low etc 2013, You, Chen etc 2014, Cherukuri, Mallada, Cortes, 2015, etc

Dual Dynamics

SLIDE 16

➢ Frequency: a locally measurable signal (“price” of imbalance) ➢ Completely decentralized; no explicit communication necessary

Load frequency control

control frequency load control load frequency control load frequency

SLIDE 17

Simulations

Dynamic simulation of IEEE 68-bus system (New England)

Power System Toolbox (RPI)
Detailed generation model
Exciter model, power system

stabilizer model

Nonzero resistance lines

Sample rate 250ms Step increase of loads on bus 1, 7, 27

SLIDE 18

59.964 Hz ERCOT threshold for freq control

Simulations

SLIDE 19

Simulations

SLIDE 20

Recap

P

i m

P

ij

i j

l i i l i

d D 







 

2 1 2 ( ) ( ) : :

min

ver

, ,

s. t.

=

l l i i i l L i l l l i m i l i i ij ki l L i j i j k k i

C d D d d d P d D P P   

   

             

    

Network Dynamics Optimization

SLIDE 21

   

, : ~

min ( , )

i i

i i i i x u i i ij j i i i i j i j i i i

f x g u A x B u C w h x u     

  

s. t.

Network Dynamics: Optimization:

How to design distributed, closed-loop controller u?

[Li, Chen, Zhao, 2015]: Economic Automatic Generation Control
[Zhang, Antonois, Li, 2016]: Sufficient and Necessary Conditions
[Zhang, Malkawi, Li, 2016]: Thermal Control for HVAC

This Idea Extends to General Systems

SLIDE 22

This talk: limited communication

▪ Recover information from local computation

Load frequency control
Decentralized voltage control (distribution network)

(Qu, Li, Dahleh, 2014)

Power allocation in buildings/data center
Quantized dual gradient for power allocation

▪ Extract information from physical measurements (Feedback)

SLIDE 23

This talk: limited communication

▪ Extract information from physical measurements (Feedback)

Load frequency control
Decentralized voltage control (distribution network)

(Qu, Li, Dahleh, 2014)

Power allocation in buildings/data center

▪ Recover information from local computation

Quantized dual gradient for power allocation

SLIDE 24

This talk: limited communication

Load frequency control
Decentralized voltage control (distribution network)

(Qu, Li, Dahleh, 2014)

Power allocation in buildings/data center

▪ Recover information from local computation

Quantized dual gradient for power allocation

▪ Extract information from physical measurements (Feedback)

SLIDE 25

Power management within buildings

Control center coordinates power consumption of appliances

➢ Maximize utility, minimize cost ➢ Satisfy operation constraints, e.g. power capacity constraints

Control Center

SLIDE 26

Distributed Coordination under Two-way Comm.

Control Center

Step 1: Appliances to center: Power request Step 2: Center to appliances: Coordination signal

Assume perfect, reliable, and ubiquitous communication resources

Iterate

SLIDE 27

Q 1: Is it possible to use only one-way comm.? Q 2: How many bits are needed?

Reduce Communication Needs

Control Center

SLIDE 28

Not just for the buildings/grids

Data Center Multi-core Processor

Communication cost is much higher than computation [Bolsens I., 2002]

SLIDE 29

Power allocation problem

Control center User 1: User 2: User N: … p(t) q1(t)

SLIDE 30

A distributed algorithm: Dual gradient descent

Control center … User 1 User 2 User N p(t) q1(t)

SLIDE 31

A distributed algorithm: One-way comm.

Control center q1(t)

Replace this with true measurement of total power consump. Q(t).

Q(t) … User 1 User 2 User N p(t)

SLIDE 32

It might violates hard physical constraint

What’s the problem here?

Control center p(t)

Theorem: If the step size and initial setting are chosen properly, the constraint will hold all the time.

“Distributed resource allocation using one-way communication”, Magnusson, Enyioha, Li, Fischione, Tarokh, 2016

… User 1 User 2 User N

SLIDE 33

This Talk

▪ Extract information from physical measurements (Feedback) ▪ Recover information from local computation

Load frequency control
Power allocation in buildings/data center
Quantized dual gradient for power allocation

SLIDE 34

This Talk

▪ Extract information from physical measurements (Feedback) ▪ Recover information from local computation

Load frequency control
Power allocation in buildings/data center
Quantized dual gradient for power allocation

SLIDE 35

Control center … User 1 User 2 User N p(t)

Recall: Dual Gradient with One-way Comm.

SLIDE 36

Control center

Further reduce comm.

Just send one bit to indicate the sign

s(k)=0 or 1 … User 1 User 2 User N p(t)

SLIDE 37

Dual Gradient with One-bit One-way Comm.

Control center s(k)

This is quantized (normalized) gradient descent of dual function

… User 1 User 2 User N

Normalized Gradient Descent [Shor 1985]:

SLIDE 38

Quantized (Normalized) Gradient Descent (QGD)

Problem: QGD:

Definition: A quantization is proper (good) if and only if the algorithm is able to converge to the optimal points for any well- behaviored f, e.g. convex smooth function.

SLIDE 39

Quantized (Normalized) Gradient Descent (QGD)

Problem: Questions: A) How to determine a quantization is proper? B) What is the minimal size of the quantization to be proper? C) How to choose d(t) and ε(t) , given a good quantization? D) What are the connections between the fineness of the quantization to the convergence of the algorithm? QGD:

SLIDE 40

Descent direction

Red: Quantization direction Blue: Gradient direction

SLIDE 41

Proper quantization

“Convergence of limited communications gradient methods”, Magnusson, Enyioha, Li, Fischione, Tarokh, Transactions on Automatic Control, 2017 Red: Quantization direction Blue: Gradient direction

SLIDE 42

Convergence rate

➢ Finer quantization, larger stepsizeis allowed ➢ Finer quantization, faster convergence

One Stopping Criterion:

*More convergence results are available in the paper

SLIDE 43

Quantization size

Shannon,1959

SLIDE 44

Simulation

(3) Infinite bandwidth: normalized gradient (2) Infinite bandwidth: gradient

Message: Should incorporate the info. of gradient magnitude

SLIDE 45

Summary of QGD

Problem: A) Proper quantization = θ-cover B) Minimal size of proper quantization is K+1 C) Pick the quantized direction closest to gradient direction D) θ plays an important role in the convergence QGD:

“Convergence of limited communications gradient methods”, Magnusson, Enyioha, Li, Fischione, Tarokh, Transactions on Automatic Control, 2017

SLIDE 46

Extension to Constrained Case

Problem: QGD: Can the results of unconstrained case extend?

SLIDE 47

Θ-cover does not work for constrained case

Grey: Constraints set; : x(t)

Get stuck at non-optimal points Not necessarily a descent direction

SLIDE 48

Extension to Constrained Case

Problem: QGD:

SLIDE 49

Communication Complexity

Control center s(t) … User 1 User N

SLIDE 50

Communication Complexity

s(t) … User 1 User N Question: What minimal bits (in total) are needed to achieve ℇ-optimal solution? ℇ-complexity (a min max definition)

“Communication Complexity of Distributed Resource Allocation Optimization”, Magnusson, Enyioha, Li, Fischione, Tarokh, submitted, 2017

What optimal accuracy is able to be achieved using b-bits (in total)? b-complexity (a min max definition) Is there a simple coding scheme that reaches the complexity? Yes.

SLIDE 51

Summary: Limited Communication

▪ Extract information from physical measurements (Feedback) ▪ Recover information from local computation

Load frequency control
Power allocation in buildings/data center
Quantized dual (normalized) gradient for power allocation

Question: How to choose the right algorithms and integrate them together? Tradeoff: Efficiency, Robustness, Communication, Sensing, Computation, Convergence speed

Thank you!

SLIDE 52

Accelerated Distributed Nesterov Gradient Descent

Guannan Qu, Na Li, John A. Paulson School of Engineering and Applied Sciences, Harvard University Problem Formulation

Local Communication

Background

Main Results 𝝂-strongly convex and 𝑴-smooth cost functions Centralized Gradient Methods for minimizing 𝒈

𝝂-strongly convex and 𝑴-smooth cost functions

Proposed Algorithm

Preliminaries

For more detailed results, see Guannan Qu and Na Li, "Accelerated Distributed Nesterov Gradient Descent," arXiv preprint arXiv:1705.07176(2017).

Communication Graph Connected

Convergence Rate:

Initialize:

Proposed Algorithm

Initialize:

Summary of Results

Gradient Descent Nesterov Gradient Descent (for 𝒈 𝝂-strongly

convex, 𝑴-smooth)

Nesterov Gradient Descent (for 𝒈 convex, 𝑴-smooth)

𝒈 type algo.

GD Nesterov GD Convex and 𝑴-Smooth 𝜈-Strongly Convex and 𝑀-Smooth Nesterov GD brings acceleration! Most distributed gradient methods are based on GD, and the convergence rate is not better than GD.

Can Nesterov momentum be used in Distributed Gradient Methods and accelerate the convergence? Summary of Results

convex and 𝑴-smooth cost functions

Simulation Main Results convex and 𝑴-smooth cost functions

(for 𝒈 convex and 𝑴-smooth, or 𝒈 𝝂-strongly convex and 𝑴-smooth)

Convergence Rates

SLIDE 53

Back up: Reverse and Forward Engineering

SLIDE 54

Power Balance, Stability: Dynamic model

sec min 5 min 60 min day year

primary freq control secondary freq control

Economic efficiency: power flow model

economic dispatch unit commitment

Frequency control

SLIDE 55

Power Balance, Stability: Dynamic model

Control

sec min 5 min 60 min day year

primary freq control secondary freq control

Economic efficiency: power flow model

Optimization economic dispatch unit commitment

Frequency control

Traditionally this is done at the generation side.

➢ Goal: Balance the grid in an optimal (cost-effective) way

SLIDE 56

Loss of 2 nuclear plants in ERCOT

Kirby 2003 [ORNL/TM-2003/19]

(1 min) (10 min) deadband 59.964Hz

Frequency response

Imagine if there is 50%+ renewable generation

SLIDE 57

Advantages of load-side control

 faster (no/low inertia!)  no waste or emission  more resources (large #)  localize disturbance

Idea dates back to 1970s (Schweppe et al (1979, 1980))

SLIDE 58

Hierarchical Control at Different Time-scales

Physical Systems

Real Disturbance Optimization (slow)

Nominal Operating Point

Predicted Disturbance

Control (fast)

─

SLIDE 59

Imagine when we have 33%+ renewable generation …

(1 min) (10 min)

Challenges

Can the grid follow its

wn PV/Wind production

faster and more efficiently?

SLIDE 60

Distributed Ec Economically-Effi ficient Control

sec min 5 min 60 min day year

Distributed Economically-Efficient Control

economic dispatch unit commitment

Control Goals:

 Rebalance power  Stabilize frequency  Restore nominal frequency  Re-dispatch power optimally (min cost/disutility)

SLIDE 61

Distributed Ec Economically-Efficient Control

Advantages: For the control: Stable and more economically-efficient For the optimization: Save sensing/communication/computation

Optimal Power Dispatch Automatically solve Physical Systems Real Disturbance Redesigned Control (fast)

SLIDE 62

System m Dynami mics cs & Existing Control

Problem setup

Example:

Frequency dynamics, Voltage dynamics Primary/Secondary frequency/Voltage control Inverter dynamics/control (Model limitation: linear approximation)

SLIDE 63

Optimization Problem

Problem setup

Example:

Economic Dispatch, Optimal Load Response.

SLIDE 64

System m Dynami mics cs & Existing Control Economica cal Effici cient State

Problem setup

How to (re)design the control u to reach the optimal solution?

Distributed
Closed-loop (state-feedback)

Tool: reverse/forward engineering

SLIDE 65

System m Dynami mics cs & Existing Control Economica cally Effici cient State

Reverse

Optimization Problem

solve Analogy ?

SLIDE 66

System m Dynami mics cs & Existing Control Economica cally Effici cient State

Forward

Optimization Problem

Equivalent

Modified

SLIDE 67

System m Dynami mics cs & ModifiedControl Economica cally Effici cient State

Forward

Optimization Problem

Equivalent

Modified

solve  

( ) ( ) ( )

, ( ), , , :

i ij j i i i i j N i i ij j ij j i i i j N i j N i i j j j i

x A x B u C w u D x E u Fw g z z f x u z j N

  

        

  

& & &

SLIDE 68

System m Dynami mics cs & Existing Control Economica cal Effici cient State

Sufficient and necessary conditions are available at

[Zhang, Antonois, Li, 2015, 2016 ]

Distributed Ec Economically-Efficient Control

SLIDE 69

Simulation of IEEE a 68-bus system

Application to optimal load control for primary freq. control

SLIDE 70

Application to automatic generation control

Simulation of a 4-bus system

SLIDE 71

Distributed Ec Economically-Effi ficient Control

Advantages: For the control: Stable and more economically-efficient For the optimization: A large amount of sensing, comm. and comp. is saved

Thank you!

Optimal Power Dispatch Automatically solve Physical Systems Real Disturbance Redesigned Control (fast)

SLIDE 72

System Dynamics

 

( ) : :

1 ( ) ( ) ( ) ( ) ( ) ( )

m i l i i i ij ji l L i j i j k k i i ij ij i j

d t D t P P t P t M P b t t    

  

             

  

➢ Frequency: a locally measurable signal (“price” of imbalance) ➢ Completely decentralized; no explicit communication necessary

Load frequency control

 

' 1

( ) ( ) for ( )

l l

d l l i d

d t C t l L i 



     

Load Control

SLIDE 73

Dual Gradient Algorithms

Step 1: Each appliance i updates the power request qi (t) & sends to the control center Step 2: Control center updates the signal p(t) & sends to each appliance

Replace this with true measurement of total power consump. Q(t).

SLIDE 74

Normalized Gradient

Problem: Questions: Reduce the communication?

Primal Problem e.g. Network constraints; Multi-resource allocation

Gradient Descent: Normalized Gradient Descent:

SLIDE 75

Quantized (Normalized) Gradient Descent (QGD)

Problem: QGD:

SLIDE 76

Proper Quantization

A quantization is proper if and only if the algorithm is able to converge to the optimal points for any well-behaviored f.

SLIDE 77

Quantized (Normalized) Gradient Descent (QGD)

Problem: Questions: A) How to determine a quantization is proper? B) What is the minimal size of the quantization to be proper? C) How to choose d(t) and ε(t) , Given a proper quantization? D) What are the connections between the fineness of the quantization to the convergence of the algorithm? QGD:

SLIDE 78

Descent direction

Red: Quantization direction Blue: Gradient direction

SLIDE 79

Proper quantization

“Convergence of limited communications gradient methods”, Magnusson, Heal, Enyioha, Li, Fischione, Tarokh, ACC submitted

SLIDE 80

Convergence rate

“Convergence of limited communications gradient methods”, Magnusson, Heal, Enyioha, Li, Fischione, Tarokh, ACC submitted

➢ Finer quantization, larger stepsizeis allowed ➢ Finer quantization, faster convergence

Stopping Criterion:

SLIDE 81

“Convergence of limited communications gradient methods”, Magnusson, Heal, Enyioha, Li, Fischione, Tarokh, ACC submitted

Stopping Criterion:

Convergence rate

SLIDE 82

Quantization size

Red: Quantization direction

SLIDE 83

Quantization size

Shannon,1959

SLIDE 84

Simulation

(3) Infinite bandwidth: normalized gradient (2) Infinite bandwidth: gradient

Message: Should incorporate the info. of gradient magnitude

SLIDE 85

Summary of QGD

Problem: A) Proper quantization = θ-cover B) Minimal size of proper quantization is N+1 C) Pick the quantized direction closest to gradient direction D) θ plays an important role in determines the convergence QGD:

SLIDE 86

Proper quantization

“Convergence of limited communications gradient methods”, Magnusson, Heal, Enyioha, Li, Fischione, Tarokh, ACC submitted

Finer quantization, larger stepsizeis allowed Finer quantization, faster convergence

SLIDE 87