A Distributed Approach to Large Scale Security Constrained Unit - - PowerPoint PPT Presentation

a distributed approach to large scale security
SMART_READER_LITE
LIVE PREVIEW

A Distributed Approach to Large Scale Security Constrained Unit - - PowerPoint PPT Presentation

A Distributed Approach to Large Scale Security Constrained Unit Commitment Problem Kaan Egilmez Cambridge Energy Solutions FERC Technical Conference on Increasing Real-Time and Day-Ahead Market Efficiency through Improved Software June


slide-1
SLIDE 1

1

FERC Technical Conference on Increasing Real-Time and Day-Ahead Market Efficiency through Improved Software June 22-24, 2015 Washington, DC

A Distributed Approach to Large Scale Security Constrained Unit Commitment Problem

Kaan Egilmez Cambridge Energy Solutions

slide-2
SLIDE 2

2

About CES

  • Cambridge Energy Solutions is a software company with

a mission to develop software tools for participants in deregulated electric power markets.

  • CES-US provides information and tools to assist market

participants in analyzing the electricity markets on a locational basis, forecast and value transmission congestion, and to understand the fundamental drivers of short- and long-term prices.

  • CES-US staff are experts on market structures in the US,

system operation and related information technology.

slide-3
SLIDE 3

3

Presentation overview

  • The convergence of machine virtualization and the maturing of multi-core

computing has had a dramatic impact on the ease with which high performance computing techniques can be brought to bear on real world problems.

  • At CES we are actively working on improving the performance of our DAYZER

market modeling and simulation software by making use of multi-core parallel programming on individual compute nodes combined with distribution of work load across multiple such compute nodes organized into high performance computing clusters.

  • This talk provides an overview of the techniques we are using to

accomplish this goal as well as simulation results of performance improvement on both small and large scale models such as our combined model for PJM and MISO.

  • These techniques if applied to market operations and planning would allow

many more scenarios to be concurrently examined and/or more detailed individual models to be solved within reasonable time limits allowing novel solutions to existing concerns regarding robustness of market results to various kinds of uncertainties.

slide-4
SLIDE 4

4

DAYZER

CES has developed DAYZER to assist electric power market participants in analyzing the locational market clearing prices and the associated transmission congestion costs in competitive electricity markets. This tool simulates the operation of the electricity markets by mimicking the dispatch procedures used by the corresponding independent system operators (ISOs), and replicates the calculations made by the ISOs in solving for the security- constrained, least-cost unit commitment and dispatch in the Day-Ahead markets. Models are available for the CAISO, ERCOT, MISO, NEPOOL, NYISO, ONTARIO, PJM, SPP and WECC markets, as well as a combined model for the PJM-MISO region.

slide-5
SLIDE 5

5

DAYZER SCUC MILP (MUC) Formulation

Minimize the total cost over 24 hours of: Generation + Startup/Shutdown + Imports/Exports + Generation Slacks + Spin Reserve Slacks + Non Spin Reserve Slacks + Transmission Overloads + PAR Angle Overloads Subject to the following constraints for each hour:

  • System energy balance
  • Spin reserves requirement
  • Non spin reserves requirement
  • Unit commitment constraints (capacity, min up/down, start/stop,

ramping)

  • Pump storage constraints (efficiency, reservoir)
  • Transmission constraints (line, contingency, interface, PAR, nomogram)
  • PAR angle constraints
  • DC line constraints
slide-6
SLIDE 6

6

Examples of DAYZER Model Characteristics

NEPOOL (2014)

8 load zones 1 reserves pool 6 import/export interface units 2 pumped storage units 416 generation units (Nuclear, Hydro, Wind, Solar, CC, ST, GT) 2612 transmission constraints 11 PARs

PJM+MISO

combined

(2014)

54 load zones + 88 industrial load units 7 reserves pools 39 import/export interface units 8 pumped storage units 1972 generation units (Nuclear, Hydro, Wind, Solar, Battery, CC, ST, GT) 16161 transmission constraints 37 PARs 5 DC Lines

slide-7
SLIDE 7

7

MUC Performance for NEPOOL Model

Machine A – 4 cores E3-1240 V2 CPU @ 3.4 GHz 32 GB memory Windows 8 server 64 bit Machine B – 8 cores i7-5960X CPU @ 3 GHz (over clocked at 3.87 GHz) 32 GB memory Windows 8.1 Pro 64 bit

Run Time statistics over 365 days in 2014

99th %tile Mean Max Min

Key

15 13 73 65 253 251 4 4 40 80 120 160 200 240 4 cores 8 cores

Seconds / Day

slide-8
SLIDE 8

8

MUC Solution Quality for NEPOOL Model

3 10 20 48 107 176 1 40 80 120 160 200 < 0.01% 0.01% 0.02% 0.03% 0.04% 0.05% 0.21%

Duality gap at final solution (Target = 0.05%) Days Simulation over 365 days in 2014

slide-9
SLIDE 9

9

MUC Performance for PJM+MISO Model

1829 1491 2642 2231 3027 3906 579 760 500 1000 1500 2000 2500 3000 3500 4000 4 cores 8 cores

Seconds / Day Run Time statistics over 90 days in Q1 2014

95th %tile Mean Max Min

Key The difference in run time performance is due to the faster CPU speed on the 8 core machine. A single MUC process cannot take advantage of multiple cores other than in incidental ways due to I/O and the presence of other workloads. These runs were performed with no

  • ther non system tasks running

concurrently with MUC.

slide-10
SLIDE 10

10

MUC Solution Quality for PJM+MISO Model

Days Duality gap at final solution (Target = 0.05%) Simulation over 90 days in Q1 2014

1 17 22 11 12 14 11 1 1 22 10 12 15 4 2 1 24

5 10 15 20 25

0.02% 0.05% 0.10% 0.15% 0.20% 0.25% 0.30% 0.35% 0.46% 0.58%

4 cores 8 cores

More MILP iterations were able to reach the target duality gap on the faster machine within the allowed maximum run time. The solver termination state (optimal vs. best found) differs for 18 days.

slide-11
SLIDE 11

11

Typical MUC run time performance for a large model simulated for one year

500 1000 1500 2000 2500 3000 3500 1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239 253 267 281 295 309 323 337 351 365

Seconds Days

Splitting the simulation into months or quarters and running each segment in parallel is the conventional approach to taking advantage of multi-core

  • machines. It’s clear from the above timing pattern that a finer grained load

balancing scheme can produce a much better overall run time performance.

slide-12
SLIDE 12

12

Solution Architecture for Distributed And Parallel DAYZER

Master Workstation DAYZER Compute Nodes (Multi-core) MS MPI Interconnect over Private Network

  • Simulation period load balanced across all cores at compute nodes using MPI.
  • Results can be sent to a central database or stored in local partial databases.
  • MPI based query tool allows locally stored results to be aggregated at Master.
  • MUC: each day assigned to a core at a node using single threaded MILP SCUC.
  • PUC: each day assigned to a multi-core node using Parallel SCUC.
slide-13
SLIDE 13

13

DAYZER Parallel SCUC (PUC)

  • Target duality gap estimated by solving an initial relaxation problem.
  • Adaptive step size initialization and update heuristics incorporating the target gap

estimate as well as a measure of the current over/under commit.

  • Early termination heuristics based on the target gap and step size update history.
  • Unit sub problems modeled and solved as MILP (same as in the global version).
  • Ramping constraints imposed on hourly dispatch using latest UC solutions.
  • A unit (partial) decommitment

phase based on semi-global uplift minimization.

  • Coverage of all transmission constraints by adaptively modifying

the dispatch LPs.

  • Pump storage optimization handled by updating UC for a fixed PS solution, then

relaxing the associated PS constraints and updating their multipliers while UC is kept fixed. We then iterate over multiple cycles of this to achieve convergence.

  • Losses and Contingency Analysis calculations interleaved with UC

iterations.

Solves the same problem as MUC but utilizes Lagrangian Relaxation Subgradient Optimization by decomposing the problem across time (hourly dispatch) as well as space (unit commitment). Some of the more distinctive aspects of our implementation are:

slide-14
SLIDE 14

14

PUC performance on a small scale problem (NEPOOL) with Pump Storage Optimization

99th %tile Mean Max Min Key

Statistics from runs on two different machines over 365 days in 2014

Fuel Cost % Gap wrt MUC

4 cores 8 cores

Run time Seconds / Day

14 11 16 21 7 10 13 82 36 53 60 25 40 45 68 53 104 81 40 50 61 7 5 5 10 3 5 6 20 40 60 80 100 120 MIP 1 Cycle 2 Cycles 3 Cycles 1 Cycle 2 Cycles 3 Cycles 0.59% 0.43% 0.39% 2.92% 2.08% 2.04% 5.71% 2.81% 2.81%

  • 0.63%
  • 0.53%
  • 0.53%
  • 1%

0% 1% 2% 3% 4% 5% 6% 1 Cycle 2 Cycles 3 Cycles

slide-15
SLIDE 15

15

Results from same runs without Pump Storage highlight the large impact of these resources

15 10 15 20 7 10 13 73 67 77 53 40 96 53 69 6 4 3 9 4 4 7 25 36 53 41 52 20 40 60 80 100 120

MUC

1 Cycle 2 Cycles 3 Cycles 1 Cycle 2 Cycles 3 Cycles 253

4 cores 8 cores

0.42% 0.38% 0.35% 2.44% 1.89% 1.60% 1.85% 2.83% 4.05%

  • 0.59%
  • 0.61%
  • 0.61%
  • 1%

0% 1% 2% 3% 4% 5% 1 Cycle 2 Cycles 3 Cycles

Run time Seconds / Day Fuel Cost % Gap wrt MUC

  • The effective parallelization estimated from these runs is between 88% and 93%

which implies a speed up factor between 6 and 9 at 24 cores.

  • Even without PS optimization PUC solution quality improves with additional cycles.
slide-16
SLIDE 16

16

LMP comparison highlights the improvement gained from additional PUC cycles

MUC

50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500

PUC

1 Cycle 3 Cycles

NEPOOL daily load weighted average LMP (no PS)

RMS error: 1 cycle = 5.33 2 cycles = 4.41 3 cycles = 3.82 RMS error (w/PS): 1 cycle = 6.36 2 cycles = 5.18 3 cycles = 5.20

slide-17
SLIDE 17

17

However, congestion pattern convergence may require even more cycles

Daily average of normalized hourly transmission rent =

  

          

h c c

h c Flow h c SP h c Flow ) , ( ) , ( ) , ( 24 1

1 2 3 4 5 6 1 2 3 4 5 6

1 cycle 2 cycles 3 cycles

MUC PUC

RMS error: 1 cycle = 0.29 2 cycles = 0.24 3 cycles = 0.26

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

slide-18
SLIDE 18

18

PUC performance on a large scale problem (PJM+MISO combined) with Pump Storage Optimization

The above timing results imply an effective parallelization of almost 98% and hence a speed up factor of nearly 16 at 24 cores.

95th %tile Mean Max Min Key

Run time Seconds / Day Fuel Cost % Gap wrt MUC

0.28% 0.31% 0.30% 0.76% 0.86% 0.97% 1.37% 1.19% 1.44%

  • 1.42%
  • 1.39%
  • 1.39%
  • 1.5%
  • 1.0%
  • 0.5%

0.0% 0.5% 1.0% 1.5% 1 Cycle 2 Cycles 3 Cycles

4 cores 8 cores

1829 1097 1575 2159 594 830 1091 2642 1590 2204 3214 882 1156 1632

2112 1493 1097 4289 3906 2064 2915 481 361 247 881 760 425 626

500 1000 1500 2000 2500 3000 3500 4000 4500 MUC 1 Cycle 2 Cycles 3 Cycles 1 Cycle 2 Cycles 3 Cycles

slide-19
SLIDE 19

19

20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200

1 cycle 3 cycles

MUC PUC

PJM + MISO daily load weighted average LMP

RMS error: 1 cycle = 1.84 2 cycles = 1.92 3 cycles = 1.84 RMS error for MUC LMP < 100: 1 cycle = 1.84 2 cycles = 1.87 3 cycles = 1.75

LMP comparison shows much smaller impact of additional PUC cycles compared to NEPOOL

slide-20
SLIDE 20

20

Improvement in the alignment of congestion patterns beyond 2 cycles is not uniform

PUC MUC

Converging but additional cycles are required Converging to similar congestion pattern at a higher gap PUC finds better solution Lower cost solution diverging

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1 Cycle 2 Cycles 3 Cycles

Daily average of normalized hourly transmission rent

RMS error: 1 cycle = 0.0618 2 cycles = 0.0544 3 cycles = 0.0554 RMS error for MUC rent < 0.5: 1 cycle = 0.0389 2 cycles = 0.0337 3 cycles = 0.0358

slide-21
SLIDE 21

21

However, additional cycles may have a benefit depending on how the results are used

Average load weighted LMP as a function of average normalized transmission rent (exponential fit)

20 40 60 80 100 120 140 160 180 0.0 0.2 0.4 0.6 0.8 1.0 1.2

PUC 1 cycle PUC 3 cycles MUC

The fit with PUC after 3 cycles is very close to the fit with MUC

slide-22
SLIDE 22

22

Solutions close together in terms of fuel cost may still differ significantly in prices

  • 6.00
  • 4.00
  • 2.00

0.00 2.00 4.00 6.00

  • 0.10
  • 0.05

0.00 0.05 0.10

Congestion difference (MUC – PUC) LMP difference Bubble width indicates % fuel cost gap wrt MUC solution for PUC after 3 cycles

slide-23
SLIDE 23

23

Uplift solutions are comparable in most cases

Uplift as a percentage of generation revenue (PS excluded)

MUC Data not shown: Gap < 0% Gap ≥ 0.5% Gap ≥ 1% PUC

0.00% 0.50% 1.00% 1.50% 2.00% 2.50% 0.00% 0.50% 1.00% 1.50% 2.00% 2.50%

0.73 0.51 5.53 0.49 2.78 3.21

  • 1.42

0.47 2.55 1.00 2.83 2.10 % Gap PUC MUC

slide-24
SLIDE 24

24

Conclusions

  • PUC overall solution quality comes close to that of MUC

and is probably acceptable for a range of applications where run time performance is more critical.

  • In addition, a final MUC pass with constraints and initial

solution developed via a single cycle PUC can be used to improve both solution quality and run time performance for larger models.

  • The combination of distributed and parallel techniques as

proposed here can be used to create a flexible and scalable computing environment that can handle the large workloads required to effectively explore the dynamics of multiple interacting energy markets without having to sacrifice on modeling fidelity.