High-dimensional modeling and forecasting for wind power generation - PowerPoint PPT Presentation

High-dimensional modeling and forecasting for wind power generation Jakob Messner ∗ , Pierre Pinson ∗ , Yongning Zhao † , ∗ ∗ Technical University of Denmark, † China Agricultural University (authors in alphabetical order) Contact - email: ppin@elektro.dtu.dk - webpage: www.pierrepinson.com YEQT Winter School on Energy Systems - 13 December 2017 1 / 46

Outline Motivations for high-dimension learning and forecasting General sparsity control for VAR models Online sparse and adaptive learning for VAR models Distributed learning Outlook 2 / 46

1 From single wind farms to entire regions (1000s) 3 / 46

A traditional view on wind power forecasting The wind power forecasting problem is defined for a single location... ... or, if several locations, by considering each of them individually (Note that, for simplicity, we will only look at very short-term forecasting in this talk, i.e., from a few mins to 1-hour ahead) 4 / 46

Wind farms as a network of sensors Many works showed that forecast quality could be significantly improved: by using data at offsite locations (i.e., other wind farms) based on spatio-temporal modelling (and the likes) A Danish example... Accounting for spatio-temporal effects allows for the correction of aggregated power forecasts for horizons up to 8 hours ahead Largest improvements at horizons of 2-5 hours ahead improvement of 1-hour ahead forecast RMSE 5 / 46

Scaling it up Ultimately, we would like to predict all wind power generation, also solar and load, at the scale of a continental power system, e.g. the European one Coal Natural Gas Fuel Oil Natural Gas Fuel Oil Nuclear Hydro Nuclear Hydro Unknown Lignite Unknown Lignite Coal RE-Europe dataset, available at zenodo.org , descriptor in Nature, Scientific Data 6 / 46

The big picture... The “grand forecasting challenge” : predict renewable power generation , dynamic uncertainties and space-time dependencies at once for the whole Europe...! Linkage with future electricity markets: Monitoring and forecasting of the complete “Energy Weather” over Europe Provides all necessary information for coupling of various existing markets (e.g., day-ahead, balancing), and deciding upon optimal cross-border exchanges 7 / 46

2 A proposal for general sparsity control (not online though) 8 / 46

Sparsity-controlled vector autoregressive (SC-VAR) model Traditional LASSO-VAR can only provide overall sparse solutions , but not allow for fine-tuning different aspects of sparsity , e.g. : overall number of nonzero coefficients of VAR ( S A ), i.e. the LASSO-VAR number of explanatory wind farms used in VAR to explain target wind farm i ( S i F ) number of past observations of each explanatory wind farm to explain target wind farm i ( S i P ) number of nonzero coefficients to explain target wind farm i ( S i N ). k = 1 k = 2 These aspects can be used to control the sparse structure of the solution as needed, especially when prior knowledge on spatio-temporal characteristics of wind farms are available for sparsity-control and expected to improve the forecasting . 9 / 46

Sparsity-controlled vector autoregressive (SC-VAR) model How to freely control the sparse structure... [E. Carrizosa, et al. 2017] Introducing binary control variables γ i j and δ i jk γ i j controls whether wind farm j is used to explain target wind farm i . δ i jk controls whether the coefficient α i jk is zero or not. Reformulating the VAR estimation as a constrained mixed integer non-linear programming (MINLP) problem. For example: N = 3 wind farms, VAR(2) with p = 2 lags γ 1 γ 1 γ 1 α 1 α 1 α 1 α 1    1 0 1   0 0  1 2 3 11 31 12 32  =  ⇐ γ 2 γ 2 γ 2 α 2 α 2 0 1 0 ⇒ A = 0 0 0 0 1 2 3 21 22     γ 3 γ 3 γ 3 α 3 α 3 α 3 α 3 1 0 1 0 0 1 2 3 11 31 12 32 If additionally with control variable δ 3 11 = 0, then α 1 α 1 α 1 α 1  0 0  11 31 12 32 α 2 α 2 A = 0 0 0 0 21 22   α 3 α 3 α 3 0 0 0 31 12 32 p That is: γ i � δ i δ i jk = 0 ⇔ α i j = 0 ⇔ jk = 0 jk = 0 k =1 10 / 46

Sparsity-controlled vector autoregressive (SC-VAR) model p N T N � 2 � � � � � α i min y i , t +1 − jk y j , t − k +1 α,δ,γ i =1 t = p j =1 k =1 δ i jk ≤ γ i subject to j , ∀ k ∈ K , i , j ∈ I I = { 1 , 2 , · · · , N } N K = { 1 , 2 , · · · , p } � γ i j ≤ S i F , ∀ i ∈ I S A - overall number of nonzero j =1 coefficients of VAR p � γ i j δ i jk ≤ S i P , ∀ i , j ∈ I S i F - number of explanatory wind farms used in VAR to explain target k =1 wind farm i N N p � � � δ i jk ≤ S A , ∀ k ∈ K , i , j ∈ I S i P - number of past observations of each explanatory wind farm to i =1 j =1 k =1 explain target wind farm i N p � � δ i jk ≤ S i N , ∀ i ∈ I S i N - number of nonzero coefficients to explain target wind farm i j =1 k =1 � � η i � α i � ≥ η i j δ i j - a threshold requires that only jk , ∀ k ∈ K , i , j ∈ I � � jk coefficients with absolute value α i jk (1 − δ i greater than or equal to η i jk ) = 0 , ∀ k ∈ K , i , j ∈ I j are effective otherwise will be zero. δ i jk , γ i j ∈ { 0 , 1 } , ∀ k ∈ K , i , j ∈ I 11 / 46

Pros and cons of SC-VAR model Pros allows for fully controlling the sparsity from different aspects. can be directly solved by off-the-shelf standard MINLP solvers. Cons SC-VAR allows for sparsity-control but doesn’t tell how to control . No information is available for setting so many parameters, which are practically intractable when dealing with high dimensional wind power forecasting. The constraint � p k =1 γ i j δ i jk ≤ S i P is nonlinear. The constraints are redundant: S i F + S i P = S i i ∈ I S i N , � N = S A The constraint � � � δ i jk ≤ S A makes the optimization problem non-decomposable, which slows down the computation. Too many variables to be optimized: VAR coefficients α i jk , binary control variables γ i j and δ i jk . � � � α i � ≥ η i j δ i jk and α i jk (1 − δ i (Note that, though jk ) = 0 are also nonlinear, [E. Carrizosa, et al. 2017] provides � � jk linearized reformulation for them.) 12 / 46

Correlation-constrained SC-VAR (CCSC-VAR) model Incorporate explicit spatial correlation information into the constraints! N T N p � 2 � � � � � α i min y i , t +1 − jk y j , t − k +1 α,δ i =1 t = p j =1 k =1 δ i jk ≤ λ i Notations : subject to j , ∀ k ∈ K , i , j ∈ I φ i j is the Pearson correlation between p � δ i jk ≥ λ i wind farms i and j . j , ∀ i , j ∈ I k =1 M is a positive constant number p N (Generally M < 2). � � δ i jk ≤ S i N , ∀ i ∈ I τ and S i N are used to control sparsity. j =1 k =1 Improvements : (simpler but better!) � � � α i � ≤ M · δ i jk , ∀ k ∈ K , i , j ∈ I � � jk Less parameters need to be tuned while δ i jk , γ i the sparsity-control ability is preserved. j ∈ { 0 , 1 } , ∀ k ∈ K , i , j ∈ I More capable of characterizing the true where 1 , φ i � j ≥ τ inter-dependencies between wind farms. λ i j = 0 , φ i j < τ Less variables to be optimized. � − M ≤ α i jk ≤ M , δ i jk = 1 All constraints are linear. � � � α i � ≤ M · δ i jk ⇔ � � jk α i jk = 0 , δ i jk = 0 The model is decomposable. 13 / 46

Application and case study Compared Models: Local forecasting models Persistence method Auto-Regressive model Spatio-temporal models VAR model LASSO-VAR model SC-VAR model CCSC-VAR model Performance Metrics: Root Mean Square Error (RMSE) 25 wind farms randomly chosen over western Denmark Mean Absolute Error (MAE) 15-minute resolution Sparsity for spatial models 20.000 data points for each wind farm 14 / 46

Application and case study Table: The average RMSE and MAE for all 25 wind farms for different forecasting models Metrics Persistence AR VAR LASSO-VAR SC-VAR CCSC-VAR Average RMSE 0.34843 0.34465 0.33156 0.33100 0.33080 0.33058 Average MAE 0.22158 0.22718 0.22631 0.22557 0.22490 0.22408 Model Sparsity n/a n/a 0 0.9248 0.8100 0.7504 From the Table and boxplot: All of the spatio-temporal models significantly outperform the local models. LASSO-VAR has highest sparsity but lowest accuracy among sparse models. CCSC-VAR model has lowest sparsity CCSC-VAR model has lowest average RMSE error for 25 wind farms The minimum, maximum and average improvements of CCSC-VAR are highest among these models. RMSE improvement over Persistence method 15 / 46

3 Online sparse and adaptive learning for VAR models 16 / 46

(Lasso) vector auto regression Power output depends on previous outputs at the wind farm itself and other wind farms: L � y n = A l y n − l + ǫ n l =1 Minimize T L � � ( A l y n − l ) − y n || 2 || 2 t =1 l =1 17 / 46

(Lasso) vector auto regression Power output depends on previous outputs at the wind farm itself and other wind farms: L � y n = A l y n − l + ǫ n l =1 Minimize T L � � ( A l y n − l ) − y n || 2 || 2 t =1 l =1 18 / 46

(Lasso) vector auto regression Power output depends on previous outputs at the wind farm itself and other wind farms: L � y n = A l y n − l + ǫ n l =1 Minimize T L L � � ( A l y n − l ) − y n || 2 � || 2 + λ || A l || t =1 l =1 l =1 sparse coefficient matrices A l 19 / 46

High-dimensional modeling and forecasting for wind power generation - PowerPoint PPT Presentation

High-dimensional modeling and forecasting for wind power generation Jakob Messner , Pierre Pinson , Yongning Zhao , Technical University of Denmark, China Agricultural University (authors in alphabetical order) Contact -

Wind Turbines Wind Turbines A balanced wind turbine rotates smoothly A balanced wind turbine

Presentation on Locally-Owned Wind Market Community Wind Community Wind What is Community Wind?

ESRC Annual HCP Review Kahuku Wind Power and Kaheawa Wind Power I Kahuku Wind Power 2 Observed

Wind Part 1: How do we measure it? Part 2: What exactly is wind? Part 3: Where is it? PART 1:

Flood Forecasting Initiative Guy Shalev Flooding impact Flood Forecasting Flood Forecasting

ECE 566: Grid Integration of Wind Energy Systems S. Suryanarayanan Associate Professor ECE

Forecasts and potential futures Rob Hyndman Author, forecast Forecasting Using R Sample

Bracing Systems Bracing Systems 1 1 Rod Bracing Rod Bracing 2 2 Wind Bracing Systems Wind

Groton Wind Farm August 4, 2009 Groton Board of Selectmen Summary Iberdrola

Role of Power Electronics in Wind Integration and Reliability Issues Wind Integration

Agenda Automated Automated Modeling and Modeling and Forecasting Forecasting Vector Vector

Lecture 10 Forecasting and Model Fitting Colin Rundel 02/20/2017 1 Forecasting 2 Forecasting

Welcome to Forecasting Using R Rob Hyndman Author, forecast Forecasting Using R What you will

Forecasting 21 January 2013 1 FCAS Agenda Business Goals & Forecasting Approach

WIND TURBINE WAKES, WAKE EFFECT IMPACTS, AND WIND LEASES: USING SOLAR ACCESS LAWS AS THE MODEL

WWU WIND FEED Software System Review The WWU Wind Feed is a tool to deliver wind information to

Financ Financial ial Forec ecas asting ting Wha What are 1. What data are you using? 2.

Delphi: a hybrid approach to forecasting a global marketplace Machine Learning is very good at

Challenges of energy forecasting for smart grids Modelling Smart Grids, Prague 11th of September

Indigenous Knowledge Aware Drought Monitoring, Forecasting and Prediction using Deep Learning

Lab session on seasonal forecasting Bill Merryfield Canadian Centre for Climate Modelling and

Privacy-friendly Forecasting for the Smart Grid using Homomorphic Encryption J. Bos 1 , W.

Inventory prediction in foreign exchange markets Damien Challet CentraleSuplec and Encelade

UBS Best of Americas Conference Colleen Johnston Chief Financial Officer September 11, 2009 TD

High-dimensional modeling and forecasting for wind power generation - PowerPoint PPT Presentation

High-dimensional modeling and forecasting for wind power generation Jakob Messner , Pierre Pinson , Yongning Zhao , Technical University of Denmark, China Agricultural University (authors in alphabetical order) Contact -

Wind Turbines Wind Turbines A balanced wind turbine rotates smoothly A balanced wind turbine

Presentation on Locally-Owned Wind Market Community Wind Community Wind What is Community Wind?

ESRC Annual HCP Review Kahuku Wind Power and Kaheawa Wind Power I Kahuku Wind Power 2 Observed

Wind Part 1: How do we measure it? Part 2: What exactly is wind? Part 3: Where is it? PART 1:

Flood Forecasting Initiative Guy Shalev Flooding impact Flood Forecasting Flood Forecasting

ECE 566: Grid Integration of Wind Energy Systems S. Suryanarayanan Associate Professor ECE

Forecasts and potential futures Rob Hyndman Author, forecast Forecasting Using R Sample

Bracing Systems Bracing Systems 1 1 Rod Bracing Rod Bracing 2 2 Wind Bracing Systems Wind

Groton Wind Farm August 4, 2009 Groton Board of Selectmen Summary Iberdrola

Role of Power Electronics in Wind Integration and Reliability Issues Wind Integration

Agenda Automated Automated Modeling and Modeling and Forecasting Forecasting Vector Vector

Lecture 10 Forecasting and Model Fitting Colin Rundel 02/20/2017 1 Forecasting 2 Forecasting

Welcome to Forecasting Using R Rob Hyndman Author, forecast Forecasting Using R What you will

Forecasting 21 January 2013 1 FCAS Agenda Business Goals &amp; Forecasting Approach

WIND TURBINE WAKES, WAKE EFFECT IMPACTS, AND WIND LEASES: USING SOLAR ACCESS LAWS AS THE MODEL

WWU WIND FEED Software System Review The WWU Wind Feed is a tool to deliver wind information to

Financ Financial ial Forec ecas asting ting Wha What are 1. What data are you using? 2.

Delphi: a hybrid approach to forecasting a global marketplace Machine Learning is very good at

Challenges of energy forecasting for smart grids Modelling Smart Grids, Prague 11th of September

Indigenous Knowledge Aware Drought Monitoring, Forecasting and Prediction using Deep Learning

Lab session on seasonal forecasting Bill Merryfield Canadian Centre for Climate Modelling and

Privacy-friendly Forecasting for the Smart Grid using Homomorphic Encryption J. Bos 1 , W.

Inventory prediction in foreign exchange markets Damien Challet CentraleSuplec and Encelade

UBS Best of Americas Conference Colleen Johnston Chief Financial Officer September 11, 2009 TD

Forecasting 21 January 2013 1 FCAS Agenda Business Goals & Forecasting Approach