receive 4 forecasts from different nwp models one problem
play

Receive 4 forecasts from different NWP models. One problem is - PowerPoint PPT Presentation

GL Garrad Hassan Short term power forecasts for large offshore wind turbine arrays Require accurate wind (and hence power) forecasts for 4, 24 and 48 hours in the future for trading purposes. Receive 4 forecasts from different NWP models. One


  1. GL Garrad Hassan Short term power forecasts for large offshore wind turbine arrays Require accurate wind (and hence power) forecasts for 4, 24 and 48 hours in the future for trading purposes. Receive 4 forecasts from different NWP models. One problem is resolution: UK NWP models have a grid size of 4 square km, whereas a typical wind farm may be a few hundred square km. No knowledge of the NWP models but have (some) on-site measurements to combine with model outputs to improve the forecast.

  2. NWP versus On-site measurements

  3. Approaches split into two camps: Modelling Statistical analysis • Auto regressive model • Linear regression • 4D-Var data analysis assimilation • Probabilistic • Artificial Neural forecasting Networks

  4. Linear regression analysis • Bias correction • “Optimal NWP” as a linear combination of NWPs:     NWP optimal w NWP w NWP w NWP w NWP 1 1 2 2 3 3 4 4 w Obtain using a least squares fit to the measurements from the first half of i 2012. Forecast for the second half.

  5. Linear regression analysis – errors for second half of 2012 48 hour forecast NWP 1 NWP 2 NWP 3 NWP 4 NWP optimal RMS error 1.84 1.92 1.82 2.23 1.68 30 hour forecast NWP 1 NWP 2 NWP 3 NWP 4 NWP optimal RMS error 1.63 1.73 1.59 1.77 1.46 4 hour forecast NWP 1 NWP 2 NWP 3 NWP 4 NWP optimal RMS error 1.49 1.52 1.28 1.58 1.28

  6. Auto regressive model

  7. ANN for SCADA data • SCADA is useful on a very short scale (<8 hours) • Uses smoothed measured wind speed data, hourly average values of 10 min averages • The simulation uses the average wind speed from the last K hours (K=1..96), • the network is trained on N number of vectors of length K (N=1000..10000), • estimates the wind speed with an H hour horizon (H=1..8)

  8. ANN simulation results • RMSE of the ANN is smaller than that of the average of the estimations of the four NWP models : – RMSE of NWP average: 1.5767 – RMSE of estimation: 1.3938 (about ~12% improvement) – RMSE of persistence: 2.8818 – Standard deviation: 2.271

  9. Data Assimilation Observations

  10. 𝑣 𝑐 Background Estimate

  11. 𝑣 𝑐 𝑣 𝑏 Analysis Vector (Optimal Solution)

  12. Re-run periodically

  13. Resulting Equation • 4D-Var Data Assimilation cost function 𝐾 𝑣 0 = (𝑣 𝑐 −𝑣 0 ) 𝑈 𝐶 −1 𝑣 𝑐 − 𝑣 0 𝑀 −1 [ + [𝑧 𝑚 − 𝐼 𝑚 𝑣 𝑚 ] 𝑈 𝑆 𝑚 𝑧 𝑚 − 𝐼 𝑚 (𝑣 𝑚 )] 𝑚=0 Optimal Solution 𝑣 𝑏 = min 𝑣 0 𝐾(𝑣 0 )

  14. Data for this Problem • Aim: Estimate optimal initial condition for the ARMA model, 𝑦 0 . 𝑛 • 𝑦 𝑜 = 𝑏 𝑗 𝑦 𝑜−𝑗 + 𝜊 𝑜+𝑗 𝑗=1 To find the optimal solution from the model, we need optimal parameters to estimate 𝑦 0 ( 𝑏 𝑗 and 𝜊 𝑜+𝑗 fixed by training data). • These are 𝑦 −1 , … , 𝑦 −𝑛 . 𝑦 𝑚−1 𝑦 −1 𝑦 𝑚−2 𝑦 −2 • So let, 𝑣 𝑚 = which implies 𝑣 0 = . ⋮ ⋮ 𝑦 𝑚−𝑛 𝑦 −𝑛 • Use NWP data to find as 𝑣 𝑐 for the a priori information to constrain the solution.

  15. Making Probabilistic Forecasts for wind activity.  Given point forecasts from 4 models and observations how can we make probabilistic forecasts of wind speed?  First step – If we don't have any specific knowledge of the future we can naively look at past observations. We call this the 'climatology'.  We expect any useful model to do better than climatology.

  16.  Our simple approach (due to time constraints) is to create a climatological distribution modelled as a Gaussian distribution from a whole year's data.

  17.  We could also use climatology that is month or even date specific (data permitting).

  18. Moving Variance of wind speed over a year

  19. Ignorance Skill Score  To compare our models we use the 'ignorance score' (Good, 1952) given by ign=-log 2 (p) where p is the amount of probability assigned to the true outcome. All ignorance scores are given relative to climatology (where a negative score means we are doing better).

  20. Blending with climatology  It is common to create new models that are a linear combination of a particular model and the climatology.  We then have a distribution in the form P(y)= αP mod (y)+(1- α)P clim (y) where α is optimised in some way (Commonly to minimise the ignorance score).

  21. A simple model  To create a simple model from the data we take a Gaussian distribution with moving mean, averaged from the last 5 observations and a moving variance taken from the last 30 observations.  The relative ignorance for this model is 0.38.  This means that our model does worse than climatology.  However, when we blend with climatology, we get a relative ignorance score of - 0.02. (α=0.3)

  22. Kernel Dressing Models  We can turn point forecasts into probabilistic forecasts using kernel dressing.  We replace each point forecast with a Gaussian distribution (also known as a kernel).  The mean of the Gaussian distribution is just the point forecast with a bias correction from past experience.  The standard deviation for each model is the mean error found from past experience.

  23. Kernel Dressing Models  This is done for all 4 NWP models  We can compare the models using the relative ignorance score. Each one has been blended with climatology. NWP1 NWP2 NWP3 NWP4 -0.93 -0.88 -0.67 -0.79

  24. Kernel Dressing Models We can create a new model that is a weighted average of the other 4 models and blended with climatology. i.e. P(Y)=α(w 1 *P 1 (y)+w 2 P 2 (y)+w 3 P 3 (y)+w 4 P 4 (y))+(1 - α)*P clim (y) α W 1 W 2 W 3 W 4 Rel. ign 0.35336 0.34057 0.15533 0.16120 0.99 -0.99 0.25 0.25 0.25 0.25 0.99 -0.94 0.7 0.3 0 0 0.99 -1.00

  25. Kernel Dressing methods  We chose our Kernel widths with a forecast using an individual model in mind, but given that the 4 models are likely to cover more possible outcomes, we might want to reduce the kernel widths. The results of halving them are shown below. W1 W2 W3 W4 alpha Rel. ign 0.35336 0.34057 0.15533 0.16120 0.94 -1.16 0.25 0.25 0.25 0.25 0.99 -1.14 0.7 0.3 0 0 0.99 -0.96

  26. Summary of Results First Gaussian Model 0.38 First Gaussian Model blended with climatology -0.02 Individual models blended with climatology -0.93(NWP1) Weighted model using kernel widths from individual models -1.00 and blended with climatology Weighted model using smaller kernel widths and blended -1.16 with climatology.

  27. Possible future work  Extend the work to more realistic distributions rather than Gaussian.  Find probabilistic forecasts from ensembles.  Find a way of optimising the weightings and kernel widths by minimising the ignorance score.  Extend the work to probabilistic forecasts of power.

  28. Artificial Neural Networks (ANN) with Gaussian Radial Basis Functions* * „ newgrnn ” is used in MATLAB

  29. ANN for SCADA evaluation • Evaluation is based on percentage reduction of the RMSE (Root Mean Square Error) compared to persistence (assuming the same wind speed H hours later as what is now)

  30. ANN’s for NWP values Training of the ANN • Input: 30 hours ahead wind speeds calculated by four Numerical Weather Predictions (NWP1-4) • Target: Measured wind speeds at the site 30 hours later • Evaluation: results are compared to the estimation provided by the average of the four NWP’s and persistence

  31. Training & Results • Limited number of data for training the network: 365 days ’ 9am prediction for 3pm wind speed the next day and measured wind speeds at that time from SCADA data • The RMS error decreases with the number of training data:

  32. Conclusions • Neural Networks (with Gaussian Radial Basis Functions) used for short term prediction of wind speed based on SCADA data exclusively do not provide significant improvement compared to persistence (naive estimator) – maximum 5% improvement • Longer term (30 hours) predictions using ANN’s based on the four NWP inputs provide good results and significant improvement compared to averaging the NWP’s – at least 12% improvement (probably more if the network is trained on high amount of data, 300 was the maximum in this study) • Combining the SCADA data and the NWP’s data to form an input for the ANN would probably be able to provide better results for both long and short term; further investigation required

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend