improving forecasts of extreme values by machine learning
play

Improving Forecasts of Extreme Values By Machine Learning Models - PowerPoint PPT Presentation

Improving Forecasts of Extreme Values By Machine Learning Models Using Occam's Razor William W. Hsieh University of British Columbia (Visiting Scientist at Univ. of Victoria) American Meteorological Society Annual Meeting January 2018, Austin


  1. Improving Forecasts of Extreme Values By Machine Learning Models Using Occam's Razor William W. Hsieh University of British Columbia (Visiting Scientist at Univ. of Victoria) American Meteorological Society Annual Meeting January 2018, Austin Tx

  2. Introduction l Machine learning (ML) methods developed mostly for discrete data. l In Environmental Sc.: § Mostly continuous data. § Importance of extreme values. § Are ML methods not suited for extreme values? l Continuous data: Wait long enough, a new predictor value will lie outside the training range => ML model doing extrapolation! l Extreme learning machine (ELM) – 1 hidden layer artificial neural network (ANN) with random weights at hidden nodes. § Ensemble average output from 100 runs. § 3 choices of activation functions at hidden layer: (a) sigmoidal, (b) Gaussian (RadBas), (c) softplus. 2

  3. (a) sigmoidal (b) radial basis Dash = true signal 20 20 y = x + 0.2 x 2 15 15 x = training data 10 10 5 5 y y Line = linear regr. 0 0 Black = ELM with -5 -5 different activation -10 -10 fn. in (a), (b), (c) -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6 x x + = with extrapolat. (c) softplus (d) extended range 20 100 (c) 15 80 (d) ELM solutions in 10 60 (a) extended domain 5 y y 40 0 20 (b) -5 0 -10 -6 -4 -2 0 2 4 6 -100 -50 0 50 100 3 x x

  4. l Occam’s razor : among competing hypotheses, the one with the fewest assumptions should be selected. (Parsimony) l In the extrapolation region, Occam would avoid nonlinear ML models with many parameters -- but instead use linear model?? l New idea: 1) In predictor space, determine which test data points involve extrapolation (based on Mahalanobis distance to training dataset). 2) Use nonlinear ML solution to perform linear extrapolation. l E.g.: predict Vancouver airport (YVR) prcp. amount (on precip. days). 3 predictors: SLP , humidity, Z500 (NCEP Reanalysis), 1971-76 training, 1978-2000 testing. 4

  5. 1.5 Extrapolate from nearest neighbour test data training data outlier 1 centre 0.5 x 3 0 -0.5 -1 Use ML model to compute gradient to extrapolate -1.5 -1.5 -1 -0.5 0 0.5 1 1.5 x 1 5

  6. 1.5 Extrapolate from centre of cluster test data training data outlier 1 centre 0.5 Use these 2 points to compute gradient x 3 0 for extrapolation -0.5 -1 -1.5 -1.5 -1 -0.5 0 0.5 1 1.5 x 1 6

  7. l Use both extrapolation schemes (each with a fine and coarse finite difference estimate of the gradient for extrapolation) => 4 extrap. schemes § Take median (of 4 extrap.schemes & original value) l Compute mean absolute error (MAE), get skill score (SS) relative to original ML model’s MAE. l 4 datasets: YVR prcp, streamflow at Englishman River (ENG) and Stave River (STA), sediment concentration at Fraser River (FRA). § Also reversed training and testing data (rev). l Ran ELM: § 200 trials with different random no. sequences. 7

  8. MAE SS (extrapolated data) 1 sigmoid radbas softplus 0.8 0.6 0.4 0.2 0 -0.2 ENG ENG(rev) STA STA(rev) YVR YVR(rev) FRA(rev) 8

  9. Simple alternative: Train MLR (multiple linear regression) & use its output for the extrapolation pts. MAE SS (extrapolated data using MLR) 1 sigmoid radbas softplus 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 ENG ENG(rev) STA STA(rev) YVR YVR(rev) FRA(rev) 9

  10. Boxplot the 21 medians (of SS over 200 trials) for MLR and ELM (with linear extrap.) over the extrapolated data. Medians of skill scores 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 ELM with lin. extrap. -0.8 MLR MAE SS RMSE SS corr. SS 10

  11. Conclusion & future work l For extreme values, ML models often do nonlinear extrapolation. l Following Occam, proposed using linear extrapolation instead of nonlinear extrapolation: § Use nonlinear ML solution to linearly extrapolate. § Or simply use MLR model for the extrapolation points. l Future improvements: § Determination of outliers by Mahalanobis distance is not robust – replace with more robust method. § Some predictors may be discrete variables – will need to modify the current linear extrapolation schemes. 11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend