 
              CONTOUR REGRESSION: A distribution-regularized regression framework for climate modeling Zubin Abraham a , Pang-Ning Tan a , Julie A. Winkler b , Perdinan b , Shiyuan Zhong b , Malgorzata Liszewska c a Dept of Computer Science, Michigan State University b Dept of Geography, Michigan State University c Centre for Math and Comp Modeling, Univ of Warsaw
2 Introduction : Climate Modeling Climate Change Modeling • There are growing concerns about climate change and how it could impact natural resources and various sectors of economy and society • Agriculture, • Health, • Hydrology, • Population migration and conflict, etc. • Climate change impact assessment studies require long-term projections of future climate scenarios. [1] [1] Julie Winkler et. al. Climate Scenario Development and Applications for Local/Regional Climate Change Impact Assessments: An Overview for the Non-Climate Scientist: Part I: Scenario Development Using Dow nscaling Methods Climate scenario development and applications I- In proceeding of Geography Compass’11
3 Introduction : Climate Modeling Long Term Projection of Future Climate • Cold bias Warm bias Histogram of daily maximum temperature at a weather station in Michigan
4 Introduction : Multiple Linear Regression Example: Multiple Linear Regression • Regression-based methods that minimize prediction error tend to have large distribution bias The cumulative distribution function (CDF) of observation variable
5 Introduction : Quantile Mapping Example: Quantile Mapping (QM) • Quantile mapping is a bias correction method. Eq. (1) Where, ‘x’ is the RCM/GCM output and ‘y’ the observed response variable. CDF of x and y Bias correction methods minimize bias but have large prediction error
6 Introduction : Climate Modeling Comparison between QM and MLR Fig. Histogram of daily maximum temperature at a weather station in Michigan
7 Contour Regression Contributions  Present framework (Contour Regression) that maximizes prediction accuracy while minimizing bias in the distribution. [3,4]  We also present a linear, a non-linear and a quantile regression based variations of contour regression  The framework can incorporate predictor variables from heterogeneous data sources (semi-supervised) [4] [3] Zubin Abraham et al. Distribution regularized regression framework for climate modeling –SDM’13 [4] Zubin Abraham et al. Contour regression: A distribution-regularized regression framework for climate modeling – In proceeding of Statistical Analysis and Data Mining’ 14
8 Contour Regression: Introduction Contour Regression (CR) • General framework for contour regression Minimize residual errors Minimize errors in CDF (2) Where, Regression line y x
9 Contour Regression: MLCR Multiple Linear Contour Regression (MLCR) • Eq. (5) Where, Eq. (6)
10 Contour Regression: Non-Linear Setting Kernel Contour Regression (KCR) • Ridge regression applied to CR. Eq. (7) • Kernel Contour regression (KCR). Eq. (8) Where, Eq. (9)
11 Contour Regression: Conditional Quantiles Quantile Regression (QR) W 1 =7.0, W 0 =-0.4 Regression line (Y) (X) Zubin Abraham et al. Extreme Value Prediction for Zero Inflated DataL- PAKDD’12
12 Contour Regression: Conditional Quantiles Quantile Regression (QR) (10) Where, u = Residual (Observation-prediction)
13 Contour Regression: Conditional Quantiles Quantile Contour Regression (QCR) Contour regression that uses a QR based loss function take the following form (11) Where The preceding optimization problem can be converted to the following form (12) Linear programming is used to solve the above loss function.
14 Contour Regression: Multi-Source Data Contour Regression for Multi-Source Data Predictor variables: Response variable: Geometric Quantile Mapping: Geometric quantile is the multi-dimensional equivalent of a univariate quantile mapping function. Eq. (13) Eq. (14) Zubin Abraham et al. Position Preserving Multi-Output Prediction – ECML- PKDD’13 J. I. Marden. Positions and qq plots. Statistical Science’ 04
EXPERIMENTAL EVALUATION
16 Contour Regression (CR) : Experimental Evaluation Experimental Setup • Predicting surface precipitation, maximum temperature and min temperature at a location using the following predictor variables obtained from regional climate models: Julie Winkler et. al. - Climate Scenario Development using Hybrid Downscaling: An Application to NARCCAP and ENSEMBLES simulations- In proceeding of AAG’12 *
17 Data : RCM Data Sources  Predictor variables are obtained from NCEP-driven regional climate models.  WRFG  CRCM  RCM3  Observation data obtained from 14 climate stations in Michigan.  Daily data from 1980-1999  Training: 1980-1989  Testing: 1990-199
18 Experimental Evaluation: Introduction Experimental Evaluation  Performance of CR when using least square loss function.  Comparing residual error and distribution bias of MLR, QM, and MLCR  Performance of CR when using QR based loss function.  QCR versus QR  Performance of CR when using predictor variables from multiple data sources.  MLCR (in a semi-supervised setting)
19 Multiple Linear Contour Regression (MLCR) : Experimental Results MLCR Results (Accuracy) A bar plot of maximum temperature RMSE of the 14 station belonging to the WRFG dataset
20 Multiple Linear Contour Regression (MLCR) : Experimental Results MLCR Results (Distribution Bias) The CDF plots of maximum temperature and precipitation of a station belonging to WRFG dataset
21 Multiple Linear Contour Regression (MLCR) : Experimental Results Summary for MLCR Results Relative performance gain of MLCR over baseline approaches.. MLCR had lower distribution bias in 14/14 stations for each dataset
22 Multiple Linear Contour Regression (MLCR) : Experimental Results Summary for MLCR Results Relative performance gain of MLCR over baseline approaches..
23 Quantile Contour Regression (QCR) : Experimental Results Summary of QCR Results The CDF plots of minimum temperature and precipitation of a station belonging to WRFG dataset QCR had lower distribution bias than QR in 14/14 stations for each dataset
24 Quantile Contour Regression (QCR) : Experimental Results Summary of QCR Results Percentage of stations that QCR outperformed QR QCR had better accuracy than QR in 14/14 stations for each dataset
25 Multiple Linear Contour Regression (MLCR) : Experimental Results MLCR Results (Heterogeneous Data) The CDF plots of maximum temperature and precipitation of a station belonging to WRFG dataset
26 Contour Regression (CR) : References Summary • We presented a framework for contour regression, that maximizes prediction accuracy while minimizing bias in the distribution. • We show that the framework can be adapted to modeling non linear relationships and conditional quantiles. • We empirically showed that the framework outperformed or was at least on par with baseline approaches on real world climate data. • The framework can incorporate predictor variables from heterogeneous data sources
27 Contour Regression (CR) : Summary References  [1] Julie Winkler et. al. Climate Scenario Development and Applications for Local/Regional Climate Change Impact Assessments: An Overview for the Non-Climate Scientist: Part I: Scenario Development Using Downscaling Methods Climate scenario development and applications I- In proceeding of Geography Compass’11  [2] Themeßl, Jakob, M., Gobiet, A. and Leuprecht, A. (2011), Empirical-statistical downscaling and error correction of daily precipitation from regional climate models. International Journal of Climatology, 31: 1530 – 1544.  [3] Zubin Abraham et al. Distribution regularized regression framework for climate modeling –SDM’13  [4] Zubin Abraham et al. Contour regression: A distribution-regularized regression framework for climate modeling – In proceeding of Statistical Analysis and Data Mining’ 14  [5] Zubin Abraham et al. Position Preserving Multi-Output Prediction – ECML- PKDD’13  [6] Zubin Abraham et al. Extreme Value Prediction for Zero Inflated DataL- PAKDD’12  [7] Zubin Abraham et al. An Integrated Framework for Simultaneous Classification and Regression of Time-Series data. SDM’10  [8] Julie Winkler et. al. - Climate Scenario Development using Hybrid Downscaling: An Application to NARCCAP and ENSEMBLES simulations- In proceeding of AAG’12 *  [9] J. I. Marden. Positions and qq plots. Statistical Science’ 04  [10] X. He, Y. Yang, and J. Zhang. Bivariate downscaling with asynchronous measurements. Journal of agricultural, biological, and environmental statistics’ 12 .
THANK YOU! This work is partially supported by NSF grant III-0712987 and subcontract for NASA award NNX09AL60G. This work is also partly supported by National Science Foundation Dynamics of Coupled Natural and Human Systems competition Program (CNH Award No.-0909378).
Recommend
More recommend