Predicting Hourly Ozone Pollution in Dallas‐Fort Worth Area Using Spatio‐Temporal Clustering
May 20‐23, 2015 Dallas, Texas, USA Mahdi Ahmadi Yan Huang Kuruvilla John University of North Texas
Predicting Hourly Ozone Pollution in Dallas Fort Worth Area Using - - PowerPoint PPT Presentation
Predicting Hourly Ozone Pollution in Dallas Fort Worth Area Using Spatio Temporal Clustering Mahdi Ahmadi Yan Huang Kuruvilla John University of North Texas May 2023, 2015 Dallas, Texas, USA Presentation Outline Research
May 20‐23, 2015 Dallas, Texas, USA Mahdi Ahmadi Yan Huang Kuruvilla John University of North Texas
2
U.S. counties with high ozone concentrations in 2009.
(source: http://www.epa.gov/o3healthtraining/what.html#fig1)
Current standard level: 0.075 ppm Proposed standard level: 0.060‐0.065 ppm
7
– Time‐consuming and complex procedure – High level of knowledge/proficiency – Accuracy of modelling/prediction – Difficulty of validation – High dimensionality of the solution/modeling space
Cassmassi & Bassett, 1993; Dueñas et al., 2002; Feister & Balzer, 1991; Fiore et al., 1998; Katsoulis, 1996; Korsog & Wolff, 1991; Kuntasal & Chang, 1987; Walker, 1985; Zeldin et al., 1990)
2004; Bloomfield et al., 1996; Chen et al., 1998; Davis, Eder, et al., 1998a; Khokhlov et al., 2008; Smith & Shively, 1995)
al., 2005; Pryor et al., 1995; Statheropoulos et al., 1998; Yu & Chang, 2000)
2002; Balaguer Ballester et al., 2002; Chaloulakou et al., 2003; Dutot et al., 2007; Elkamel et al., 2001; Gomez‐Sanchis et al., 2006; Hadjiiski & Hopke, 2000; Karatzas et al., 2008; Ruiz‐Suarez et al., 1995; Spellman, 1999; Wang et al., 2003; Yi & Prybutok, 1996)
2004; Kaburlasos et al., 2007; Sujit Kumar Sahu & Bakar, 2012; Sujit K Sahu et al., 2007; Domínguez et. al., 2014).
Ozone and meteorological data archive reading from TAMIS datbase Pre‐processing dataset Data mining project 8‐hr ozone pattern recognition using k‐ means cluster analysis Post‐processing temporal cluster analysis to identify
Hierarchical cluster analysis to find spatial patterns of hourly ozone Developing linear regression model for each season and each zone
– Any gap in the time series equal or less than 4 hour were replaced by employing linear interpolation. – Any day with more than 4 consecutive missing values was removed from the dataset.
– For the seasonal analysis it makes more sense to perform cluster analysis on 8‐hr average ozone – 8‐hr average time series using moving average technique were generated. – The original 1‐hr time series were kept for spatial cluster analysis and liner regression.
Sum of square within (SSW)
Cluster Season Months #1
Low
Jan Feb Nov Dec #2
‐
‐ ‐ ‐ ‐ #3
Moderate
Mar Apr May Oct #4
High
Jun Jul Aug Sep
Log SR
Log 1
Log SR
Log 1
Cluster
RMSE #1 0.002797 0.359812 0.021030 0.793794 0.176548 0.827 7.361 #2 0.000206 0.241734 ‐0.000247 0.871374 0.356038 0.890 5.142 #3 0.002699 9.652480 0.027754 0.687160 0.311890 0.827 7.872