Time series of air pollutants Jiri Neubauer, Jaroslav Michalek, - - PDF document

time series of air pollutants
SMART_READER_LITE
LIVE PREVIEW

Time series of air pollutants Jiri Neubauer, Jaroslav Michalek, - - PDF document

Time series of air pollutants Jiri Neubauer, Jaroslav Michalek, Frantisek Bozek The topic of the contribution is the description of time series of air pollutants. These time series contain values of carbon monoxide (CO), nitrogen oxides (NO x ),


slide-1
SLIDE 1

1

Time series of air pollutants

Jiri Neubauer, Jaroslav Michalek, Frantisek Bozek The topic of the contribution is the description of time series of air pollutants. These time series contain values of carbon monoxide (CO), nitrogen oxides (NOx), sulphur dioxide (SO2) and suspended particulate matter (dust). The necessary pollutant concentrations were

  • btained from the measuring station with automatic emission monitoring in Vyškov, the

Czech Republic, for the period 1997-2001. The assessment of the air pollution degree was based on monitoring concentrations of pollutants in the ground atmospheric layer. For the purpose of elaborating time series these three method were used:

  • 1. The SARIMA process
  • 2. The process of hidden periods
  • 3. The exponential smoothing

These statistical tests of randomness were calculated for verifying of the suitability of the model: A) The test based on signs of differences B) The test based on break points C) The test based on the Kendall’s coefficient τ D) The test based on the Spearman’s coefficient ρ E) The median test In addition the portmanteau test was used for verifying of the ARMA process. You can see on the transparency films the particular models (on all data series or in detailed view) and you can visually consider suitability of each models. It seems to be interesting to calculate the forecasting of 7 future values and compare with the measured data. The second part is dedicated to the problem of exceeding the level. I demonstrate this problem on the time series of DUST. This series were found as the ARMA process – A(7). The estimation of the spectral density function were calculate using the expression of the spectral density function of the ARMA process and after it were computed estimations of moments λ and

2

λ . You can see in the table expectation of theoretical values of the random variable Cu based on calculated spectral density function and the real measured values from the series. The difference between this magnitudes is shown on the graph. Next table contains expectation of the random variable Z(T) (total time over the level u) and measured values (see. figure). The anticoincidence between results calculated using given formulas and really measured values needs to be seeking in non-performance of the conditions a) – d) (especially condition c) was not satisfied (Kolmogorov – Smirnov test, chi square test, A test normality – reject the hypothesis of normality). The solution of this could be possible to find in any transformation of the series. PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-2
SLIDE 2

2

For the future, it is counted on the development of a model which would enable to predict the concentration dependence of selected pollutants on temperature, pressure and humidity, air, wind intensity, inversion and other climate conditions. Further, the observation

  • f the dependence of concentration development as for particular pollutants in interrelation

and the attempt of possible determination of their synergetic or antagonistic effects is being considered. PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-3
SLIDE 3

ARIMA

ARMA(p, q) Let { }

t

ε be white noise,

q t q t t t p t p t t t

y y y y

− − − − − −

+ + + + + + + + = ε θ ε θ ε θ ε φ φ φ ... ...

2 2 1 1 2 2 1 1

is a mixed autoregressive moving average process ARMA(p, q). In lag operator form (

2 2 1

) ( ,

− −

= = =

t t t t t

y y L Ly L y Ly

)

t q q t p p

L L L y L L L ε θ θ θ φ φ φ ) ... 1 ( ) ... 1 (

2 2 1 2 2 1

+ + + + = − − − −

t t

L y L ε θ φ ) ( ) ( =

ARIMA(p, d, q) This model is convenient for description of nonstationary time series. 1-st difference …

t t t t t t

y L Ly y y y y ) 1 (

1

− = − = − = ∆

2-nd difference … ARIMA(p, d, q) is defined as

t t

L w L ε θ φ ) ( ) ( =

, where

t d t

y w ∆ =

. We can write

t t d

L y L L ε θ φ ) ( ) 1 )( ( = −

. SARIMA(p, d, q)(P, D, Q) lag s This model is convenient for description of nonstationary seasonal time series. The seasonal lag operator…

s t t s t s s s t t s

y y L y L L y y L

2 2

) ( ,

− −

= = =

The seasonal difference operator …

s s

L − = ∆ 1

s t t t s t s

y y y L y

− = − = ∆ ) 1 (

s t s t t t s s t s t s

y y y y L L y L y

2 2 2 2

2 ) 2 1 ( ) 1 (

− − +

− = + − = − = ∆

We can write

t s t D s d s

L L y L L L L ε θ φ ) ( ) ( ) 1 ( ) 1 )( ( ) ( Θ = − − Φ

.

t t t t t t t t t t t

y L y L L y y y y y y y y y

2 2 2 1 1 1 2

) 1 ( ) 2 1 ( 2 ) ( ) ( − = + − = = + − = ∆ − ∆ = − ∆ = ∆ ∆ = ∆

− − − −

PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-4
SLIDE 4

Properties of the autocorrelation function and the partial autocorrelation function and criteria AIC, FPE were used for determining the order of the model. Program STATISTICA and MATLAB were used for estimation of coefficients , ,..., , ,...,

1 1 q p

θ θ φ φ

Q P

Θ Θ Φ Φ ,..., , ,...,

1 1

(Maximum likelihood estimation).

Model of Hidden Periods

Periodogram The periodogram ) (ω I

  • f time series y1, …, yn is defined as a function of variable ω

( )

π ω π ω ω π ω ≤ ≤ − + = , ) ( ) ( 4 1 ) (

2 2

b a I

,

where

=

=

n t t

t y n a

1

) cos( 2 ) ( ω ω

,

=

=

n t t

t y n b

1

) sin( 2 ) ( ω ω

. Using the periodogram we can find significant periods in time series y1, …, yn. We are trying to construct model of this series

. ,..., 1 , ) sin ) cos( (

1

n t t t y

t p j j j j j t

= + + + =

=

ε ω β ω α µ

j j β

α µ , ,

are unknown parameters that is necessary to estimate,

j

ω are different

frequencies (

) , ( π ω ∈

j

),

t

ε is a normally distributed white noise ( ) , (

2 ε

σ N

). The significant frequencies is possible to determine using a Fisher’s test. A null hypothesis in the Fisher test is

t t

y ε =

, which suppose that series do not contains any significant periods. An alternative hypothesis supposes that there are some significant periods in this series. The Fisher’s test is based on values of the periodogram computing for frequencies

      − = = = 2 1 , ,..., 1 , 2

*

n m m j n j

j

π ω

High values of the periodogram determine significant frequencies.

= =

=

m i j j m j

I I W

1 * * ,..., 1

) ( ) ( max ω ω

The hull hypothesis is rejecting if

F

g W >

, where gF is a critical value of the Fisher’s test (critical values are tabulated).

PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-5
SLIDE 5

Parameters

j j β

α µ , , we can estimate using linear regression model

t t t

y ε + = Ax

, where

[ ]

) sin( ) cos( ) sin( ) cos( 1

1 1

t t t t

M M t

ω ω ω ω L = x

,

M

ω ω ,...,

1

are significant frequencies,

[ ]

M M

β α β α µ L

1 1

= A

.

Exponential Smoothing

Brown’s Exponential Smoothing An exponential weights moving average is an average that weights the observed time series values unequally, with more recent observations being weighted more heavily than older observations. This unequal weighting is achieved through smoothing constants that determine how much weight is given to each observation. If mt-1 is the moving average calculated for the first t-1 points in the series yt then given the value yt , the new moving average is found as

1

) 1 (

− + =

t t t

m y m α α

, where α is the smoothing constant ( 1 < <α ). For a data series y1, …, yn forecasts are given by

1

ˆ

=

t t

m y

. The initial value m0 is calculated as the average level in the first quarter of the series. Holt’s Linear Smoothing This adds a trend component to Brown’s Exponential method. For a data series yt forecasts are given by

1 1

ˆ

− − +

=

t t t

b m y

, where

) )( 1 (

1 1 − − +

− + =

t t t t

b m y m α α

is the level at time t,

1 1

) 1 ( ) (

− −

− + − =

t t t t

b m m b γ γ

is the trend at time t,

α is the level smoothing constant (

1 < <α ),

γ is the trend smoothing constant (

1 < < γ ). The initial values m0 and b0 are calculated by a linear regression on the first half of the series.

PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-6
SLIDE 6

Winter’s Seasonal Smoothing (Additive) This adds an additive seasonal component to Holt’s Linear method. For a data series yt forecasts are given by

s t t t t

c b m y

− − −

+ + =

1 1

ˆ

, where

) )( 1 ( ) (

1 1 − − −

+ − + − =

t t s t t t

b m c y m α α

is the level at time t,

1 1

) 1 ( ) (

− −

− + − =

t t t t

b m m b γ γ

is the trend at time t,

s t t t t

c m y c

− + − = ) 1 ( ) ( δ δ

is the seasonal component at time t,

α is the level smoothing constant (

1 < <α ),

γ is the trend smoothing constant (

1 < < γ

),

δ is the seasonal smoothing constant (

1 < <δ ), s is the season period. Winter’s Seasonal Smoothing (Multiplicative) This adds a mulitiplicative seasonal component to Holt’s Linear method. For a data series yt forecasts are given by

s t t t t

c b m y

− − − +

= ) ( ˆ

1 1

, where

) )( 1 (

1 1 − − −

+ − + =

t t s t t t

b m c y m α α

is the level at time t,

1 1

) 1 ( ) (

− −

− + − =

t t t t

b m m b γ γ

is the trend at time t,

s t t t t

c m y c

− + = ) 1 ( δ δ

is the seasonal component at time t.

α is the level smoothing constant (

1 < <α ),

γ is the trend smoothing constant (

1 < < γ ),

δ is the seasonal smoothing constant (

1 < <δ ), s is the season period.

PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-7
SLIDE 7

Problems of Exceeding the Level u

Let {

}

t

X is a random process defined in a interval T , . We assume that {

}

t

X possess the following properties: a) {

}

t

X is real ∞ <

2 t

EX , =

t

EX ; b) {

}

t

X

is continuous in the mean (

2

for t t X X E

t t

→ → − ) and strictly stationary (for any admissible t1, …, tn and any h is ) ..., , ( ) ..., , (

1 ,... 1 ,...

1 1

n h t h t n t t

x x F x x F

n n

+ +

= ) c) {

}

t

X

is gaussian (all distribution functions are normal);

d) realizations of the process {

}

t

X

are continuous with probability 1.

Let ) (t R be an autocovariance function of the process {

}

t

X

and

) (λ f be a spectral density function.

∞ ∞ −

= λ λ λ λ d f

k k

) (

´

, ,... 2 , 1 , = k It is easy to proof that ) ( R = λ , because

∞ ∞ −

= λ λ λ d f it t R ) ( ) exp( ) ( . If ∞ <

λ λ λ d f ) ( , then

1 =

λ (the spectral density function is axially symmetric). Let u be a real number, Gu be a set of continuous functions in T , that are not identically equal u in any open subinterval of T , and u T g u g ≠ ≠ ) ( , ) ( . The function

u

G g ∈ crosses the level u in a point ) , ( T t ∈ if there are points t1 and t2 in arbitrary neighbourhood of t0 that

[ ][ ]

) ( ) (

2 1

< − − u t g u t g . The number of this points is Cu. Let {

}

t

X fulfils conditions a) – d),

[ ]

     ∞ = ∞ ∞ <        −         =

2 2 2 2 1 2

for for 2 exp , λ λ λ λ λ π u T T C E

u

Let {

}

t

X fulfils conditions a) – d), u be a real number. We define    < ≥ = u t X u t X t Y ) ( when ) ( when 1 ) (

,

=

T

dt t Y t Z ) ( ) (

.

Variable Z(t) indicates total time when the process {

}

t

X is over the level u.             − = σ φ u T T EZ 1 ) (

,

PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-8
SLIDE 8

where ) (

2

R = σ ,         σ φ u is a distribution function of ) 1 , ( N . If we describe time series by ARMA(p, q) process

q t q t t t p t p t t t

y y y y

− − − − − −

+ + + + + + + + = ε θ ε θ ε θ ε φ φ φ ... ...

2 2 1 1 2 2 1 1

, then the spectral density function is

2 2 1 2 2 1 2

) exp( ... ) 2 exp( ) exp( 1 ) exp( ... ) 2 exp( ) exp( 1 2 ) ( λ φ λ φ λ φ λ θ λ θ λ θ π σ λ ip i i iq i i f

p q

− − − − − − − − + + − + − + = , where

2

σ is ) var(

t

ε . PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-9
SLIDE 9

Series of DUST

Transformation: c y y

t TR t

− = , where c = 56,57 (linear regression) The series

TR t

y

  • model AR(7)

p(1) p(2) p(3) p(4) p(5) p(6) p(7) 0,5782

  • 0,0244 0,0035

0,0634 0,0019 0,0496 0,0893

  • 1. The number

u

C 0761 , 739

0 =

λ 2549 , 823

2 =

λ Level u

[ ]

T C E

u

, Measured value 613,1 432 10 573,0 363 20 467,7 273 30 333,5 219 40 207,7 161 50 113,0 117 60 53,7 85 70 22,3 59 80 8,1 49 90 2,6 29 100 0,7 27 110 0,2 12 120 10 130 10 140 6 150 2

  • 2. Total time over the level – Z(T)

Level u

[ ]

) (T Z E Measured value 912,5 707 10 605,6 468 20 421,5 329 30 249,2 222 40 128,8 146 50 60,1 100 60 24,9 73 70 9,2 45 80 3,0 33 90 0,8 23 100 0,2 18 110 0,1 8 120 6 130 5 140 3 150 1 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-10
SLIDE 10

50 100 150 100 200 300 400 500 600 700 E[Cu] (blue), measured value (red) level u number 50 100 150 100 200 300 400 500 600 700 800 900 1000 E[Z(T)] (blue), measure value (red) level u total time PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-11
SLIDE 11
  • 3
  • 2
  • 1

1 2 3 200 400 600 800 1000 1200 1400 Spectral density of DUST- AR(7) w PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-12
SLIDE 12

200 400 600 800 1000 1200 1400 1600 1800 2000

  • 500

500 1000 1500 2000 2500 3000 3500 Series of CO (blue), ARIMA (0,1,9) (red) 10-9 kg / m3 1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 100 200 300 400 500 600 700 800 900 1000 Series of CO (blue), ARIMA (0,1,9) (red) 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-13
SLIDE 13

200 400 600 800 1000 1200 1400 1600 1800 2000 500 1000 1500 2000 2500 3000 3500 Series of CO (blue), model of hidden period (Fisher) (red) 10-9 kg / m3 1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 100 200 300 400 500 600 700 800 900 1000 Series of CO (blue), model of hidden period (Fisher) (red) 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-14
SLIDE 14

200 400 600 800 1000 1200 1400 1600 1800 2000

  • 500

500 1000 1500 2000 2500 3000 3500 Series of CO (blue), Exp. smoothing addtive season (7), Alpha=0.696, Delta = 0.071(red) 10-9 kg / m3 1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 100 200 300 400 500 600 700 800 900 1000 Series of CO (blue), Exp. smoothing addtive season (7), Alpha=0.696, Delta = 0.071(red) 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-15
SLIDE 15

5 10 15 200 400 600 800 1000 1200 1400 1600 1800 2000 Series of CO (blue), predicted values ARIMA (0,1,9) (red) 5 10 15

  • 1000
  • 500

500 1000 1500 2000 Series of CO (blue), predicted values Exp. Smoothing (red) 5 10 15 200 400 600 800 1000 1200 1400 1600 1800 2000 Series of CO (blue), predicted value - model of hidden periods (red) - Fisher

PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-16
SLIDE 16

200 400 600 800 1000 1200 1400 1600 1800 2000 50 100 150 200 250 300 350 Series of NOx (blue), SARIMA (0,1,3)(0,1,1) lag 7, ln(NOx) (red) 10-9 kg / m3 1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 20 40 60 80 100 120 140 Series of NOx (blue), SARIMA (0,1,3)(0,1,1) lag 7, ln(NOx) (red) 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-17
SLIDE 17

200 400 600 800 1000 1200 1400 1600 1800 2000

  • 50

50 100 150 200 250 300 350 Series of NOx (blue), model of hidden period (Fisher) (red) 10-9 kg / m3 1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 20 40 60 80 100 120 140 Series of NOx (blue), model of hidden period (Fisher) (red) 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-18
SLIDE 18

200 400 600 800 1000 1200 1400 1600 1800 2000

  • 50

50 100 150 200 250 300 350 Series of NOx (blue), Exp. smoothing addtive season (7), Alpha=0.641, Delta = 0.078(red) 10-9 kg / m3 1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 20 40 60 80 100 120 140 Series of NOx (blue), Exp. smoothing addtive season (7), Alpha=0.641, Delta = 0.078(red) 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-19
SLIDE 19

5 10 15 50 100 150 200 250 Series of NOx (blue), predicted values SARIMA (0,1,3)(0,1,1) lag 7 ln(NOx) (red) 5 10 15 20 40 60 80 100 120 140 160 180 200 Series of NOx (blue), predicted value - model of hidden periods (red) - Fisher 5 10 15

  • 50

50 100 150 200 Series of NOx (blue), predicted values Exp. Smoothing (red)

PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-20
SLIDE 20

200 400 600 800 1000 1200 1400 1600 1800 2000 50 100 150 200 250 Series of DUST (blue), SARIMA (1,1,1)(1,0,1) lag 7 (red) 10-9 kg / m3 1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 50 100 150 200 250 Series of DUST (blue), SARIMA (1,1,1)(1,0,1) lag 7 (red) 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-21
SLIDE 21

200 400 600 800 1000 1200 1400 1600 1800 2000 50 100 150 200 250 Series of DUST (blue), model of hidden periods (red)-Fisher 10-9 kg / m3 1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 20 40 60 80 100 120 140 160 Series of DUST (blue), model of hidden periods (red)-Fisher 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-22
SLIDE 22

1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 20 40 60 80 100 120 140 160 Series of DUST (blue), Exp. smoothing, Alpha=0.499 (red) 10-9 kg / m3 200 400 600 800 1000 1200 1400 1600 1800 2000 50 100 150 200 250 Series of DUST (blue), Exp. smoothing, Alpha=0.499 (red) 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-23
SLIDE 23

5 10 15 20 40 60 80 100 120 140 Series of DUST (blue), predicted v alues SARIMA (1,1,1)(1,0,1) lag 7 (red) 5 10 15 20 40 60 80 100 120 140 Series of DUST (blue), predicted v alues - model of hidden periods (red) - Fisher 5 10 15

  • 20

20 40 60 80 100 120 140 160 Series of DUST (blue), predicted v alues Exp. Smoothing (red)

PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-24
SLIDE 24

200 400 600 800 1000 1200 1400 1600 1800 2000 20 40 60 80 100 120 Series of SO2 (blue), ARIMA (1,1,4), ln(SO2+10) (red) 10-9 kg / m3 1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 2 4 6 8 10 12 14 16 18 Series of SO2 (blue), ARIMA (1,1,4), ln(SO2+10) (red) 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-25
SLIDE 25

1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400

  • 5

5 10 15 20 Series of SO2 (blue), model of hidden periods (Fisher) (red) 10-9 kg / m3 200 400 600 800 1000 1200 1400 1600 1800 2000 20 40 60 80 100 120 Series of SO2 (blue), model of hidden periods (Fisher) (red) 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-26
SLIDE 26

200 400 600 800 1000 1200 1400 1600 1800 2000 20 40 60 80 100 120 Series of SO2 (blue), Exp. smoothing, Alpha=0.675 (red) 10-9 kg / m3 1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 2 4 6 8 10 12 14 16 18 Series of SO2 (blue), Exp. smoothing, Alpha=0.675 (red) 10-9 kg / m3 PDF created with FinePrint pdfFactory trial version http://www.fineprint.com

slide-27
SLIDE 27

5 10 15

  • 10
  • 5

5 10 15 20 Series of SO2 (blue), predicted values ARIMA (1,1,4), ln(SO2+10) (red) 5 10 15

  • 10
  • 5

5 10 15 20 Series of SO2 (blue), predicted values -model of hidden periods (red) - Fisher 5 10 15

  • 15
  • 10
  • 5

5 10 15 20 25 Series of SO2 (blue), predicted values Exp. Smoothing (red)

PDF created with FinePrint pdfFactory trial version http://www.fineprint.com