Log-Scaling rainfall data: effects on GPD Bayesian goodness of fit. - - PowerPoint PPT Presentation

log scaling rainfall data effects on gpd bayesian
SMART_READER_LITE
LIVE PREVIEW

Log-Scaling rainfall data: effects on GPD Bayesian goodness of fit. - - PowerPoint PPT Presentation

Motivation Model Estimation Model Checking Conclussions Log-Scaling rainfall data: effects on GPD Bayesian goodness of fit. M.I. Ortego J.J. Egozcue Departament de Matemtica Aplicada III E.T.S. Enginyeria Camins Canals Ports Barcelona


slide-1
SLIDE 1

Motivation Model Estimation Model Checking Conclussions

Log-Scaling rainfall data: effects on GPD Bayesian goodness of fit.

M.I. Ortego J.J. Egozcue

Departament de Matemàtica Aplicada III E.T.S. Enginyeria Camins Canals Ports Barcelona (Civil Engineering) Universitat Politècnica de Catalunya

4th conference on Extreme Value Analysis. Gothenburg, 15-19 August 2005

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-2
SLIDE 2

Motivation Model Estimation Model Checking Conclussions

Outline

1

Motivation Rainfall data Problems with model adequacy p-values

2

Model Estimation Bayesian Generalized Pareto Estimation (BGPE) Priors and posteriors

3

Model Checking GPD goodness-of-fit Whole model

4

Conclussions

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-3
SLIDE 3

Motivation Model Estimation Model Checking Conclussions Rainfall data Problems with model adequacy p-values

Vergel de Racons data.

Vergel de Recons 50 100 150 200 250 300 350 1964 1969 1974 1979 1984 1989 1994

time (years) mm daily-precipitation

Main goals:

  • Finding suitable model.
  • Hazard analysis.
  • Occurrence probabilities; return periods.

For reference, see Romero et al. (1998), [4] M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-4
SLIDE 4

Motivation Model Estimation Model Checking Conclussions Rainfall data Problems with model adequacy p-values

The model

Scale of the reference variable, precipitation: ⋄ is a positive variable: (0 mm rainfall is not rainfall!) ⋄ has a relative scale: 50 mm is double than 25mm daily rainfall, but 500mm and 525 mm daily rainfall is nearly the same! Logarithmic scale is needed!!!

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-5
SLIDE 5

Motivation Model Estimation Model Checking Conclussions Rainfall data Problems with model adequacy p-values

The model

Occurrence: Cramér-Lundberg model (Homogeneous Poisson process with intensity parameter λ ). Magnitude: Excesses over threshold described by a Generalized Pareto Distribution (GPD). Bayesian parameter estimation.

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-6
SLIDE 6

Motivation Model Estimation Model Checking Conclussions Rainfall data Problems with model adequacy p-values

Is it a suitable model?

Hazard Estimates: At high levels, great uncertainty of estimates due to scarcity of data. Estimates of typical hazard parameters (e.g. Return period) vary dramatically depending on the selected model:

Distribution of log10(return period)

  • 1.0

0.0 1.0 2.0 3.0 4.0 50.0 150.0 250.0 350.0 450.0 550.0 650.0 750.0

mm log10(return period) Quant0.05 Quant0.1 Quant0.25 Quant0.5 Quant0.75 Quant0.9 Quant0.95

Return period of raw data

Distribution of log10(return period)

  • 1.00

1.00 3.00 5.00 7.00 1.40 1.60 1.80 2.00 2.20 2.40 2.60 2.80

mm log10(return period) Quant0.05 Quant0.1 Quant0.25 Quant0.5 Quant0.75 Quant0.9 Quant0.95

Return period of log data

C M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-7
SLIDE 7

Motivation Model Estimation Model Checking Conclussions Rainfall data Problems with model adequacy p-values

Return period of raw data

Distribution of log10(return period)

  • 1.0

0.0 1.0 2.0 3.0 4.0 50.0 150.0 250.0 350.0 450.0 550.0 650.0 750.0

mm log10(return period)

Quant0.05 Quant0.1 Quant0.25 Quant0.5 Quant0.75 Quant0.9 Quant0.95

Go back M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-8
SLIDE 8

Motivation Model Estimation Model Checking Conclussions Rainfall data Problems with model adequacy p-values

Return period of log data

Distribution of log10(return period)

  • 1.00

1.00 3.00 5.00 7.00 1.40 1.60 1.80 2.00 2.20 2.40 2.60 2.80

mm log10(return period)

Quant0.05 Quant0.1 Quant0.25 Quant0.5 Quant0.75 Quant0.9 Quant0.95

Go back M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-9
SLIDE 9

Motivation Model Estimation Model Checking Conclussions Rainfall data Problems with model adequacy p-values

p-values

Whole model checking (prior +likelihood +GPD)

For reference, see Gelman et al. 1995, 1996, [6]

Several ways of checking it. Goodness-of-fit checking GPD(ξ, β) goodness-of-fit assessing. Several ways of checking it

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-10
SLIDE 10

Motivation Model Estimation Model Checking Conclussions Rainfall data Problems with model adequacy p-values

p-values

A first approach: plug-in-p-value (pplug) pplug = Pgpd(·|❜

ξ,❜ β)[t(X) ≥ t(xobs)] ,

gpd(x|ξ, β) is replaced by gpd(·| ξ, β), where ξ, β is the maximum likelihood estimate of the parameters.

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-11
SLIDE 11

Motivation Model Estimation Model Checking Conclussions Rainfall data Problems with model adequacy p-values

p-values

Bayesian p-values: posterior predictive p-value (ppost)

Guttman (1967) and Rubin (1984)

ppost = Pmpost(·|xobs)[t(X) ≥ t(xobs)] , where mpost(x|xobs) is the posterior predictive distribution, mpost(x|xobs) =

  • gpd(x|ξ, β)π(ξ, β|xobs)d(ξ, β) ,

and π(ξ, β|xobs) is the posterior density for ξ, β.

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-12
SLIDE 12

Motivation Model Estimation Model Checking Conclussions Rainfall data Problems with model adequacy p-values

p-values

discrepancy p-value(pdis)

(Gelman et al., 1995)

The test statistic t(X) is replaced by a discrepancy t(X, ξ, β) pdis = Pmdis(·)[t(X, ξ, β) ≥ t(xobs, ξ, β)] , where mdis(x, ξ, β|xobs) is , mdis(x, ξ, β|xobs) = gpd(x|ξ, β)π(ξ, β|xobs) , and π(ξ, β|xobs) is the posterior density for ξ, β.

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-13
SLIDE 13

Motivation Model Estimation Model Checking Conclussions Rainfall data Problems with model adequacy p-values

p-values: pros and cons

Desirable characteristics:

  • Uniform distribution.
  • Easy to compute.

Other useful characteristics:

  • Known distribution of used statistic (even asymptotically).
  • Easiness of interpretation.

Pros and cons:

  • plug-in p-value: Easy to compute. Uncertainty ignored.
  • posterior predictive p-value is not uniform. Easy to compute.
  • discrepancy p-value is not uniform.

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-14
SLIDE 14

Motivation Model Estimation Model Checking Conclussions Bayesian Generalized Pareto Estimation (BGPE) Priors and posteriors

Bayesian GP Estimation (BGPE)

Three parameters to estimate in the model: Poisson rate, λ, of Poisson(λ) and ξ, β of the magnitude, modelled by GPD(ξ, β): GPDX(x|ξ, β) = 1 −

  • 1 + ξ

β x −1

ξ

A suitable joint prior distribution for λ, ξ, β is set. Prior distributions for λ and ξ, β are independent → the joint prior factorizes: πλ,ξ,β(λ, ξ, β) = πλ(λ) · πξ,β(ξ, β)

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-15
SLIDE 15

Motivation Model Estimation Model Checking Conclussions Bayesian Generalized Pareto Estimation (BGPE) Priors and posteriors

The joint likelihood of parameters, L(λ, ξ, β|xobs), splits into two terms: L(λ, ξ, β|xobs) = L(λ|xobs) · L(ξ, β|xobs) Finally, the Posterior distribution of λ, ξ, β, πλ,ξ,β(λ, ξ, β|xobs) , is obtained: πλ,ξ,β(λ, ξ, β|xobs) = L(λ, ξ, β|xobs) · πλ(λ) · πξ,β(ξ, β) Attention is set to marginal posterior distribution of ξ, β: πξ,β(ξ, β|xobs) = L(ξ, β|xobs) · πξ,β(ξ, β)

For reference, see Egozcue and Ramis (2001), [1], and Egozcue and Tolosana (2002), [2] . M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-16
SLIDE 16

Motivation Model Estimation Model Checking Conclussions Bayesian Generalized Pareto Estimation (BGPE) Priors and posteriors

Prior and posterior distributions : Raw data (I)

Prior density Posterior density Something is lost!!

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-17
SLIDE 17

Motivation Model Estimation Model Checking Conclussions Bayesian Generalized Pareto Estimation (BGPE) Priors and posteriors

Prior and posterior distributions : Raw data (II)

Prior density, ξ < 0 Posterior density, ξ < 0

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-18
SLIDE 18

Motivation Model Estimation Model Checking Conclussions Bayesian Generalized Pareto Estimation (BGPE) Priors and posteriors

Prior and posterior distributions: log data

Prior density Posterior density

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-19
SLIDE 19

Motivation Model Estimation Model Checking Conclussions GPD goodness-of-fit Whole model

p-values: Our alternative

First approach pp =

  • i

ψipi, predictive KS p − value , where pi = KSGOF(ξi, βi), for fixed (ξi, βi) and ψi = π(ξi, βi|xobs) Our alternative pp = Φ   n

i=1 ψi Φ−1(pi)

  • i ψδ

i

  , 1 ≤ δ ≤ 2 , δ ≃ 1 . pi = KSGOF(ξi, βi), for fixed (ξi, βi); ψi = π(ξi, βi|xobs)

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-20
SLIDE 20

Motivation Model Estimation Model Checking Conclussions GPD goodness-of-fit Whole model

GPD Goodness-of-fit

RAW DATA LOG DATA

K-S posterior predictive Gof

4.984 ∗ 10−2 0.7028

K-S discrepancy p-value

4.781 ∗ 10−2 0.7549

Our approach

2.075 ∗ 10−2 0.9605 (other statistics can be used)

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-21
SLIDE 21

Motivation Model Estimation Model Checking Conclussions GPD goodness-of-fit Whole model

Whole model assessing

Slope discrepancy Estimation of the slope of the expected excesses regression line E[X − u|X > u; ξ, β] = β + ξu 1 − ξ , an estimator of ξ, ξ < 1. Results: RAW DATA LOG DATA

Slope discrepancy p-value

8.765 ∗ 10−2 0.4902

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-22
SLIDE 22

Motivation Model Estimation Model Checking Conclussions GPD goodness-of-fit Whole model

Priori vs. likelihood: raw data

Prior density, ξ < 0 Raw data likelihood For raw data, likelihood is out of priori domain!

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-23
SLIDE 23

Motivation Model Estimation Model Checking Conclussions

Conclussions

Model fits better log-scaled data (i.e. Weibull d.a. prior, GPD) Checked by:

  • predictive goodness of fit ;
  • discrepancy bayesian p-value.

Are Gumbel/Frechet d.a. admisible models for natural (finite) phenomena?

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-24
SLIDE 24

Appendix For Further Reading

For Further Reading I

Egozcue , J.J. and Ramis, C. Bayesian hazard analysis of heavy precipitation in Eastern Spain.

  • Int. J. Climatol., 21: 1263-1279,2001.

Egozcue, J.J. and Tolosana-Delgado, R. (2002) Program BGPE: Bayesian Generalized Pareto Estimation.

  • Ed. Diaz-Barrero, J.L., ISBN 84-69999125,Barcelona,

Spain. Egozcue, J.J, Pawlowsky-Glahn, V. and Ortego M.I. Wave-height hazard analysis in Eastern Coast of Spain. Bayesian approach using generalized Pareto distribution. Advances in Geosciences, 2: 25-30, 2005.

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF

slide-25
SLIDE 25

Appendix For Further Reading

For Further Reading II

Romero, R. , Guijarro J.A. , Ramis, C. and Alonso, S. A 30-years (1964-93) daily rainfall data base for the Spanish Mediterranean regions: first exploratory study.

  • Int. J. Climatol., 18: 541-560,1998.

Bayarri,M.J. and Berger, J.O. P-values for composite null models.

  • J. of the Am. Stat. Ass., 95: 1127-1142, 2000.

Gelman, A. and Carlin, J.B. and Stern, H.and Rubin, D.B. Bayesian data analysis. Wiley, 1995.

M.I. Ortego, J.J.Egozcue Log-scalling rainfall data. Effects on GPD GOF