new approaches for statistical modelling
play

New approaches for statistical modelling Jelena Jockovi c - PowerPoint PPT Presentation

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New approaches for statistical modelling Jelena Jockovi c ADVISORS: Pepa Ram rez Cobo, Prof. Fernando L


  1. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New approaches for statistical modelling Jelena Jockovi´ c ADVISORS: Pepa Ram´ ırez Cobo, Prof. Fernando L´ opez Bl´ azquez DOC-COURSE IMUS, University of Seville May 25 2010

  2. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Outline The Double Pareto Lognormal Distribution ( dPlN ) Algebraic structures concerning probability densities Edgeworth expansions

  3. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Heavy tailed distributions • samples with some extreme values • cannot be modelled by normal distribution • application: insurance, finance, hydrology, internet traffic... • models: Pareto, Pareto mixtures, Log-normal,..., dPlN dPlN introduced in: Reed, W. and Jorgensen, M. (2004). The Double Pareto Lognormal distribution - a new parametric model for size distributions. Communications in Statistics, Theory and Methods, 33(8):1733-1753.

  4. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions dPlN : Definition • Reed and Jorgersen, 2004. • Define Y = W + Z ind., where Z ∼ N ( ν, τ 2 ) and W ∼ f W ( w ) (skewed Laplace distribution): � αβ α + β e βw for w � 0 , f W ( w ) = αβ α + β e − αw for w > 0 where α, β > 0 • Y ∼ NL ( α, β, ν, τ ) and X = exp( Y ) ∼ dPlN ( α, β, ν, τ ) .

  5. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions CDF � log x − ν � P [ X � x ] = Φ τ � α 2 τ 2 � � log x − ν − τ 2 α � βx − α − β + α exp + αν Φ 2 τ � β 2 τ 2 � � log x − ν + τ 2 β � αx β Φ c − α + β exp − βν 2 τ

  6. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions PDF, Moments • PDF β α f ( x ) = α + β f 1 ( x ) + α + β f 2 ( x ) , � � � log x − ν − ατ 2 � αν + α 2 τ 2 f 1 ( x ) = αx − α − 1 exp Φ , 2 τ � � � log x − ν + βτ 2 � − βν + β 2 τ 2 f 2 ( x ) = βx β − 1 exp Φ c . 2 τ • Moments: The MGF does not exist in closed form. However, for r < α moments can be obtained.

  7. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions dPlN properties • Power law tail behaviour: f ( x ) ∼ αA ( α, ν, τ ) x − α − 1 , x → ∞ , f ( x ) ∼ βA ( − β, ν, τ ) x β − 1 , x → 0 • Closure under power-law transformations: X ∼ dPlN ( α, β, ν, τ 2 ) , a, b > 0 W = aX b ∼ dPlN ( α/b, β/b, bν + log a, b 2 τ 2 )

  8. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results X, Y ∼ dPlN , Z, W ∼ NL What is: • Z + W, Z − W ? • X · Y, X/Y ? • X + Y, X − Y ?

  9. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results X, Y ∼ dPlN , Z, W ∼ NL What is: • Z + W, Z − W ? obtained • X · Y, X/Y ? obtained • X + Y, X − Y ? very hard! � exp( ax ) ϕ ( x + b )Φ( x ) dx (!?)

  10. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Future work • more general formulas for NL, dPlN • lack of identifiability of dPlN f ( x 1 , x 2 , ..., x n | θ ) = f ( x 1 , x 2 , ..., x n | θ ′ ) , θ � = θ ′ θ = ( α, β, ν, τ ) , f - the likelihood function Sometimes, parameters are not estimated well! • queueing models

  11. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions dPlN and queueing systems • queueing systems closely related to heavy tailed modelling (congestion in teletraffic systems, ruin problems in insurance...) Cooper, R. (1981). Introduction to Queueing Theory. North Holland, 2nd edition. • GI/M/c described in: Ausin, M., Lillo, R., and Wiper, M. (2007). Bayesian control of the number of servers in a GI/M/c queueing system. Journal of Statistical Planning and Inference , 137:3043-3057. • dPlN/M/ 1 , M/dPlN/ 1 analyzed in: Ramirez, P., Lillo, R., Wilson, S., and Wiper, M. (2010). Bayesian inference for Double Pareto Lognormal queues. To appear in Annals of Applied Statistics. • next: dPlN/G/c queueing system! optimizing number of servers!

  12. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Motivation • bringing together applied probability and algebra (known applications in analysis of variance, multivariate analysis and stationary processes) Some classical references: Girardin, V. and Senoussi, R. (2003). Semigroup stationary processes and spectral representation. Bernouilli, 9(5):857-876. Grenander, U. (1963). Probabilities on Algebraic Structures. John Wiley, New York. Hannan, E. (1965). Group representations and applied probability. J. Appl. Prob., 2:1-68. • What is the family of densities � f � ?

  13. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Motivation • bringing together applied probability and algebra (known applications in analysis of variance, multivariate analysis and stationary processes) Some classical references: Girardin, V. and Senoussi, R. (2003). Semigroup stationary processes and spectral representation. Bernouilli, 9(5):857-876. Grenander, U. (1963). Probabilities on Algebraic Structures. John Wiley, New York. Hannan, E. (1965). Group representations and applied probability. J. Appl. Prob., 2:1-68. • What is the family of densities � f � ? Most distribution families are not closed for convolutions. We have to define new operations!

  14. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results Gamma distribution: β α Γ( α ) x α − 1 e − βx . V = { g ( α, β ) | α > 0 , β > 0 } , g ( α, β ) = ⊕ : V × V → V, g ( α, β ) ⊕ g ( α 1 , β 1 ) = g ( αα 1 , ββ 1 ) c ⊗ g ( α, β ) = g ( α c , β c ) ⊗ : R × V → V, inner product: � g ( α, β ) , g ( α 1 , β 1 ) � = log α · log α 1 + log β · log β 1 Then, the structure ( V, R , ⊕ , ⊗ , � . � ) is a pre-Hilbert space.

  15. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results Normal distribution: ( x − ν )2 2 π e − 1 V = { f ( ν, τ 2 ) | ν ∈ R , τ 2 > 0 } , 1 f ( ν, τ 2 ) = . τ 2 √ 2 τ f ( ν, τ 2 ) ⊕ f ( ν 1 , τ 12 ) = f ( ν + ν 1 , τ 2 τ 2 ⊕ : V × V → V, 1 ) c ⊗ f ( ν, τ 2 ) = f ( cν, ( τ 2 ) c ) ⊗ : R × V → V, 1 ) � = νν 1 + log τ 2 · log τ 2 inner product: � f ( ν, τ 2 ) , f ( ν 1 , τ 2 1 Then, the structure ( V, R , ⊕ , ⊗ , � . � ) is a pre-Hilbert space.

  16. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Conclusions and future work Operations ⊕ and ⊗ can be applied to: • any family of densities defined by two real parameters (at least one positive) • moment generating functions, characteristic functions (example: stable distributions)

  17. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Motivation Central Limit Theorem: | F n ( x ) − Φ( x ) |≤ C 0 sup 1 n x ∈ R 2 Error may be too large! A way to improve it: � � � � k � � � A j ( x ) ≤ C k ( x ) � � F n ( x ) − , A 0 ( x ) = Φ( x ) � � 1 ( k +1) � n � n 2 2 j =0

  18. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Definition F - d.f. to be approximated, f - c.f., { κ r } - cumulants We want to find the expansion based on: d.f. Ψ , with c.f. ψ and cumulants { γ r } � + ∞ � � ( κ r − γ r )( it ) r f ( t ) = exp ψ ( t ) (holds) r ! r =1 Under certain conditions and after applying the inverse Fourier transform: � + ∞ � � ( κ r − γ r )( − D x ) r F ( t ) = exp Ψ( t ) (Charlier differential series) r ! r =1 D x - differential operator with respect to x

  19. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Edgeworth expansions - definition � X 1+ X 2+ ··· + Xn � − µ F n ( x ) = P ≤ x , X i - iid r.v. with mean µ and n σ variance σ , Φ - standard normal distribution Collecting terms according to powers of n ... Edgeworth expansion:   ∞ � P j ( it )  1 +  exp( − t 2 / 2) , f n ( t ) = P j - pol. of deg. 3 j, j n 2 j =1 ∞ � P j ( − D x ) F n ( x ) = Φ( x ) + Φ( x ) j n 2 j =1 ( P j - Cramer-Edgeworth polynomials) Convergent series, can be truncated with error arbitrary small!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend