New approaches for statistical modelling Jelena Jockovi c - PowerPoint PPT Presentation

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New approaches for statistical modelling Jelena Jockovi´ c ADVISORS: Pepa Ram´ ırez Cobo, Prof. Fernando L´ opez Bl´ azquez DOC-COURSE IMUS, University of Seville May 25 2010

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Outline The Double Pareto Lognormal Distribution ( dPlN ) Algebraic structures concerning probability densities Edgeworth expansions

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Heavy tailed distributions • samples with some extreme values • cannot be modelled by normal distribution • application: insurance, finance, hydrology, internet traffic... • models: Pareto, Pareto mixtures, Log-normal,..., dPlN dPlN introduced in: Reed, W. and Jorgensen, M. (2004). The Double Pareto Lognormal distribution - a new parametric model for size distributions. Communications in Statistics, Theory and Methods, 33(8):1733-1753.

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions dPlN : Definition • Reed and Jorgersen, 2004. • Define Y = W + Z ind., where Z ∼ N ( ν, τ 2 ) and W ∼ f W ( w ) (skewed Laplace distribution): � αβ α + β e βw for w � 0 , f W ( w ) = αβ α + β e − αw for w > 0 where α, β > 0 • Y ∼ NL ( α, β, ν, τ ) and X = exp( Y ) ∼ dPlN ( α, β, ν, τ ) .

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions CDF � log x − ν � P [ X � x ] = Φ τ � α 2 τ 2 � � log x − ν − τ 2 α � βx − α − β + α exp + αν Φ 2 τ � β 2 τ 2 � � log x − ν + τ 2 β � αx β Φ c − α + β exp − βν 2 τ

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions PDF, Moments • PDF β α f ( x ) = α + β f 1 ( x ) + α + β f 2 ( x ) , � � � log x − ν − ατ 2 � αν + α 2 τ 2 f 1 ( x ) = αx − α − 1 exp Φ , 2 τ � � � log x − ν + βτ 2 � − βν + β 2 τ 2 f 2 ( x ) = βx β − 1 exp Φ c . 2 τ • Moments: The MGF does not exist in closed form. However, for r < α moments can be obtained.

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions dPlN properties • Power law tail behaviour: f ( x ) ∼ αA ( α, ν, τ ) x − α − 1 , x → ∞ , f ( x ) ∼ βA ( − β, ν, τ ) x β − 1 , x → 0 • Closure under power-law transformations: X ∼ dPlN ( α, β, ν, τ 2 ) , a, b > 0 W = aX b ∼ dPlN ( α/b, β/b, bν + log a, b 2 τ 2 )

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results X, Y ∼ dPlN , Z, W ∼ NL What is: • Z + W, Z − W ? • X · Y, X/Y ? • X + Y, X − Y ?

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results X, Y ∼ dPlN , Z, W ∼ NL What is: • Z + W, Z − W ? obtained • X · Y, X/Y ? obtained • X + Y, X − Y ? very hard! � exp( ax ) ϕ ( x + b )Φ( x ) dx (!?)

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Future work • more general formulas for NL, dPlN • lack of identifiability of dPlN f ( x 1 , x 2 , ..., x n | θ ) = f ( x 1 , x 2 , ..., x n | θ ′ ) , θ � = θ ′ θ = ( α, β, ν, τ ) , f - the likelihood function Sometimes, parameters are not estimated well! • queueing models

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions dPlN and queueing systems • queueing systems closely related to heavy tailed modelling (congestion in teletraffic systems, ruin problems in insurance...) Cooper, R. (1981). Introduction to Queueing Theory. North Holland, 2nd edition. • GI/M/c described in: Ausin, M., Lillo, R., and Wiper, M. (2007). Bayesian control of the number of servers in a GI/M/c queueing system. Journal of Statistical Planning and Inference , 137:3043-3057. • dPlN/M/ 1 , M/dPlN/ 1 analyzed in: Ramirez, P., Lillo, R., Wilson, S., and Wiper, M. (2010). Bayesian inference for Double Pareto Lognormal queues. To appear in Annals of Applied Statistics. • next: dPlN/G/c queueing system! optimizing number of servers!

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Motivation • bringing together applied probability and algebra (known applications in analysis of variance, multivariate analysis and stationary processes) Some classical references: Girardin, V. and Senoussi, R. (2003). Semigroup stationary processes and spectral representation. Bernouilli, 9(5):857-876. Grenander, U. (1963). Probabilities on Algebraic Structures. John Wiley, New York. Hannan, E. (1965). Group representations and applied probability. J. Appl. Prob., 2:1-68. • What is the family of densities � f � ?

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Motivation • bringing together applied probability and algebra (known applications in analysis of variance, multivariate analysis and stationary processes) Some classical references: Girardin, V. and Senoussi, R. (2003). Semigroup stationary processes and spectral representation. Bernouilli, 9(5):857-876. Grenander, U. (1963). Probabilities on Algebraic Structures. John Wiley, New York. Hannan, E. (1965). Group representations and applied probability. J. Appl. Prob., 2:1-68. • What is the family of densities � f � ? Most distribution families are not closed for convolutions. We have to define new operations!

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results Gamma distribution: β α Γ( α ) x α − 1 e − βx . V = { g ( α, β ) | α > 0 , β > 0 } , g ( α, β ) = ⊕ : V × V → V, g ( α, β ) ⊕ g ( α 1 , β 1 ) = g ( αα 1 , ββ 1 ) c ⊗ g ( α, β ) = g ( α c , β c ) ⊗ : R × V → V, inner product: � g ( α, β ) , g ( α 1 , β 1 ) � = log α · log α 1 + log β · log β 1 Then, the structure ( V, R , ⊕ , ⊗ , � . � ) is a pre-Hilbert space.

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results Normal distribution: ( x − ν )2 2 π e − 1 V = { f ( ν, τ 2 ) | ν ∈ R , τ 2 > 0 } , 1 f ( ν, τ 2 ) = . τ 2 √ 2 τ f ( ν, τ 2 ) ⊕ f ( ν 1 , τ 12 ) = f ( ν + ν 1 , τ 2 τ 2 ⊕ : V × V → V, 1 ) c ⊗ f ( ν, τ 2 ) = f ( cν, ( τ 2 ) c ) ⊗ : R × V → V, 1 ) � = νν 1 + log τ 2 · log τ 2 inner product: � f ( ν, τ 2 ) , f ( ν 1 , τ 2 1 Then, the structure ( V, R , ⊕ , ⊗ , � . � ) is a pre-Hilbert space.

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Conclusions and future work Operations ⊕ and ⊗ can be applied to: • any family of densities defined by two real parameters (at least one positive) • moment generating functions, characteristic functions (example: stable distributions)

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Motivation Central Limit Theorem: | F n ( x ) − Φ( x ) |≤ C 0 sup 1 n x ∈ R 2 Error may be too large! A way to improve it: � � � � k � � � A j ( x ) ≤ C k ( x ) � � F n ( x ) − , A 0 ( x ) = Φ( x ) � � 1 ( k +1) � n � n 2 2 j =0

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Definition F - d.f. to be approximated, f - c.f., { κ r } - cumulants We want to find the expansion based on: d.f. Ψ , with c.f. ψ and cumulants { γ r } � + ∞ � � ( κ r − γ r )( it ) r f ( t ) = exp ψ ( t ) (holds) r ! r =1 Under certain conditions and after applying the inverse Fourier transform: � + ∞ � � ( κ r − γ r )( − D x ) r F ( t ) = exp Ψ( t ) (Charlier differential series) r ! r =1 D x - differential operator with respect to x

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Edgeworth expansions - definition � X 1+ X 2+ ··· + Xn � − µ F n ( x ) = P ≤ x , X i - iid r.v. with mean µ and n σ variance σ , Φ - standard normal distribution Collecting terms according to powers of n ... Edgeworth expansion:   ∞ � P j ( it )  1 +  exp( − t 2 / 2) , f n ( t ) = P j - pol. of deg. 3 j, j n 2 j =1 ∞ � P j ( − D x ) F n ( x ) = Φ( x ) + Φ( x ) j n 2 j =1 ( P j - Cramer-Edgeworth polynomials) Convergent series, can be truncated with error arbitrary small!

New approaches for statistical modelling Jelena Jockovi c - PowerPoint PPT Presentation

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New approaches for statistical modelling Jelena Jockovi c ADVISORS: Pepa Ram rez Cobo, Prof. Fernando L

Workshop 4: Statistical modelling intro Murray Logan 10 Mar 2019 Section 1 Introduction

Workshop 4: Statistical modelling intro Murray Logan March 10, 2019 Table of contents 1

Traditional and Modern Approaches to Modelling with R: An Advanced Course Bill Venables, CSIRO,

Machine Translation Week 1: Classical approaches Classical and Statistical Approaches

New Approaches to New Approaches to New Approaches to Repair of Repair of Repair of Spinal

Statistical Approaches for Statistical Approaches for Determining Normal Limits in Determining

Statistical Modeling Approaches for Statistical Modeling Approaches for Information Retrieval

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

The Modelling and Simulation Process 1. History of Modelling and Simulation 2. Modelling and

(Modelling) Semantics of Modelling Languages Hans Vangheluwe 7 September 2010, Lisboa, Portugal

Modelling with Differential Equations Modelling with Differential Equations Modelling with

Summary of WG2 Summary of WG2 Reference approaches to modelling for Reference approaches to

Statistical graphics with Statistical graphics with ggplot2 ggplot2 Programming for Statistical

TITANIUM EYEWEAR DESIGNED IN ICELAND, MADE IN ITALY AGNAR NEW NEW NEW ALBA NEW NEW NEW

Physical Modelling Physical Modelling with with ModelVision ModelVision, , Physical Modelling

Modelling and Synthesis of User Interfaces for Complex, Web-Based Modelling Environments Jacob

Mixture Models Simulation-based Estimation Michel Bierlaire michel.bierlaire@epfl.ch

Phase Fluctuations and Sign Problems Michael Wagman MIT Lattice 2018 East Lansing, Michigan

Welfare, Inequality & Poverty, # 2 1 Arthur CHARPENTIER - Welfare, Inequality and Poverty

Computational Bayesian data analysis Bruno Nicenboim / Shravan Vasishth 2020-03-11 1 Bayesian

Week 2: Maximum Likelihood Estimation Instructor: Sergey Levine 1 Recap: MLE for the binomial

Logging with SF4L and Logback J.Serrat 102759 Software Design November 3, 2015 Index Why

Directed Probabilistic Graphical Models CMSC 678 UMBC Announcement 1: Assignment 3 Due Wednesday

7th International dCache Workshop Berlin Bits and Pieces 2013 Christian Bernardt (at DESY)

Sambuz

Useful Links

Newsletter

Mail Us

New approaches for statistical modelling Jelena Jockovi c - PowerPoint PPT Presentation

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New approaches for statistical modelling Jelena Jockovi c ADVISORS: Pepa Ram rez Cobo, Prof. Fernando L

Workshop 4: Statistical modelling intro Murray Logan 10 Mar 2019 Section 1 Introduction

Workshop 4: Statistical modelling intro Murray Logan March 10, 2019 Table of contents 1

Traditional and Modern Approaches to Modelling with R: An Advanced Course Bill Venables, CSIRO,

Machine Translation Week 1: Classical approaches Classical and Statistical Approaches

New Approaches to New Approaches to New Approaches to Repair of Repair of Repair of Spinal

Statistical Approaches for Statistical Approaches for Determining Normal Limits in Determining

Statistical Modeling Approaches for Statistical Modeling Approaches for Information Retrieval

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

The Modelling and Simulation Process 1. History of Modelling and Simulation 2. Modelling and

(Modelling) Semantics of Modelling Languages Hans Vangheluwe 7 September 2010, Lisboa, Portugal

Modelling with Differential Equations Modelling with Differential Equations Modelling with

Summary of WG2 Summary of WG2 Reference approaches to modelling for Reference approaches to

Statistical graphics with Statistical graphics with ggplot2 ggplot2 Programming for Statistical

TITANIUM EYEWEAR DESIGNED IN ICELAND, MADE IN ITALY AGNAR NEW NEW NEW ALBA NEW NEW NEW

Physical Modelling Physical Modelling with with ModelVision ModelVision, , Physical Modelling

Modelling and Synthesis of User Interfaces for Complex, Web-Based Modelling Environments Jacob

Mixture Models Simulation-based Estimation Michel Bierlaire michel.bierlaire@epfl.ch

Phase Fluctuations and Sign Problems Michael Wagman MIT Lattice 2018 East Lansing, Michigan

Welfare, Inequality &amp; Poverty, # 2 1 Arthur CHARPENTIER - Welfare, Inequality and Poverty

Computational Bayesian data analysis Bruno Nicenboim / Shravan Vasishth 2020-03-11 1 Bayesian

Week 2: Maximum Likelihood Estimation Instructor: Sergey Levine 1 Recap: MLE for the binomial

Logging with SF4L and Logback J.Serrat 102759 Software Design November 3, 2015 Index Why

Directed Probabilistic Graphical Models CMSC 678 UMBC Announcement 1: Assignment 3 Due Wednesday

7th International dCache Workshop Berlin Bits and Pieces 2013 Christian Bernardt (at DESY)

Sambuz

Useful Links

Newsletter

Mail Us

Welfare, Inequality & Poverty, # 2 1 Arthur CHARPENTIER - Welfare, Inequality and Poverty