Th` ese de Doctorat de lUniversit e Pierre et Marie Curie - - PowerPoint PPT Presentation

th ese de doctorat de l universit e pierre et marie curie
SMART_READER_LITE
LIVE PREVIEW

Th` ese de Doctorat de lUniversit e Pierre et Marie Curie - - PowerPoint PPT Presentation

Th` ese de Doctorat de lUniversit e Pierre et Marie Curie Contributions ` a la Pr evision Statistique Olivier P. Faugeras Universit e Pierre et Mare Curie - Paris VI Laboratoire de Statistique Th eorique et Appliqu ee


slide-1
SLIDE 1

Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

Contributions ` a la Pr´ evision Statistique Olivier P. Faugeras

Universit´ e Pierre et Mare Curie - Paris VI Laboratoire de Statistique Th´ eorique et Appliqu´ ee

28/11/2008

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-2
SLIDE 2

Outline

Part I : Parametric Statistical Prediction for a Stochastic Process. Observe : X0, . . . , XT of a stochastic process (Xt) with law Pθ. Predict : XT +h a future value. Part II : A nonparametric quantile-copula approach to conditional density

  • estimation. Applications to prediction.

Observe : (Xi, Yi)i=1,...,n independent identically distributed. Predict : Y , given that X = x.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-3
SLIDE 3

Part I : Parametric Statistical Prediction for a Stochastic Process.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-4
SLIDE 4

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

Outline

1

Introduction The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

2

Prediction by temporal separation Model Statistical Prediction and assumptions Results : Consistency of the predictor Example

3

Limit law of the Predictor Assumptions Result : Limit law of the predictor Conclusions

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-5
SLIDE 5

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

The Statistical Prediction Problem (1)

Let X = {Xt, t ∈ Z} a real-valued, square integrable, stochastic process, with distribution Pθ, θ a parameter. Observed data: (X0, . . . , XT ) := XT Aim : Forecast Y := g(XT +h) by a function f(XT

0 ) = ˆ

Y Criteria : Error L2 Lemma : Decomposition of the prediction error Eθ(Y − ˆ Y )2 = Eθ(Y − Eθ(Y |XT

0 ))2 + Eθ(Eθ(Y |XT 0 ) − f(XT 0 ))2

The prediction error splits between a probabilistic prediction error term and a statistical prediction error term. The error is thus minimised by choosing the conditional expectation as a predictor f(XT

0 ) = Eθ(Y |XT 0 ) := Y ∗

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-6
SLIDE 6

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

The Statistical Prediction Problem (1)

Let X = {Xt, t ∈ Z} a real-valued, square integrable, stochastic process, with distribution Pθ, θ a parameter. Observed data: (X0, . . . , XT ) := XT Aim : Forecast Y := g(XT +h) by a function f(XT

0 ) = ˆ

Y Criteria : Error L2 Lemma : Decomposition of the prediction error Eθ(Y − ˆ Y )2 = Eθ(Y − Eθ(Y |XT

0 ))2 + Eθ(Eθ(Y |XT 0 ) − f(XT 0 ))2

The prediction error splits between a probabilistic prediction error term and a statistical prediction error term. The error is thus minimised by choosing the conditional expectation as a predictor f(XT

0 ) = Eθ(Y |XT 0 ) := Y ∗

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-7
SLIDE 7

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

The Statistical Prediction Problem (1)

Let X = {Xt, t ∈ Z} a real-valued, square integrable, stochastic process, with distribution Pθ, θ a parameter. Observed data: (X0, . . . , XT ) := XT Aim : Forecast Y := g(XT +h) by a function f(XT

0 ) = ˆ

Y Criteria : Error L2 Lemma : Decomposition of the prediction error Eθ(Y − ˆ Y )2 = Eθ(Y − Eθ(Y |XT

0 ))2 + Eθ(Eθ(Y |XT 0 ) − f(XT 0 ))2

The prediction error splits between a probabilistic prediction error term and a statistical prediction error term. The error is thus minimised by choosing the conditional expectation as a predictor f(XT

0 ) = Eθ(Y |XT 0 ) := Y ∗

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-8
SLIDE 8

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

The Statistical prediction problem (2)

Definition : the Probabilistic predictor The Bayesian or Probabilistic predictor is defined as the random variable Y ∗ := Eθ(Y |XT

0 ) := rθ(XT 0 )

But : θ is unknown → to be estimated by ˆ θT on XT Definition : The Statistical predictor We build the plug-in Statistical predictor : ˆ Y := rˆ

θT (XT 0 )

2 mixed problems : on the same data

1

a probabilistic calculation problem : XT

0 as argument of rθ

2

a statistical estimation problem : XT

0 as data to estimate θ by ˆ

θT → behaviour difficult to study.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-9
SLIDE 9

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

The Statistical prediction problem (2)

Definition : the Probabilistic predictor The Bayesian or Probabilistic predictor is defined as the random variable Y ∗ := Eθ(Y |XT

0 ) := rθ(XT 0 )

But : θ is unknown → to be estimated by ˆ θT on XT Definition : The Statistical predictor We build the plug-in Statistical predictor : ˆ Y := rˆ

θT (XT 0 )

2 mixed problems : on the same data

1

a probabilistic calculation problem : XT

0 as argument of rθ

2

a statistical estimation problem : XT

0 as data to estimate θ by ˆ

θT → behaviour difficult to study.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-10
SLIDE 10

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

The Statistical prediction problem (2)

Definition : the Probabilistic predictor The Bayesian or Probabilistic predictor is defined as the random variable Y ∗ := Eθ(Y |XT

0 ) := rθ(XT 0 )

But : θ is unknown → to be estimated by ˆ θT on XT Definition : The Statistical predictor We build the plug-in Statistical predictor : ˆ Y := rˆ

θT (XT 0 )

2 mixed problems : on the same data

1

a probabilistic calculation problem : XT

0 as argument of rθ

2

a statistical estimation problem : XT

0 as data to estimate θ by ˆ

θT → behaviour difficult to study.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-11
SLIDE 11

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

The Statistical prediction problem (2)

Definition : the Probabilistic predictor The Bayesian or Probabilistic predictor is defined as the random variable Y ∗ := Eθ(Y |XT

0 ) := rθ(XT 0 )

But : θ is unknown → to be estimated by ˆ θT on XT Definition : The Statistical predictor We build the plug-in Statistical predictor : ˆ Y := rˆ

θT (XT 0 )

2 mixed problems : on the same data

1

a probabilistic calculation problem : XT

0 as argument of rθ

2

a statistical estimation problem : XT

0 as data to estimate θ by ˆ

θT → behaviour difficult to study.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-12
SLIDE 12

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

The Statistical prediction problem (2)

Definition : the Probabilistic predictor The Bayesian or Probabilistic predictor is defined as the random variable Y ∗ := Eθ(Y |XT

0 ) := rθ(XT 0 )

But : θ is unknown → to be estimated by ˆ θT on XT Definition : The Statistical predictor We build the plug-in Statistical predictor : ˆ Y := rˆ

θT (XT 0 )

2 mixed problems : on the same data

1

a probabilistic calculation problem : XT

0 as argument of rθ

2

a statistical estimation problem : XT

0 as data to estimate θ by ˆ

θT → behaviour difficult to study.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-13
SLIDE 13

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

The Statistical prediction problem (2)

Definition : the Probabilistic predictor The Bayesian or Probabilistic predictor is defined as the random variable Y ∗ := Eθ(Y |XT

0 ) := rθ(XT 0 )

But : θ is unknown → to be estimated by ˆ θT on XT Definition : The Statistical predictor We build the plug-in Statistical predictor : ˆ Y := rˆ

θT (XT 0 )

2 mixed problems : on the same data

1

a probabilistic calculation problem : XT

0 as argument of rθ

2

a statistical estimation problem : XT

0 as data to estimate θ by ˆ

θT → behaviour difficult to study.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-14
SLIDE 14

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

Prevision versus Regression

R´ egression

1

estimation step : on the data Dn := {(Xi, Yi), i = 0, . . . , n}, estimate r(x) = E[Y |X = x] by ˆ r(x, Dn)

2

prediction step : for a new (X, Y ), predict Y by ˆ r(X, Dn) if (X, Y ) were independent of Dn, then E[Y |X, Dn] = E[Y |X] and Eθ[r(X) − ˆ r(X, Dn)]2 =

  • (r(X) − ˆ

r(X, Dn))2 |X = x

  • dPX(x)

=

  • (r(x) − ˆ

r(x, Dn))2 dPX(x) → The Prediction error is the same as the MISE regression error. Prediction For a Markov process, (Xi, Yi) = (Xi, Xi+1) et (X, Y ) = (XT , XT +1) ⇒ Dn not independent of X

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-15
SLIDE 15

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

Prevision versus Regression

R´ egression

1

estimation step : on the data Dn := {(Xi, Yi), i = 0, . . . , n}, estimate r(x) = E[Y |X = x] by ˆ r(x, Dn)

2

prediction step : for a new (X, Y ), predict Y by ˆ r(X, Dn) if (X, Y ) were independent of Dn, then E[Y |X, Dn] = E[Y |X] and Eθ[r(X) − ˆ r(X, Dn)]2 =

  • (r(X) − ˆ

r(X, Dn))2 |X = x

  • dPX(x)

=

  • (r(x) − ˆ

r(x, Dn))2 dPX(x) → The Prediction error is the same as the MISE regression error. Prediction For a Markov process, (Xi, Yi) = (Xi, Xi+1) et (X, Y ) = (XT , XT +1) ⇒ Dn not independent of X

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-16
SLIDE 16

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

Prevision versus Regression

R´ egression

1

estimation step : on the data Dn := {(Xi, Yi), i = 0, . . . , n}, estimate r(x) = E[Y |X = x] by ˆ r(x, Dn)

2

prediction step : for a new (X, Y ), predict Y by ˆ r(X, Dn) if (X, Y ) were independent of Dn, then E[Y |X, Dn] = E[Y |X] and Eθ[r(X) − ˆ r(X, Dn)]2 =

  • (r(X) − ˆ

r(X, Dn))2 |X = x

  • dPX(x)

=

  • (r(x) − ˆ

r(x, Dn))2 dPX(x) → The Prediction error is the same as the MISE regression error. Prediction For a Markov process, (Xi, Yi) = (Xi, Xi+1) et (X, Y ) = (XT , XT +1) ⇒ Dn not independent of X

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-17
SLIDE 17

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

Prevision versus Regression

R´ egression

1

estimation step : on the data Dn := {(Xi, Yi), i = 0, . . . , n}, estimate r(x) = E[Y |X = x] by ˆ r(x, Dn)

2

prediction step : for a new (X, Y ), predict Y by ˆ r(X, Dn) if (X, Y ) were independent of Dn, then E[Y |X, Dn] = E[Y |X] and Eθ[r(X) − ˆ r(X, Dn)]2 =

  • (r(X) − ˆ

r(X, Dn))2 |X = x

  • dPX(x)

=

  • (r(x) − ˆ

r(x, Dn))2 dPX(x) → The Prediction error is the same as the MISE regression error. Prediction For a Markov process, (Xi, Yi) = (Xi, Xi+1) et (X, Y ) = (XT , XT +1) ⇒ Dn not independent of X

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-18
SLIDE 18

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

Prevision versus Regression

R´ egression

1

estimation step : on the data Dn := {(Xi, Yi), i = 0, . . . , n}, estimate r(x) = E[Y |X = x] by ˆ r(x, Dn)

2

prediction step : for a new (X, Y ), predict Y by ˆ r(X, Dn) if (X, Y ) were independent of Dn, then E[Y |X, Dn] = E[Y |X] and Eθ[r(X) − ˆ r(X, Dn)]2 =

  • (r(X) − ˆ

r(X, Dn))2 |X = x

  • dPX(x)

=

  • (r(x) − ˆ

r(x, Dn))2 dPX(x) → The Prediction error is the same as the MISE regression error. Prediction For a Markov process, (Xi, Yi) = (Xi, Xi+1) et (X, Y ) = (XT , XT +1) ⇒ Dn not independent of X

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-19
SLIDE 19

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

Prevision versus Regression

R´ egression

1

estimation step : on the data Dn := {(Xi, Yi), i = 0, . . . , n}, estimate r(x) = E[Y |X = x] by ˆ r(x, Dn)

2

prediction step : for a new (X, Y ), predict Y by ˆ r(X, Dn) if (X, Y ) were independent of Dn, then E[Y |X, Dn] = E[Y |X] and Eθ[r(X) − ˆ r(X, Dn)]2 =

  • (r(X) − ˆ

r(X, Dn))2 |X = x

  • dPX(x)

=

  • (r(x) − ˆ

r(x, Dn))2 dPX(x) → The Prediction error is the same as the MISE regression error. Prediction For a Markov process, (Xi, Yi) = (Xi, Xi+1) et (X, Y ) = (XT , XT +1) ⇒ Dn not independent of X

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-20
SLIDE 20

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

Towards asymptotic independence

Issue How to let X be independent of Dn ? A solution : temporal separation Let ϕ(T) → ∞ and kT → ∞ such that kT − ϕ(T) → ∞. Split the data (X0, . . . , XT ) :

1

estimate θ on [0, ϕ(T)] : ˆ θϕ(T )

2

predict on [T − kT , T] : ˆ Y := rˆ

θϕ(T )(XT T −kT )

by using an assumption of asymptotic independence (short memory) on the process.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-21
SLIDE 21

Introduction Prediction by temporal separation Limit law of the Predictor The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

Towards asymptotic independence

Issue How to let X be independent of Dn ? A solution : temporal separation Let ϕ(T) → ∞ and kT → ∞ such that kT − ϕ(T) → ∞. Split the data (X0, . . . , XT ) :

1

estimate θ on [0, ϕ(T)] : ˆ θϕ(T )

2

predict on [T − kT , T] : ˆ Y := rˆ

θϕ(T )(XT T −kT )

by using an assumption of asymptotic independence (short memory) on the process.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-22
SLIDE 22

Introduction Prediction by temporal separation Limit law of the Predictor Model Statistical Prediction and assumptions Results : Consistency of the predictor Example

Outline

1

Introduction The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

2

Prediction by temporal separation Model Statistical Prediction and assumptions Results : Consistency of the predictor Example

3

Limit law of the Predictor Assumptions Result : Limit law of the predictor Conclusions

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-23
SLIDE 23

Introduction Prediction by temporal separation Limit law of the Predictor Model Statistical Prediction and assumptions Results : Consistency of the predictor Example

Some notions on α-mixing

Definition : α-mixing coefficients, Rosenblatt [1956] Let (Ω, A, P) a probability space and B, C two sub-sigma fields of A. The α-mixing coefficient between B and C is defined by α(B, C) = sup

B∈B C∈C

|P(B ∩ C) − P(B)P(C)| and the α-mixing coefficient of order k for the stochastic process X = {Xt, t ∈ N} defined on the probability space (Ω, A, P) as α(k) = sup

t∈N

α(σ(Xs, s ≤ t), σ(Xs, s ≥ t + k))

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-24
SLIDE 24

Introduction Prediction by temporal separation Limit law of the Predictor Model Statistical Prediction and assumptions Results : Consistency of the predictor Example

Model

Let X = (Xt, t ∈ N) a stochastic process. We assume that :

1

X is a second order, square integrable, α-mixing process.

2

the regression function rθ(.) depends approximately of the last kT values (XT −i, i = 1, . . . , kT ) : X∗

T +1 := Eθ

  • XT +1
  • XT
  • :=

kT

  • i=0

ri(XT −i, θ) + ηkT (X, θ). Assumptions H0 on the process (i) lim

T →∞ Eθ(η2 kT (X, θ)) = 0 ;

(ii) for all i ∈ N, ri(XT −i, θ1) − ri(XT −i, θ2) ≤ Hi(XT −i) θ1 − θ2 , ∀θ1, θ2; (iii) there exists a r > 1 such that sup

i∈N

  • EθH2r

i (XT −i)

1/r < ∞. This additive model is an extension of a model studied by Bosq [2007].

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-25
SLIDE 25

Introduction Prediction by temporal separation Limit law of the Predictor Model Statistical Prediction and assumptions Results : Consistency of the predictor Example

Statistical Prediction and assumptions

We assume we have an estimator ˆ θT of θ. Assumptions H1 on the estimator ˆ θT (i) lim sup

T →∞

T.Eθ(ˆ θT − θ)2 < ∞ ; (ii) there exists q > 1 such that lim sup

T →∞

T qE(ˆ θT − θ)2q < ∞ . We build a statistical predictor : ˆ XT +1 := kT

i=0 ri(XT −i, ˆ

θϕ(T )) Assumptions H2 on the coefficients (i)

k2

T

ϕ(T )

T →∞ 0;

(ii) (T − kT − ϕ(T)) →

T →∞ ∞.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-26
SLIDE 26

Introduction Prediction by temporal separation Limit law of the Predictor Model Statistical Prediction and assumptions Results : Consistency of the predictor Example

Statistical Prediction and assumptions

We assume we have an estimator ˆ θT of θ. Assumptions H1 on the estimator ˆ θT (i) lim sup

T →∞

T.Eθ(ˆ θT − θ)2 < ∞ ; (ii) there exists q > 1 such that lim sup

T →∞

T qE(ˆ θT − θ)2q < ∞ . We build a statistical predictor : ˆ XT +1 := kT

i=0 ri(XT −i, ˆ

θϕ(T )) Assumptions H2 on the coefficients (i)

k2

T

ϕ(T )

T →∞ 0;

(ii) (T − kT − ϕ(T)) →

T →∞ ∞.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-27
SLIDE 27

Introduction Prediction by temporal separation Limit law of the Predictor Model Statistical Prediction and assumptions Results : Consistency of the predictor Example

Consistency of the predictor

Theorem 2.5 Under the assumptions H0,H1,H2, we have that lim sup

T →∞

Eθ( ˆ XT +1 − X∗

T +1)2 = 0

Tool : Davydov’s covariance inequality Let X ∈ Lq(P) and Y ∈ Lr(P), if q > 1, r > 1 and 1

r + 1 q = 1 − 1 p, then

|Cov(X, Y )| ≤ 2p

  • 2α(σ(X), σ(Y ))

1

p XqY r. Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-28
SLIDE 28

Introduction Prediction by temporal separation Limit law of the Predictor Model Statistical Prediction and assumptions Results : Consistency of the predictor Example

Consistency of the predictor

Theorem 2.5 Under the assumptions H0,H1,H2, we have that lim sup

T →∞

Eθ( ˆ XT +1 − X∗

T +1)2 = 0

Tool : Davydov’s covariance inequality Let X ∈ Lq(P) and Y ∈ Lr(P), if q > 1, r > 1 and 1

r + 1 q = 1 − 1 p, then

|Cov(X, Y )| ≤ 2p

  • 2α(σ(X), σ(Y ))

1

p XqY r. Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-29
SLIDE 29

Introduction Prediction by temporal separation Limit law of the Predictor Model Statistical Prediction and assumptions Results : Consistency of the predictor Example

Example of process

For a linear, weakly stationary, centered, non deterministic, inversible process in discrete time, its Wold decomposition writes: XT = eT +

kT

  • i=1

ϕi(θ)XT −i+

  • i>kT

ϕi(θ)XT −i with

  • i=1

ϕ2

i (θ) < ∞. Set ηkT (X, θ) =

  • i>kT +1

ϕi(θ)XT +1−i Proposition If X verifies the assumptions

1

∀i, ϕi is differentiable and ϕ′

i(.)∞ < ∞ ;

2

there exists a r > 1 such as (Xt) has a moment of order 2r;

3

X is α-mixing and such that

i,j

ϕi+1(θ)ϕj+1(θ)α1/p (|i − j|) < ∞. Then, X verifies the assumptions of theorem 2.5.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-30
SLIDE 30

Introduction Prediction by temporal separation Limit law of the Predictor Assumptions Result : Limit law of the predictor Conclusions

Outline

1

Introduction The Statistical Prediction Problem Prevision vs Regression Towards asymptotic independence

2

Prediction by temporal separation Model Statistical Prediction and assumptions Results : Consistency of the predictor Example

3

Limit law of the Predictor Assumptions Result : Limit law of the predictor Conclusions

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-31
SLIDE 31

Introduction Prediction by temporal separation Limit law of the Predictor Assumptions Result : Limit law of the predictor Conclusions

Assumptions for the limit law

Assumptions H′

0 on the process

(i) θ → ri(XT −i, θ) is twice differentiable w.r.t. θ; (ii) sup

i

  • ∂2

θri(XT −i, .)

  • ∞ = OP (1);

(iii) ηkT (X, θ) = oP

  • 1

ϕ(T )

  • ;

(iv)

+∞

  • i=0

∂θri(XT −i; θ) exists and converge a. s. to a vector V as T → +∞. Assumption H′

1 on the estimator ˆ

θT (i) √ T(ˆ θT − θ)

L

N(0, σ2(θ)). Assumption H′

2 on the coefficients

(i) kT = o(

  • ϕ(T));

(ii) (T − kT − ϕ(T)) →

T →∞ ∞.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-32
SLIDE 32

Introduction Prediction by temporal separation Limit law of the Predictor Assumptions Result : Limit law of the predictor Conclusions

Limit law of the predictor

Theorem 2.10 If the assumptions H′

0,H′ 1,H′ 2 are verified, then

  • ϕ(T)( ˆ

XT +1 − X∗

T +1) L

< U, V > where U and V are two independent random variables, U with law N(0, σ2(θ)) and V is the limit of

+∞

  • i=0

∂θri(XT −i; θ) as T → ∞

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-33
SLIDE 33

Introduction Prediction by temporal separation Limit law of the Predictor Assumptions Result : Limit law of the predictor Conclusions

Tool

An asymptotic independence lemma Let (X′

n) and (X′′ n) two sequences of real-valued random variables with laws

P ′

n and P ′′ n respectively, defined on the probability space (Ω, A, P). Assume

that (X′

n) and (X′′ n) are asymptotically mixing w.r.t. each other, in the sense

that there exists a sequence of coefficients α(n) with α(n) →

n→∞ 0 such that,

for all Borel set A and B of R,

  • P(X′

n ∈ A, X′′ n ∈ B) − P(X′ n ∈ A)P(X′′ n ∈ B)

  • ≤ α(n)

Then, if

1

X′

n L

X′ with law P ′;

2

X′′

n L

X′′ with law P ′′; (X′

n, X′′ n) L

(X′, X′′), and the law (X′, X′′) is P ′ ⊗ P ′′.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-34
SLIDE 34

Introduction Prediction by temporal separation Limit law of the Predictor Assumptions Result : Limit law of the predictor Conclusions

Conclusions

Some limits of the temporal decoupling method

1

heuristically under-efficient : gap in the data ;

2

the mixing coefficients = a real number which reduces the dependence structure of the process to a property of asymptotic independence ;

3

practical applications are difficult to undertake. References Faugeras, O. (2007) Pr´ evision statistique param´ etrique par s´ eparation

  • temporelle. Accepted to Annales de l’ISUP.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-35
SLIDE 35

Introduction Prediction by temporal separation Limit law of the Predictor Assumptions Result : Limit law of the predictor Conclusions

Part II : A nonparametric quantile-copula approach to conditional density estimation.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-36
SLIDE 36

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Outline

4

Introduction Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

5

The Quantile-Copula estimator The quantile transform The copula representation A product shaped estimator

6

Asymptotic results Consistency and asymptotic normality Sketch of the proofs

7

Comparison with competitors Theoretical comparison Finite sample simulation

8

Application to prediction and discussions Application to prediction Discussions

9

Summary and conclusions

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-37
SLIDE 37

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Setup and Motivation

Objective

  • bserve a sample ((Xi, Yi); i = 1, . . . , n) i.i.d. of (X, Y ).

predict the output Y for an input X at location x with minimal assumptions on the law of (X, Y ) (Nonparametric setup). Notation (X, Y ) → joint c.d.f FX,Y , joint density fX,Y ; X → c.d.f. F, density f; Y → c.d.f. G, density g.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-38
SLIDE 38

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Setup and Motivation

Objective

  • bserve a sample ((Xi, Yi); i = 1, . . . , n) i.i.d. of (X, Y ).

predict the output Y for an input X at location x with minimal assumptions on the law of (X, Y ) (Nonparametric setup). Notation (X, Y ) → joint c.d.f FX,Y , joint density fX,Y ; X → c.d.f. F, density f; Y → c.d.f. G, density g.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-39
SLIDE 39

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Setup and Motivation

Objective

  • bserve a sample ((Xi, Yi); i = 1, . . . , n) i.i.d. of (X, Y ).

predict the output Y for an input X at location x with minimal assumptions on the law of (X, Y ) (Nonparametric setup). Notation (X, Y ) → joint c.d.f FX,Y , joint density fX,Y ; X → c.d.f. F, density f; Y → c.d.f. G, density g.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-40
SLIDE 40

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Why estimating the conditional density ?

What is a good prediction ?

1

Classical approach (L2 theory): the conditional mean or regression function r(x) = E(Y |X = x),

2

Fully informative approach: the conditional density f(y|x)

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-41
SLIDE 41

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Why estimating the conditional density ?

What is a good prediction ?

1

Classical approach (L2 theory): the conditional mean or regression function r(x) = E(Y |X = x),

2

Fully informative approach: the conditional density f(y|x)

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-42
SLIDE 42

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Why estimating the conditional density ?

What is a good prediction ?

1

Classical approach (L2 theory): the conditional mean or regression function r(x) = E(Y |X = x),

2

Fully informative approach: the conditional density f(y|x)

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-43
SLIDE 43

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Why estimating the conditional density ?

What is a good prediction ?

1

Classical approach (L2 theory): the conditional mean or regression function r(x) = E(Y |X = x),

2

Fully informative approach: the conditional density f(y|x)

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-44
SLIDE 44

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Why estimating the conditional density ?

What is a good prediction ?

1

Classical approach (L2 theory): the conditional mean or regression function r(x) = E(Y |X = x),

2

Fully informative approach: the conditional density f(y|x)

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-45
SLIDE 45

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Estimating the conditional density - 1

A first density-based approach f(y|x) = fX,Y (x, y) f(x) ← ˆ fX,Y (x, y) ˆ f(x) ˆ fX,Y , ˆ f: Parzen-Rosenblatt kernel estimators with kernels K, K′, bandwidths h and h′. The double kernel estimator ˆ f(y|x) =

n

  • i=1

K′

h′(Xi − x)Kh(Yi − y) n

  • i=1

K′

h′(Xi − x)

→ ratio shaped

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-46
SLIDE 46

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Estimating the conditional density - 1

A first density-based approach f(y|x) = fX,Y (x, y) f(x) ← ˆ fX,Y (x, y) ˆ f(x) ˆ fX,Y , ˆ f: Parzen-Rosenblatt kernel estimators with kernels K, K′, bandwidths h and h′. The double kernel estimator ˆ f(y|x) =

n

  • i=1

K′

h′(Xi − x)Kh(Yi − y) n

  • i=1

K′

h′(Xi − x)

→ ratio shaped

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-47
SLIDE 47

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Estimating the conditional density - 1

A first density-based approach f(y|x) = fX,Y (x, y) f(x) ← ˆ fX,Y (x, y) ˆ f(x) ˆ fX,Y , ˆ f: Parzen-Rosenblatt kernel estimators with kernels K, K′, bandwidths h and h′. The double kernel estimator ˆ f(y|x) =

n

  • i=1

K′

h′(Xi − x)Kh(Yi − y) n

  • i=1

K′

h′(Xi − x)

→ ratio shaped

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-48
SLIDE 48

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Estimating the conditional density - 1

A first density-based approach f(y|x) = fX,Y (x, y) f(x) ← ˆ fX,Y (x, y) ˆ f(x) ˆ fX,Y , ˆ f: Parzen-Rosenblatt kernel estimators with kernels K, K′, bandwidths h and h′. The double kernel estimator ˆ f(y|x) =

n

  • i=1

K′

h′(Xi − x)Kh(Yi − y) n

  • i=1

K′

h′(Xi − x)

→ ratio shaped

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-49
SLIDE 49

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Estimating the conditional density - 2

A regression strategy Fact: E

  • 1|Y −y|≤h|X = x
  • = F(y + h|x) − F(y − h|x) ≈ 2h.f(y|x)

Conditional density estimation problem → a regression framework

1

Transform the data: Yi → Y ′

i := (2h)−11|Yi−y|≤h

Yi → Y ′

i := Kh(Yi − y) smoothed version

2

Perform a nonparametric regression of Y ′

i on Xis by local averaging

methods (Nadaraya-Watson, local polynomial, orthogonal series,...) Nadaraya-Watson estimator ˆ f(y|x) =

n

  • i=1

K′

h′(Xi − x)Kh(Yi − y) n

  • i=1

K′

h′(Xi − x)

→ (same) ratio shape.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-50
SLIDE 50

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Estimating the conditional density - 2

A regression strategy Fact: E

  • 1|Y −y|≤h|X = x
  • = F(y + h|x) − F(y − h|x) ≈ 2h.f(y|x)

Conditional density estimation problem → a regression framework

1

Transform the data: Yi → Y ′

i := (2h)−11|Yi−y|≤h

Yi → Y ′

i := Kh(Yi − y) smoothed version

2

Perform a nonparametric regression of Y ′

i on Xis by local averaging

methods (Nadaraya-Watson, local polynomial, orthogonal series,...) Nadaraya-Watson estimator ˆ f(y|x) =

n

  • i=1

K′

h′(Xi − x)Kh(Yi − y) n

  • i=1

K′

h′(Xi − x)

→ (same) ratio shape.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-51
SLIDE 51

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Estimating the conditional density - 2

A regression strategy Fact: E

  • 1|Y −y|≤h|X = x
  • = F(y + h|x) − F(y − h|x) ≈ 2h.f(y|x)

Conditional density estimation problem → a regression framework

1

Transform the data: Yi → Y ′

i := (2h)−11|Yi−y|≤h

Yi → Y ′

i := Kh(Yi − y) smoothed version

2

Perform a nonparametric regression of Y ′

i on Xis by local averaging

methods (Nadaraya-Watson, local polynomial, orthogonal series,...) Nadaraya-Watson estimator ˆ f(y|x) =

n

  • i=1

K′

h′(Xi − x)Kh(Yi − y) n

  • i=1

K′

h′(Xi − x)

→ (same) ratio shape.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-52
SLIDE 52

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

Ratio shaped estimators

Bibliography

1

Double kernel estimator: Rosenblatt [1969], Roussas [1969], Stute [1986], Hyndman, Bashtannyk and Grunwald [1996];

2

Local Polynomial: Fan, Yao and Tong [1996], Fan and Yao [2005];

3

Local parametric and constrained local polynomial: Hyndman and Yao [2002]; Rojas, Genovese, Wasserman [2009];

4

Partitioning type estimate: Gy¨

  • rfi and Kohler [2007];

5

Projection type estimate: Lacour [2007].

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-53
SLIDE 53

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

The trouble with ratio shaped estimators

Drawbacks quotient shape of estimator is tricky to study; explosive behavior when the denominator is small → numerical implementation delicate (trimming); minoration hypothesis on the marginal density f(x) ≥ c > 0. How to remedy these problems? → build on the idea of using synthetic data: find a representation of the data more adapted to the problem.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-54
SLIDE 54

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

The trouble with ratio shaped estimators

Drawbacks quotient shape of estimator is tricky to study; explosive behavior when the denominator is small → numerical implementation delicate (trimming); minoration hypothesis on the marginal density f(x) ≥ c > 0. How to remedy these problems? → build on the idea of using synthetic data: find a representation of the data more adapted to the problem.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-55
SLIDE 55

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

Outline

4

Introduction Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

5

The Quantile-Copula estimator The quantile transform The copula representation A product shaped estimator

6

Asymptotic results Consistency and asymptotic normality Sketch of the proofs

7

Comparison with competitors Theoretical comparison Finite sample simulation

8

Application to prediction and discussions Application to prediction Discussions

9

Summary and conclusions

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-56
SLIDE 56

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

The quantile transform

What is the “best” transformation of the data in that context ? The quantile transform theorem when F is arbitrary, if U is a uniformly distributed random variable on (0, 1), X

d

= F −1(U); whenever F is continuous, the random variable U = F(X) is uniformly distributed on (0, 1). → use the invariance property of the quantile transform to construct a pseudo-sample (Ui, Vi) with a prescribed uniform marginal distribution. (X1, . . . , Xn) (Y1, . . . , Yn) ↓ ↓ (U1 = F(X1), . . . , Un = F(Xn)) (V1 = G(Y1), . . . , Vn = G(Yn))

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-57
SLIDE 57

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

The quantile transform

What is the “best” transformation of the data in that context ? The quantile transform theorem when F is arbitrary, if U is a uniformly distributed random variable on (0, 1), X

d

= F −1(U); whenever F is continuous, the random variable U = F(X) is uniformly distributed on (0, 1). → use the invariance property of the quantile transform to construct a pseudo-sample (Ui, Vi) with a prescribed uniform marginal distribution. (X1, . . . , Xn) (Y1, . . . , Yn) ↓ ↓ (U1 = F(X1), . . . , Un = F(Xn)) (V1 = G(Y1), . . . , Vn = G(Yn))

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-58
SLIDE 58

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

The quantile transform

What is the “best” transformation of the data in that context ? The quantile transform theorem when F is arbitrary, if U is a uniformly distributed random variable on (0, 1), X

d

= F −1(U); whenever F is continuous, the random variable U = F(X) is uniformly distributed on (0, 1). → use the invariance property of the quantile transform to construct a pseudo-sample (Ui, Vi) with a prescribed uniform marginal distribution. (X1, . . . , Xn) (Y1, . . . , Yn) ↓ ↓ (U1 = F(X1), . . . , Un = F(Xn)) (V1 = G(Y1), . . . , Vn = G(Yn))

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-59
SLIDE 59

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

The copula representation

→ leads naturally to the copula function: Sklar’s theorem [1959] For any bivariate cumulative distribution function FX,Y on R2, with marginal c.d.f. F of X and G of Y , there exists some function C : [0, 1]2 → [0, 1], called the dependence or copula function, such as FX,Y (x, y) = C(F(x), G(y)) , − ∞ ≤ x, y ≤ +∞. If F and G are continuous, this representation is unique with respect to (F, G). The copula function C is itself a c.d.f. on [0, 1]2 with uniform marginals. → captures the dependence structure of the vector (X, Y ), irrespectively of the marginals. → allows to deal with the randomness of the dependence structure and the randomness of the marginals separately.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-60
SLIDE 60

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

The copula representation

→ leads naturally to the copula function: Sklar’s theorem [1959] For any bivariate cumulative distribution function FX,Y on R2, with marginal c.d.f. F of X and G of Y , there exists some function C : [0, 1]2 → [0, 1], called the dependence or copula function, such as FX,Y (x, y) = C(F(x), G(y)) , − ∞ ≤ x, y ≤ +∞. If F and G are continuous, this representation is unique with respect to (F, G). The copula function C is itself a c.d.f. on [0, 1]2 with uniform marginals. → captures the dependence structure of the vector (X, Y ), irrespectively of the marginals. → allows to deal with the randomness of the dependence structure and the randomness of the marginals separately.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-61
SLIDE 61

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

The copula representation

→ leads naturally to the copula function: Sklar’s theorem [1959] For any bivariate cumulative distribution function FX,Y on R2, with marginal c.d.f. F of X and G of Y , there exists some function C : [0, 1]2 → [0, 1], called the dependence or copula function, such as FX,Y (x, y) = C(F(x), G(y)) , − ∞ ≤ x, y ≤ +∞. If F and G are continuous, this representation is unique with respect to (F, G). The copula function C is itself a c.d.f. on [0, 1]2 with uniform marginals. → captures the dependence structure of the vector (X, Y ), irrespectively of the marginals. → allows to deal with the randomness of the dependence structure and the randomness of the marginals separately.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-62
SLIDE 62

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

A product shaped estimator

Assume that the copula function C(u, v) has a density c(u, v) = ∂2C(u,v)

∂u∂v

i.e. c(u, v) is the density of the transformed r.v. (U, V ) = (F(X), G(Y )). A product form of the conditional density By differentiating Sklar’s formula, fY |X(y|x) = fXY (x, y) f(x) = g(y)c(F(x), G(y)) A product shaped estimator ˆ fY |X(y|x) = ˆ gn(y)ˆ cn(Fn(x), Gn(y))

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-63
SLIDE 63

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

A product shaped estimator

Assume that the copula function C(u, v) has a density c(u, v) = ∂2C(u,v)

∂u∂v

i.e. c(u, v) is the density of the transformed r.v. (U, V ) = (F(X), G(Y )). A product form of the conditional density By differentiating Sklar’s formula, fY |X(y|x) = fXY (x, y) f(x) = g(y)c(F(x), G(y)) A product shaped estimator ˆ fY |X(y|x) = ˆ gn(y)ˆ cn(Fn(x), Gn(y))

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-64
SLIDE 64

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

Construction of the estimator - 1

→ get an estimator of the conditional density by plugging estimators of each quantities. density of Y : g ← kernel estimator ˆ gn(y) :=

1 nhn n

  • i=1

K0

  • y−Yi

hn

  • c.d.f.

F(x) ← Fn(x) = 1

n n

  • j=1

1Xjx G(y) ← Gn(y) := 1

n n

  • j=1

1Yjy empirical c.d.f. copula density c(u, v) ← cn(u, v) a bivariate Parzen-Rosenblatt kernel density (pseudo) estimator cn(u, v) := 1 na2

n n

  • i=1

K u − Ui an , v − Vi an

  • (1)

with kernel K(u, v) = K1(u)K2(v), and bandwidths an.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-65
SLIDE 65

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

Construction of the estimator - 1

→ get an estimator of the conditional density by plugging estimators of each quantities. density of Y : g ← kernel estimator ˆ gn(y) :=

1 nhn n

  • i=1

K0

  • y−Yi

hn

  • c.d.f.

F(x) ← Fn(x) = 1

n n

  • j=1

1Xjx G(y) ← Gn(y) := 1

n n

  • j=1

1Yjy empirical c.d.f. copula density c(u, v) ← cn(u, v) a bivariate Parzen-Rosenblatt kernel density (pseudo) estimator cn(u, v) := 1 na2

n n

  • i=1

K u − Ui an , v − Vi an

  • (1)

with kernel K(u, v) = K1(u)K2(v), and bandwidths an.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-66
SLIDE 66

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

Construction of the estimator - 1

→ get an estimator of the conditional density by plugging estimators of each quantities. density of Y : g ← kernel estimator ˆ gn(y) :=

1 nhn n

  • i=1

K0

  • y−Yi

hn

  • c.d.f.

F(x) ← Fn(x) = 1

n n

  • j=1

1Xjx G(y) ← Gn(y) := 1

n n

  • j=1

1Yjy empirical c.d.f. copula density c(u, v) ← cn(u, v) a bivariate Parzen-Rosenblatt kernel density (pseudo) estimator cn(u, v) := 1 na2

n n

  • i=1

K u − Ui an , v − Vi an

  • (1)

with kernel K(u, v) = K1(u)K2(v), and bandwidths an.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-67
SLIDE 67

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

Construction of the estimator - 1

→ get an estimator of the conditional density by plugging estimators of each quantities. density of Y : g ← kernel estimator ˆ gn(y) :=

1 nhn n

  • i=1

K0

  • y−Yi

hn

  • c.d.f.

F(x) ← Fn(x) = 1

n n

  • j=1

1Xjx G(y) ← Gn(y) := 1

n n

  • j=1

1Yjy empirical c.d.f. copula density c(u, v) ← cn(u, v) a bivariate Parzen-Rosenblatt kernel density (pseudo) estimator cn(u, v) := 1 na2

n n

  • i=1

K u − Ui an , v − Vi an

  • (1)

with kernel K(u, v) = K1(u)K2(v), and bandwidths an.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-68
SLIDE 68

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

Construction of the estimator - 2

But, F and G are unknown: the random variables (Ui = F(Xi), Vi = G(Yi)) are not observable. ⇒ cn: is not a true statistic. → approximate the pseudo-sample (Ui, Vi), i = 1, . . . , n by its empirical counterpart (Fn(Xi), Gn(Yi)), i = 1, . . . , n. A genuine estimator of c(u, v) ˆ cn(u, v) := 1 na2

n n

  • i=1

K1 u − Fn(Xi) an

  • K2

v − Gn(Yi) an

  • .

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-69
SLIDE 69

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

Construction of the estimator - 2

But, F and G are unknown: the random variables (Ui = F(Xi), Vi = G(Yi)) are not observable. ⇒ cn: is not a true statistic. → approximate the pseudo-sample (Ui, Vi), i = 1, . . . , n by its empirical counterpart (Fn(Xi), Gn(Yi)), i = 1, . . . , n. A genuine estimator of c(u, v) ˆ cn(u, v) := 1 na2

n n

  • i=1

K1 u − Fn(Xi) an

  • K2

v − Gn(Yi) an

  • .

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-70
SLIDE 70

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

Construction of the estimator - 2

But, F and G are unknown: the random variables (Ui = F(Xi), Vi = G(Yi)) are not observable. ⇒ cn: is not a true statistic. → approximate the pseudo-sample (Ui, Vi), i = 1, . . . , n by its empirical counterpart (Fn(Xi), Gn(Yi)), i = 1, . . . , n. A genuine estimator of c(u, v) ˆ cn(u, v) := 1 na2

n n

  • i=1

K1 u − Fn(Xi) an

  • K2

v − Gn(Yi) an

  • .

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-71
SLIDE 71

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

Construction of the estimator - 2

But, F and G are unknown: the random variables (Ui = F(Xi), Vi = G(Yi)) are not observable. ⇒ cn: is not a true statistic. → approximate the pseudo-sample (Ui, Vi), i = 1, . . . , n by its empirical counterpart (Fn(Xi), Gn(Yi)), i = 1, . . . , n. A genuine estimator of c(u, v) ˆ cn(u, v) := 1 na2

n n

  • i=1

K1 u − Fn(Xi) an

  • K2

v − Gn(Yi) an

  • .

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-72
SLIDE 72

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

Construction of the estimator - 2

But, F and G are unknown: the random variables (Ui = F(Xi), Vi = G(Yi)) are not observable. ⇒ cn: is not a true statistic. → approximate the pseudo-sample (Ui, Vi), i = 1, . . . , n by its empirical counterpart (Fn(Xi), Gn(Yi)), i = 1, . . . , n. A genuine estimator of c(u, v) ˆ cn(u, v) := 1 na2

n n

  • i=1

K1 u − Fn(Xi) an

  • K2

v − Gn(Yi) an

  • .

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-73
SLIDE 73

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions The quantile transform The copula representation A product shaped estimator

The quantile-copula estimator

Recollecting all elements, we get, The quantile-copula estimator ˆ fn(y|x) := ˆ gn(y)ˆ cn(Fn(x), Gn(y)). that is to say, ˆ fn(y|x) :=

  • 1

nhn

n

  • i=1

K0 y − Yi hn

  • .
  • 1

na2

n n

  • i=1

K1 Fn(x) − Fn(Xi) an

  • K2

Gn(y) − Gn(Yi) an

  • Olivier P. Faugeras

Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-74
SLIDE 74

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Consistency and asymptotic normality Sketch of the proofs

Outline

4

Introduction Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

5

The Quantile-Copula estimator The quantile transform The copula representation A product shaped estimator

6

Asymptotic results Consistency and asymptotic normality Sketch of the proofs

7

Comparison with competitors Theoretical comparison Finite sample simulation

8

Application to prediction and discussions Application to prediction Discussions

9

Summary and conclusions

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-75
SLIDE 75

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Consistency and asymptotic normality Sketch of the proofs

Hypothesis

Assumptions on the densities i) the c.d.f F of X and G of Y are strictly increasing and differentiable; ii) the densities g and c are twice differentiable with continuous bounded second derivatives on their support. Assumptions on the kernels (i) K and K0 are of bounded support and of bounded variation; (ii) 0 ≤ K ≤ C and 0 ≤ K0 ≤ C for some constant C; (iii) K and K0 are second order kernels: m0(K) = 1, m1(K) = 0 and m2(K) < +∞, and the same for K0. (iv) K is twice differentiable with bounded second partial derivatives. → classical regularity assumptions in nonparametric literature.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-76
SLIDE 76

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Consistency and asymptotic normality Sketch of the proofs

Hypothesis

Assumptions on the densities i) the c.d.f F of X and G of Y are strictly increasing and differentiable; ii) the densities g and c are twice differentiable with continuous bounded second derivatives on their support. Assumptions on the kernels (i) K and K0 are of bounded support and of bounded variation; (ii) 0 ≤ K ≤ C and 0 ≤ K0 ≤ C for some constant C; (iii) K and K0 are second order kernels: m0(K) = 1, m1(K) = 0 and m2(K) < +∞, and the same for K0. (iv) K is twice differentiable with bounded second partial derivatives. → classical regularity assumptions in nonparametric literature.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-77
SLIDE 77

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Consistency and asymptotic normality Sketch of the proofs

Asymptotic results - 1

Under the above regularity assumptions, with hn → 0, an → 0, Pointwise Consistency weak consistency hn ≃ n−1/5, an ≃ n−1/6 entail ˆ fn(y|x) = f(y|x) + OP

  • n−1/3

. strong consistency hn ≃ (ln ln n/n)1/5 and an ≃ (ln ln n/n)1/6 ˆ fn(y|x) = f(y|x) + Oa.s. ln ln n n 1/3 . asymptotic normality nhn → ∞, na4

n → ∞, na6 n → 0, and

√ ln ln n/(na3

n) → 0 entail

  • na2

n

  • ˆ

fn(y|x) − f(y|x)

  • d

N

  • 0, g(y)f(y|x)||K||2

2

  • .

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-78
SLIDE 78

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Consistency and asymptotic normality Sketch of the proofs

Asymptotic results - 2

Uniform Consistency Under the above regularity assumptions, with hn → 0, an → 0, for x in the interior of the support of f and [a, b] included in the interior of the support of g, weak consistency hn ≃ (ln n/n)1/5, an ≃ (ln n/n)1/6 entail sup

y∈[a,b]

| ˆ fn(y|x) − f(y|x)| = OP

  • (ln n/n)1/3

. strong consistency hn ≃ (ln n/n)1/5, an ≃ (ln n/n)1/6 entail sup

y∈[a,b]

| ˆ fn(y|x) − f(y|x)| = Oa.s. ln n n 1/3 .

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-79
SLIDE 79

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Consistency and asymptotic normality Sketch of the proofs

Asymptotic Mean square error

Asymptotic Bias and Variance for the quantile-copula estimator Bias: E( ˆ fn(y|x)) − f(y|x) = g(y)m2(K).∇2c(F(x), G(y))a2

n

2 + o(a2

n)

with m2(K) = (m2(K1), m2(K2)), ∇2c(u, v) = ( ∂2c(u,v)

∂u2

, ∂2c(u,v)

∂v2

). Variance: V ar( ˆ f(y|x)) = 1/(na2

n)g(y)f(y|x)||K||2 2 + o(1/(na2 n)).

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-80
SLIDE 80

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Consistency and asymptotic normality Sketch of the proofs

Sketch of the proofs

Decomposition diagram

ˆ g(y)ˆ cn(Fn(x), Gn(y)) ↓ g(y)ˆ cn(Fn(x), Gn(y)) → g(y)ˆ cn(F(x), G(y)) → g(y)cn(F(x), G(y)) ↓ g(y)c(F(x), G(y))

↓ : consistency results of the kernel density estimators → : two approximation lemmas

1

ˆ cn from (Fn(x), Fn(y)) → (F(x), G(y))

2

ˆ cn → cn. Tools: results for the K-S statistics ||F − Fn||∞ and ||G − Gn||∞. → Heuristic: rate of convergence of density estimators < rate of approximation

  • f the K-S Statistic.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-81
SLIDE 81

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation

Outline

4

Introduction Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

5

The Quantile-Copula estimator The quantile transform The copula representation A product shaped estimator

6

Asymptotic results Consistency and asymptotic normality Sketch of the proofs

7

Comparison with competitors Theoretical comparison Finite sample simulation

8

Application to prediction and discussions Application to prediction Discussions

9

Summary and conclusions

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-82
SLIDE 82

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation

Theoretical asymptotic comparison - 1

Competitor: e.g. Local Polynomial estimator, ˆ f (LP )

n

(y|x) := ˆ θ0 with R(θ, x, y) :=

n

  • i=1
  • Kh2(Yi − y) −

r

j=0 θj(Xi − x)j2

K′

h1(Xi − x),

where ˆ θxy := (ˆ θ0, ˆ θ1, . . . , ˆ θr) is the value of θ which minimizes R(θ, x, y). Comparative Bias BLP = h2

1m2(K′)

2 ∂2f(y|x) ∂x2 + h2

2m2(K)

2 ∂2f(y|x) ∂y2 + o(h2

1 + h2 2)

BQC = g(y)m2(K).∇2c(F(x), G(y))a2

n

2 + o(a2

n)

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-83
SLIDE 83

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation

Theoretical asymptotic comparison - 2

Asymptotic bias comparison All estimators have bias of the same order ≈ h2 ≈ n−1/3; Distribution dependent terms:

difficult to compare sometimes less unknown terms for the quantile-copula estimator

c of compact support : the “classical” kernel method to estimate the copula density induces bias on the boundaries of [0, 1]2 → techniques to reduce the bias of the kernel estimator on the edges (boundary kernels, beta kernels, reflection and transformation methods,...)

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-84
SLIDE 84

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation

Theoretical asymptotic comparison - 2

Asymptotic bias comparison All estimators have bias of the same order ≈ h2 ≈ n−1/3; Distribution dependent terms:

difficult to compare sometimes less unknown terms for the quantile-copula estimator

c of compact support : the “classical” kernel method to estimate the copula density induces bias on the boundaries of [0, 1]2 → techniques to reduce the bias of the kernel estimator on the edges (boundary kernels, beta kernels, reflection and transformation methods,...)

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-85
SLIDE 85

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation

Theoretical asymptotic comparison - 2

Asymptotic bias comparison All estimators have bias of the same order ≈ h2 ≈ n−1/3; Distribution dependent terms:

difficult to compare sometimes less unknown terms for the quantile-copula estimator

c of compact support : the “classical” kernel method to estimate the copula density induces bias on the boundaries of [0, 1]2 → techniques to reduce the bias of the kernel estimator on the edges (boundary kernels, beta kernels, reflection and transformation methods,...)

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-86
SLIDE 86

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation

Theoretical asymptotic comparison - 3

Asymptotic Variance comparison Main terms in the asymptotic variance: Ratio shaped estimators: V ar(LP) := f(y|x)

f(x)

→ explosive variance for small value of the density f(x), e.g. in the tail of the distribution of X. Quantile-copula estimator: V ar(QC) := g(y)f(y|x) → does not suffer from the unstable nature of competitors. Asymptotic relative efficiency: ratio of variances V ar(QC) V ar(LP) := f(x)g(y) → the QC has a lower asymptotic variance for a large amount of x,y values.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-87
SLIDE 87

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation

Theoretical asymptotic comparison - 3

Asymptotic Variance comparison Main terms in the asymptotic variance: Ratio shaped estimators: V ar(LP) := f(y|x)

f(x)

→ explosive variance for small value of the density f(x), e.g. in the tail of the distribution of X. Quantile-copula estimator: V ar(QC) := g(y)f(y|x) → does not suffer from the unstable nature of competitors. Asymptotic relative efficiency: ratio of variances V ar(QC) V ar(LP) := f(x)g(y) → the QC has a lower asymptotic variance for a large amount of x,y values.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-88
SLIDE 88

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation

Theoretical asymptotic comparison - 3

Asymptotic Variance comparison Main terms in the asymptotic variance: Ratio shaped estimators: V ar(LP) := f(y|x)

f(x)

→ explosive variance for small value of the density f(x), e.g. in the tail of the distribution of X. Quantile-copula estimator: V ar(QC) := g(y)f(y|x) → does not suffer from the unstable nature of competitors. Asymptotic relative efficiency: ratio of variances V ar(QC) V ar(LP) := f(x)g(y) → the QC has a lower asymptotic variance for a large amount of x,y values.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-89
SLIDE 89

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation

Theoretical asymptotic comparison - 3

Asymptotic Variance comparison Main terms in the asymptotic variance: Ratio shaped estimators: V ar(LP) := f(y|x)

f(x)

→ explosive variance for small value of the density f(x), e.g. in the tail of the distribution of X. Quantile-copula estimator: V ar(QC) := g(y)f(y|x) → does not suffer from the unstable nature of competitors. Asymptotic relative efficiency: ratio of variances V ar(QC) V ar(LP) := f(x)g(y) → the QC has a lower asymptotic variance for a large amount of x,y values.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-90
SLIDE 90

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation

Finite sample simulation

Model Sample of n = 100 i.i.d. variables (Xi, Yi), from the following model: X, Y is marginally distributed as N(0, 1) X, Y is linked via Frank Copula . C(u, v, θ) = ln[(θ + θu+v − θu − θv)/(θ − 1)] ln θ with parameter θ = 100. Practical implementation: Beta kernels for copula estimator, Epanechnikov for other. simple Rule-of-thumb method for the bandwidths.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-91
SLIDE 91

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-92
SLIDE 92

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-93
SLIDE 93

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-94
SLIDE 94

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Theoretical comparison Finite sample simulation Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-95
SLIDE 95

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

Outline

4

Introduction Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

5

The Quantile-Copula estimator The quantile transform The copula representation A product shaped estimator

6

Asymptotic results Consistency and asymptotic normality Sketch of the proofs

7

Comparison with competitors Theoretical comparison Finite sample simulation

8

Application to prediction and discussions Application to prediction Discussions

9

Summary and conclusions

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-96
SLIDE 96

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

Application to prediction - definitions

Point predictors: Conditional mode predictor Definition of the mode: θ(x) := arg supy f(y|x) → plug in predictor : ˆ θ(x) := arg supy ˆ fn(y|x) Set predictors: Level sets Predictive set Cα(x) such as P(Y ∈ Cα(x)|X = x) = α → Level set or Highest density region Cα(x) := {y : f(y|x) ≥ fα} with fα the largest value such that the prediction set has coverage probability α. → plug-in level set: Cα,n(x) := {y : ˆ fn(y|x) ≥ ˆ fα} where ˆ fα is an estimate of fα.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-97
SLIDE 97

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

Application to prediction - definitions

Point predictors: Conditional mode predictor Definition of the mode: θ(x) := arg supy f(y|x) → plug in predictor : ˆ θ(x) := arg supy ˆ fn(y|x) Set predictors: Level sets Predictive set Cα(x) such as P(Y ∈ Cα(x)|X = x) = α → Level set or Highest density region Cα(x) := {y : f(y|x) ≥ fα} with fα the largest value such that the prediction set has coverage probability α. → plug-in level set: Cα,n(x) := {y : ˆ fn(y|x) ≥ ˆ fα} where ˆ fα is an estimate of fα.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-98
SLIDE 98

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

Application to prediction - definitions

Point predictors: Conditional mode predictor Definition of the mode: θ(x) := arg supy f(y|x) → plug in predictor : ˆ θ(x) := arg supy ˆ fn(y|x) Set predictors: Level sets Predictive set Cα(x) such as P(Y ∈ Cα(x)|X = x) = α → Level set or Highest density region Cα(x) := {y : f(y|x) ≥ fα} with fα the largest value such that the prediction set has coverage probability α. → plug-in level set: Cα,n(x) := {y : ˆ fn(y|x) ≥ ˆ fα} where ˆ fα is an estimate of fα.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-99
SLIDE 99

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

Application to prediction - definitions

Point predictors: Conditional mode predictor Definition of the mode: θ(x) := arg supy f(y|x) → plug in predictor : ˆ θ(x) := arg supy ˆ fn(y|x) Set predictors: Level sets Predictive set Cα(x) such as P(Y ∈ Cα(x)|X = x) = α → Level set or Highest density region Cα(x) := {y : f(y|x) ≥ fα} with fα the largest value such that the prediction set has coverage probability α. → plug-in level set: Cα,n(x) := {y : ˆ fn(y|x) ≥ ˆ fα} where ˆ fα is an estimate of fα.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-100
SLIDE 100

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

Application to prediction - definitions

Point predictors: Conditional mode predictor Definition of the mode: θ(x) := arg supy f(y|x) → plug in predictor : ˆ θ(x) := arg supy ˆ fn(y|x) Set predictors: Level sets Predictive set Cα(x) such as P(Y ∈ Cα(x)|X = x) = α → Level set or Highest density region Cα(x) := {y : f(y|x) ≥ fα} with fα the largest value such that the prediction set has coverage probability α. → plug-in level set: Cα,n(x) := {y : ˆ fn(y|x) ≥ ˆ fα} where ˆ fα is an estimate of fα.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-101
SLIDE 101

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

Application to prediction - results

Point predictors: Conditional mode predictor Under regularity conditions, uniform convergence on a compact set of the conditional density estimator entails that ˆ θ(x)

a.s.

→ θ(x) Set predictors: Level sets Under regularity conditions, uniform convergence on a compact set of the conditional density estimator entails that λ(∆(Cα,n(x), Cα(x)))

a.s.

→ 0 where ∆(., .) stands for the symmetric difference, and λ for Lebesgue measure.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-102
SLIDE 102

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

On the efficiency estimation of the empirical margins

Deficiency of the empirical distribution functions the order statistics X1,n < . . . < Xn,n is complete sufficient for estimating F with a density f. → Fn is the UMVU estimator of F. its smoothed version ˆ F(x) = n−1 n

i=1 L

  • Xi−x

bn

  • where bn bandwidth

and L(x) = x

−∞ l(t)dt, with l density kernel, is such that

  • E( ˆ

F(x) − F(x))2 − E(Fn(x) − F(x))2 + 2h/nF ′(x)

  • tl(t)L(t)dt
  • ≤ h4AC2 + O(h2/n)

→ Fn is deficient w.r.t ˆ F.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-103
SLIDE 103

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

On the efficiency estimation of the empirical margins

Deficiency of the empirical distribution functions the order statistics X1,n < . . . < Xn,n is complete sufficient for estimating F with a density f. → Fn is the UMVU estimator of F. its smoothed version ˆ F(x) = n−1 n

i=1 L

  • Xi−x

bn

  • where bn bandwidth

and L(x) = x

−∞ l(t)dt, with l density kernel, is such that

  • E( ˆ

F(x) − F(x))2 − E(Fn(x) − F(x))2 + 2h/nF ′(x)

  • tl(t)L(t)dt
  • ≤ h4AC2 + O(h2/n)

→ Fn is deficient w.r.t ˆ F.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-104
SLIDE 104

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

Implication for the quantile copula estimator

The doubly smoothed quantile copula conditional density estimator → replace Fn and Gn by ˆ F and ˆ G beneficial for small samples graphically more appealing: less wiggly behaviour Consequence for local averaging With smooth margin estimators ˆ F and ˆ G, ˆ F(x) − ˆ F(Xi) ≈ ˆ f(Xi)(x − Xi) (2)

  • r ˆ

F(Xi) − ˆ F(x) ≈ ˆ f(x)(Xi − x) (3)

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-105
SLIDE 105

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

Implication for the quantile copula estimator

The doubly smoothed quantile copula conditional density estimator → replace Fn and Gn by ˆ F and ˆ G beneficial for small samples graphically more appealing: less wiggly behaviour Consequence for local averaging With smooth margin estimators ˆ F and ˆ G, ˆ F(x) − ˆ F(Xi) ≈ ˆ f(Xi)(x − Xi) (2)

  • r ˆ

F(Xi) − ˆ F(x) ≈ ˆ f(x)(Xi − x) (3)

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-106
SLIDE 106

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

Connection with the variable bandwidth kernel estimators

Connection with the variable bandwidth kernel estimators Therefore, the copula density part of the estimator writes ˆ cn( ˆ F(x), ˆ G(y)) = (nanbn)−1

n

  • i=1

K1

  • ˆ

F(Xi) − ˆ F(x) an

  • K2 (. . .)

≈ (nanbn)−1

n

  • i=1

K1

  • Xi − x

an/ ˆ f(Xi)

  • K2

Yi − y bn/ˆ g(Yi)

  • with approximation (2), and

≈ (nanbn)−1

n

  • i=1

K1

  • Xi − x

an/ ˆ f(x)

  • K2

Yi − y bn/ˆ g(y)

  • with approximation (3).

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-107
SLIDE 107

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions Application to prediction Discussions

Connection with the variable bandwidth kernel estimators

Connection with the variable bandwidth kernel estimators → the copula density estimator with smoothed margin estimates is like a kernel estimator with an adaptive local bandwidth an/ ˆ f(Xi) : sample smoothing bandwidth an/ ˆ f(x) : balloon smoothing bandwidth

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-108
SLIDE 108

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions

Outline

4

Introduction Why estimating the conditional density? Two classical approaches for estimation The trouble with ratio shaped estimators

5

The Quantile-Copula estimator The quantile transform The copula representation A product shaped estimator

6

Asymptotic results Consistency and asymptotic normality Sketch of the proofs

7

Comparison with competitors Theoretical comparison Finite sample simulation

8

Application to prediction and discussions Application to prediction Discussions

9

Summary and conclusions

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-109
SLIDE 109

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions

Conclusions

Summary ratio type into the product → consistency and limit results where obtained by combination of the previous known ones on (unconditional) density estimation, nonexplosive behavior in the tails of the marginal density, no need for trimming or clipping.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-110
SLIDE 110

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions

Conclusions

Some perspectives and work-in-progress Adaptive bandwidth choices to the regularity of the model with an efficient kernel estimation of the copula density by Boundary-corrected kernels (with A. Leblanc). To design applications-specific conditional estimators:

estimation in the tail of the marginal distribution, to relate with extreme value theory, with applications in insurance, risk analysis, environmental sciences. estimation for censored data with Kaplan-Meier estimators of the marginals.

Extension to time series by coupling arguments for Markovian models. Alternative nonparametric methods of estimation by wavelets and minimax analysis with K.Tribouley, E. Masiello.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-111
SLIDE 111

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions

Bibliography

Reference

  • O. P. Faugeras. A quantile-copula approach to conditional density estimation.

Submitted, accepted upon minor revision, 2008. Available on http://hal.archives-ouvertes.fr/hal-00172589/fr/. Related work:

  • J. Fan and Q. Yao. Nonlinear time series. Springer Series in Statistics.

Springer-Verlag, New York, second edition, 2005. Nonparametric and parametric methods.

  • J. Fan, Q. Yao, and H. Tong. Estimation of conditional densities and

sensitivity measures in nonlinear dynamical systems. Biometrika, 83(1):189–206, 1996.

  • L. Gy¨
  • rfi and M. Kohler. Nonparametric estimation of conditional
  • distributions. IEEE Trans. Inform. Theory, 53(5):1872–1879, 2007.
  • R. J. Hyndman, D. M. Bashtannyk, and G. K. Grunwald. Estimating and

visualizing conditional densities. J. Comput. Graph. Statist., 5(4):315–336, 1996.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-112
SLIDE 112

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions

References

  • R. J. Hyndman and Q. Yao. Nonparametric estimation and symmetry

tests for conditional density functions. J. Nonparametr. Stat., 14(3):259–278, 2002.

  • C. Lacour. Adaptive estimation of the transition density of a markov
  • chain. Ann. Inst. H. Poincar´

e Probab. Statist., 43(5):571–597, 2007.

  • M. Rosenblatt. Conditional probability density and regression estimators.

In Multivariate Analysis, II (Proc. Second Internat. Sympos., Dayton, Ohio, 1968), pages 25–31. Academic Press, New York, 1969.

  • M. Sklar. Fonctions de r´

epartition ` a n dimensions et leurs marges. Publ.

  • Inst. Statist. Univ. Paris, 8:229–231, 1959.
  • W. Stute. On almost sure convergence of conditional empirical

distribution functions. Ann. Probab., 14(3):891–901, 1986.

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie

slide-113
SLIDE 113

Introduction The Quantile-Copula estimator Asymptotic results Comparison with competitors Application to prediction and discussions Summary and conclusions

Thank you !

Olivier P. Faugeras Th` ese de Doctorat de l’Universit´ e Pierre et Marie Curie