Block I Connections between multivariate and FDA Beatriz - - PowerPoint PPT Presentation

block i connections between multivariate and fda beatriz
SMART_READER_LITE
LIVE PREVIEW

Block I Connections between multivariate and FDA Beatriz - - PowerPoint PPT Presentation

Block I Connections between multivariate and FDA Beatriz Bueno-Larraz Real Eyes Universidad Aut onoma de Madrid IWAFDA 2019 It turns out, in my opinion, that reproducing kernel Hilbert spaces are the natural setting in which to solve


slide-1
SLIDE 1

Block I Connections between multivariate and FDA Beatriz Bueno-Larraz Real Eyes Universidad Aut´

  • noma de Madrid

IWAFDA 2019

slide-2
SLIDE 2

“It turns out, in my opinion, that reproducing kernel Hilbert spaces are the natural setting in which to solve problems of statistical inference

  • n time processes”.

Parzen (1962)

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 1 / 43

slide-3
SLIDE 3
  • 1. Introduction to RKHS’s
  • 2. Logistic regression

Berrendero, J. R., Bueno-Larraz, B., Cuevas, A. (2018). On func- tional logistic regression via RKHS’s. arXiv:1812.00721.

  • 3. Mahalanobis distance

Berrendero, J. R., Bueno-Larraz, B., Cuevas, A. (2018). On Maha- lanobis distance in functional settings. arXiv:1803.06550.

  • 4. Binary classification

Berrendero, J. R., Cuevas, A., Torrecilla, J.L. (2017). On the use of reproducing kernel Hilbert spaces in functional classification. Jour- nal of the American Statistical Association Delaigle, A., Hall, P. (2012). Achieving near perfect classification for functional data. Journal of the Royal Statistical Society: Series B

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 2 / 43

slide-4
SLIDE 4

Our setting

Our sample is made of trajectories x ∈ L2[0, 1], drawn from a second

  • rder process X(s) with
  • mean function m(s) = E[X(s)],
  • covariance function K(s, t) = cov(X(s), X(t)).

Standard Brownian motion: m(s) = 0 K(s, t) = min(s, t)

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 3 / 43

slide-5
SLIDE 5

RKHS via the covariance operator

The functional analogue of the cov. matrix is the covariance operator. x = Σy − → Kf(t) = 1 K(t, s)f(s)ds.

Definition in Peszat and Zabczyk (2007)

The RKHS associated with K is defined as H(K) = {K1/2f, f ∈ L2[0, 1]}. λ1 ≥ λ2 . . . > 0 and {ej} are the eigenvalues and eigenfunctions of K, then for f ∈ H(K), f2

K = K−1/2f2 2 = ∞

  • j=1

f, ei2

2

λi , since {ej} is an ONB in L2[0, 1].

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 4 / 43

slide-6
SLIDE 6

RKHS via the covariance operator

The functional analogue of the cov. matrix is the covariance operator. x = Σy − → Kf(t) = 1 K(t, s)f(s)ds.

Definition in Peszat and Zabczyk (2007)

The RKHS associated with K is defined as H(K) = {K1/2f, f ∈ L2[0, 1]}. λ1 ≥ λ2 . . . > 0 and {ej} are the eigenvalues and eigenfunctions of K, then for f ∈ H(K), f2

K = K−1/2f2 2 = ∞

  • j=1

f, ei2

2

λi , since {ej} is an ONB in L2[0, 1].

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 4 / 43

slide-7
SLIDE 7

Finite dimensional case

Rd L2[0, 1] Matrix Σ Operator K H(Σ) = {Σ1/2a, a ∈ Rd} H(K) = {K1/2f, f ∈ L2[0, 1]} = {Σa, a ∈ Rd} x2

Σ = (Σ−1/2x)′(Σ−1/2x)

g2

K = K−1/2g2 2

In the functional case: when x is the trajectory of a Gaussian process, x ∈ H(K) with probability 1. (Luki´

c and Beder (2001)) Brownian Motion: H(K) =

  • f | f(0) = 0, f absolutely continuous, f ′ ∈ L2[0, 1]
  • .
  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 5 / 43

slide-8
SLIDE 8

Finite dimensional case

Rd L2[0, 1] Matrix Σ Operator K H(Σ) = {Σ1/2a, a ∈ Rd} H(K) = {K1/2f, f ∈ L2[0, 1]} = {Σa, a ∈ Rd} x2

Σ = (Σ−1/2x)′(Σ−1/2x)

g2

K = K−1/2g2 2

In the functional case: when x is the trajectory of a Gaussian process, x ∈ H(K) with probability 1. (Luki´

c and Beder (2001)) Brownian Motion: H(K) =

  • f | f(0) = 0, f absolutely continuous, f ′ ∈ L2[0, 1]
  • .
  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 5 / 43

slide-9
SLIDE 9

Classical definition of RKHS

A function r : [0, 1] × [0, 1] → R is a reproducing kernel of the Hilbert space H ⊂ L2[0, 1] if and only if it satisfies:

  • 1. r(s, ·) ∈ H, ∀s ∈ [0, 1].
  • 2. Reproducing property: f, r(s, .) = f(s), ∀f ∈ H, ∀s ∈ [0, 1].

A Hilbert space of real-valued functions with a reproducing kernel is called a reproducing kernel Hilbert space (RKHS).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 6 / 43

slide-10
SLIDE 10

Characterizing reproducing kernels

A reproducing kernel is always positive semidefinite, that is, for all {s1, . . . , sp} ∈ [0, 1]p, the matrix (r(si, sj))p

i,j=1 is positive semidefinite.

Moore-Aronszajn theorem

Every positive semidefinite function is the kernel of a unique RKHS. Every stochastic process has a natural associated RKHS, H(K), whose kernel is the covariance function of the process, K(s, t).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 7 / 43

slide-11
SLIDE 11

RKHS via finite linear combinations

If K(s, t) is the covariance function, we define (H0(K), ·, ·) by H0(K) ≡

  • f : f(s) =

n

  • j=1

ajK(s, tj), aj ∈ R, tj ∈ [0, 1], n ∈ N

  • ,

f, gK =

i,j aibjK(si, tj),

where f(·) =

i aiK(·, si) and g(·) = j bjK(·, tj).

The RKHS associated with K is defined as the completion of H0(K).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 8 / 43

slide-12
SLIDE 12

Loève’s isometry

L(X) is the L2-completion of the linear span of X(s).

Loéve’s Isometry

The RKHS H(K) is an isometric copy of L(X), since

ΨX p

  • i=1

ai

  • X(ti) − m(ti)
  • =

p

  • i=1

aiK(ti, ·), ∀ai ∈ R

defines a congruence. We identify β, XK ≡ Ψ−1

X (β).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 9 / 43

slide-13
SLIDE 13

1

Introduction to RKHS’s

2

Logistic regression

3

Mahalanobis distance

4

Binary classification

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 10 / 43

slide-14
SLIDE 14

Logistic regression problem

Used for problems with categorical response (typically Y ∈ {0, 1}). It is assumed that log

  • p(x)/(1– p(x))
  • is linear in x ∈ Rd, where

p(x) = P(Y=1| x), which leads to: P(Y=1| x) = 1 1 + exp{– α0 – α′x}, α0 ∈ R and α ∈ Rd. It holds when X0, X1 are Gaussian with common covariance matrix.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 11 / 43

slide-15
SLIDE 15

Goals of this part

To analyze the relationship between two functional extensions of the multivariate model. To carefully examine whether ML estimators exist. To see how to circumvent the non-existence issue using multivariate problems.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 12 / 43

slide-16
SLIDE 16

Functional logistic regression

The standard functional extension for x ∈ L2[0, 1] is, P(Y=1| x) = 1 1 + exp{– β0 – β, x2}, β0 ∈ R and β ∈ L2[0, 1]. The RKHS model would be, P(Y=1 | x) = 1 1 + exp {– β0 – β, xK}, β0 ∈ R and β ∈ H(K).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 13 / 43

slide-17
SLIDE 17

Relationship with the finite dimensional model (I)

For Σ invertible, writing α ∈ H(Σ) as α = Σa =

d

  • i=1

aiΣi, for Σi the i-th column of Σ, we have α, xK =

  • Σ−1/2Σa

′ Σ−1/2x

  • = a′x.

Thus, the standard logistic regression model is a particular case of the RKHS one: P(Y =1| x) =

  • 1+exp
  • – α0 – α, xK

−1 =

  • 1+exp
  • – α0 – a′x

−1.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 14 / 43

slide-18
SLIDE 18

Relationship with the finite dimensional model (II)

Whenever the slope function β ∈ H(K) ⊂ L2[0, 1] has the form β(·) =

p

  • i=1

βjK(tj, ·), the RKHS model reduces to, P(Y =1| x) =

  • 1 + exp
  • – β0 –

p

  • j=1

βjx(tj) −1 .

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 15 / 43

slide-19
SLIDE 19

Conditional Gaussian distributions

Let X0(s), X1(s) be Gaussian processes with continuous trajectories, continu-

  • us mean functions m0, m1 and continuous covariance function K (equal for

both classes). Let P0 and P1 be the probability measures on C[0, 1] (or L2[0, 1]) induced by the processes X0, X1 respectively.

(a) if m0, m1 ∈ H(K), then the RKHS model holds with β := m1 – m0 and β0 := (m02

K – m12 K)/2 – log((1 – p)/p).

(b) if m1– m0 ∈ K(L2) = {K(f) : f ∈ L2[0, 1]}, then L2 model holds. (c) if m1– m0 ∈ K(L2), L2 model is never recovered, but different sit- uations are possible.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 16 / 43

slide-20
SLIDE 20

Conditional Gaussian distributions

Let X0(s), X1(s) be Gaussian processes with continuous trajectories, continu-

  • us mean functions m0, m1 and continuous covariance function K (equal for

both classes). Let P0 and P1 be the probability measures on C[0, 1] (or L2[0, 1]) induced by the processes X0, X1 respectively.

(a) if m0, m1 ∈ H(K), then the RKHS model holds with β := m1 – m0 and β0 := (m02

K – m12 K)/2 – log((1 – p)/p).

(b) if m1– m0 ∈ K(L2) = {K(f) : f ∈ L2[0, 1]}, then L2 model holds. (c) if m1– m0 ∈ K(L2), L2 model is never recovered, but different sit- uations are possible.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 16 / 43

slide-21
SLIDE 21

Conditional Gaussian distributions

Let X0(s), X1(s) be Gaussian processes with continuous trajectories, continu-

  • us mean functions m0, m1 and continuous covariance function K (equal for

both classes). Let P0 and P1 be the probability measures on C[0, 1] (or L2[0, 1]) induced by the processes X0, X1 respectively.

(a) if m0, m1 ∈ H(K), then the RKHS model holds with β := m1 – m0 and β0 := (m02

K – m12 K)/2 – log((1 – p)/p).

(b) if m1– m0 ∈ K(L2) = {K(f) : f ∈ L2[0, 1]}, then L2 model holds. (c) if m1– m0 ∈ K(L2), L2 model is never recovered, but different sit- uations are possible.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 16 / 43

slide-22
SLIDE 22

Conditional Gaussian distributions

Let X0(s), X1(s) be Gaussian processes with continuous trajectories, continu-

  • us mean functions m0, m1 and continuous covariance function K (equal for

both classes). Let P0 and P1 be the probability measures on C[0, 1] (or L2[0, 1]) induced by the processes X0, X1 respectively.

(a) if m0, m1 ∈ H(K), then the RKHS model holds with β := m1 – m0 and β0 := (m02

K – m12 K)/2 – log((1 – p)/p).

(b) if m1– m0 ∈ K(L2) = {K(f) : f ∈ L2[0, 1]}, then L2 model holds. (c) if m1– m0 ∈ K(L2), L2 model is never recovered, but different sit- uations are possible.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 16 / 43

slide-23
SLIDE 23

Maximum Likelihood estimator

Given a sample (x0

1, y0 1), . . . , (x0 n0, y0 n0) and (x1 1, y1 1), . . . , (x1 n1, y1 n1) in X×

{0, 1}, the ML function is: For X = Rd,

Ln(a, a0) = 1 n0

n0

  • i=1

log e−a0−a′x0

i

1 + e−a0−a′x0

i + 1

n1

n1

  • i=1

log 1 1 + e−a0−a′x1

i .

For X = L2[0, 1],

Ln(β, β0) = 1 n0

n0

  • i=1

log e−β0−x0

i ,β

1 + e−β0−x0

i ,β + 1

n1

n1

  • i=1

log 1 1 + e−β0−x1

i ,β .

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 17 / 43

slide-24
SLIDE 24

Non-existence of the finite MLE

The maximum of Ln is not attainable whenever the sample is (quasi)- linearly separable (Albert and Anderson (1984)).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 18 / 43

slide-25
SLIDE 25

Sign-choice property

Z(t) = (X1(t), . . . , Xn(t)), t ∈ [0, 1] satisfies the Sign Choice (SC) property when ∀ (s1, . . . , sn) ∈ {+, −}n, with probability one, ∃ t0 ∈ [0, 1] such that sign(Xi(t0)) = si.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 19 / 43

slide-26
SLIDE 26

Non-existence of the functional MLE

Let X(s), s ∈ [0, 1], be a L2 stochastic process with E[X(s)] = 0 and let X1, . . . , Xn be independent copies of X. If Zn(s) = (X1(s), . . . , Xn(s)) fulfills the SC property, with probability

  • ne, the MLE estimator does not exist for any sample size.

The n-dimensional Brownian motion fulfills the SC property (via Blu-

mental’s 0-1 law).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 20 / 43

slide-27
SLIDE 27

Asymptotic non-existence

Let (xi, yi) be a independent sample satisfying the RKHS model, X Gaussian with K continuous and ΣT invertible for any finite set T. Then, lim

n→∞ P(MLE exists) = 0.

The estimation of β in practice: Taking a finite number of variables X(t1), . . . , X(tp) and using Firth’s estimator.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 21 / 43

slide-28
SLIDE 28

Asymptotic non-existence

Let (xi, yi) be a independent sample satisfying the RKHS model, X Gaussian with K continuous and ΣT invertible for any finite set T. Then, lim

n→∞ P(MLE exists) = 0.

The estimation of β in practice: Taking a finite number of variables X(t1), . . . , X(tp) and using Firth’s estimator.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 21 / 43

slide-29
SLIDE 29

Asymptotic non-existence

Let (xi, yi) be a independent sample satisfying the RKHS model, X Gaussian with K continuous and ΣT invertible for any finite set T. Then, lim

n→∞ P(MLE exists) = 0.

The estimation of β in practice: Taking a finite number of points p and using Firth’s estimator.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 21 / 43

slide-30
SLIDE 30

Asymptotic non-existence

Let (xi, yi) be a independent sample satisfying the RKHS model, X Gaussian with K continuous and ΣT invertible for any finite set T. Then, lim

n→∞ P(MLE exists) = 0.

The estimation of β in practice: Taking a finite number of points p and using Firth’s estimator.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 21 / 43

slide-31
SLIDE 31

Asymptotic non-existence

Let (xi, yi) be a independent sample satisfying the RKHS model, X Gaussian with K continuous and ΣT invertible for any finite set T. Then, lim

n→∞ P(MLE exists) = 0.

The estimation of β in practice: Taking a finite number of points p and using Firth’s estimator.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 21 / 43

slide-32
SLIDE 32

Asymptotic non-existence

Let (xi, yi) be a independent sample satisfying the RKHS model, X Gaussian with K continuous and ΣT invertible for any finite set T. Then, lim

n→∞ P(MLE exists) = 0.

The estimation of β in practice: Taking a finite number of points p and using Firth’s estimator.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 21 / 43

slide-33
SLIDE 33

1

Introduction to RKHS’s

2

Logistic regression

3

Mahalanobis distance

4

Binary classification

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 22 / 43

slide-34
SLIDE 34

Finite-dimensional distance

Let Σ be the (non-singular) covariance matrix of the d-dimensional random vector X, the (square) Mahalanobis distance between x ∈ Rd and m ∈ Rd is M2(x, m) = (x − m)′Σ−1(x − m).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 23 / 43

slide-35
SLIDE 35

Direct functional extension

Its most popular applications are:

  • Supervised classification
  • Outlier detection
  • Multivariate depth measures, with depth function (1+M(x, m))−1.
  • Hypothesis testing,...

The analogue of the covariance matrix is the operator Kf(t) = 1 K(t, s)f(s)ds. The counterpart of Σ−1 would be K−1, but K is not invertible (there is no linear continuous K−1 s.t. K−1K = K−1K = Id).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 24 / 43

slide-36
SLIDE 36

Spectral decomposition

Given λi, ei the eigenvalues and eigenvectors of Σ, M2(x, m) =

d

  • i=1

((x − m)′ei)2 λi . Then the naive functional extension would be M2(x, m) =

  • i=1

x − m, ei2

2

λi , for λi, ei the eigenvalues and eigenfunctions of K. Problem: when x is the trajectory of a Gaussian process, this series diverges with probability 1. Existing proposals Galeano et al. (2015), Ghiglietti et al. (2017).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 25 / 43

slide-37
SLIDE 37

Spectral decomposition

Given λi, ei the eigenvalues and eigenvectors of Σ, M2(x, m) =

d

  • i=1

((x − m)′ei)2 λi . Then the naive functional extension would be M2(x, m) =

  • i=1

x − m, ei2

2

λi , for λi, ei the eigenvalues and eigenfunctions of K. Problem: when x is the trajectory of a Gaussian process, this series diverges with probability 1. Existing proposals Galeano et al. (2015), Ghiglietti et al. (2017).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 25 / 43

slide-38
SLIDE 38

Goals of this part

Highlight the relationship between Mahalanobis distance and RKHS norm. Understand the problem with the functional extension and how RKHS’s help to circumvent it. Show the common properties between the multivariate and func- tional metrics.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 26 / 43

slide-39
SLIDE 39

Finite dimensional RKHS

If T = {1, 2, . . . , d} then K : T × T → R is summarized by the covariance matrix Σi,j = K(i, j), and then H(Σ) = {x = Σ1/2a, a ∈ Rd}. In this case, for x ∈ H(Σ), x2

K = (Σ−1/2x)′(Σ−1/2x) = x′Σ−1x = M2(x, 0).

That is, the square norm of x in H(K) coincides with the Mahalanobis distance.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 27 / 43

slide-40
SLIDE 40

Trouble with the extension to functional data

Then, one would like to define M(x, m) = x − mK. But it is not possible since, when x is the trajectory of a Gaussian pro- cess, x ∈ H(K) with probability 1. (Luki´

c and Beder (2001))

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 28 / 43

slide-41
SLIDE 41

Idea and definition

It seems natural to define the Mahalanobis distance replacing the tra- jectory x(t) by the “closest” function in H(K). But since H(K) is not closed in L2[0, 1], we need to penalize, xα = arg min

f∈H(K)

  • x − f2

2 + αf2 K

  • .

Then we define, Mα(x, m) = xα − mαK

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 29 / 43

slide-42
SLIDE 42

Explicit expression

The minimization problem for xα has an explicit solution, xα = (K + αI)−1Kx. Then the distance can be rewritten as xα − mα2

K

=

  • j=1

λj (λj + α)2 x − m, ej2

2

which is a true metric.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 30 / 43

slide-43
SLIDE 43

Shared properties with the multivariate case

Mahalanobis distance is invariant under invertible linear operators: (Ax)′(AΣA′)−1(Ax) = x′Σ−1x. The functional distance Mα is invariant for isometries in L2[0, 1]. Under Gaussianity, the square Mahalanobis distance in Rd distributes as d

i=1 Yi,for Yi independent χ2 1.

Functional Mahalanobis distance for Gaussian processes distributes as Mα(x, m)2 ∼

  • i=1

λ2

i

(λi + α)2 Yi.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 31 / 43

slide-44
SLIDE 44

Shared properties with the multivariate case

Mahalanobis distance is invariant under invertible linear operators: (Ax)′(AΣA′)−1(Ax) = x′Σ−1x. The functional distance Mα is invariant for isometries in L2[0, 1]. Under Gaussianity, the square Mahalanobis distance in Rd distributes as d

i=1 Yi,for Yi independent χ2 1.

Functional Mahalanobis distance for Gaussian processes distributes as Mα(x, m)2 ∼

  • i=1

λ2

i

(λi + α)2 Yi.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 31 / 43

slide-45
SLIDE 45

1

Introduction to RKHS’s

2

Logistic regression

3

Mahalanobis distance

4

Binary classification

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 32 / 43

slide-46
SLIDE 46

Statement of the problem

Training sample: (xi, yi), i = 1, . . . , n, with xi ∈ X and yi ∈ 0, 1. The aim is to predict the value of y corresponding to a new individual for which only x is observed. Multivariate Functional

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 33 / 43

slide-47
SLIDE 47

From functional to multivariate

We can transform functional classification problems to multivariate us- ing a finite number of marginal variables → variable selection.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 34 / 43

slide-48
SLIDE 48

Goals of this part

To look at both problems from a general framework, to highlight the similarities and differences between them. To give optimal rules for both problems. To provide insight into the near perfect classification phenomenon pointed out by Delaigle and Hall (2012), leveraging the RKHS’s.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 35 / 43

slide-49
SLIDE 49

Hàjek-Feldman dichotomy

Let P0 and P1 be two probability distributions on (Ω, F). They are equivalent (P0 ∼ P1) if, ∀A∈F, P0(A) = 0 ↔ P1(A) = 0. In this case, there exist the Radon-Nikodym density dP1

dP0 (x)

They are mutually singular (P0 ⊥ P1) if there exists A∈F such that P0(A) = 0 and P1(A) = 0. Hàjek-Feldman dichotomy If P0 and P1 are Gaussian, either P0 ∼ P1 or P0 ⊥ P1.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 36 / 43

slide-50
SLIDE 50

Bayes classifiers

A classifier is a measurable function g : X → {0, 1}, giving the class assigned to x. Bayes classifier: g∗(x) = Iη(x)>0.5, for η(x) = P(Y = 1|X = x) = E[Y|X = x] The Bayes error L∗ is the classification error of the Bayes rule. Bayes classifier is optimal, in the sense that P

  • g∗(x) = y
  • ≤ P
  • g(x) = y
  • (≡ L∗ ≤ L)

for any other classifier g.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 37 / 43

slide-51
SLIDE 51

Optimal rule

When the samples are in Rd, density functions f0, f1 are defined and Bayes classifier can be rewritten as g∗(x) = I f1(x)

f0(x) > 1−p p

,

where p = P(Y=1). But we lack of (Lebesgue) density functions in functional spaces. Theorem 1 of Baíllo et al. (2011) If P0 ∼ P1, g∗(x) = I dP1

dP0 (x)> 1−p p

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 38 / 43

slide-52
SLIDE 52

Particular example

Multivariate (Rd) P0 : X ∼ Nd(0, Σ) P1 : X ∼ Nd(m, Σ), for m = 0 ∈ Rd. Functional P0 : X(t) = Z(t), P1 : X(t) = m(t) + Z(t), for t ∈ [0, 1], Z a zero-mean Gaussian process with covariance function K.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 39 / 43

slide-53
SLIDE 53

Bayes rule in this case

In this case P0 ∼ P1, which is equal to m in H(K), or H(Σ), and dP1 dP0 (x) = exp

  • m, xK − 1

2m2

K

  • In the multivariate case g∗(x) = 1 if

m′Σ−1x − 1 2m′Σ−1m > 1 − p p which is the Fisher classifier (and Fisher = Mahalanobis).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 40 / 43

slide-54
SLIDE 54

But what if P0 ⊥ P1?

P0 ⊥ P1 is equal to m ∈ H(K) or H(Σ). Multivariate (Rd) Only extreme cases. Functional Near perfect classification.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 41 / 43

slide-55
SLIDE 55

Near perfect classification

We can get a classification error as small as desired. For instance: approximating m by a sequence of functions in H(K).

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 42 / 43

slide-56
SLIDE 56

Bibliography I

Albert, A. and Anderson, J. A. (1984). On the existence of maximum likelihood estimates in logistic regression models. Biometrika, 71(1):1–10. Baíllo, A., Cuevas, A., and Cuesta-Albertos, J. A. (2011). Supervised classification for a family

  • f gaussian functional models. Scandinavian Journal of Statistics, 38(3):480–498.

Galeano, P., Joseph, E., and Lillo, R. E. (2015). The Mahalanobis distance for functional data with applications to classification. Technometrics, 57(2):281–291. Ghiglietti, A., Ieva, F., and Paganoni, A. M. (2017). Statistical inference for stochastic processes: two-sample hypothesis tests. Journal of Statistical Planning and Inference, 180:49–68. Luki´ c, M. N. and Beder, J. H. (2001). Stochastic processes with sample paths in reproducing kernel Hilbert spaces. Transactions of the American Mathematical Society, 353(10):3945– 3969. Parzen, E. (1962). Extraction and detection problems and reproducing kernel hilbert spaces. Journal of the Society for Industrial & Applied Mathematics, Series A: Control 1, 1:35–62. Peszat, S. and Zabczyk, J. (2007). Stochastic partial differential equations with Lévy noise: An evolution equation approach, volume 113. Cambridge University Press.

  • B. Bueno-Larraz

Block III: Connections between multivariate and FDA IWAFDA 43 / 43

slide-57
SLIDE 57

Thank you for your attention

beatriz.bueno.larraz@gmail.com