Supervised Principal Component Regression for Functional Data with - - PowerPoint PPT Presentation

supervised principal component regression for functional
SMART_READER_LITE
LIVE PREVIEW

Supervised Principal Component Regression for Functional Data with - - PowerPoint PPT Presentation

Supervised Principal Component Regression for Functional Data with High Dimensional Predictors Xinyi(Cindy) Zhang University of Toronto xyi.zhang@mail.utoronto.ca July 10, 2018 Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 1


slide-1
SLIDE 1

Supervised Principal Component Regression for Functional Data with High Dimensional Predictors

Xinyi(Cindy) Zhang

University of Toronto xyi.zhang@mail.utoronto.ca

July 10, 2018

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 1 / 32

slide-2
SLIDE 2

Joint work with

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 2 / 32

slide-3
SLIDE 3

Overview

1

Motivation

2

Methodology SPCR

3

Theoretical Properties Equivalence Estimation Convergence

4

Numerical Studies Simulation Real Data Application

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 3 / 32

slide-4
SLIDE 4

Motivation

Functional magnetic resonance imaging (fMRI) is a noninvasive technique for studying brain activity.

Image courtesy of the Rebecca Saxe laboratory, MIT news, http://news.mit.edu/2011/brain-language-0301 Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 4 / 32

slide-5
SLIDE 5

Motivation

fMRI dataset of each subject contains a time series of 3-D images.

(a) (b)

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 5 / 32

slide-6
SLIDE 6

Motivation

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 6 / 32

slide-7
SLIDE 7

Motivation

Collection of a large dimensional set of clinical/demographic variables.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 6 / 32

slide-8
SLIDE 8

Motivation

Collection of a large dimensional set of clinical/demographic variables. Association hasn’t been well understood.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 6 / 32

slide-9
SLIDE 9

Related Methodology

PCA

Principal component analysis (PCA) can be applied to extract a lower-dimensional subspace that captures the most of variation in the covariates.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 7 / 32

slide-10
SLIDE 10

Related Methodology

Potential problems

But PCA fails to capture any information when the principal subspace extracted from the covariates is orthogonal to the vectors of regression parameters.

Supervised Principal Component Regression

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 8 / 32

slide-11
SLIDE 11

Methodology

Some notations

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 9 / 32

slide-12
SLIDE 12

Methodology

Some notations

Covariance matrix Σx = E(XXT); cross-covarinace matrix Σxy =

  • T E{XY(t)}[E{XY(t)}]Tdt, where T is a compact support.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 9 / 32

slide-13
SLIDE 13

Methodology

Some notations

Covariance matrix Σx = E(XXT); cross-covarinace matrix Σxy =

  • T E{XY(t)}[E{XY(t)}]Tdt, where T is a compact support.

{(Xi, Yi(t)), i = 1, . . . , n} iid ∼ {X, Y(t)}.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 9 / 32

slide-14
SLIDE 14

Methodology

Some notations

Covariance matrix Σx = E(XXT); cross-covarinace matrix Σxy =

  • T E{XY(t)}[E{XY(t)}]Tdt, where T is a compact support.

{(Xi, Yi(t)), i = 1, . . . , n} iid ∼ {X, Y(t)}. Empirical estimation Σx = n−1XTX, where X = (X1, . . . , Xn)T ∈ Rn×p.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 9 / 32

slide-15
SLIDE 15

Methodology

Some notations

Covariance matrix Σx = E(XXT); cross-covarinace matrix Σxy =

  • T E{XY(t)}[E{XY(t)}]Tdt, where T is a compact support.

{(Xi, Yi(t)), i = 1, . . . , n} iid ∼ {X, Y(t)}. Empirical estimation Σx = n−1XTX, where X = (X1, . . . , Xn)T ∈ Rn×p. Empirical estimation Σxy = n−2

T XTY(t)Y(t)TXdt, where

Y(t) = (Y1(t), . . . , Yn(t))T ∈ Rn.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 9 / 32

slide-16
SLIDE 16

Methodology

Start with p < n. Regressing Y(t) on the projection XTw1, the optimal regression function γ∗(t) is the minimizer of the expected integrated residual sum of squares defined as IRSS =

  • T

E

  • {Y(t) − XTw1γ(t)}T{Y(t) − XTw1γ(t)}
  • dt.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 10 / 32

slide-17
SLIDE 17

Methodology

= ⇒

γ∗(t) = {wT

1 E(XXT)w1}−1wT 1 E{XY(t)}.

Plugging in γ∗(t) into IRSS yields

IRSS(γ∗) =

  • T

(E{YT(t)Y(t)} − [E{XY(t)}]Tw1{wT

1 E(XXT)w1}−1wT 1 [E{XY(t)}])dt.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 11 / 32

slide-18
SLIDE 18

Methodology

Among all the possible directions of w1, the one minimizing IRSS(γ∗) satisfies

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 12 / 32

slide-19
SLIDE 19

Methodology

Among all the possible directions of w1, the one minimizing IRSS(γ∗) satisfies

Proposition

If w10 is a minimizer of IRSS(γ∗), then w10 satisfies w10 = arg max

w1

wT

1 Σxyw1

wT

1 Σxw1

. (2.1)

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 12 / 32

slide-20
SLIDE 20

Methodology

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 13 / 32

slide-21
SLIDE 21

Methodology

For c = 0, cw10 is also a maximizer of equation (2.1).

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 13 / 32

slide-22
SLIDE 22

Methodology

For c = 0, cw10 is also a maximizer of equation (2.1). Another constraint wT

1 Σxw1 =1 to adjust the effect of potential

different scales in the predictor space.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 13 / 32

slide-23
SLIDE 23

Methodology

A sequence of generalized Rayleigh quotient problems (NP hard): w∗

k

= arg maxwk wk

TΣxy wk s.t.

wk

TΣx wk = 1, wk TΣx w∗ j

= 0, where 1 ≤ j < k. Define W∗ = (w∗

1 , . . . , w∗ K )

Equivalent optimization problem: W∗ = max

W∈Rp×K tr(WTΣxy W) s.t.

WTΣx W = IK Convex simultaneous regression problem: Σxy = UUT + Σǫ = K

i=1 λi vi vT i

+ Σǫ, V∗ = argmin

V 1 2 U − Σx V2 F

The Rayleigh quotient problems − → a convex simultaneous regression problem which recovers the same principal space, i.e. V∗V∗T = W∗W∗T under some mild conditions.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 14 / 32

slide-24
SLIDE 24

Methodology

In reality, the covariance matrices Σx and Σxy are unknown, and the

  • ptimization problem we’re actually solving is
  • V = argmin

V

1 2 U − ΣxV2

F,

and U satisfies Σxy = U UT + Σǫ = B + Σǫ.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 15 / 32

slide-25
SLIDE 25

Optimization Problem in High Dimensions

When p is relatively large compared with n, or p > n, by adding an ℓ1 penalty to our reformulated problem, one can easily estimate V

  • V = argmin

V

1 2 U − ΣxV2

F + λV1,1

  • ,

where · 1,1 denotes (A·11, A·21, · · · , A·m1)1, for a matrix A ∈ Rn×m.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 16 / 32

slide-26
SLIDE 26

Algorithm and Tuning Parameter Selection

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 17 / 32

slide-27
SLIDE 27

Algorithm and Tuning Parameter Selection

LASSO.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 17 / 32

slide-28
SLIDE 28

Algorithm and Tuning Parameter Selection

LASSO. Extended BIC (Chen and Chen, 2008) to select λK for fixed K.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 17 / 32

slide-29
SLIDE 29

Algorithm and Tuning Parameter Selection

LASSO. Extended BIC (Chen and Chen, 2008) to select λK for fixed K. 5-fold CV to select K.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 17 / 32

slide-30
SLIDE 30

Theoretical Properties

To make the signal and residual separable with respect to Σxy, we need the separability condition: λmin(Σ−1/2

xy

(UUT)Σ−1/2

xy

) > λmax(Σ−1/2

xy

ΣǫΣ−1/2

xy

).

Theorem (Equivalence)

When p < n, V = span(V∗) can recover W = span(W∗) exactly, that is V = W or equivalently V∗V∗T = W∗W∗T if the separability condition holds.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 18 / 32

slide-31
SLIDE 31

Theoretical Properties

Theorem (Estimation Error)

Under proper conditions, with probability going to 1, V converges to V∗.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 19 / 32

slide-32
SLIDE 32

Numerical Results

Simulation I

Y(t) = Xβ(t) + ǫ(t).

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 20 / 32

slide-33
SLIDE 33

Numerical Results

Simulation I

Xi

iid

∼ Np(0, Σ), where Σjj′ = 0.5|j−j′| for 1 ≤ j, j′ ≤ p.

Y(t) = Xβ(t) + ǫ(t).

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 20 / 32

slide-34
SLIDE 34

Numerical Results

Simulation I

Xi

iid

∼ Np(0, Σ), where Σjj′ = 0.5|j−j′| for 1 ≤ j, j′ ≤ p. Compact support T = [0, 1].

Y(t) = Xβ(t) + ǫ(t).

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 20 / 32

slide-35
SLIDE 35

Numerical Results

Simulation I

Xi

iid

∼ Np(0, Σ), where Σjj′ = 0.5|j−j′| for 1 ≤ j, j′ ≤ p. Compact support T = [0, 1]. ǫi(t)

iid

∼ a gaussian process with mean 0 and covariance function K(s, t) = exp{−3(s − t)2} for 0 ≤ s, t ≤ 1.

Y(t) = Xβ(t) + ǫ(t).

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 20 / 32

slide-36
SLIDE 36

Numerical Results

Simulation I

Xi

iid

∼ Np(0, Σ), where Σjj′ = 0.5|j−j′| for 1 ≤ j, j′ ≤ p. Compact support T = [0, 1]. ǫi(t)

iid

∼ a gaussian process with mean 0 and covariance function K(s, t) = exp{−3(s − t)2} for 0 ≤ s, t ≤ 1. β(t) = V∗γ(t), where V∗ = (V1, V2, V3)T ∈ Rp×3; γ(t) = (γ1(t), γ2(t), γ3(t))T.

V1 = (1, 1, 0, 0, · · · , 0, 0, 0, 0)T V2 = (0, 0, 1, 1, 0, 0, · · · , 0, 0)T V3 = (0, 0, 0, 0, · · · , 0, 0, 1, 1)T γ1(t) = 5cos(πt) γ2(t) = 7cos(2πt) γ3(t) = 9cos(3πt) + 5sin2(3πt)

Y(t) = Xβ(t) + ǫ(t).

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 20 / 32

slide-37
SLIDE 37

Numerical Results

Simulation I

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 21 / 32

slide-38
SLIDE 38

Numerical Results

Simulation I

Independent testing dataset (Y∗

i (t), X∗ i ) with size n∗ = 5000.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 21 / 32

slide-39
SLIDE 39

Numerical Results

Simulation I

Independent testing dataset (Y∗

i (t), X∗ i ) with size n∗ = 5000.

Optimal direction V; least square estimates β

V(t) obtained from

regressing Y(t) on X V.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 21 / 32

slide-40
SLIDE 40

Numerical Results

Simulation I

(n, p) Method Prediction Error

  • Π − Π2

F

  • K

(50, 200) SPCR 12.2496 (0.2776) 0.0657 (0.0039) 5.1000 (0.0915) UPCR 445.0968 (3.8092) NA NA (100, 200) SPCR 8.5161 (0.1474) 0.0319 (0.0015) 3.9400 (0.0632) UPCR 414.3093 (5.6964) NA NA (500, 200) SPCR 6.0579 (0.0569) 0.0059 (0.0003) 3.0600 (0.0238) UPCR 367.6252 (5.8318) NA NA (100, 1000) SPCR 10.5444 (0.1598) 0.0365 (0.0011) 5.9800 (0.0425) UPCR 481.8148 (0.9669) NA NA (200, 1000) SPCR 7.0769 (0.1084) 0.0135 (0.0006) 4.5100 (0.0771) UPCR 427.6922 (0.7265) NA NA

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 22 / 32

slide-41
SLIDE 41

Numerical Results

Simulation II

In this simulation, we consider V∗ as non-sparse and not residing in the low dimensional space.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 23 / 32

slide-42
SLIDE 42

Numerical Results

Simulation II

In this simulation, we consider V∗ as non-sparse and not residing in the low dimensional space. Set βj(t) = 5cos{(j+9)

100 πt} for 1 ≤ j ≤ p.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 23 / 32

slide-43
SLIDE 43

Numerical Results

Simulation II

In this simulation, we consider V∗ as non-sparse and not residing in the low dimensional space. Set βj(t) = 5cos{(j+9)

100 πt} for 1 ≤ j ≤ p.

Other settings same as simulation I.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 23 / 32

slide-44
SLIDE 44

Numerical Results

Simulation II

(n, p) Method Prediction Error (50, 200) SPCR 315.9588 (3.1784) UPCR 408.2328 (4.0381) (100, 200) SPCR 146.7840 (1.5825) UPCR 349.1845 (5.1075) (100, 1000) SPCR 463.9981 (1.3050) UPCR 492.6754 (4.9588) (200, 1000) SPCR 337.8412 (1.2150) UPCR 380.4472 (2.7625) (500, 200) SPCR 11.8829 (0.0876) UPCR 330.2885 (5.8221)

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 24 / 32

slide-45
SLIDE 45

Numerical Results

Human Connectome Project fMRI data Application

Application to the cortical surface motor task related fMRI data from HCP Dataset:

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 25 / 32

slide-46
SLIDE 46

Numerical Results

Human Connectome Project fMRI data Application

Application to the cortical surface motor task related fMRI data from HCP Dataset:

  • Behavioral and 3T MR imaging data.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 25 / 32

slide-47
SLIDE 47

Numerical Results

Human Connectome Project fMRI data Application

Application to the cortical surface motor task related fMRI data from HCP Dataset:

  • Behavioral and 3T MR imaging data.
  • 805 subjects having the cortical surface motor task related fMRI

data.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 25 / 32

slide-48
SLIDE 48

Numerical Results

Human Connectome Project fMRI data Application

Application to the cortical surface motor task related fMRI data from HCP Dataset:

  • Behavioral and 3T MR imaging data.
  • 805 subjects having the cortical surface motor task related fMRI

data.

  • fMRI data recorded on 284 equally space time points.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 25 / 32

slide-49
SLIDE 49

Numerical Results

Human Connectome Project fMRI data Application

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 26 / 32

slide-50
SLIDE 50

Numerical Results

Human Connectome Project fMRI data Application

Data:

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 27 / 32

slide-51
SLIDE 51

Numerical Results

Human Connectome Project fMRI data Application

Data: Average the blood oxygenation level dependent (BOLD) signals of all pixels in the left caudal anterior cingulate region − → Yi(t).

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 27 / 32

slide-52
SLIDE 52

Numerical Results

Human Connectome Project fMRI data Application

Data: Average the blood oxygenation level dependent (BOLD) signals of all pixels in the left caudal anterior cingulate region − → Yi(t). Covariates X ∈ R280 belongs to {demographic, language, emotion, motor and free surfer brain summary statistics}.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 27 / 32

slide-53
SLIDE 53

Numerical Results

Human Connectome Project fMRI data Application

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 28 / 32

slide-54
SLIDE 54

Numerical Results

Human Connectome Project fMRI data Application

Comparing with UPCR:

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 29 / 32

slide-55
SLIDE 55

Numerical Results

Human Connectome Project fMRI data Application

Comparing with UPCR: Randomly pick ⌊2n/3⌋ subjects as training data and the remaining as test data.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 29 / 32

slide-56
SLIDE 56

Numerical Results

Human Connectome Project fMRI data Application

Comparing with UPCR: Randomly pick ⌊2n/3⌋ subjects as training data and the remaining as test data. Repeat 100 times.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 29 / 32

slide-57
SLIDE 57

Numerical Results

Human Connectome Project fMRI data Application

Comparing with UPCR: Randomly pick ⌊2n/3⌋ subjects as training data and the remaining as test data. Repeat 100 times. Report the average prediction error of 100 repetitions.

SPCR : 239.04(1.10) UPCR : 243.48(1.03)

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 29 / 32

slide-58
SLIDE 58

Numerical Results

Human Connectome Project fMRI data Application

Comparing with UPCR: Randomly pick ⌊2n/3⌋ subjects as training data and the remaining as test data. Repeat 100 times. Report the average prediction error of 100 repetitions.

SPCR : 239.04(1.10) UPCR : 243.48(1.03)

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 29 / 32

slide-59
SLIDE 59

Extensions

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 30 / 32

slide-60
SLIDE 60

Extensions

Statistical inference: testing & confidence regions.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 30 / 32

slide-61
SLIDE 61

Extensions

Statistical inference: testing & confidence regions. Multiple functional responses.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 30 / 32

slide-62
SLIDE 62

Extensions

Statistical inference: testing & confidence regions. Multiple functional responses. Nonlinear relationship.

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 30 / 32

slide-63
SLIDE 63

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 31 / 32

slide-64
SLIDE 64

Thank you!

Xinyi(Cindy) Zhang (University of Toronto) SPCR July 10, 2018 32 / 32