Functional regression analysis using R Christian Ritz Statistics - - PowerPoint PPT Presentation

functional regression analysis using r
SMART_READER_LITE
LIVE PREVIEW

Functional regression analysis using R Christian Ritz Statistics - - PowerPoint PPT Presentation

Functional regression analysis using R Christian Ritz Statistics Group Faculty of Life Sciences (LIFE) University of Copenhagen, Denmark Dortmund, August 13 2008 Christian Ritz (Statistics at LIFE) 1 / 15 Examples What are functional data?


slide-1
SLIDE 1

Functional regression analysis using R

Christian Ritz

Statistics Group Faculty of Life Sciences (LIFE) University of Copenhagen, Denmark

Dortmund, August 13 2008

Christian Ritz (Statistics at LIFE) 1 / 15

slide-2
SLIDE 2

Examples

What are functional data? Activity and disease patterns (eg. monitoring birds, children or insects over time) Animal and human growth curves (eg. weight gain in pigs and dietary studies) Fluorescence curves (eg. photosynthesis processes over time (Ritz and Streibig, 2008)) Reproduction histories (eg. longevity of medflies (Chiou et al, 2003))

Christian Ritz (Statistics at LIFE) 2 / 15

slide-3
SLIDE 3

Examples

What are functional data? Activity and disease patterns (eg. monitoring birds, children or insects over time) Animal and human growth curves (eg. weight gain in pigs and dietary studies) Fluorescence curves (eg. photosynthesis processes over time (Ritz and Streibig, 2008)) Reproduction histories (eg. longevity of medflies (Chiou et al, 2003))

Christian Ritz (Statistics at LIFE) 2 / 15

slide-4
SLIDE 4

More about fluorescence curves

Experiment:

◮ dark-adapted leaves exposed to light

(only the first seconds of this process is recorded!)

Functional response:

◮ proportion of light not used in the photosynthesis

High throughput measurements:

◮ fast and non-invasive ◮ informative long before visual effects

Curve trajectory changes with species and stress level

Christian Ritz (Statistics at LIFE) 3 / 15

slide-5
SLIDE 5

Observed fluorescence curves

Three replicates

Christian Ritz (Statistics at LIFE) 4 / 15

slide-6
SLIDE 6

More about functional data

Common features: repeated measurements on the same subject or unit basic observation: smooth function (in practice observed discretely on a grid) Use of functional data: classification/clustering ANOVA- and regression-like models prediction Smoothness being exploited in various ways

Christian Ritz (Statistics at LIFE) 5 / 15

slide-7
SLIDE 7

More about functional data

Common features: repeated measurements on the same subject or unit basic observation: smooth function (in practice observed discretely on a grid) Use of functional data: classification/clustering ANOVA- and regression-like models prediction Smoothness being exploited in various ways

Christian Ritz (Statistics at LIFE) 5 / 15

slide-8
SLIDE 8

Functional regression

How to relate functional responses to scalar, explanatory variables? Available functional regressions models: Semi-parametric approaches:

◮ additive effects models (Ramsay & Silverman, 2005)

(R package fda on CRAN and R-Forge)

◮ multiplicative effects models (Chiou et al., 2003)

(R package fmer soon on CRAN)

◮ . . . Christian Ritz (Statistics at LIFE) 6 / 15

slide-9
SLIDE 9

Functional regression

How to relate functional responses to scalar, explanatory variables? Available functional regressions models: Semi-parametric approaches:

◮ additive effects models (Ramsay & Silverman, 2005)

(R package fda on CRAN and R-Forge)

◮ multiplicative effects models (Chiou et al., 2003)

(R package fmer soon on CRAN)

◮ . . . Christian Ritz (Statistics at LIFE) 6 / 15

slide-10
SLIDE 10

Functional multiplicative effects models

A little notation: yi : T → R is a function (i = 1, . . . , N) T ⊆ R is the interval Observed at points t1, . . . , tK (K large) Multiplicative effects regression model: E(yi(t)|zi) = ψ(t, zi)µ(t) Right-hand side: µ: capturing the overall average trend ψ: multiplicative effects: low-degree polynomials in t with coefficients depending on explanatory variable zi

Christian Ritz (Statistics at LIFE) 7 / 15

slide-11
SLIDE 11

Estimation – in two steps

1

Non-parametric estimation:

◮ µ: smoothing based on all curves (R package KernSmooth) ◮ coefficients in ψ: obtained using least squares 2

Parametric or semi-parametric estimation for coefficients:

1

choose GLM (glm()) or quasi-likelihood model

2

iterative estimation: (IWLS+smoothing)

⋆ link and/or variance functions (not in GLM case) ⋆ parameters in linear predictor Christian Ritz (Statistics at LIFE) 8 / 15

slide-12
SLIDE 12

Using R

library(fmer) bo.m1 <- fmerm(fluo2 ~ log(time), id2, id0, data = barleyOat, quad = TRUE) Arguments to fmerm: fluo2: function values log(time): grid values id2: curve id (54 curves in total) id0: treatment factor quad: ψ quadratic in t

Christian Ritz (Statistics at LIFE) 9 / 15

slide-13
SLIDE 13

Model fit components

Estimated overall mean Estimated regression curves (use plot method) For each coefficient in ψ:

◮ estimated link and variance functions ◮ estimated parameters

(use summary method)

◮ fitted values and residuals

(use fitted and residuals)

Christian Ritz (Statistics at LIFE) 10 / 15

slide-14
SLIDE 14

Fitted fluorescence curve

Using the plot method:

Christian Ritz (Statistics at LIFE) 11 / 15

slide-15
SLIDE 15

Pros and cons

Advantages:

◮ non-parametric modelling of the form of the curves

(separating the time effect from other effects)

◮ parametric regression models for the differences between curves ◮ graphical model check available (ratioPlot)

Drawbacks:

◮ automatic bandwidth selection needed (used repeatedly) ◮ two-step estimation procedure (some variation lost) Christian Ritz (Statistics at LIFE) 12 / 15

slide-16
SLIDE 16

Pros and cons

Advantages:

◮ non-parametric modelling of the form of the curves

(separating the time effect from other effects)

◮ parametric regression models for the differences between curves ◮ graphical model check available (ratioPlot)

Drawbacks:

◮ automatic bandwidth selection needed (used repeatedly) ◮ two-step estimation procedure (some variation lost) Christian Ritz (Statistics at LIFE) 12 / 15

slide-17
SLIDE 17

Future R work

Testing on more datasets!!! Setting up a modular structure for model fitting:

◮ one function per step in estimation procedure ◮ plug-ins for different smoothing methods ◮ choice between bandwidth selection methods ◮ more flexible model specification

Constructing extractors for various fit components

Christian Ritz (Statistics at LIFE) 13 / 15

slide-18
SLIDE 18

Future theoretical work

Joint estimation Extended modelling including the residual process Model checking diagnostics

Christian Ritz (Statistics at LIFE) 14 / 15

slide-19
SLIDE 19

References

Chiou, J. M., Müller, H.-G. and Wang, J. L. (2003). Functional quasi-likelihood regression with smooth random

  • effects. J. R. Statist. Soc. B, 65, 405–423

Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis (2nd edn), Springer, New York. Ritz, C. and Streibig, J. C. (2008). Functional regression analysis of fluorescence curves. To appear in Biometrics

Christian Ritz (Statistics at LIFE) 15 / 15