Computational Methods for Nonlinear Mixed Models Douglas Bates - - PowerPoint PPT Presentation

computational methods for nonlinear mixed models
SMART_READER_LITE
LIVE PREVIEW

Computational Methods for Nonlinear Mixed Models Douglas Bates - - PowerPoint PPT Presentation

Computational Methods for Nonlinear Mixed Models Douglas Bates University of Wisconsin - Madison <Bates@Wisc.edu> Slides for this presentation are available at lme4.R-forge.R-project.org/slides/ Joint Statistical Meetings Washington, DC


slide-1
SLIDE 1

Computational Methods for Nonlinear Mixed Models

Douglas Bates

University of Wisconsin - Madison <Bates@Wisc.edu> Slides for this presentation are available at lme4.R-forge.R-project.org/slides/

Joint Statistical Meetings Washington, DC August 6, 2009

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 1 / 14

slide-2
SLIDE 2

Outline

1

Introduction

2

Model definition and an example

3

The penalized least squares problem

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 2 / 14

slide-3
SLIDE 3

Outline

1

Introduction

2

Model definition and an example

3

The penalized least squares problem

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 2 / 14

slide-4
SLIDE 4

Outline

1

Introduction

2

Model definition and an example

3

The penalized least squares problem

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 2 / 14

slide-5
SLIDE 5

Outline

1

Introduction

2

Model definition and an example

3

The penalized least squares problem

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 3 / 14

slide-6
SLIDE 6

Introduction

Population pharmacokinetics data are often modeled using nonlinear mixed-effects models (NLMMs). These are nonlinear because pharmacokinetic parameters - rate constants, clearance rates, etc. - occur nonlinearly in the model function. In statistical terms these are mixed-effects models because they involve both fixed-effects parameters, applying to the entire population or well-defined subsets of the population, and random effects associated with particular experimental or observational units under study. Many algorithms for obtaining parameter estimates, usually the maximum likelihood estimates (MLEs), for such models have been proposed and implemented. Comparing different algorithms is not easy. Even understanding the definition of the model and the proposed algorithm is not easy. We begin by defining the model.

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 4 / 14

slide-7
SLIDE 7

Outline

1

Introduction

2

Model definition and an example

3

The penalized least squares problem

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 5 / 14

slide-8
SLIDE 8

Theophylline pharmacokinetics

Time since drug administration (hr) Serum concentration (mg/l)

2 4 6 8 10 5 10 15 20 25

  • 6
  • 7

5 10 15 20 25

  • 8
  • 11

5 10 15 20 25

  • 3
  • 2
  • 4

5 10 15 20 25

  • 9
  • 12

5 10 15 20 25

  • 10
  • 1

5 10 15 20 25 2 4 6 8 10

  • 5

These are serum concentration profiles for 12 volunteers after injestion of an oral dose of Theophylline, as described in Pinheiro and Bates (2000).

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 6 / 14

slide-9
SLIDE 9

Modeling pharmacokinetic data with a nonlinear model

These are longitudinal repeated measures data. For such data the time pattern of an individual’s response is determined by pharmacokinetic parameters (e.g. rate constants) that

  • ccur nonlinearly in the expression for the expected response.

The form of the nonlinear model is determined by the pharmacokinetic theory, not derived from the data. d · ke · ka · C e−ket − e−kat ka − ke These pharmacokinetic parameters vary over the population. We wish to characterize typical values in the population and the extent of the variation. Thus, we associate random effects with the parameters, ka, ke and C in the nonlinear model.

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 7 / 14

slide-10
SLIDE 10

Linear and nonlinear mixed-effects models

For both linear and nonlinear mixed-effects models, we consider the n-dimensional response random variable, Y, whose value, y, is

  • bserved, and the q-dimensional, unobserved random effects variable,

B. In the models we will consider B ∼ N (0, Σθ). The variance-covariance matrix Σθ can be huge but it is completely determined by a small number of variance-component parameters, θ. The conditional distribution of the response, Y, is (Y|B = b) ∼ N

  • µY|B, σ2In
  • The conditional mean, µY|B, depends on b and on the fixed-effects

parameters, β, through a linear predictor expression, Zb + Xβ. For a linear mixed model (LMM), µY|B is exactly the linear predictor. For an NLMM the linear predictor determines the parameter values in the nonlinear model function which then determines the mean.

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 8 / 14

slide-11
SLIDE 11

Transforming to orthogonal random effects

We never really form Σθ; we always work with the relative covariance factor, Λθ, defined so that Σθ = σ2ΛθΛ⊺

θ.

Note that we must allow for Λθ to be less that full rank. We define a q-dimensional “spherical” or “unit” random-effects vector, U, such that U ∼ N

  • 0, σ2Iq
  • , B = Λθ U ⇒ Var(B) = σ2ΛθΛ⊺

θ = Σθ.

Setting Uθ = ZΛθ, the linear predictor expression becomes Zb + Xβ = ZΛθ u + Xβ = Uθ u + Xβ.

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 9 / 14

slide-12
SLIDE 12

The conditional mode, ˜ uθ,β

Although the probability model is defined from (Y|U = u), we

  • bserve y, not u (or b) so we want to work with the other conditional

distribution, (U|Y = y). The joint distribution of Y and U is Gaussian with density fY,U(y, u) = fY|U(y|u) fU(u) = exp(− 1

2σ2 y − µY|U2)

(2πσ2)n/2 exp(− 1

2σ2 u2)

(2πσ2)q/2 = exp(−

  • y − µY|U2 + u2

/(2σ2)) (2πσ2)(n+q)/2 The mode, ˜ uθ,β, of the conditional distribution (U|Y = y) (also the mean in this case of an LMM) is ˜ uθ,β = arg min

u

  • y − µY|U
  • 2 + u2

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 10 / 14

slide-13
SLIDE 13

Outline

1

Introduction

2

Model definition and an example

3

The penalized least squares problem

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 11 / 14

slide-14
SLIDE 14

Minimizing a penalized sum of squared residuals

An expression like

  • y − µY|U
  • 2 + u2 is called a penalized sum of

squared residuals because

  • y − µY|U
  • 2 is a sum of squared residuals

and u2 is a penalty on the size of the vector u. Determining ˜ uθ,β as the minimizer of this expression is a penalized least squares (PLS) problem. For an LMM it is a penalized linear least squares problem that can be solved directly (i.e. without iterating). For an NLMM it is a penalized nonlinear least squares problem. One way to determine the solution in an LMM is to rephrase it as a linear least squares problem for an extended residual vector ˜ uθ,β = arg min

u

  • y − Xβ

Uθ Iq

  • u
  • 2

This is sometimes called a pseudo-data approach because we create the effect of the penalty term, u2, by adding “pseudo-observations” to y and to the predictor.

Douglas Bates (U.Wisc) Nonlinear mixed models August 6, 2009 12 / 14

slide-15
SLIDE 15

The profiled deviance for LMMs

We can see that ˜ uθ,β satisfies

  • U ⊺

θ Uθ + Iq

˜ uθ,β = U ⊺

θ (y − Xβ)

which we solve using the sparse Cholesky decomposition LθL⊺

θ = P

  • U ⊺

θ Uθ + Iq

  • P ⊺

P is a permutation matrix that has practical importance but does not affect the theory. The matrix Lθ is the sparse, lower-triangular factor. Let r2(θ, β) be the minimum penalized residual sum of squares, then ℓ(θ, β, σ|y) = log L(θ, β, σ|y) can be written −2ℓ(θ, β, σ|y) = n log(2πσ2) + r2(θ, β) σ2 + log(|Lθ|2) The conditional estimate of σ2 is

  • σ2(θ, β) = r2(θ, β)

n producing the profiled deviance −2˜ ℓ(θ, β|y) = log(|Lθ|2) + n

  • 1 + log

2πr2(θ, β) n

  • Douglas Bates (U.Wisc)

Nonlinear mixed models August 6, 2009 13 / 14

slide-16
SLIDE 16

Profiling the deviance with respect to β for LMMs

In a LMM the deviance depends on β only through r2(θ, β) we can

  • btain the conditional estimate,

βθ, by extending the PLS problem to r2(θ) = min

u,β

  • y − Xβ − Uθ u2 + u2

with the solution satisfying the equations U ⊺

θ Uθ + Iq

U ⊺

θ X

X⊺Uθ X⊺X ˜ uθ

  • βθ
  • =

U ⊺

θ y

X⊺y.

  • The profiled deviance, which is a function of θ only, is

−2˜ ℓ(θ) = log(|Lθ|2) + n

  • 1 + log

2πr2(θ) n

  • Douglas Bates (U.Wisc)

Nonlinear mixed models August 6, 2009 14 / 14