Optimal Designs for a Modified Exponential Model Juan M. Rodr - - PDF document

optimal designs for a modified exponential model
SMART_READER_LITE
LIVE PREVIEW

Optimal Designs for a Modified Exponential Model Juan M. Rodr - - PDF document

Optimal Designs for a Modified Exponential Model Juan M. Rodr guez-D az Mar a Teresa Santos-Mart n 1 Historical notes From the middle of the 19th century on several equations relating the rate constant k of a chemical


slide-1
SLIDE 1

Optimal Designs for a Modified Exponential Model

Juan M. Rodr´ ıguez-D´ ıaz Mar´ ıa Teresa Santos-Mart´ ın 1 Historical notes

From the middle of the 19th century on several equations relating the rate constant k of a chemical reaction to the temperature T have appeared in literature. All of them were developed experimentally and the most popular ones try to fit linearly ln(k) against T, 1/T or ln(T). The fact that this different plots gave reasonably good linear fits with the same data was due to the narrow temperature ranges usually employed in kinetic studies (see [Laidler (1984)]), but after the controversy on the models maintained during several decades all these equations were gradually dropped out but two of them: the Arrhenius and Modified-Arrhenius ones. The main reason for the rejection of the remaining equations was the lack of theoretical justification for them, while the Arrhenius ones were well explained theoretically. 1.1 The Arrhenius equation The Arrhenius equation is widely accepted as the right tool to describe the influence of temperature on the rates of chemical processes. It was first used by Svante Arrhenius in 1884 in his studies of dissociation of electrolytes, but later on applied to describe the relationship between temperature and the rates of chemical reactions and many other physical processes such as diffusion, thermal and electrical conductivity, viscosity, etc. The integrated form of the equation is ln(k) = A′ − β T , where β = E/R, with E the activation energy and R the gas constant. By taking expo- nentials and making the change T = 1/X it comes to be the exponential model E[k] = Ae−βx + ε (1) where A = eA′ > 0 is the frequency factor, and β > 0. The optimal designs for this model have been studied in [Han and Chaloner (2003)] for independent and normally-distributed errors with constant variance and in [Rodr´ ıguez-Torreblanca and Rodr´ ıguez-D´ ıaz (2007)] for different variance structures. Also optimal and compound designs specifically for the Arrhenius equation as well as a study of the efficiency of some designs used in literature can be found in [Rodr´ ıguez-Arag´

  • n and L´
  • pez-Fidalgo (2005)].

However, for the analysis of more precise rate-temperature data, particularly in studies covering a wide temperature range, it is usual to allow A′ to be temperature-dependent, proportional to ln(T), or equivalently A proportional to X raised to a power m producing E[k] = axme−βx + ε (2) where a > 0 is now temperature independent and β > 0, the so called Modified-Arrhenius (MA) model. Nowadays (see [Laidler (1984)]), the procedure often employed is to use

1

slide-2
SLIDE 2

model (1) for data of lower precision or where the temperature range is limited, and to analyze more precise data by using model (2). This will be the model we will work with in this paper. 1.2 Optimal Design theory In Optimal Experimental Design theory the work with non-linear models like the MA one is far more complex than the case of linear ones, due to the fact that in the first case the best design will depend on the values of the unknown parameters. A non-linear model can be written as Y = η(x, θ) + ǫ, (3) where Y denotes the observation, x ∈ X = (xmin, xmax) is the independent variable, θ = (θ1, θ2, . . . θk) ∈ Θ is the vector of unknown parameters, and η(x, θ) represents a function that is non-linear respect to θ. The ǫ term stands for the random errors, coming from either the experiment itself or due to a bad choice of the model. In the homoscedastic case the errors are assumed to be normally distributed, with zero mean and constant variance. The heteroscedastic model allows a more complex structure for the variance, and the treatment is more difficult. A design is a collection of points of the independent variable, {x1, x2, . . . , xN}, where N is the size of the design. It can be written taking the n different points (called the support points) and for each one the proportion (weight) than it has in the design, that is ξn = x1 x2 . . . xn p1 p2 . . . pn

  • ,

n

  • i=1

pi = 1, xi ∈ X, i = 1, . . . , n where ni = Npi is the frecuency of point xi in the design. From this point of view, an approximate design can be defined as any probability measure in X with finite support. Let p(Y |x, θ) be the probability density function for the random variable Y with model (3). The (i, j) element of the Fisher information matrix for a given x is, I(x, θ) = E −∂2 log p(y|x, θ) ∂θi∂θj

  • .

For n independent samples in x = x1, . . . , xn, the corresponding matrix is I(x, θ) =

n

  • i=1

I(xi, θ). The information matrix for an approximate design ξ is M(ξ, θ) = E(I(˜ x, θ)), where ˜ x is a random vector with probability distribution function ξ. The information matrix becomes the main tool when looking for the optimal design for the experiment. When the function η(x, θ) in model (3) is differentiable with continuous derivative for every parameter θi, the information matrix for a design ξ will be M(ξ, θ) =

n

  • i=1

pi∇η(xi, θ)∇η(xi, θ)T,

2

slide-3
SLIDE 3

where ∇η(x, θ) = ∂η(x, θ) ∂θ1 , ∂η(x, θ) ∂θ2 , . . . , ∂η(x, θ) ∂θk

  • ,

is the gradient vector of η(x, θ). The last expression is for the homoscedastic case. When a heteroscedastic model is assumed, with error variance given by var(Y |x) = σ2/ω(x), being ω(x) a non-negative function called the efficiency function, the information matrix is M(ξ, θ) =

n

  • i=1

piω(xi)∇η(xi, θ)∇η(xi, θ)T. (4) In both cases, when dealing with non-linear models the information matrix depends on the unknown parameters. To solve this problem some kind of additional information is needed, either an initial value for the parameters or a priori distribution for them. In any case, the optimal design will be a function of these initial values or distributions. The inverse

  • f the information matrix is proportional to the covariance matrix of the estimators of the

parameters of the model. Optimal experimental designs typically minimize some convex function of the inverse of the information matrix. If the aim is to estimate the parameters in the model, a reasonable criterion is D-optimality, that focuses in the determinant of the information matrix. A design ξ is D-optimal if maximices this determinant, what is equivalent to minimize the

  • ne of the covariance matrix (the generalized variance of the parameter estimators). A

D-optimal design minimizes the volume of the confidence ellipsoid of the parameters. The General Equivalence Theorem is a useful tool for checking that a design is optimal. In the case of D-optimality, it can be stated as follows: If k is the number of parameters

  • f the model and θ = θ0, then the design ξ∗ is D-optimal if and only if

ω(x)∇η(x, θ0)TM −1(ξ∗, θ0)∇η(x, θ0) ≤ k x ∈ X and the inequality becomes equality in the support points of ξ∗. If we are interested in looking for the best estimator for a linear combination of para- meters then c-optimization is the right criterion function. It is specially used for vectors (1,0,...,0),...,(0,...,0,1), getting optimal designs for the estimation of each parameter. For c ∈ Rk, a design ξ is c-optimal if maximices −cTM(ξ, θ)−1c what is equivalent to minimize V ar(cT ˆ θ). By the General Equivalence Theorem for c-optimality, the design ξ∗ is c-optimal if and only if (∇η(x, θ0)TM −1(ξ∗, θ0)c)2 ≤ cTM −1(ξ∗, θ0)c ∀x ∈ X However, c-optimal designs can be constructed geometrically using Elfving’s Theorem [Elfving (1952)]. The exponential regression model is one of the most used in literature since models the behaviour of laboratory cultives, the time-life of dispositives, etc. The exponential distri- bution describes the time for a continuous process to change state. Optimal experimen- tal designs for exponential regression models have been studied in [Box and Lucas (1953)], [Melas (1978)], [Dette and Sperlich (1994)], [Mukhopadhyay and Haines (1995)], [Dette and Neugebauer [Dette and Neugabauer (1997)], [Han and Chaloner (2003)] and [Rodr´ ıguez-Torreblanca and Ro

3

slide-4
SLIDE 4

this last one working with different efficiency functions and related with some other discrete distributions like the Poisson or the Negative Binomial ones. In the following, optimal designs for the MA model 2 will be computed for different

  • ptimality criteria. Homocedastic and heterocedastic cases with several efficiency functions

will be studied. The designs computed will us allow to check the efficiency of several other designs used in literature.

2 The homocedastic case

In this first approach the error ǫ will be normally distributed with mean 0 and constant variance σ2. Assuming m known, the parameters to be estimated are a and β. The optimal design does not depend on the error variance σ2, thus σ2 = 1 will be assumed without loss

  • f generality.

2.1 D-optimal designs Let us first focus on D-optimality. Given that ∇η(x, θ) =

  • xme−βx

−axm+1e−βx

  • the information matrix (4) for a design ξ is

M(ξ, θ) =

  • X

∇η(x, θ)∇η(x, θ)Tξ(dx) =

  • X

x2me−2βx

  • 1

−ax −ax a2x2

  • ξ(dx)

=

  • X x2me−2βxξ(dx)
  • X ax2m+1e−2βxξ(dx)

  • X ax2m+1e−2βxξ(dx)
  • X a2x2m+2e−2βxξ(dx)
  • (5)

Then |M(ξ, θ)| = a2

  • X

x2me−2βxξ(dx)

  • X

x2m+2e−2βxξ(dx) −

X

x2m+1e−2βxξ(dx) 2

  • ,

and thus the design maximizing the determinant does not depend on a. From now on we will assume a = 1. Since we have two parameters we need at least two points in the design. Let us begin by considering a two-point design, say supported in x1, x2 with xmin ≤ x1 < x2 ≤ xmax. Both points should have the same weight for the design to be D-optimal, and the determinant

  • f the information matrix will be in this case

Det(x1, x2) = 1 4e−2β(x1+x2)x2m

1 x2m 2 (x1 − x2)2 .

For x1 fixed Detx1(x2) = Det(x1, x2) is a continuous function with three stationary points in (0, ∞) given by x2 = x1, and x2 = (1 + m + βx1 ±

  • (1 + m + βx1)2 − 4βmx1)/(2β)

. (6)

4

slide-5
SLIDE 5

The first one is a minimum, and the rest are local maxima, one smaller than x1 (when taking the square root to be negative) and the other one greater than x1 (for a positive square root). Thus, when assuming x1 < x2 the solution is x2 = 1 + m + βx1 +

  • (1 + m + βx1)2 − 4βmx1

2β . (7) Replacing this expression for x2 in the determinant we get a quite complicated function in

  • x1. For m < 0 this function decreases as x1 increases, thus the best choice will be x1 = xmin.

Let us now assume m > 0: after some algebra and using the hypotheses for the model it shows up that the maximum is achieved for x∗

1 = 1 + 2m − √1 + 2m

2β . The second point can now be obtained from (7), but it can be checked that it can also be expressed by x∗

2 = 1 + 2m + √1 + 2m

2β , that appears to be a much more easier and suitable expression. The best design is thus ξ∗ supported in {x∗

1, x∗ 2} when m ≥ 0 and x∗ 1, x∗ 2 ∈ X. If either

m < 0 or x∗

1 < xmin then using (7) we get that the design ξ∗ min with support in {xmin, x∗ min}

being x∗

min = 1 + m + βxmin +

  • (1 + m + βxmin)2 − 4βmxmin

2β will be the best choice; and (again for m ≥ 0) when x∗

2 > xmax then the design ξ∗ max putting

the same weight in the points xmax and x∗

max = 1 + m + βxmax −

  • (1 + m + βxmax)2 − 4βmxmax

2β is the best candidate for D-optimality, now taking the smaller local maximum from (6). If both conditions x∗

1 < xmin and x∗ 2 > xmax are to be verified then the optimal design will be

the one with support points {xmin, xmax}, gξ∗

min,max(x). Table 1 summarizes these results.

Table 1: D-optimal support points for MA model assuming constant variance Special points Conditions on X D-optimal support points (a) x∗

1 = 1+2m−√1+2m 2β

x∗

2 = 1+2m+√1+2m 2β

x∗

1 ≥ xmin

x∗

2 ≤ xmax

{x∗

1, x∗ 2}

(b) x∗

min = B+√ B2−4βmxmin 2β

with B = 1 + m + βxmin

  • r

m < 0 x∗

1 < xmin

{xmin, min{x∗

min, xmax}}

(c) x∗

max = C−√ C2−4βmxmax 2β

with

C = 1 + m + βxmax x∗

2 > xmax

{max{x∗

max, xmin}, xmax}

To finish we need to check the optimality condition for these designs. Using the Equiv- alence theorem, for proving that ξ is D-optimal is enough to check that gξ(x) = 2 − ω(x)∇η(x, θ)TM −1(ξ)∇η(x, θ) ≥ 0 ∀x ∈ [xmin, xmax]

5

slide-6
SLIDE 6

and gξ(x) is 0 in the support points. For each one of the cases both conditions can be checked straightforwardly: gξ∗(x) has two local minima, precisely in the support points, with value 0; while gξ∗

min(x) has an only local minimum for x > xmin in x∗

min and gξ∗

max(x) has its only

local minimum for x < xmax in x∗

max, always giving value 0 for the corresponding support

  • points. Finally, the only stationary point of gξ∗

min,max(x) in [xmin, xmax] is a maximum, and

as in the previous cases the function values 0 in the support points. Remark 2.1. Note that when m = 0 model (2) comes to be the exponential model studied in [Han and Chaloner (2003)]. In this case x∗

1 = 0 and x∗ 2 = 1/β, thus the support points

  • f the D-optimal design will be {0, 1/β} if X = {0, xmax} with xmax ≥ 1/β (case (a)), or

{xmin, min{xmax, xmin + 1/β}} for a general situation (b), that is the solution given in the cited paper. Example 2.1.1. For m = 1 and taking β0 = 1 as initial value of parameter β the “special points” are x∗

1 = 3 −

√ 3 2 = 0.634, x∗

2 = 3 +

√ 3 2 = 2.366, x∗

min = 2.618

and x∗

max = 0.586 ,

the last ones assuming xmin = 1 and xmax = 2 respectively. Figure 1 shows the plotting

  • f function gξ(x) for the design space X being respectively [0, ∞], [1, ∞], [0, 2] and [1, 2],

taking ξ to be the best design for each situation. It can be seen that always the functions takes the value 0 at the support points and it is greater than 0 for every x ∈ X.

Figure 1: Checking D-optimality condition for MA model with constant variance, m = β0 = 1 and different design spaces X

2.2 Study of efficiency Once we have the optimal designs for the MA model it is possible to analyze some of the designs used in the literature for the Arrhenius model from the point of view of efficiency. As it was mentioned before, the MA model is used instead of the Arrhenius one when more precision is needed or the range of variation of the independent variable is wider; but in these cases the samples are usually taken at the same points that those when working with the Arrhenius equation, that is, the same designs are used.

6

slide-7
SLIDE 7

To perform such a comparison among designs we will pay attention to Example 1 in [Rodr´ ıguez-Arag´

  • n and L´
  • pez-Fidalgo (2005)].

In that section of the paper, D-optimal designs are computed for the Arrhenius model applied to the reaction NO+O3 → NO2+O2 in the interval T ∈ [212, 422] using as initial values of the parameters A = 3.0 × 10−12 and B = 1500. The D-optimal design for the Arrhenius model has support points 329.3 and 422, with the same weight in both points. Some other designs are analyzed:

  • Ray-Watson design, obtained from [Ray and Watson (1981)]:

ξRay−Wat =

  • 212

241 273 299 361 422 12/75 9/75 8/75 24/75 12/75 10/75

  • .

These authors took 75 samples using 6 different points. In the following designs this same number of support points will be used.

  • an equally weighted and Uniform distributed design with constant distance d between

consecutive points and symmetric from the center of the design space: ξUni = 212 254 296 338 380 422 1/6 1/6 1/6 1/6 1/6 1/6

  • .
  • an equally weighted and Arithmetical distributed design with the spacing parameter

between consecutive points growing from the center of the design space following an arithmetical progression: ξAri = 212 269.27 307.45 326.54 364.72 422 1/6 1/6 1/6 1/6 1/6 1/6

  • .
  • an equally weighted and Geometrical distributed design with the spacing parameter

between consecutive points growing from the center of the design space following an geometrical progression: ξGeo = 212 295.68 314.81 319.18 338.32 422 1/6 1/6 1/6 1/6 1/6 1/6

  • .
  • an equally weighted and Lineal inverse distributed design: the image of the design

space through the model is uniformly divided and the correspondent points through the inverse regression are taken as support points: ξLinInv = 212 296.9 338.8 370.7 397.8 422 1/6 1/6 1/6 1/6 1/6 1/6

  • .

The efficiency of design ξ respect to D-optimality for the MA model is EffD(ξ) = det M(ξ) det M(ξ∗) 1

2

, where ξ∗ is the D-optimal design for the MA model shown in Table 1. In Figure 2 the efficiencies of the designs described above for the MA model are plotted for different values

  • f m. It can be seen that the best design near m = 0 is logically the D-optimal design

7

slide-8
SLIDE 8

for the Arrhenius model -for m = 0 the MA model is exactly the Arrhenius one-. When moving away from m = 0, the efficiency decreases steadily, being the lost in efficiency much faster for positive m. For m > 3, the best design in our study is the Uniform one, while when m < −7 the cleverest choice appears to be the Linear inverse design. In any case, when the true model is a MA one with m not very close to 0 the only sensible design for D-optimality is the corresponding D-optimal design for the MA model given by Table 1. Otherwise the efficiency of any other design may be almost ridiculous.

Figure 2: Comparative D-efficiency of several designs for the MA model with β0 = 1500, T ∈ [212, 422] against m

3 The heterocedastic case

Now the case with non-constant variance will be studied. Still we will assume independent

  • bservations and m known. The information matrix will be

M(ξ, θ) =

  • X

x2me−2βx

  • 1

−ax −ax a2x2

  • ω(x)ξ(dx)

where ω(x) is the efficiency function. In the following some special values of ω(x) will be examined. 3.1 Variance equal to the mean In many experiments occur that the observed variance is a function of the mean, very often proportional to it. These cases are very related to the Poisson distribution, already stud- ied in [Rodr´ ıguez-Torreblanca and Rodr´ ıguez-D´ ıaz (2007)] for the exponential (Arrhenius)

  • model. When dealing with the MA model it means that

ω(x) = axme−βx .

8

slide-9
SLIDE 9

1 2 3 4 5 1 0.5 0.5 1 1.5 2

c

xmax

  • xmax

gΞmax

x 1 2 3 4 5 1 0.5 0.5 1 1.5 2

b c

xmin xmax

gΞmin,max

  • x

1 2 3 4 5 1 0.5 0.5 1 1.5 2

a

x1

  • x2
  • gΞ x

1 2 3 4 5 1 0.5 0.5 1 1.5 2

b

xmin xmin

  • gΞmin

x

Figure 3: Checking D-optimality condition for MA model with variance equal to the mean, m = β0 = 1 and different design spaces X

In this case, the determinant of the information matrix is Det(x1, x2) = 1 4e−β(x1+x2)xm

1 xm 2 (x1 − x2)2 ,

and the D-optimal designs can be computed by using a similar procedure that the one described in Section 2.1. They are shown in Table 2

Table 2: D-optimal support points for MA model and variance=mean Special points Conditions on X D-optimal support points (a) x∗

1 = 1+m−√1+m β

x∗

2 = 1+m+√1+m β

x∗

1 ≥ xmin

x∗

2 ≤ xmax

{x∗

1, x∗ 2}

(b) x∗

min = B+√ B2−4βmxmin 2β

with B = 2 + m + βxmin

  • r

m < 0 x∗

1 < xmin

{xmin, min{x∗

min, xmax}}

(c) x∗

max = C−√ C2−4βmxmax 2β

with

C = 2 + m + βxmax x∗

2 > xmax

{max{x∗

max, xmin}, xmax}

Example 3.1.1. Like in Example 2.1.1 let us make the computation for m = β0 = 1. The “special points” are in this case x∗

1 = 2 −

√ 2 = 0.586, x∗

2 = 2 −

√ 2 = 3.414, x∗

min = 3.732

and x∗

max = 0.438 ,

the last ones again assuming xmin = 1 and xmax = 2 respectively. Figure 3 shows the plotting

  • f function gξ(x) for the design space X being [0, ∞), [1, ∞), [0, 2] and [1, 2] respectively,

always taking ξ to be the best design for each situation. The functions takes the value 0 at the support points and it is greater than 0 for every x ∈ X.

References

[Atkinson and Donev (1992)] Atkinson, A.C. and Donev, A.N. Optimum Experimental De-

  • signs. Oxford University Press, (1992) New York.

9

slide-10
SLIDE 10

[Box and Lucas (1953)] Box, G.E.P. and Lucas, H.L. Design of experiments in nonlinear

  • ituations. Biometrika 46 (1953) 77–90.

[Dette and Neugebauer (1996)] Dette, H. and Neugebauer, H. Bayesian optimal one point designs for one parameter nonlinear models. Journal of Statistical Planning and Infer- ence 52 (1996) 17–31. [Dette and Neugabauer (1997)] Dette, H. and Neugebauer, H. Bayesian D-optimal designs for exponential regression models. Journal of Statistical Planning and Inference 60 (1997) 331–349. [Dette and Sperlich (1994)] Dette, H. and Sperlich, S. A note on Bayesian D-optimal de- signs for a generalizarion of the exponential growth model. South African Statistical Journal 28 (1994) 103–117. [Elfving (1952)] Elfving, G. Optimum allocation in linear regression theory. Ann. Math.

  • Statist. 23 (1952) 255–262.

[Imhof (2001)] Imhof, L.A. Maximin designs for exponential Growth models and het- eroscedastic polynomial models. The Annals of Statistics bf 29 (2001) 561–576. [Han and Chaloner (2003)] Han, C. and Chaloner, K. D- and c-optimal designs for expo- nential regression models used in viral dynamics and other applications. Journal of Statistical Planning and inference 115 (2003) 585–601. [Laidler (1984)] Laidler, K. J. The Development of the Arrhenius Equation. Journal of Chemical Education, 61 (1984) 494–498. [Mukhopadhyay and Haines (1995)] Mukhopadhyay, S. and Haines, L.M. Bayesian D-

  • ptimal designs for the exponential growth model. Journal of Statistical Planning and

Inference 44 (1995) 385–397. [Melas (1978)] Melas, V.B. Optimal designs for exponential regression . Math. Operations- forsch Stat. 9 (1978) 45–59. [Minkin (2001)] Minkin, S. Experimental design for Clonogenic Assays in Chemotherapy. Journal of the American Statistical Association Vol. 88 422 (1993) 410–420. [Ray and Watson (1981)] Ray G.W. and Watson R.T. Kinetics of the reaction NO+O3 → NO2+O2 from 212 to 422. J. Phys. Chem. 85 (1981) 1673-1676. [Rodr´ ıguez-Arag´

  • n and L´
  • pez-Fidalgo (2005)] Rodr´

ıguez-Arag´

  • n L. J. and L´
  • pez-Fidalgo
  • J. Optimal designs for the Arrhenius equation, Chemometrics and Intelligent Labora-

tory Systems, 77 (2005), 131–138. [Rodr´ ıguez-Torreblanca and Rodr´ ıguez-D´ ıaz (2007)] Rodr´ ıguez-Torreblanca C. and Rodr´ ıguez D´ ıaz J.M. Locally D- and c-optimal designs for Poisson and Negative Binomial regression models. Metrika (In press). [Wang et al (2007)] Wang, Y., Myers, R.H., Smith, E.P., Ye, K. D-optimal designs for Poisson regression models . Journal of Statistical Planning and inference (In press).

10