Custom Functions for Specifying Nonlinear Terms to gnm Heather - - PowerPoint PPT Presentation

custom functions for specifying nonlinear terms to gnm
SMART_READER_LITE
LIVE PREVIEW

Custom Functions for Specifying Nonlinear Terms to gnm Heather - - PowerPoint PPT Presentation

Custom Functions for Specifying Nonlinear Terms to gnm Heather Turner, David Firth and Andy Batchelor Department of Statistics University of Warwick, UK H. Turner, D. Firth and A. Batchelor () Custom nonlin Functions UseR August 2008


slide-1
SLIDE 1

Custom Functions for Specifying Nonlinear Terms to gnm

Heather Turner, David Firth and Andy Batchelor

Department of Statistics University of Warwick, UK

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 1 / 17

slide-2
SLIDE 2

Motivating Application: Marriage Data

We wish to investigate the propensity to marry for women living in Ireland based on the Living in Ireland Surveys (1994-2001) For women born between 1950 and 1975 we have

◮ year of (first) marriage ◮ year and month of birth ◮ social class ◮ highest level of education attained ◮ year highest level of education was attained

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 2 / 17

slide-3
SLIDE 3

Proposed Model

We wish to model the hazard of marriage occuring at time t h(t) = P(T = t|T ≥ t) using the discrete-time model logit(h(t)) = c + βl log(ageit − αl) + βr log(αr − ageit) + x′

itβ

= η The parameters can be estimated by logistic regression of a marriage indicator on η — a generalized nonlinear model

◮ need custom "nonlin" function to specify “log-excess” terms

in the formula argument to gnm

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 3 / 17

slide-4
SLIDE 4

Variables and Predictors

A "nonlin" function creates a list of arguments for the internal function nonlinTerms First define the variables and predictors in the term βl log(ageit − αl) Start to build "nonlin" function as follows

LogExcess <- function(age){ list(predictors = list(beta = ~1, alpha = ~1), variables = list(substitute(age))) } class(LogExcess) <- "nonlin"

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 4 / 17

slide-5
SLIDE 5

Term-specific Issues

Want to use same function for both log-excess terms, so add argument

side = "left"

To avoid taking logs of negative values, we let αl = age[min] − 10−5 − exp(α∗

l )

αr = age[max] + 10−5 + exp(α∗

r)

and estimate α∗

l and α∗ r instead. In LogExcess define

constraint <- ifelse(side == "left", min(age) - 1e-5, max(age) + 1e-5)

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 5 / 17

slide-6
SLIDE 6

Term Definition

The term argument of nonlinTerms takes labels for the predictors and variables and returns a deparsed expression of the term:

term = function(predLabels, varLabels) { paste(predLabels[1], " * log(", " -"[side == "right"], varLabels[1], " + ", " -"[side == "left"], constraint, " + exp(", predLabels[2], "))") }

So e.g.

> term(c("beta", "alpha*"), "age") [1] "beta * log( age +

  • 14.99999

+ exp( alpha* ))"

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 6 / 17

slide-7
SLIDE 7

Parameter Labels

Default parameter labels are taken from the predictor names, here beta and alpha To make parameter labels unique, save call to LogExcess:

call <- sys.call()

and specify call argument to nonlinTerms

call = as.expression(call)

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 7 / 17

slide-8
SLIDE 8

Complete Function

LogExcess <- function(age, side = "left"){ call <- sys.call() constraint <- ifelse(side == "left", min(age) - 1e-5, max(age) + 1e-5) list(predictors = list(beta = ~1, alpha = ~1), variables = list(substitute(age)), term = function(predLabels, varLabels) { paste(predLabels[1], " * log(", " -"[side == "right"], varLabels[1], " + ", " -"[side == "left"], constraint, " + exp(", predLabels[2], "))") }, call = as.expression(call)) } class(LogExcess) <- "nonlin"

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 8 / 17

slide-9
SLIDE 9

Summary of Baseline Model

Call: gnm(formula = marriages/lives ~ LogExcess(age, side = "left") + LogExcess(age, side = "right"), family = binomial, data = fulldata, weights = lives, start = c(-20, 3, 0, 3, 0)) Deviance Residuals: Min 1Q Median 3Q Max

  • 0.8098
  • 0.4441
  • 0.3224
  • 0.1528

4.0483 Estimate Std. Error z value Pr(>|z|) (Intercept)

  • 118.5395

NA NA NA LogExcess(age, side = "left")beta 3.6928 NA NA NA LogExcess(age, side = "left")alpha

  • 0.1432

NA NA NA LogExcess(age, side = "right")beta 24.8623 NA NA NA LogExcess(age, side = "right")alpha 4.0247 NA NA NA

  • Std. Error is NA where coefficient has been constrained or is unidentified

Residual deviance: 12553 on 31004 degrees of freedom AIC: 12748 Number of iterations: 76

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 9 / 17

slide-10
SLIDE 10

Example ‘Recoil’ Plot

10 20 30 40 50 0.00 0.04 0.08 0.12 Age Probability of Marriage

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 10 / 17

slide-11
SLIDE 11

Example ‘Recoil’ Plot

10 20 30 40 50 0.00 0.04 0.08 0.12 Age Probability of Marriage 10 20 30 40 50 0.00 0.04 0.08 0.12 Age Probability of Marriage

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 10 / 17

slide-12
SLIDE 12

Example ‘Recoil’ Plot

10 20 30 40 50 0.00 0.04 0.08 0.12 Age Probability of Marriage 10 20 30 40 50 0.00 0.04 0.08 0.12 Age Probability of Marriage 10 20 30 40 50 0.00 0.04 0.08 0.12 Age Probability of Marriage

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 10 / 17

slide-13
SLIDE 13

Re-parameterization

The problem with aliasing can be overcome by re-parameterizing the baseline model: γ − δ

  • (ν − αl) log
  • ν − αl

ageit − αl

  • + δ
  • (αr − ν) log
  • αr − ν

αr − ageit

  • A new nonlin function, Bell, is needed to specify this term
  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 11 / 17

slide-14
SLIDE 14

Interpretation of Parameters

The parameters of the new parameterization have a more useful interpretation than before:

Age (years) Probability of Marriage αL ν αR expit(γ)

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 12 / 17

slide-15
SLIDE 15

Recoil Plots for Reparameterised Model

  • γ

−2.09 → −1.95 0.00 0.05 0.10 0.15

  • ν

25.39 → → 28

  • δ

0.34 → 0.15

  • αL

−0.14 → → 0.035 10 20 30 40 50

  • αR

4.02 → 20 0.00 0.05 0.10 0.15 10 20 30 40 50

  • Original Model

Perturbed Model Re−fitted Model Age (years) Probability of Marriage

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 13 / 17

slide-16
SLIDE 16

Summary of Re-parameterized Model

Call: gnm(formula = marriages/lives ~ Bell(age), family = binomial, data = fulldata, weights = lives, start = c(NA, 26, 0.2, NA, NA)) Deviance Residuals: Min 1Q Median 3Q Max

  • 0.8098
  • 0.4441
  • 0.3224
  • 0.1528

4.0483 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept)

  • 2.09446

0.03313 -63.225 < 2e-16 Bell(age)peak 25.39354 0.30405 83.517 < 2e-16 Bell(age)fallOff 0.32917 0.09065 3.631 0.000282 Bell(age)leftAdj

  • 0.14321

0.89350

  • 0.160 0.872663

Bell(age)rightAdj 4.02470 1.73753 2.316 0.020540 (Dispersion parameter for binomial family taken to be 1) Residual deviance: 12553 on 31004 degrees of freedom AIC: 12748 Number of iterations: 14

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 14 / 17

slide-17
SLIDE 17

Further Analysis

We can write a simple function to compute the endpoints and their standard errors

> BellEndpoints(bell.mod) [,1] [,2] left 14.17508 0.7742838 right 100.92183 97.2381849

Adding theoretically important covariates produces even higher estimate of right endpoint

◮ use simpler model with infinite right endpoint

Residual analysis suggests location of hazard depends on education level

◮ extend model to allow ν to depend on covariates

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 15 / 17

slide-18
SLIDE 18

Final Model

For women born in 1950

15 20 25 30 35 40 45 0.00 0.05 0.10 0.15 0.20 Age (years) Probability of marriage 10 20 30 40 50 0.0 0.2 0.4 0.6 0.8 1.0 Age (years) Proportion never married

Education level (years)

8 11 14 15 16 17

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 16 / 17

slide-19
SLIDE 19

References/Acknowledgements

More information about gnm can be found on www.warwick.ac.uk/go/gnm A working paper on the marriage application is available at www.warwick.ac.uk/go/crism/research/2007 The marriage data are from The Economic and Social Research Institute Living in Ireland Survey Microdata File ( c Economic and Social Research Institute). We gratefully acknowledge Carmel Hannan for introducing us to this application and providing background on the data.

  • H. Turner, D. Firth and A. Batchelor

() Custom “nonlin” Functions UseR August 2008 17 / 17