Assessing Proximal and Lagged Moderated Effects in Mobile Health - - PowerPoint PPT Presentation

assessing proximal and lagged moderated effects in mobile
SMART_READER_LITE
LIVE PREVIEW

Assessing Proximal and Lagged Moderated Effects in Mobile Health - - PowerPoint PPT Presentation

Assessing Proximal and Lagged Moderated Effects in Mobile Health Joint Statistical Meetings Health Policy Statistics Section (Invited) Chicago, IL August 2, 2016 Audrey Boruvka, 1 Daniel Almirall, 1 Katie Witkiewitz, 2 and Susan A. Murphy 1 1


slide-1
SLIDE 1

Assessing Proximal and Lagged Moderated Effects in Mobile Health

Joint Statistical Meetings Health Policy Statistics Section (Invited) Chicago, IL

August 2, 2016

Audrey Boruvka,1 Daniel Almirall,1 Katie Witkiewitz,2 and Susan A. Murphy1

1University of Michigan and 2University of New Mexico

slide-2
SLIDE 2

Outline

  • 1. Three examples: BASICS Mobile, HeartSteps and

Sense2Stop

  • 2. What does data from a micro-randomized trial look like?
  • 3. Proximal and lagged moderated effects
  • 4. Estimating the proximal and lagged moderated effects
  • 5. Simulation experiments
  • 6. A data example using BASICS Mobile

1 / 29

slide-3
SLIDE 3

BASICS-Mobile Example (College Drinking)

PI: Katie Witkiewitz

Smartphone-based intervention to curb heavy drinking and smoking in college students Data Collected EMA up to 3x/day (morning, aftern., eve) Intervention Frequency Up to 2x/day (afternoon, evening) Intervention Content Mindfulness-based message vs general health information (binary treatment) Intervention Availability Based on answering an EMA Typical Question Is the effect of providing a mindfulness-based intervention (vs GHI) on subsequent smoking rate moderated by increase in need to self-regulate?

2 / 29

slide-4
SLIDE 4

HeartSteps Example (Physical Activity)

PI: Pedja Klasjna

Wearable activity-tracker + smartphone-based intervention to encourage physical activity Data Collected "Continuously" + EMA each evening Intervention Frequency Up to 5x/day (before work, lunch, 2pm, after work, eve) Intervention Content Delivers vs does not deliver (binary treatment) contextually relevant activity suggestion via the smartphone Intervention Availability Not in vehicle, not exercising, not "snooze" the app, phone on Typical Question Does time-of-day or the busyness influence the effect of suggesting an activity on step count?

3 / 29

slide-5
SLIDE 5

Sense2Stop Example (Smoking Cessation)

PI: Bonnie Spring

Wearable chest-strap + wrist-band + smartphone-based intervention to sense stress and reduce smoking Data Collected "Continuously" + EMA Intervention Frequency 3x/day on average; with 50% chance

  • f happening when stressed and 50% chance of happening

when not stressed Intervention Content Deliver or not deliver prompt (binary treatment) via smartphone to use one of 3 stress-management apps Intervention Availability Not in vehicle, ≥ 60min since intervention, ≥ 10min since EMA, cannot have uncertain stress classification, phone on Typical Question Will delivering the message be more effective than not delivering the message in times of stress? In times of no stress? Or equally effective in either?

4 / 29

slide-6
SLIDE 6

Data from a Micro-randomized Trial

t treatment occasion Xt individual and contextual characteristics at t At binary treatment at t Yt+1 continuous response following t and before t + 1 Ht history through t: ( ¯ Xt, ¯ Yt, ¯ At−1) Data in temporal order looks like this X1, A1, Y2, . . . , Xt, At,Yt+1, . . . , XT, AT, YT+1 ←←←←←← Ht, At,Yt+1, . . . ρt(1 | Ht) is known randomization probability P(At = 1 | Ht) that generates At

5 / 29

slide-7
SLIDE 7

Example Data Structure

BASICS Mobile

. . . Morning At−1 t − 1 Afternoon At t Evening Morning . . . Xt−1 Xt Yt+1

6 / 29

slide-8
SLIDE 8

Proximal moderated effect

At on Yt+1

Yt+1(¯ at) response, had the treatments ¯ at been provided S1t(¯ at−1) vector of candidate moderators from the history through t, Ht, had the treatments ¯ at−1 been provided The proximal treatment effect is E

  • Yt+1( ¯

At−1, 1) − Yt+1( ¯ At−1, 0) | S1t( ¯ At−1)

  • .

You can think of S1t(¯ at−1) as a "State" of particular interest

7 / 29

slide-9
SLIDE 9

Proximal moderated effect

At on Yt+1

A proximal treatment effect is E

  • Yt+1( ¯

At−1, 1) − Yt+1( ¯ At−1, 0) | S1t( ¯ At−1)

  • .

S1t(¯ at−1) is low-dimensional, pre-selected by scientist. It can be the "empty set". It can include time trends. Proximal effect is averaged over any variables in Ht not represented in S1t. The definition depends on distribution of (past) treatments in the data.

8 / 29

slide-10
SLIDE 10

Lagged moderated effect

At on Yt+2

A lagged treatment effect is E

  • Yt+2( ¯

At−1, 1, Aat=1

t+1 ) − Yt+2( ¯

At−1, 0, Aat=0

t+1 ) | S2t( ¯

At−1)

  • .

Aat=a

t+1 = At+1( ¯

At−1, a) S2t(¯ at−1) is again a low-dimensional, pre-selected by scientist Delayed effect is averaged over any variables in Ht not represented in Skt Delayed effect is averaged over future treatment Aat

t+1.

Here, lag = 2.

9 / 29

slide-11
SLIDE 11

General case: Lag k treatment effects

E

  • Yt+k( ¯

At−1, 1, Aat=1

t+1 , . . . , Aat=1 t+k−1)

  • − E
  • Yt+k( ¯

At−1, 0, Aat=0

t+1 , . . . , Aat=0 t+k−1) | Skt( ¯

At−1)

  • .

where Aat=a

t+1 denotes At+1( ¯

At−1, a), Aat=a

t+2 denotes At+2( ¯

At−1, a, At+1( ¯ At−1, a)), and so on. Skt(¯ at−1) is again a low-dimensional, pre-selected by scientist for examining the lag k effect

10 / 29

slide-12
SLIDE 12

Identification (Effects in terms of observed data)

Under sequential randomization, consistency and positivity assumptions

The proximal treatment effect is E

  • Yt+1( ¯

At−1, 1) − Yt+1( ¯ At−1, 0) | S1t( ¯ At−1)

  • = E [E[Yt+1 | At = 1, Ht] − E[Yt+1 | At = 0, Ht] | S1t]

= E I(At = 1)Yt+1 ρt(1 | Ht) − I(At = 0)Yt+1 1 − ρt(1 | Ht)

  • S1t
  • ,

where ρt(1 | Ht) = Pr(At = 1 | Ht) is the probabilities used to randomize sequentially. Lagged treatment effects can be identified similarly.

11 / 29

slide-13
SLIDE 13

The Notion of Availability

Not all individuals are available for treatment at all time points (e.g., Wang et al. 2012; Robins 2004). For simplicity, we define this in terms of the observed data. E [E[Yt+k | At = 1, It = 1, Ht] | It = 1, Skt] − E [E[Yt+k | At = 0, It = 1, Ht] | It = 1, Skt] = E 1(At = 1)Yt+1 ρt(1 | Ht) − 1(At = 0)Yt+1 1 − ρt(1 | Ht)

  • It = 1, Skt
  • ,

Note that It = 1 is not a static subpopulation; and we expect prior treatment to effect it.

12 / 29

slide-14
SLIDE 14

Modeling assumptions

We consider linear models for each lag k effect E [E[Yt+1 | At = 1, It = 1, Ht] | It = 1, S1t] − E [E[Yt+1 | At = 0, It = 1, Ht] | It = 1, S1t] = fkt(Skt)⊺βk. Recall k = 1 is the proximal effect. These models do not constrain each other across k (Robins, Rotnitzky and Scharfstein 2000, Theorem 8.6). We assume these treatment effect models are correct.

13 / 29

slide-15
SLIDE 15

Estimation

What would we like in an estimator?

Recall Ht is high-dimensional (especially in mobile health!) Our goal was to develop a Ease of use Familiar and easy-to-use estimation method that allows the scientist to Parsimony Examine proximal or lagged effects of At conditional on any Stk, a low-dim subset of Ht Efficiency While incorporating working knowledge about the association of Ht and Yt+k for statistical power Robustness Yet not requiring this working knowledge to be correct–which can be difficult or impossible!

14 / 29

slide-16
SLIDE 16

Weighted & Centered least squares

Fix k. Just think Yt+k

Wt

∼ gkt(Ht) + ˜ Atfk(t, Skt), where ˜ At = At − ˜ pt(1 | Skt), centered treatment Wt =

  • ˜

pt(1|Skt) ρt(1|Ht)

At

1−˜ pt(1|Skt) 1−ρt(1|Ht)

(1−At) . gkt(Ht)⊺αk is a working model for E[WtYt+k | Ht]. Formally, solve for (αk, βk) in 0 = Pn UW (αk, βk), where UW =

T−k+1

  • t=1
  • Yt+k − gkt(Ht)⊺αk − ˜

Atfk(t, Skt)⊺βk

  • Wt
  • gkt(Ht)

˜ Atfk(t, Skt)

  • 15 / 29
slide-17
SLIDE 17

Weighted & Cenered least squares vs Usual GEE

Our proposed weighted and centered estimating function

T−k+1

  • t=1
  • Yt+k − gkt(Ht)⊺αk − ˜

Atfk(t, Skt)⊺βk

  • Wt
  • gkt(Ht)

˜ Atfk(t, Skt)

  • versus standard, traditional GEE for longitudinal data analyses

T−k+1

  • t=1

(Yt+k − gkt(Ht)⊺αk − Atfk(t, Skt)⊺βk)

  • gkt(Ht)

Atfk(t, Skt)

  • but this requires

E[WtYt+k | Ht] = gkt(Ht)⊺αk + Atfk(t, Skt)⊺βk; we don’t!

16 / 29

slide-18
SLIDE 18

Implementation is easy

Estimation can be implemented with standard GEE software. Availability? Just replace Wt with ItWt. Only the independence working correlation structure may be

  • employed. Alternative structures induce bias.

Extra code (available in R) is needed for SEs with estimated (i) numerator or (ii) denominator of the weights.

17 / 29

slide-19
SLIDE 19

Simulation Experiment

Omitting an underlying moderator variable induces bias in standard GEE but not in our proposed Weighting and Centering Estimator.

18 / 29

slide-20
SLIDE 20

Simulation Experiment with n = T = 30

Yt+1 = 0.8(St − 0.5) + (At − ρt(1 | Ht))(−0.2 + β∗

11St) + ϵt+1

ϵt ∼ N(0, 1) with Corr(ϵu, ϵt) = 0.5|u−t| ρt(1 | Ht) = expit(−0.8At−1 + 0.8St) for St ∈ (−1, 1) Pr(St = 1 | At−1, Ht−1) = 0.5 Proximal Effect conditional on St = E[E[Yt+1 | At = 1, Ht] − E[Yt+1 | At = 0, Ht] | St] = −0.2 + β∗

11St

Marginal Effect E[E[Yt+1 | At = 1, Ht] − E[Yt+1 | At = 0, Ht]] = −0.2 + β∗

11E[St] = −0.2

We will vary β∗

11; and compare with GEE

19 / 29

slide-21
SLIDE 21

Experiment: Omitting an Underlying Moderator

Results

Similar results for varying levels of n, T, marginal proximal effect sizes and type of residual correlation structures in the generative model.

20 / 29

slide-22
SLIDE 22

3 Easy, Take Home Messages

We proposed a new estimand, particularly useful to behavioral intervention scientists working in mHealth Weighting and centering allow us to be interested in causal effects that are marginal over Ht and robust to mis-specification

  • f E[WtYt+k | Ht]

Easy implementation using "over the counter" GEE software

21 / 29

slide-23
SLIDE 23

Recall BASICS Mobile Example

Smartphone-based intervention to curb heavy drinking and smoking in college students Data Collected EMA up to 3x/day (morning, aftern., eve) Intervention Frequency Up to 2x/day (afternoon, evening) Intervention Content Mindfulness-based message vs general health information (binary) Intervention Availability Based on answering an EMA Typical Question Is the effect of providing a mindfulness-based intervention (vs general health information) on subsequent smoking rate moderated by increase in need to self-regulate?

22 / 29

slide-24
SLIDE 24

BASICS Mobile Data Example

At indicator that the user rec’vd mindfulness-based message Yt+1 smoking rate reported at the EMA following At k examined a proximal (k = 1) and a delayed (k = 2) effect S1t indicator of increased self-regulation from t − 1 to t S2t is the empty set (marginal delayed effect) pt(1 | It = 1, Ht) Mindfulness treatment more likely if urge is high or past smoking is high ˜ pt(1 | It = 1, St) estimated using Pn T

t=1 At/T = 0.67

Treatment effect Estimate SE 95% CI p-value Proximal, ↑ self-reg −0.05 0.94 (−2.03, 1.93) 0.96 Proximal, ✁

↑ self-reg −2.78 1.27 (−5.47, −0.10) 0.04 Delayed −0.47 0.60 (−1.74, 0.80) 0.45

23 / 29

slide-25
SLIDE 25

Future Work

  • Apply these methods with HeartSteps and Smoking

Cessation Data

  • How best to include random effects?
  • Variable selection (e.g., penalization) for the covariates

used in the working model

  • Is there a scientific rationale for sharing parameters across

k (proximal and lagged treatment effects)?

24 / 29

slide-26
SLIDE 26

Thank you!

Daniel Almirall, dalmiral@umich.edu Institute for Social Research, University of Michigan

25 / 29

slide-27
SLIDE 27

Extra Slides

Extra Slides Follow.

26 / 29

slide-28
SLIDE 28

Extension of the Structural Nested Mean Model

Treatment “blip" of at versus stochastic At(¯ at−1) on Yt+k is µt,t+k(Ht( ¯ At−1), ¯ At−1, at) = E

  • Yt+k( ¯

At−1, at, Aat=a

t+1 , . . . , Aat=a t+k−1) | Ht( ¯

At−1)

  • − E
  • Yt+k( ¯

At−1, At, At+1, At+k−1) | Ht( ¯ At−1)

  • = E
  • Yt+k( ¯

At) | At = at, Ht( ¯ At−1)

  • by Seq Ign. & Cons.

− E

  • Yt+k( ¯

At) | Ht( ¯ At−1)

  • 27 / 29
slide-29
SLIDE 29

Extension of the Structural Nested Mean Model

Treatment “blip" of at versus stochastic At(¯ at−1) on Yt+k is µt,t+k(Ht( ¯ At−1), ¯ At−1, at) = E

  • Yt+k( ¯

At−1, at, Aat=a

t+1 , . . . , Aat=a t+k−1) | Ht( ¯

At−1)

  • − E
  • Yt+k( ¯

At−1, At, At+1, At+k−1) | Ht( ¯ At−1)

  • = E
  • Yt+k( ¯

At) | At = at, Ht( ¯ At−1)

  • by Seq Ign. & Cons.

− E

  • Yt+k( ¯

At) | Ht( ¯ At−1)

  • Our lagged effects are expected contrasts of the “blips":

E

  • µt,t+k(Ht( ¯

At−1), ¯ At−1, 1) | Skt( ¯ At−1)

  • − E
  • µt,t+k(Ht( ¯

At−1), ¯ At−1, 0) | Skt( ¯ At−1)

  • = E
  • Yt+k( ¯

At−1, 1, Aat=1

t+1 , . . . , Aat=1 t+k−1)

  • Skt( ¯

At−1)

  • − E
  • Yt+k( ¯

At−1, 0, Aat=0

t+1 , . . . , Aat=0 t+k−1)

  • Skt( ¯

At−1)

  • ,

27 / 29

slide-30
SLIDE 30

Modeling assumptions: Example 1

Recall E [E[Yt+1 | At = 1, It = 1, Ht] | It = 1, S1t] − E [E[Yt+1 | At = 0, It = 1, Ht] | It = 1, S1t] = fk(t, Skt)⊺βk. Suppose Skt is the null set. Here, the analyst is interested in a marginal effect that could vary over time; for example, E [E[Yt+1 | At = 1, It = 1, Ht] | It = 1] − E [E[Yt+1 | At = 0, It = 1, Ht] | It = 1] = fk(t)⊺βk. where fk(t) could be any function of time (e.g., basis function).

28 / 29

slide-31
SLIDE 31

Modeling assumptions: Example 2

Recall E [E[Yt+1 | At = 1, It = 1, Ht] | It = 1, S1t] − E [E[Yt+1 | At = 0, It = 1, Ht] | It = 1, S1t] = fk(t, Skt)⊺βk. Suppose Skt = Stresst is binary. Here, an example model is E [E[Yt+1 | At = 1, It = 1, Ht] | It = 1, Stresst] − E [E[Yt+1 | At = 0, It = 1, Ht] | It = 1, Stresst] = zk(t)⊺βk,1 + Stresstzk(t)⊺βk,2. where zk(t) is some function of time. Here, fk(t, St) = (zk(t)⊺, Stresstzk(t)⊺).

29 / 29