. Power Analysis for Logistic - PowerPoint PPT Presentation

……………………………………………………. Power Analysis for Logistic Regression Models Fit to Clustered Data: Choosing the Right Rho ……………………………………………………. CAPS Methods Core Seminar Steve Gregorich May 16, 2014 CAPS Methods Core 1 SGregorich

Abstract � Context Power analyses for logistic regression models fit to clustered data Approach . estimate effective sample size ( N eff : cluster-adjusted total sample sizes) . input N eff into standard power analysis routines for independent obs. Wrinkle . in the context of logistic regression there are two general approaches to estimating the intra-cluster correlation of Y : . phi-type coefficient and . tetrachoric-type coefficient. Resolution . The phi-type coefficient should be used when calculating N eff I will present background on this topic as well as some simulation results CAPS Methods Core 2 SGregorich

Simple random sampling (SRS) . Fully random selection of participants e.g., start with a list, select N units at random . Some key features wrt statistical inference: representativeness all units have equal probability of selection all sampled units can be considered to be independent of one another . SRS with replacement versus without replacement CAPS Methods Core 3 SGregorich

Clustered sampling . Rnd sample of m clusters; rnd sample of n units w/in each cluster multi-stage area sampling patients within clinics . Repeated measures Random sample of m respondents; n repeated measures are taken repeated measures are clustered within respondents . Typically, elements within the same cluster are more similar to each other than elements from different clusters . The n units w/in a cluster usually do not contain the same amount of info wrt some parameter, θ , as the same number of units in an SRS sample …the concept of effective sample size, N eff … ( ) ( ) ˆ ˆ 2 2 σ θ ≠ σ θ Therefore, it is usually true that clus srs CAPS Methods Core 4 SGregorich

Two-stage clustered sampling design Unless otherwise noted, I assume . Clustered sampling of m clusters, each with n units: N = m × n . Normally distributed unit-standardized x , binary y exchangeable / compound symmetric correlation structure ρ >0: intra-cluster correlation of y (outcome) response y ρ = 0 or 1: intra-cluster correlation of x (explanatory var) response x . Regression of y onto x via . a mixed logistic model with random cluster intercepts or . a GEE logistic model . Common effects of x across clusters, i.e., no random slopes for x . Common between- and within-cluster effects of x CAPS Methods Core 5 SGregorich

The design effect, deff . deff can be thought of as a design-attributable multiplicative change in variation that results from choice of a clustered sampling versus an SRS design � �� = � = �� and � �� , where � � �� ( ) ˆ 2 σ θ is the estimated parameter variation given a clustered sampling design; clus ( ) ˆ 2 σ θ is the estimated parameter variation given a SRS design; srs N is the common size of the SRS and clustered ( N = m × n ) samples; ˆ estimated effective size of the clustered sample wrt information about ˆ N θ , eff relative to what would have been obtained with a SRS of size N Assumes compound symmetric covariance structure of the response CAPS Methods Core 6 SGregorich

The misspecification effect, meff Conceptually similar to deff except that the multiplicative change corresponds to the effect of correctly modeling the clustering of observations versus ignoring the cluster structure � �� = � = �� and � �� , where � � �� ( ) ˆ 2 σ θ is the estimated parameter variation given clustered responses; clus #� is the estimated parameter variation ignoring clustering of responses; ! � �� " N is the total size of the clustered sample; ˆ is the effective size of the clustered sample wrt information about ˆ N θ , eff relative to what would have been obtained with a SRS of the same size Assumes compound symmetric covariance structure of the response CAPS Methods Core 7 SGregorich

deff , meff , and the sample size ratio A ‘context free’ label for deff and meff is the sample size ratio, SSR N SSR= ˆ N eff . deff , meff , and SSR have equivalent meaning wrt power analysis, but deff and meff are conceptually distinct . deff assumes that you are considering SRS versus clustered sampling . meff assumes that you have chosen a clustered sampling design and want to make adjustments to an analysis that assumed SRS . I will use meff for this talk CAPS Methods Core 8 SGregorich

Estimating meff via the intra-cluster correlation . Given positive intra-cluster correlation of y : ρ >0, y the meff estimator depends on ρ x #1. Level-2 (cluster-level) x variables will have zero within -cluster variation and ρ = 1 x � � %&' $ = . � � (� %&' )� */,- . In this case � �� = �� = � = 1 + (4 − 1)$ 7 , � � �� /00 . note: when estimating 8 9 , assume ρ = 1 x CAPS Methods Core 9 SGregorich

Estimating meff via the intra-cluster correlation #2. Consider a level-1 stochastic x variable with positive within-cluster variation and zero between-cluster variation: ρ = 0: x � � %&' $ = . � � (� %&' )� */,- . In this case � �� (; (;<=) ⁄ ) � = �� = � ≈ 1 − $ 7 � � �� /00 note: 4 (4 − 1) ⁄ → 1 as 4 → ∞ ρ < 1 see my March 2010 CAPS Methods Core talk) (for Level-1 x variables with 0 < x CAPS Methods Core 10 SGregorich

Power analysis for clustered sampling designs using meff : Option 1 Option 1. Given a chosen model, power, and alpha level, plus a proposed clustered sample of size N = m × n , and a meff estimate � � = . � �� (instead of N ), and estimate . Use standard power analysis software, plug in � �� CAPS Methods Core 11 SGregorich

Power analysis for clustered sampling designs using meff : Option 1 Example Estimate Power by Simulation . Simulate data from a CRT with 100 clusters ( j ) and 30 individuals/cluster ( i ) 8 AB = group B H. K + J B + � AB needed later for PASS where, VAR( u j ) = VAR( e ij ) = 1, VAR( u j ) + VAR( e ij ) = 2 , and ! (� L ! + � � ! ) ⁄ ρ y = � L = 0.50 . Linear mixed model results from analysis of 2000 replicate samples . ρ y = 0.501 all relatively ≈ √N . residual std dev = 1.416 unbiased # PQR�S = H. . O TUK . simulated power for group effect: 67.7% CAPS Methods Core 12 SGregorich

Power analysis for clustered sampling designs using meff : Option 1 Example . Simulation result: power = 67.7% . Use PASS Linear Regression routine to solve for power � = 1 + (30 − 1) � H. KHX = 15.529 . �� = 100 × 30 ÷ 15.529 ≈ 193 . � �� .specify 193 as N in PASS 0.495 . specify H 1 slope = . specify Residual Std Dev = 1.416 (resid. @ level-1 plus level-2) . PASS result: power = 67.6% Summary . choose meff estimator and estimate meff . estimate N eff . plug N eff into power analysis software (w/ other parameters) . estimate power CAPS Methods Core 13 SGregorich

Power analysis for clustered sampling designs using meff : Option 1 Example CAPS Methods Core 14 SGregorich

Power analysis for clustered sampling designs using meff : Option 1 Example PASS: power = 67.6% Simulation: power = 67.7% CAPS Methods Core 15 SGregorich

Power analysis for clustered sampling designs using meff : Option 2 example Option 2. Given a clustered sample design, chosen model, power, and alpha level, plus an effect size estimate and a meff estimate . Use standard power analysis software to estimate required sample size assuming independent observations, i.e., N eff . Then estimate N � � = � �� × �� . � Option 2: Step 1 Start with… . the group effect (b= 0.495 ), 1.416 . a residual standard deviation of , . and power equal to 67.6%, � = 193 . Use PASS to estimate the required effective sample size, � �� CAPS Methods Core 16 SGregorich

Power analysis for clustered sampling designs using meff : Option 2 example � = 193 Result: � �� CAPS Methods Core 17 SGregorich

Power analysis for clustered sampling designs using meff : Option 2 example Option 2: Step 2 � = 193, clusters of size n =30, and ρ y = 0.501, . Given � �� = 193 to obtain the required needed sample size adjust � �� = 1 + (4 − 1)$ 7 ρ = 1 and �� . for a CRT, x � = 193 × ^1 + (30 − 1) � 0.501_ ≈ 3000 . � � =3000 suggests that . Given clusters of size n =30, � 100 clusters need to be sampled and randomized (i.e., 3000 ÷ 30) This example used the linear mixed models framework. Now onto the models for clustered data with binary outcomes. CAPS Methods Core 18 SGregorich

. Power Analysis for Logistic - PowerPoint PPT Presentation

. Power Analysis for Logistic Regression Models Fit to Clustered Data: Choosing the Right Rho . CAPS Methods Core Seminar

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Todays lecture Logistic regression How can we use logistic regression for reranking? Shay

From Logistic Regression to Neural Networks CMSC 470 Marine Carpuat Logistic Regression What

LEARNING Outline Math Behind Logistic Regression Visualizing Logistic Regression Loss

Workshop 10.5a: Logistic regression Murray Logan August 23, 2016 Table of contents 1 Logistic

Logistic Regression using OLS1D in Excel 2013 XL4D: V0H XL4D: V0H XL4D: V0H 2015 Schield

Workshop 10.5a: Logistic regression Murray Logan 05 Sep 2016 Section 1 Logistic regression

Lecture 3: Logistic Regression Feng Li Shandong University fli@sdu.edu.cn September 21, 2020

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

XL4B: Logistic Regression using OLS1B in Excel 2013 25 Feb 2018 V0C-2x XL4B: V0C-2x XL4B: V0C-2x

Logistic regression Shay Cohen (based on slides by Sharon Goldwater) 28 October 2019 Todays

Machine Learning Logistic Regression Hamid R. Rabiee Spring 2015

Discriminant Analysis using Logistic Regression OLS1D XL4E: V0D XL4E : OLS1D V0D XL4E : OLS1D V0D

Learning From Data Lecture 9 Logistic Regression and Gradient Descent Logistic Regression

Logistic regression Predict binary outcomes (success/failure) from numerical or categorical

Power and Limitations of Opinion Polls Rajeeva L. Karandikar Director Chennai Mathematical

Sampling and Representativeness Department of Government London School of Economics and

Logistics and Such COGS 105 Research Methods for Cognitive Scientists Exam date now posted.

Political Science 209 - Fall 2018 Uncertainty Florian Hollenbach 2nd December 2018 Statistical

Summary Structures for Massive Data Graham Cormode G.Cormode@warwick.ac.uk 7 6 4 1 Massive

ECON 626: Applied Microeconomics Lecture 8: Permutations and Bootstraps Professors: Pamela

RANSAC 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Up to now, weve

stakeholder Think Tank Meeting Trevor Lentz, PhD, PT, MPH Lesley Curtis, PhD Frank Rockhold, PhD

Sambuz

Useful Links

Newsletter

Mail Us