Determining Sample Size for Linear Models in Excel EERA Round Table - PDF document

Determining Sample Size for Linear Models in Excel EERA Round Table Presentation 19 February 2016 Bryan W. Griffin Georgia Southern University bwgriffin@georgiasouthern.edu Sample Size in Excel file download location http://www.bwgriffin.com/samplesize Content 1. Purpose 2. Framework 3. Sample Size Worksheet Explanation a. Download Source b. Worksheet Content c. Excel Security Warning — Macros 4. Illustrated Examples of Sample Size Determination a. Overview of Use b. Independent Samples t-test c. Pearson Correlation Coefficient d. One-way ANOVA e. One-way ANCOVA f. Regression g. Regression with an Incremental Test of a Predictor Set 5. Illustrated Examples of Power Determination a. Overview of Use b. Independent Samples t-test c. Pearson Correlation Coefficient d. One-way ANOVA e. One-way ANCOVA f. Regression g. Regression with an Incremental Test of a Predictor Set 1. Purpose The purpose of this presentation is to provide an Excel workbook developed to calculate sample sizes for linear models given alpha, power, effect size, and model degrees of freedom. This is the same format for sample size determination used by Cohen (1988) in his popular book Statistical Power Analysis for the Behavior Sciences . The workbook can also be used to calculate power given alpha, effect size, sample size, and model degrees of freedom. While other software exists for computing both sample size and power, this Excel workbook has some instructional, and research, benefit in that it is very fast, very easy to use, has a familiar interface with Excel, and shows both inputs and results on the same screen so students can see results without scrolling or changing screens. As an instructional tool in Excel, having both inputs and outputs on the same screen streamlines the process and brings focus not on how to use the software, but instead on how manipulating alpha, power, effect size, and model degrees of freedom impact sample size. 1

In short, this freely available Excel workbook may prove useful to instructors, students, and researchers who encounter situations in which sample size must be determined. 2. Framework I regularly teach courses in which students are required to learn how to determine appropriate sample sizes for various designs and statistical models. The importance of sample size considerations has been recognized among academics and professionals engaged in empirical research (e.g., Freiman, Chalmers, Smith, & Kuebler, 1978; Hoenig & Heisey, 2001; Moher, Dulberg, & Wells, 1994; Wilkinson, 1999). While texts for sample size determination have existed for some time (Cohen, 1988; Kraemer & Theimann, 1987; Murphy & Myors, 2003), the widespread use of computers can and has facilitated power and sample size determination. Over the past 10 to 15 years software for power and sample size calculations has appeared. For example, there are several commercial products designed for sample size estimation such as SamplePower (http://www- 03.ibm.com/software/products/en/spss-samplepower), PASS (www.ncss.com), and nQuery Advisor (http://www.statsols.com). There are also freely available to use on-line applets written in JAVA (www.java.com) to help researchers find sample sizes (e.g., www.statpages.org/#Power; www.cs.uiowa.edu/~rlenth/Power; http://danielsoper.com/statcalc3/default.aspx ), and also free to use software that can be installed on one’s computer (e.g., G*Power at http://www.gpower.hhu.de ; PS at http://biostat.mc.vanderbilt.edu/wiki/Main/PowerSampleSize). I found, however, that most of my students have access to and understand Microsoft Excel (https://products.office.com/en-us/excel), so to allow them work with Excel — software with which they are already familiar — I created a workbook that computes both sample size and power given specific input (i.e., alpha, effect size, model degrees of freedom). These inputs are used with a noncentral F distribution (Patnaik, 1949) to calculate n. Excel does not offer an in-built noncentral F distribution function, so that distribution was added via Visual Basic code. In addition to the sample size calculation using the noncentral F distribution, I also added sample size estimates, and power estimates, obtained from four noncentral F distribution approximations: Laubscher’s cube root (Laubscher, 1960; Severo & Zelen, 1960), Laubscher’s square root (Laubscher, 1960), Tiku’s 3 -moment (Tiku, 1965), and Patnaik’s 2 -moment (Patnaik, 1949). I included these four for historical interest; until fast computers became widely available these approximations were used for power and sample size estimates because of their relative ease in calculation compared with the noncentral F. 3. Sample Size Workbook Explanation (a) Download Source The Excel file can be downloaded from this page: http://www.bwgriffin.com/samplesize 2

(b) Workbook Content The workbook contains three worksheets (see tabbed labels at bottom left of Excel window). The first is labeled “Sample Size” and should be the worksheet to appear when the workbook is opened. This sheet is capable of providing sample size calculations for most linear models with independent data. These include, for example, independent samples t- tests, Pearson’s product moment correlation, linear regression models, ANOVA and ANCOVA models. The second worksheet is “ES Conversion” and provides a tool to con vert effect sizes among different metrics (d, r,r 2 , f, f 2 , and η 2 ). The third worksheet is entitled “Power” and is designed to calculate power for known alpha level, degrees of freedom, and n. (c) Excel Security Warning — Macros Once downloaded, open the workbook normally in Excel. The workbook was developed in Microsoft Excel version 2003, but also functions in more recent versions of Excel (e.g., 2010, 2016). When opening the workbook, the user is likely to experience a security warning that macros exist. The workbook executes macros so those must be enabled. The macros perform the necessary noncentral F calculations, sample size calculations, and also control user input. In Excel 2003 a pop-up window may appear that allows users to enable macros. In more recent versions of Excel, a security warning may appear directly below the ribbon. On the “Security Warning” ribbon, select Options and then select “Enable this content.” 3

4. Illustrated Examples of Sample Size Determination (a) Overview of Use There are five steps to finding sample size: 1. Enter alpha error rate (often .05 or .01) – probability of a Type 1 error 2. Enter power level (often .80, .90, or .95) – probability of detecting an effect in the sample 3. Enter model degrees of freedom (df 1 ; e.g., df = 1 for two group t-test or df 1 = 2 for ANOVA with 3 groups) 4. Enter the effect size of interest in the appropriate box (d, r, r 2 , f, f 2 ) 5. Click on the “Find n” button under the appropriate effect size box The image below shows each step as indicated by an arrow. A note on model degrees of freedom. The value sought here is the number of parameter estimates required from a linear model perspective. So estimating a two group t-test via a liner model would require one degree of freedom to estimate the slope or mean difference. For a Pearson correlation, it is similarly one degree of free to estimate the slope in a regression model. If one wishes to compare three groups, the model df would be 2 (two dummy variables required, or df 1 = 2 in ANOVA). 4

(b) Independent samples t-test Research Question: Is there a difference in motivation to read between females and males in 6 th grade? Type 1 Error Rate: α = .05 Desired Power Level: Power = .90 Anticipated Effect Size: d = .35 Model Degrees of Freedom df = 1 for a two-group t-test Resulting total sample size is n = 346 or 346 / 2 = 173 per group. See the image below for the inputs of this example. 5

(c) Pearson Correlation Coefficient Research Question: Do mathematics test anxiety and mathematics self-efficacy correlate? Type 1 Error Rate: α = .01 Desired Power Level: Power = .95 Anticipated Effect Size: r = -.45 Model Degrees of Freedom df = 1 for a Pearson correlation Resulting total sample size is n = 74. See the image below for the inputs of this example. Note – n differs from that provided by Cohen (1988) ; Cohen’s tables of power and n for Pearson r slightly overstate the sample size needed for a given power level (often a difference in sample size of 1 to 3). 6

(d) One-way ANOVA Research Question: Are there differences in job satisfaction levels among administrators, faculty, and staff? Type 1 Error Rate: α = .0 5 Desired Power Level: Power = .80 Anticipated Effect Size: f = .41 Model Degrees of Freedom df = 2 for ANOVA with 3 groups Resulting total sample size is n = 61, so about 61 / 3 = 20 to 21 per group. See the image below for the inputs of this example. 7

Determining Sample Size for Linear Models in Excel EERA Round Table - PDF document

Determining Sample Size for Linear Models in Excel EERA Round Table Presentation 19 February 2016 Bryan W. Griffin Georgia Southern University bwgriffin@georgiasouthern.edu Sample Size in Excel file download location

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

EERA-JPNM, Fission and Fusion Lorenzo Malerba, CIEMAT, Spain EERA-JPNM Coordinator

Linear Model using Excel 2013 Trendline XL2A 4/3/2017 V0L XL2A V0L Model Trendline

Excel Tips Rachael Wright & Dotti Davidson Excel Terms Knowing the proper names of excel

Greece Central School District EXCEL II Reconstruction Projects OUTLINE | Greece CSD EXCEL II

SAMPLE SIZE IN TRIAXIAL LOADS How sample size affects the frictional behavior Photo by H.

When It doesnt read When It doesnt read RECONNECTING EERA : 23-28 august online

5.2 Microsoft Excel Microsoft Excel Microsoft Excel is the spreadsheet component of the

X1D: Create Pivot Tables using Excel 2013 3/07/2018 V1N Create Pivot Tables using Excel 2013 1

Hypothesis Tests using Excel T.TEST function V1e 11/12/2013 Two group hypothesis tests using

Create Pivot Tables using Excel 2008/2013 1/26/2016 V1H Create Pivot Tables using Excel 2008 1

Excel In Extreme Ways Integrating Excel into your development cycle Introduction Andres

FINANCIAL REPORTING STANDARDS Elikem Vulley EXCEL PROFESSIONAL INSTITUTE EXCEL PROFESSIONAL

Introduction to Excel and Visual Basic for Excel Gilbert Ritschard Department of economics,

Hypothesis Tests using Z.TEST function in Excel 2008 V1c 11/16/2012 Hypothesis Tests [Excel

Excel Tutorial 1 Getting Started with Excel Tutorial 2 Formatting a Workbook Tutorial 3

2.10 Viareggio Italy (LU) 2019-06-11 This presentation is brought to you by Francesco

Outline What is TURF ? A TURF Model for Critical User Interactions What is TURF? TURF - A

Uncertainty in laboratory results Uncertainty in laboratory results using evidence from the

Mathematical strategies and error quantification in the coarse-graining of many-body stochastic

Advisory Committee Meeting June 20, 2019 Presentation overview Introductions Approve meeting

Measurement and Correction of Beta Beating in the Fermilab Booster Meghan McAteer The University

Public Equity IAC Presentation Presented by: Rhonda Smith, Director Casey High, Portfolio

Investing in Global Equity Markets with particular Emphasis on Chinese Stocks John B. Guerard,

Determining Sample Size for Linear Models in Excel EERA Round Table - PDF document

Determining Sample Size for Linear Models in Excel EERA Round Table Presentation 19 February 2016 Bryan W. Griffin Georgia Southern University bwgriffin@georgiasouthern.edu Sample Size in Excel file download location

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

EERA-JPNM, Fission and Fusion Lorenzo Malerba, CIEMAT, Spain EERA-JPNM Coordinator

Linear Model using Excel 2013 Trendline XL2A 4/3/2017 V0L XL2A V0L Model Trendline

Excel Tips Rachael Wright &amp; Dotti Davidson Excel Terms Knowing the proper names of excel

Greece Central School District EXCEL II Reconstruction Projects OUTLINE | Greece CSD EXCEL II

SAMPLE SIZE IN TRIAXIAL LOADS How sample size affects the frictional behavior Photo by H.

When It doesnt read When It doesnt read RECONNECTING EERA : 23-28 august online

5.2 Microsoft Excel Microsoft Excel Microsoft Excel is the spreadsheet component of the

X1D: Create Pivot Tables using Excel 2013 3/07/2018 V1N Create Pivot Tables using Excel 2013 1

Hypothesis Tests using Excel T.TEST function V1e 11/12/2013 Two group hypothesis tests using

Create Pivot Tables using Excel 2008/2013 1/26/2016 V1H Create Pivot Tables using Excel 2008 1

Excel In Extreme Ways Integrating Excel into your development cycle Introduction Andres

FINANCIAL REPORTING STANDARDS Elikem Vulley EXCEL PROFESSIONAL INSTITUTE EXCEL PROFESSIONAL

Introduction to Excel and Visual Basic for Excel Gilbert Ritschard Department of economics,

Hypothesis Tests using Z.TEST function in Excel 2008 V1c 11/16/2012 Hypothesis Tests [Excel

Excel Tutorial 1 Getting Started with Excel Tutorial 2 Formatting a Workbook Tutorial 3

2.10 Viareggio Italy (LU) 2019-06-11 This presentation is brought to you by Francesco

Outline What is TURF ? A TURF Model for Critical User Interactions What is TURF? TURF - A

Uncertainty in laboratory results Uncertainty in laboratory results using evidence from the

Mathematical strategies and error quantification in the coarse-graining of many-body stochastic

Advisory Committee Meeting June 20, 2019 Presentation overview Introductions Approve meeting

Measurement and Correction of Beta Beating in the Fermilab Booster Meghan McAteer The University

Public Equity IAC Presentation Presented by: Rhonda Smith, Director Casey High, Portfolio

Investing in Global Equity Markets with particular Emphasis on Chinese Stocks John B. Guerard,

Excel Tips Rachael Wright & Dotti Davidson Excel Terms Knowing the proper names of excel