STAT 113: TOPIC OUTLINE (FINAL EXAM) COLIN REIMER DAWSON, FALL 2015 - PDF document

STAT 113: TOPIC OUTLINE (FINAL EXAM) COLIN REIMER DAWSON, FALL 2015 The final exam will cover the following six areas in roughly equal proportion. For example, there might be one multi-part question for each major heading (possibly with some crossover, where it makes sense). 1. Research Design / Describing Samples Questions in this category will focus on issues of research design (sampling procedures, confounding, etc.) as well as the kinds of descriptive statistics and visualizations that we covered prior to Exam 1, and which are treated in Chapters 1-2 of the textbook. Key concepts are listed in the Exam 1 topic outline, and include ideas related to (1) sampling and study structure • experimental vs. observational studies and what we can glean from them • confounding • sampling and sources of sampling bias (2) structure of data • identifying cases, variables, and types of variables (3) descriptive measures • central tendency (mean, median) • variability (range, IQR, variance, standard deviation) • relationships (correlation, regression models) (4) descriptive visualizations • bar plots • histograms • box-and-whisker-plots • scatterplots Date : December 11, 2015. 1

2 COLIN REIMER DAWSON, FALL 2015 (5) prediction • interpretation and use of linear regression models • interpretation of residuals • checking residuals to diagnose problems with the model 2. Inference Foundations Questions in this category will focus on “big picture” issues about statistical inference, and in particular, about confidence intervals and hypothesis tests. Covered mainly in Ch. 3-6 of the textbook, and on our last exam. These topics include: (1) Distinguishing populations and samples • parameters vs. statistics • identifying appropriate populations • variability across different random samples that we could have gotten • What is a sampling distribution, and why do we care? (2) Use and meaning of a confidence interval • Why do we have uncertainty in our estimates? • What does the margin of error tell us, and how is it affected by sample size, confidence level, population variability... • What does the confidence level mean? (3) Use and interpretation of a hypothesis test • Why do we need to do tests, as opposed to just looking at our data? • What are null and alternative hypotheses statements about? • What is the conceptual criterion to say that we have evidence for a research hypothesis? • Measuring consistency/inconsistency with a null hypothesis • Logic and interpretation of P -values. • What does it mean for a result to be “statistically significant”? • What can we say if we reject H 0 ? What can we say if we do fail to reject H 0 ?

STAT 113: TOPIC OUTLINE (FINAL EXAM) 3 • Kinds of statistical errors (Type I / Type II), what they mean, what kinds of things affect how likely they are to occur (4) Logic of bootstrapping and randomization • What do we do when we construct a “bootstrap” sample? • What assumptions are randomization procedures based on? • Difference between sample size and number of samples • What are the individual points in bootstrap / randomization distributions? • How can we use bootstrap / randomization distributions to construct confidence intervals / compute P -values? (5) Common structure of most test statistics / confidence intervals • Standardized test statistics such as z and t statistics measure the number of away from that the is. • Relationship beween magnitude of a test statistic and the P - value 3. Inference for Correlation and Regression Questions in this category will focus on making inferences (confidence intervals, hypothesis tests) about relationships between quantitative variables, in particular correlation and the slope of a regression line; and also on (a) making estimates with a margin of error about the expected/predicted/mean value of a y variable in a cross-section of a population sharing a particular x value, and (b) making predictions with a margin of error about the value of a y variable for a particular case with a particular x value. This is the material from Chapter 9 of the textbook, and from the class slides and handouts from 11/23 and 11/24. Particular topics include: (1) The difference between a population correlation ( ρ ) and a sample correlation ( r ) • How sample correlations vary across samples around a population correlation • How to simulate random samples assuming no association/correlation. (2) The difference between a population regression line-of-best-fit and a sample regression line-of-best-fit • How sample lines vary around a population line

4 COLIN REIMER DAWSON, FALL 2015 • How to simulate random samples assuming that x has no power to predict y (3) t -tests for population correlation and population slope-of-best-fit- line • How to compute the test statistic (using the appropriate standard error) • Conditions that msut be satisfied in order to use a t -distribution (4) “Coefficient of Determination” ( R 2 ) for a regression model • Interpretation as a proportion • Relationship to correlation 4. Goodness of Fit and Association Tests for Categorical Variables Questions in this category will focus on hypothesis tests when the response variable is categorical and may have more than two levels (and so we can’t simply do a single-proportion test), and/or when there is also an explanatory variable that may have more than two levels (that is, there are more than two groups). This is the topic of Ch. 7 in the textbook, and the classes from 11/25-11/2. Specific topics include: (1) The distinction between expected (“long run”) category counts/proportions of a categorical variable and the particular distribution of a single sample (2) Schemes to construct simulated random samples of one categorical variable assuming particular long-run proportions. (3) Ways to measure “distance” between observed and expected outcomes, including the χ 2 statistic • How are the different parts computed, and what do they represent? • What is the role/purpose of the normalization (denominator)? • How can we use individual terms in the sum to investigate de- gree of discrepancy for individual categories? (4) Finding expected proportions for combinations of two categorical values, assuming the individual variable proportions are fixed (5) Simulating random samples assuming fixed proportions for each variable separately, and assuming no relationship (constant “conditional distributions”).

STAT 113: TOPIC OUTLINE (FINAL EXAM) 5 (6) The concept of “degrees of freedom” in a set of random proportions. (7) What kinds of χ 2 values represent discrepancies from H 0 (8) How to use a χ 2 distribution to find a P -value / measure how “un- expected” a sample of counts is. (9) Relationship between χ 2 goodness of fit test and z -test of a single proportion when there are only two categories (i.e., there is a binary response variable) (10) Relationship betwen a χ 2 test of association and a z -test of a difference of proportions when there are two binary variables. 5. Comparing Multiple Means Questions in this category will focus on hypothesis tests when the response variable is quantitative, and the explanatory variable is categorical, but may have more than two levels (i.e., there are more than two groups, and we want to compare the typical outcomes of a quantitative variable). This is the material from Chapter 8 of the textbook, and from the classes on 12/4, 12/7 and 12/8. Specific topics are: (1) Distinction between different sample means and different population means (2) Why is it a good idea to do one overall test first, instead of lots of separate tests of pairs? (3) Scheme to simulate random samples assuming a set of means is equal (4) Ways to measure “distance among” more than two means. • Simple things, good for randomization • F -statistic, good for theoretical distribution-based test (5) Idea of dividing variance into within groups vs between groups • Why does it make sense to normalize by within groups variance, when interested in variation across means? (6) Meaning of components of the ANOVA table (7) After a significant F -test • Confidence intervals of individual means • Confidence intervals and tests of differences of pairs • Using the pooled MS Within as the common estimate of within- group variance

6 COLIN REIMER DAWSON, FALL 2015 6. Practical Integration The last type of question will ask you to pull together themes from across the different sections of the course, focusing on some common principles (see also “Inference Foundations”), perhaps asking you to take a research question and describe from start to finish how you would approach it. (1) Can you identify what parameter(s) make the most sense to focus on? Considerations: • Categorical Vs. Quantitative Response Variable? – Proportion vs. Mean • Categorical vs. Quantitative vs. No Explanatory Variable? – Differences between/among groups vs. Correlation vs. In- ference about a single proportion/mean • One, two, or more groups? – Inference about a single proportion/mean vs. difference of proportions/means vs. more complex scenarios (chi- square/ANOVA) (2) Can you interpret the conclusion of a hypothesis test / confidence interval in real world terms? Can you distinguish between scenarios when causal conclusions are / are not justified?

STAT 113: TOPIC OUTLINE (FINAL EXAM) COLIN REIMER DAWSON, FALL 2015 - PDF document

STAT 113: TOPIC OUTLINE (FINAL EXAM) COLIN REIMER DAWSON, FALL 2015 The final exam will cover the following six areas in roughly equal proportion. For example, there might be one multi-part question for each major heading (possibly with some

Virtual Student Orientation Information for Families SLIDESMANIA.COM TOPIC TOPIC TOPIC TOPIC

ConnectHome ConnectHome Topic 2 Topic 2 Nation Webinar Nation Webinar Topic 3 Topic 3 Topic

Kinematics in 1-Dimension www.njctl.org Slide 3 / 113 Slide 4 / 113 How to Use this File

Math 211 Math 211 Review for the Final Exam December 8, 2002 2 The Final Exam The Final Exam

AP Chemistry The Atom 2015-08-25 www.njctl.org Slide 3 / 113 Slide 4 / 113 Table of Contents:

Second Year Student Meeting PhD Candidacy Exam On-topic or Off-topic Candidacy Exam? On-Topic:

Buffers/Titration Review hydrolysis of salts Aqueous Equilibria - I Slide 3 / 113 Slide 4 / 113

US 113 North/South Study US 113 North/South Study Lincoln and Milford Public Workshops Lincoln

1.113.5 2.113.7 Set up secure shell (OpenSSH) Setup and configure basic DNS services Setup and

Final exam effects Textures I Final exam effects Final exam effects Lighting Grads

ICS 101 Final Exam Review Fall 2016 Final Exam information In lab: check final exam schedule

Announcements Announcements Final Exam will be a take Final Exam will be a take- -home exam

Final Review Introduction to Web Design Final exam on Thursday, December 19 at 12:00 p.m. Final

Final Review Drawing on the Web Final exam on Thursday, May 14 at 2:00 p.m. (EST) Final Review

AP Chemistry The Atom 2015-08-25 www.njctl.org Slide 3 / 113 Table of Contents: The Atom (Pt.

STAT 113: FINAL EXAM PRACTICE PROBLEMS COLIN REIMER DAWSON, FALL 2015 Research Design /

Deep Learning: Part 2 Graduate School of Culture Technology, KAIST Juhan Nam Outlines

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Discrete Mathematics & Mathematical Reasoning Algorithms Colin Stirling Informatics Some

Complexity of factoring polynomials with rational number coefficients Mark van Hoeij Florida

STAT2201 Analysis of Engineering & Scientific Data Unit 8 Slava Vaisman The University of

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fMRI Data Analysis

STK-IN4300 Details of Random Forests Statistical Learning Methods in Data Science Adaptive

Analysis Toolpack on a Mac It seems Excel has done away with the Analysis Toolpack on Macs They

Sambuz

Useful Links

Newsletter

Mail Us

STAT 113: TOPIC OUTLINE (FINAL EXAM) COLIN REIMER DAWSON, FALL 2015 - PDF document

STAT 113: TOPIC OUTLINE (FINAL EXAM) COLIN REIMER DAWSON, FALL 2015 The final exam will cover the following six areas in roughly equal proportion. For example, there might be one multi-part question for each major heading (possibly with some

Virtual Student Orientation Information for Families SLIDESMANIA.COM TOPIC TOPIC TOPIC TOPIC

ConnectHome ConnectHome Topic 2 Topic 2 Nation Webinar Nation Webinar Topic 3 Topic 3 Topic

Kinematics in 1-Dimension www.njctl.org Slide 3 / 113 Slide 4 / 113 How to Use this File

Math 211 Math 211 Review for the Final Exam December 8, 2002 2 The Final Exam The Final Exam

AP Chemistry The Atom 2015-08-25 www.njctl.org Slide 3 / 113 Slide 4 / 113 Table of Contents:

Second Year Student Meeting PhD Candidacy Exam On-topic or Off-topic Candidacy Exam? On-Topic:

Buffers/Titration Review hydrolysis of salts Aqueous Equilibria - I Slide 3 / 113 Slide 4 / 113

US 113 North/South Study US 113 North/South Study Lincoln and Milford Public Workshops Lincoln

1.113.5 2.113.7 Set up secure shell (OpenSSH) Setup and configure basic DNS services Setup and

Final exam effects Textures I Final exam effects Final exam effects Lighting Grads

ICS 101 Final Exam Review Fall 2016 Final Exam information In lab: check final exam schedule

Announcements Announcements Final Exam will be a take Final Exam will be a take- -home exam

Final Review Introduction to Web Design Final exam on Thursday, December 19 at 12:00 p.m. Final

Final Review Drawing on the Web Final exam on Thursday, May 14 at 2:00 p.m. (EST) Final Review

AP Chemistry The Atom 2015-08-25 www.njctl.org Slide 3 / 113 Table of Contents: The Atom (Pt.

STAT 113: FINAL EXAM PRACTICE PROBLEMS COLIN REIMER DAWSON, FALL 2015 Research Design /

Deep Learning: Part 2 Graduate School of Culture Technology, KAIST Juhan Nam Outlines

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Discrete Mathematics &amp; Mathematical Reasoning Algorithms Colin Stirling Informatics Some

Complexity of factoring polynomials with rational number coefficients Mark van Hoeij Florida

STAT2201 Analysis of Engineering &amp; Scientific Data Unit 8 Slava Vaisman The University of

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fMRI Data Analysis

STK-IN4300 Details of Random Forests Statistical Learning Methods in Data Science Adaptive

Analysis Toolpack on a Mac It seems Excel has done away with the Analysis Toolpack on Macs They

Sambuz

Useful Links

Newsletter

Mail Us

Discrete Mathematics & Mathematical Reasoning Algorithms Colin Stirling Informatics Some

STAT2201 Analysis of Engineering & Scientific Data Unit 8 Slava Vaisman The University of