Inference concepts DAAG Chapter 4 Learning objectives Point - PowerPoint PPT Presentation

Inference concepts DAAG Chapter 4

Learning objectives • Point estimation • Confidence intervals and hypothesis tests • Contingency tables • One-way and two-way comparisons, ANOVA • Response curves • Nested structures, pseudoreplication • Maximum likelihood estimation • Bayesian estimation

Inference • Interested in population quantities – Parameters � (e.g. μ, σ 2 ) • Collect sample X �(�) to estimate a • Use a sample statistic � population quantity � • The sampling distribution � � implies � � �|� • We use � � �|� for inference about � , or • We use � �|� � for inference about � (Bayesian)

Point estimation • What is the population mean μ? – A point estimate of � is the sample mean �̅ • Look to the sampling distribution � � �|� �|� ~�(�, � � /�) – According to CLT, � � – The standard error of the mean is thus �/ � – Can approximate SEM ≈ s/ � �̅�� • The sampling distribution of � = �� is � |� – Includes variability from �̅ and s ≈ � – � is the number of SEM units between �̅ and �

Hypothesis tests • Use � � �|� for inference about � • In hypothesis testing, – Begin by assuming � = � ! (null hypothesis) – What is the sampling distribution � � �|� " ? – Imagine we sample from � � �|� " . What values are likely? What values are unlikely? • Our answer determines the rejection region of the test # $%& – Now, collect a sample and compute � � $%& in the rejection region? Reject our initial • Is � hypothesis that � = � !

Hypothesis tests • How to decide what is an unlikely value? – Formulate an alternative hypothesis • � > � ! or � < � ! or � ≠ � ! – Decide on a Type 1 error rate α (false rejection) – α, together with alternative hypothesis, implies a rejection region (“unlikely value”) • If we don’t want to decide α, compute p-value – Smallest α that would result in rejection of null hypothesis

Confidence intervals • Consider � � �� , the sampling distribution of � − � � • Given a probability, (e.g. 95% or 99%) we can � − � from � � compute an interval for � �� ~�(0, � � /n) or • For μ, use � � � (� ~� .�/ ��) �� ⁄ • Results in confidence intervals for μ �̅ ± 1 2/� �/ � or �̅ ± � 3 4 ,.�/ 5/ �

A short comment… • Use hypothesis tests sparingly, and for good reason. – Multiple comparisons can result in false alarms – Ask directed questions • Consider alternatives to hypothesis tests – They provide little or no information about � • What is the probability of the null hypothesis? – Confidence intervals (or Bayesian posterior distributions) provide much more information • Always report means (point estimates) and standard errors when reporting hypothesis tests

Contingency tables • Comparing two or more categorical variables • Common question: are the variables independent? Which categories have more or fewer units than expected? Men Women Totals Brown Eyes 42 39 81 (81/174) Blue Eyes 35 38 73 (73/174) Other 12 8 20 (20/174) Totals 89 (89/174) 75 (75/174) 174

One-way comparisons • Data: tinting • Experiment: time to discriminate a target for different window tinting levels hi Tinting lo no 50 100 150 200 Time (ms)

One way ANOVA Analysis of Variance Table Response: it Df Sum Sq Mean Sq F value Pr(>F) tint 2 6597 3298.4 2.1769 0.1164 Residuals 179 271220 1515.2

Two-way comparisons • There are other factors that might influence time to discriminate a target, e.g. age Younger Older 200 150 it 100 50 no lo hi no lo hi

Two way ANOVA Analysis of Variance Table Response: it Df Sum Sq Mean Sq F value Pr(>F) tint 2 6597 3298 3.0965 0.04765 * agegp 1 81612 81612 76.6164 1.567e-15 *** Residuals 178 189607 1065 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Interaction plots agegp 90 Older 80 Younger mean of it 70 60 50 40 no lo hi tint

Two-way ANOVA: interaction Analysis of Variance Table Response: it Df Sum Sq Mean Sq F value Pr(>F) tint 2 6597 3298 3.1109 0.04702 * agegp 1 81612 81612 76.9729 1.466e-15 *** tint:agegp 2 2999 1499 1.4141 0.24590 Residuals 176 186609 1060 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Response curves • Sometimes a response should be handled as a regression problem rather than ANOVA 1.2 1.0 distance 0.8 0.6 3.0 3.5 4.0 4.5 angle

Pseudoreplication

Nested structures • If the scale of your effect doesn’t match the scale of your experimental unit, don’t pretend that it does. Q: How many experimental units do we have for comparing treatment to control?

Maximum likelihood estimation • Likelihood is the probability of data � given a population, parameterized by � • The value of � that maximizes the likelihood is the 7 . maximum likelihood estimate � �6 8 9 = � + ; 9 , ; 9 ~� 0, � � , < = 1,2, … , � . 2D� � E �(F G ��) 4 1 @ A; �, � � = C �H 4 9I/ . 2 log(2D� � ) − N (8 9 − �) � J A; �, � � = − 1 2� � 9I/

Bayesian estimation O � � = O � � O(�) O(�) It is often difficult to get O(�) directly, but O(�) is just a normalizing constant O � � ∝ O � � O(�) so use various tricks to generate samples from O � � O(�) The most popular trick is MCMC

Inference concepts DAAG Chapter 4 Learning objectives Point - PowerPoint PPT Presentation

Inference concepts DAAG Chapter 4 Learning objectives Point estimation Confidence intervals and hypothesis tests Contingency tables One-way and two-way comparisons, ANOVA Response curves Nested structures, pseudoreplication

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

CONCEPTS AND CONCEPTS AND CONCEPTS AND CONCEPTS AND PR PR PRINC PRINC NCIPLES OF NCIPLES

Current C Current C Current C Current C Concepts of Concepts of Concepts of Concepts of

Post-Selection Inference Todd Kuffner Washington University in St. Louis PhyStat 2016

Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard

Type Inference 75 Definition Type Inference Type inference = Java compiler's ability

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Exact Inference Inference Basic task for inference: Compute

MAXIMIZING UTILIZATION FOR DATA CENTER INFERENCE WITH TENSORRT INFERENCE SERVER David Goodwin,

Quartet Inference from SNP Data Under the Coalescent Model Syed Shalan Naqvi Quartet Inference

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Mathematical approximation Jo Hardin Professor, Pomona College DataCamp Inference for Linear

Deep Variational Inference FLARE Reading Group Presentation Wesley Tansey 9/28/2016 What is

TensorRT 2. Setup of the TensorRT inference engine 2. Setup of the TensorRT inference engine 3. I/O

Causal Inference and Response Surface Modeling Inference and

The Foundations: Logic and Proofs Chapter 1, Part III: Proofs Rules of Inference Section 1.6

Multivariate Responses In the general mean-variance specification E ( Y j | x ) = f ( x j , ) ,

Introductory Statistics Day 1 Introduction Data is the sword of the 21st century, those who

Adaptive Experiments for Policy Choice Maximilian Kasy Anja Sautmann December 7, 2018

Measurement of using B K and B KK K decays David London Universit e de

An experimental evaluation of continuous testing during development David Saff, Michael D. Ernst

M u ltiple e x planator y v ariables IN TE R ME D IATE STATISTIC AL MOD E L IN G IN R Dann y

General considerations Forecasting is about the future! Lead times within 0-48 hours, in line with

Sequential data analysis with TraMineR, Part 1 Gilbert Ritschard Department of Econometrics and

Sambuz

Useful Links

Newsletter

Mail Us