inference concepts
play

Inference concepts DAAG Chapter 4 Learning objectives Point - PowerPoint PPT Presentation

Inference concepts DAAG Chapter 4 Learning objectives Point estimation Confidence intervals and hypothesis tests Contingency tables One-way and two-way comparisons, ANOVA Response curves Nested structures, pseudoreplication


  1. Inference concepts DAAG Chapter 4

  2. Learning objectives • Point estimation • Confidence intervals and hypothesis tests • Contingency tables • One-way and two-way comparisons, ANOVA • Response curves • Nested structures, pseudoreplication • Maximum likelihood estimation • Bayesian estimation

  3. Inference • Interested in population quantities – Parameters � (e.g. μ, σ 2 ) • Collect sample X �(�) to estimate a • Use a sample statistic � population quantity � • The sampling distribution � � implies � � �|� • We use � � �|� for inference about � , or • We use � �|� � for inference about � (Bayesian)

  4. Point estimation • What is the population mean μ? – A point estimate of � is the sample mean �̅ • Look to the sampling distribution � � �|� �|� ~�(�, � � /�) – According to CLT, � � – The standard error of the mean is thus �/ � – Can approximate SEM ≈ s/ � �̅�� • The sampling distribution of � = ��� is � |� – Includes variability from �̅ and s ≈ � – � is the number of SEM units between �̅ and �

  5. Hypothesis tests • Use � � �|� for inference about � • In hypothesis testing, – Begin by assuming � = � ! (null hypothesis) – What is the sampling distribution � � �|� " ? – Imagine we sample from � � �|� " . What values are likely? What values are unlikely? • Our answer determines the rejection region of the test # $%& – Now, collect a sample and compute � � $%& in the rejection region? Reject our initial • Is � hypothesis that � = � !

  6. Hypothesis tests • How to decide what is an unlikely value? – Formulate an alternative hypothesis • � > � ! or � < � ! or � ≠ � ! – Decide on a Type 1 error rate α (false rejection) – α, together with alternative hypothesis, implies a rejection region (“unlikely value”) • If we don’t want to decide α, compute p-value – Smallest α that would result in rejection of null hypothesis

  7. Confidence intervals • Consider � � ��� , the sampling distribution of � − � � • Given a probability, (e.g. 95% or 99%) we can � − � from � � compute an interval for � ��� ��� ~�(0, � � /n) or • For μ, use � � � (� ~� .�/ ���) ��� ⁄ • Results in confidence intervals for μ �̅ ± 1 2/� �/ � or �̅ ± � 3 4 ,.�/ 5/ �

  8. A short comment… • Use hypothesis tests sparingly, and for good reason. – Multiple comparisons can result in false alarms – Ask directed questions • Consider alternatives to hypothesis tests – They provide little or no information about � • What is the probability of the null hypothesis? – Confidence intervals (or Bayesian posterior distributions) provide much more information • Always report means (point estimates) and standard errors when reporting hypothesis tests

  9. Contingency tables • Comparing two or more categorical variables • Common question: are the variables independent? Which categories have more or fewer units than expected? Men Women Totals Brown Eyes 42 39 81 (81/174) Blue Eyes 35 38 73 (73/174) Other 12 8 20 (20/174) Totals 89 (89/174) 75 (75/174) 174

  10. One-way comparisons • Data: tinting • Experiment: time to discriminate a target for different window tinting levels hi Tinting lo no 50 100 150 200 Time (ms)

  11. One way ANOVA Analysis of Variance Table Response: it Df Sum Sq Mean Sq F value Pr(>F) tint 2 6597 3298.4 2.1769 0.1164 Residuals 179 271220 1515.2

  12. Two-way comparisons • There are other factors that might influence time to discriminate a target, e.g. age Younger Older 200 150 it 100 50 no lo hi no lo hi

  13. Two way ANOVA Analysis of Variance Table Response: it Df Sum Sq Mean Sq F value Pr(>F) tint 2 6597 3298 3.0965 0.04765 * agegp 1 81612 81612 76.6164 1.567e-15 *** Residuals 178 189607 1065 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

  14. Interaction plots agegp 90 Older 80 Younger mean of it 70 60 50 40 no lo hi tint

  15. Two-way ANOVA: interaction Analysis of Variance Table Response: it Df Sum Sq Mean Sq F value Pr(>F) tint 2 6597 3298 3.1109 0.04702 * agegp 1 81612 81612 76.9729 1.466e-15 *** tint:agegp 2 2999 1499 1.4141 0.24590 Residuals 176 186609 1060 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

  16. Response curves • Sometimes a response should be handled as a regression problem rather than ANOVA 1.2 1.0 distance 0.8 0.6 3.0 3.5 4.0 4.5 angle

  17. Pseudoreplication

  18. Nested structures • If the scale of your effect doesn’t match the scale of your experimental unit, don’t pretend that it does. Q: How many experimental units do we have for comparing treatment to control?

  19. Maximum likelihood estimation • Likelihood is the probability of data � given a population, parameterized by � • The value of � that maximizes the likelihood is the 7 . maximum likelihood estimate � �6 8 9 = � + ; 9 , ; 9 ~� 0, � � , < = 1,2, … , � . 2D� � E �(F G ��) 4 1 @ A; �, � � = C �H 4 9I/ . 2 log(2D� � ) − N (8 9 − �) � J A; �, � � = − 1 2� � 9I/

  20. Bayesian estimation O � � = O � � O(�) O(�) It is often difficult to get O(�) directly, but O(�) is just a normalizing constant O � � ∝ O � � O(�) so use various tricks to generate samples from O � � O(�) The most popular trick is MCMC

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend