Inference for Numerical Data III Dajiang Liu @ PHS 525 Feb 23 th , - PowerPoint PPT Presentation

Inference for Numerical Data III Dajiang Liu @ PHS 525 Feb 23 th , 2016

Central Limit Theorem � is approximately normal. • The sample mean point estimates � �~� �, �� • The approximation works when: • Sample size is “large” • A rule of thumb is sample size � ≥ 30 • The distribution should not be skewed (i.e. be symmetric) • There are no outliers • The approximation may not be good if any of the above 3 conditions are not met

Population Distribution does not need to be normal Sampling Distribution for Different Sample Sizes Sample mean is still normal when sample sizes are large enough

4.33 Answer • (a) The distribution is skewed toward smaller values and has several very large outliers • (b) As sample size gets larger, the distribution of the sample mean estimator behave more like normal distribution. Yet, there are still heavy upper tails, possibly due to the influence of the outliers with large values.

4.35 Answer • (1) -> (b) • (2) -> (a) • (3) -> (c) • The key is to examine the standard error. The sample mean from larger samples has the smallest standard errors.

One Sample Means with t-distribution • Central Limit Theorem requires large sample sizes • In large samples, sample mean estimate is more likely to be normally distributed • In large samples, the sample mean estimate tend to have smaller standard deviation Yet: • In many cases, large samples can be hard to attain • t-distribution can be a helpful alternative for small sample inference

The Normality Condition – Modified • Central limit theorem modified: • The sampling distribution for the mean is nearly normal when the sample observations are independent and come from a nearly normal distribution . • Important to note: • The CLT modified does not put constraint on the sample size • The CLT modified does require that population distribution is nearly normal • Original CLT does not require population distribution be normal • Even for sample sizes, CLT modified holds.

Degrees of Freedom (df) • Degrees of freedom measure the shape of the distribution • The larger the df, the more closely the t-distribution resembles the normal distribution

0 Tails are heavier

� Use t-distribution to Obtain Confidence Interval • Confidence intervals obtained using t-distribution can be more accurate • Procedures for obtaining t-distribution based confidence interval � • Obtain sample mean point estimate � • Obtain sample standard deviation � • Obtain standard error for the sample mean point estimate � = �/ � • �� • Confidence interval is obtained by � − � �� × �� ≤ � ≤ � � + � �� × �� • � • � �� is the critical t-value

Example: What is the normal and t- distribution based confidence interval??

� Example: What is the normal and t- distribution based 95%-confidence interval?? Answer: Answer: Answer: Answer: #.% � = �� = 0.53 �& Normal confidence interval equals to (3.36,5.44) t confidence interval equals to (3.29,5.51)

Hypothesis Testing with t-Distribution • T statistic • For a sample of size � , � and standard deviation �� • Estimate sample mean � • To test the hypothesis ' ( : � = � ( v.s. ' ) : � > � ( • A t-statistic can be calculated � − � ( + = � � �� The p-value can be assessed by Pr + ∗ > + , where + ∗ is a random variable with distribution � ��

Answer and R command: (a). pt(1.91,df=10,lower.tail=FALSE) [1] 0.04260244 (b). 2*pt(0.83,df=6,lower.tail=FALSE) [1] 0.4383084 (c). pt(-3.45,df=16,lower.tail=TRUE) [1] 0.001646786 (d). pt(2.13,df=28,lower.tail=FALSE) [1] 0.02104844

� Answer 5.19 • (a). ' ( : � = 8 v.s. ' ) : � < 8 0 .0%�1 (.#0 • (b). + = = − (.00 × 5 = −1.75 (.00/ #2 • (c). P-value 0.046 • (d). Reject the null hypothesis at 6 = 0.05 • (e). (7.47,7.99)

Inference for Numerical Data III Dajiang Liu @ PHS 525 Feb 23 th , - PowerPoint PPT Presentation

Inference for Numerical Data III Dajiang Liu @ PHS 525 Feb 23 th , 2016 Central Limit Theorem is approximately normal. The sample mean point estimates ~ , The approximation works when: Sample

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Data and Analysis Part III Unstructured Data Ian Stark February 2011 Part III: Unstructured

Post-Selection Inference Todd Kuffner Washington University in St. Louis PhyStat 2016

t-distribution Mine Cetinkaya-Rundel Associate Professor of the Practice, Duke University

Vocabulary score vs. self identified social class Mine Cetinkaya-Rundel Associate Professor of

Welcome to the course! Mine Cetinkaya-Rundel Associate Professor of the Practice, Duke University

The Foundations: Logic and Proofs Chapter 1, Part III: Proofs Rules of Inference Section 1.6

I III IV I III IV I III IV BUILDING TRUST Radical Candor Chart HIGH I III IV

R i f R i f Reinforcement Learning III Reinforcement Learning III t L t L i i III III Dec

Data and Analysis Part III Corpora Alex Simpson Part III: Corpora Inf1, Data & Analysis,

Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard

Type Inference 75 Definition Type Inference Type inference = Java compiler's ability

Linear Models Overview Topic Introduction & Justification Introduction & Model

Confidence Interval For The Weighted Sum Of Two Binomial Proportions Wojciech Zieli nski

Lexical Association Measures Collocation Extraction Pavel Pecina pecina@ufal.mff.cuni.cz

Survival Analysis / Time-to- Event Analysis in R Heidi Seibold Statistician at LMU Munich

Telematics 2 & Performance Evaluation Chapter 9 Short Probability Primer & Obtaining

-"$$%/$2-3 4&

#10: Planar & Spherical E = k q Imagine a point charge r 2 Hollow conducting shell

Berkeley CS276 & MIT 6.875 Pseudorandom Permutations and Symmetric Key Encryption Lecturer:

Sambuz

Useful Links

Newsletter

Mail Us

Inference for Numerical Data III Dajiang Liu @ PHS 525 Feb 23 th , - PowerPoint PPT Presentation

Inference for Numerical Data III Dajiang Liu @ PHS 525 Feb 23 th , 2016 Central Limit Theorem is approximately normal. The sample mean point estimates ~ , The approximation works when: Sample

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Data and Analysis Part III Unstructured Data Ian Stark February 2011 Part III: Unstructured

Post-Selection Inference Todd Kuffner Washington University in St. Louis PhyStat 2016

t-distribution Mine Cetinkaya-Rundel Associate Professor of the Practice, Duke University

Vocabulary score vs. self identified social class Mine Cetinkaya-Rundel Associate Professor of

Welcome to the course! Mine Cetinkaya-Rundel Associate Professor of the Practice, Duke University

The Foundations: Logic and Proofs Chapter 1, Part III: Proofs Rules of Inference Section 1.6

I III IV I III IV I III IV BUILDING TRUST Radical Candor Chart HIGH I III IV

R i f R i f Reinforcement Learning III Reinforcement Learning III t L t L i i III III Dec

Data and Analysis Part III Corpora Alex Simpson Part III: Corpora Inf1, Data &amp; Analysis,

Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard

Type Inference 75 Definition Type Inference Type inference = Java compiler's ability

Linear Models Overview Topic Introduction &amp; Justification Introduction &amp; Model

Confidence Interval For The Weighted Sum Of Two Binomial Proportions Wojciech Zieli nski

Lexical Association Measures Collocation Extraction Pavel Pecina pecina@ufal.mff.cuni.cz

Survival Analysis / Time-to- Event Analysis in R Heidi Seibold Statistician at LMU Munich

Telematics 2 &amp; Performance Evaluation Chapter 9 Short Probability Primer &amp; Obtaining

-&quot;$$%/$2-3 4&amp;

#10: Planar &amp; Spherical E = k q Imagine a point charge r 2 Hollow conducting shell

Berkeley CS276 &amp; MIT 6.875 Pseudorandom Permutations and Symmetric Key Encryption Lecturer:

Sambuz

Useful Links

Newsletter

Mail Us

Data and Analysis Part III Corpora Alex Simpson Part III: Corpora Inf1, Data & Analysis,

Linear Models Overview Topic Introduction & Justification Introduction & Model

Telematics 2 & Performance Evaluation Chapter 9 Short Probability Primer & Obtaining

-"$$%/$2-3 4&

#10: Planar & Spherical E = k q Imagine a point charge r 2 Hollow conducting shell

Berkeley CS276 & MIT 6.875 Pseudorandom Permutations and Symmetric Key Encryption Lecturer: