basic statistics
play

Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: - PowerPoint PPT Presentation

Carnegie Mellon University 10-701 Machine Learning Spring 2013 TA: Ina Fiterau Alex Smola Barnabas Poczos 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1


  1. Carnegie Mellon University 10-701 Machine Learning Spring 2013 TA: Ina Fiterau Alex Smola Barnabas Poczos 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1

  2. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Overview Introduction to Probability Theory Random Variables. Independent RVs Properties of Common Distributions Estimators. Unbiased estimators. Risk Conditional Probabilities/Independence Bayes Rule and Probabilistic Inference 1/25/2013 Recitation 1: Statistics Intro 2

  3. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Review: the concept of probability Sample space Ω – set of all possible outcomes Event E ∈ Ω – a subset of the sample space Probability measure – maps Ω to unit interval “How likely is that event E will occur?” Kolmogorov axioms Ω P E ≥ 0 P Ω = 1 𝐹 ∞ = P 𝐹 1 ∪ 𝐹 2 ∪ ⋯ 𝑄(𝐹 𝑗 ) 𝑗=1 1/25/2013 Introduction to Probability Theory 3

  4. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Reasoning with events Venn Diagrams 𝑄 𝐵 = 𝑊𝑝𝑚(𝐵)/𝑊𝑝𝑚 (Ω) Event union and intersection 𝑄 𝐵 𝐶 = 𝑄 𝐵 + 𝑄 𝐶 − 𝑄 𝐵 ∩ 𝐶 Properties of event union/intersection Commutativity: 𝐵 ∪ 𝐶 = 𝐶 ∪ 𝐵 ; 𝐵 ∩ 𝐶 = 𝐶 ∩ 𝐵 Associativity: 𝐵 ∪ 𝐶 ∪ C = (𝐵 ∪ 𝐶) ∪ C Distributivity: 𝐵 ∩ 𝐶 ∪ 𝐷 = (𝐵 ∩ 𝐶) ∪ (𝐵 ∩ 𝐷) 1/25/2013 Introduction to Probability Theory 4

  5. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Reasoning with events DeMorgan’s Laws (𝐵 ∪ 𝐶) 𝐷 = 𝐵 𝐷 ∩ 𝐶 𝐷 (𝐵 ∩ 𝐶) 𝐷 = 𝐵 𝐷 ∪ 𝐶 𝐷 Proof for law #1 - by double containment (𝐵 ∪ 𝐶) 𝐷 ⊆ 𝐵 𝐷 ∩ 𝐶 𝐷 • … 𝐵 𝐷 ∩ 𝐶 𝐷 ⊆ (𝐵 ∪ 𝐶) 𝐷 • … 1/25/2013 Introduction to Probability Theory 5

  6. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Reasoning with events Disjoint (mutually exclusive) events 𝑄 𝐵 ∩ 𝐶 = 0 𝑄 𝐵 ∪ 𝐶 = 𝑄 𝐵 + 𝑄(𝐶) examples: 𝑇 1 𝑇 5 𝑇 4 • 𝐵 and 𝐵 𝐷 𝑇 3 𝑇 6 𝑇 2 • partitions NOT the same as independent events For instance, successive coin flips 1/25/2013 Introduction to Probability Theory 6

  7. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Partitions Partition 𝑇 1 … 𝑇 𝑜 Events cover sample space 𝑇 1 ∪ ⋯ ∪ 𝑇 𝑜 = Ω Events are pairwise disjoint 𝑇 𝑗 ∩ 𝑇 𝑘 = ∅ Event reconstruction 𝑜 𝑄 𝐵 = 𝑄(𝐵 ∩ 𝑇 𝑗 ) 𝑗=1 Boole’s inequality ∞ 𝑜 ≤ 𝑄 𝐵 𝑗 𝑄(𝐵 𝑗 ) 𝑗=1 𝑗=1 Bayes’ Rule 𝑄 𝐵 𝑇 𝑗 𝑄(𝑇 𝑗 ) 𝑄 𝑇 𝑗 |𝐵 = 𝑜 𝑄 𝐵 𝑇 𝑘 𝑄(𝑇 𝑘 ) 𝑘=1 1/25/2013 Introduction to Probability Theory 7

  8. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Overview Introduction to Probability Theory Random Variables. Independent RVs Properties of Common Distributions Estimators. Unbiased estimators. Risk Conditional Probabilities/Independence Bayes Rule and Probabilistic Inference 1/25/2013 Recitation 1: Statistics Intro 8

  9. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Random Variables Random variable – associates a value to the outcome of a randomized event Sample space 𝒴 : possible values of rv 𝑌 Example: event to random variable Draw 2 numbers between 1 and 4. Let r.v. X be their sum. E 11 12 13 14 21 22 23 24 31 32 33 34 41 42 43 44 X(E) 2 3 4 5 3 4 5 6 4 5 6 7 5 6 7 8 Induced probability function on 𝒴 . x 2 3 4 5 6 7 8 1 2 3 4 3 2 1 P(X=x) 16 16 16 16 16 16 16 1/25/2013 Random Variables 9

  10. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Cumulative Distribution Functions 𝐺 𝑌 𝑦 = 𝑄 𝑌 ≤ 𝑦 ∀𝑦 ∈ 𝒴 The CDF completely determines the probability distribution of an RV The function 𝐺 𝑦 is a CDF i.i.f 𝑦→−∞ 𝐺 𝑦 = 0 and lim lim 𝑦→∞ 𝐺 𝑦 = 1 𝐺 𝑦 is a non-decreasing function of 𝑦 𝐺 𝑦 is right continuous: ∀𝑦 0 lim 𝐺 𝑦 = 𝐺(𝑦 0 ) 𝑦→𝑦 0 𝑦 > 𝑦 0 1/25/2013 Random Variables 10

  11. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Identically distributed RVs Two random variables 𝑌 1 and 𝑌 2 are identically distributed iif for all sets of values 𝐵 𝑄 𝑌 1 ∈ 𝐵 = 𝑄 𝑌 2 ∈ 𝐵 So that means the variables are equal? NO. Example: Let’s toss a coin 3 times and let 𝑌 𝐼 and 𝑌 𝐺 represent the number of heads/tails respectively They have the same distribution but 𝑌 𝐼 = 1 − 𝑌 𝐺 1/25/2013 Random Variables 11

  12. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Discrete vs. Continuous RVs Step CDF Continuous CDF 𝒴 is discrete 𝒴 is continuous Probability mass Probability density 𝑦 𝑔 𝑌 𝑦 = 𝑄 𝑌 = 𝑦 ∀𝑦 𝐺 𝑌 𝑦 = 𝑔 𝑌 𝑢 𝑒𝑢 ∀𝑦 −∞ 1/25/2013 Random Variables 12

  13. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Interval Probabilities Obtained by integrating the area under the curve 𝑄 𝑦 1 ≤ 𝑌 ≤ 𝑦 2 = 𝑦 2 𝑔 𝑦 𝑦 𝑒𝑦 𝑦 1 𝑦 2 𝑦 1 This explains why P(X=x) = 0 for continuous distributions! 𝑄 𝑌 = 𝑦 ≤ lim [𝐺 𝑦 𝑦 − 𝐺 𝑦 (𝑦 − 𝜗)] = 0 𝜗→0 𝜗 >0 1/25/2013 Random Variables 13

  14. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Moments Expectations The expected value of a function 𝑕 depending on a r.v. X~𝑄 is defined as 𝐹𝑕 𝑌 = 𝑕(𝑦)𝑄 𝑦 𝑒𝑦 n th moment of a probability distribution 𝜈 𝑜 = 𝑦 𝑜 𝑄 𝑦 𝑒𝑦 mean 𝜈 = 𝜈 1 n th central moment 𝜈 𝑜 ′ = 𝑦 − 𝜈 𝑜 𝑄 𝑦 𝑒𝑦 Variance 𝜏 2 = 𝜈 2 ′ 1/25/2013 Random Variables 14

  15. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Multivariate Distributions Example Uniformly draw 𝑌 and 𝑍 from the set {1,2,3} 2 𝑋 = 𝑌 + 𝑍 ; 𝑊 = |𝑌 − 𝑍| W V 0 1 2 P W 2 1/9 0 0 1/9 Joint 3 0 2/9 0 2/9 4 1/9 0 2/9 3/9 𝑌, 𝑍 ∈ 𝐵 = 𝑄 𝑔(𝑦, 𝑧) (𝑦,𝑧)𝜗𝐵 5 0 2/9 0 2/9 6 1/9 0 0 1/9 Marginal P V 3/9 4/9 2/9 1 𝑍 𝑧 = 𝑔(𝑦, 𝑧) 𝑔 𝑦 For independent RVs: 𝑔 𝑦 1 , … , 𝑦 𝑜 = 𝑔 𝑌 1 𝑦 1 … 𝑔 𝑌 𝑜 (𝑦 𝑜 ) 1/25/2013 Random Variables 15

  16. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Overview Introduction to Probability Theory Random Variables. Independent RVs Properties of Common Distributions Estimators. Unbiased estimators. Risk Conditional Probabilities/Independence Bayes Rule and Probabilistic Inference 1/25/2013 Recitation 1: Statistics Intro 16

  17. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Bernoulli 𝑌 = 1 𝑥𝑗𝑢ℎ 𝑞𝑠𝑝𝑐𝑏𝑐𝑗𝑚𝑗𝑢𝑧 𝑞 𝑥𝑗𝑢ℎ 𝑞𝑠𝑝𝑐𝑏𝑐𝑗𝑚𝑗𝑢𝑧 1 − 𝑞 0 ≤ 𝑞 ≤ 1 0 Mean and Variance 𝐹𝑌 = 1𝑞 + 0 1 − 𝑞 = 𝑞 𝑊𝑏𝑠𝑌 = 1 − 𝑞 2 𝑞 + 0 − 𝑞 2 1 − 𝑞 = 𝑞(1 − 𝑞) MLE: sample mean Connections to other distributions: 𝑜 If 𝑌 1 … 𝑌 𝑜 ~ 𝐶𝑓𝑠𝑜(𝑞) then Y = 𝑌 𝑗 is Binomial(n, p) 𝑗=1 Geometric distribution – the number of Bernoulli trials needed to get one success 1/25/2013 Properties of Common Distributions 17

  18. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Binomial 𝑄 𝑌 = 𝑦; 𝑜, 𝑞 = 𝑜 𝑦 𝑞 𝑦 (1 − 𝑞) 𝑜−𝑦 Mean and Variance 𝑦 𝑜 𝑜 𝑦 𝑞 𝑦 (1 − 𝑞) 𝑜−𝑦 𝐹𝑌 = = … = 𝑜𝑞 𝑦=0 𝑊𝑏𝑠𝑌 = 𝑜𝑞(1 − 𝑞) NOTE: 𝑾𝒃𝒔𝒀 = 𝑭𝒀 𝟑 − (𝑭𝒀) 𝟑 Sum of Bin is Bin Conditionals on Bin are Bin 1/25/2013 Properties of Common Distributions 18

  19. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Properties of the Normal Distribution Operations on normally-distributed variables 𝑌 1 , 𝑌 2 ~ 𝑂𝑝𝑠𝑛 0,1 , then 𝑌 1 ± 𝑌 2 ~𝑂(0,2) 𝑌 1 / 𝑌 2 ~ 𝐷𝑏𝑣𝑑ℎ𝑧(0,1) 𝑌 1 ~ 𝑂𝑝𝑠𝑛 𝜈 1 , 𝜏 12 , 𝑌 2 ~ 𝑂𝑝𝑠𝑛 𝜈 2 , 𝜏 22 and 𝑌 1 ⊥ 𝑌 2 then 𝑎 = 𝑌 1 + 𝑌 2 ~ 𝑂𝑝𝑠𝑛 𝜈 1 + 𝜈 2 , 𝜏 12 + 𝜏 22 𝜈 𝑦 𝜏 𝑌2 𝜍𝜏 𝑌 𝜏 𝑍 If 𝑌 , 𝑍 ~ 𝑂 𝜈 𝑧 , , then 𝜏 𝑍2 𝜍𝜏 𝑌 𝜏 𝑍 𝑌 + 𝑍 is still normally distributed, the mean is the sum of the means and the variance is 𝜏 𝑌+𝑍2 = 𝜏 𝑌2 + 𝜏 𝑍2 + 2𝜍𝜏 𝑌 𝜏 𝑍 , where 𝜍 is the correlation 1/25/2013 Properties of Common Distributions 19

  20. Carnegie Mellon University 10-701 Machine Learning Spring 2013 Overview Introduction to Probability Theory Random Variables. Independent RVs Properties of Common Distributions Estimators. Unbiased estimators. Risk Conditional Probabilities/Independence Bayes Rule and Probabilistic Inference 1/25/2013 Recitation 1: Statistics Intro 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend