Statistical Preliminaries Stony Brook University CSE545, Fall 2016

Random Variables X : A mapping from Ω to ℝ that describes the question we care about in practice. 2

Random Variables X : A mapping from Ω to ℝ that describes the question we care about in practice. Example: Ω = 5 coin tosses = {<HHHHH>, <HHHHT>, <HHHTH>, <HHHTH>…} We may just care about how many tails? Thus, X(<HHHHH>) = 0 X(<HHHTH>) = 1 X(<TTTHT>) = 4 X(<HTTTT>) = 4 X only has 6 possible values: 0, 1, 2, 3, 4, 5 What is the probability that we end up with k = 4 tails? P (X = k ) := P ( {ω : X(ω) = k} ) where ω ∊ Ω 3

Random Variables X : A mapping from Ω to ℝ that describes the question we care about in practice. Example: Ω = 5 coin tosses = {<HHHHH>, <HHHHT>, <HHHTH>, <HHHTH>…} We may just care about how many tails? Thus, X(<HHHHH>) = 0 X(<HHHTH>) = 1 X(<TTTHT>) = 4 X(<HTTTT>) = 4 X only has 6 possible values: 0, 1, 2, 3, 4, 5 What is the probability that we end up with k = 4 tails? P (X = k ) := P ( {ω : X(ω) = k} ) where ω ∊ Ω X(ω) = 4 for 5 out of 32 sets in Ω . Thus, assuming a fair coin, P (X = 4) = 5/32 (Not a variable, but a function that we end up notating a lot like a variable) 4

Random Variables X : A mapping from Ω to ℝ that describes the question we care about in practice. Example: Ω = 5 coin tosses = { <HHHHH>, <HHHHT>, <HHHTH>, <HHHTH> …} We may just care about how many tails? Thus, X(<HHHHH>) = 0 X is a discrete random variable X(<HHHTH>) = 1 if it takes only a countable X(<TTTHT>) = 4 number of values. X(<HTTTT>) = 4 X only has 6 possible values: 0, 1, 2, 3, 4, 5 What is the probability that we end up with k = 4 tails? P (X = k ) := P ( {ω : X(ω) = k} ) where ω ∊ Ω X(ω) = 4 for 5 out of 32 sets in Ω . Thus, assuming a fair coin, P (X = 4) = 5/32 (Not a variable, but a function that we end up notating a lot like a variable) 5

Random Variables X : A mapping from Ω to ℝ that describes the question we care about in practice. X is a continuous random variable if it X is a discrete random variable can take on an infinite number of if it takes only a countable values between any two given values. number of values. 6

Random Variables X : A mapping from Ω to ℝ that describes the question we care about in practice. Example: Ω = inches of snowfall = [0, ∞) ⊆ ℝ X amount of inches in a snowstorm X is a continuous random variable if it can take on an infinite number of X (ω) = ω values between any two given values. What is the probability we receive (at least) a inches? P (X ≥ a) := P ( {ω : X(ω) ≥ a} ) What is the probability we receive between a and b inches? P (a ≤ X ≤ b) := P ( {ω : a ≤ X(ω) ≤ b} ) 7

Random Variables X : A mapping from Ω to ℝ that describes the question we care about in practice. Example: Ω = inches of snowfall = [0, ∞) ⊆ ℝ X amount of inches in a snowstorm X is a continuous random variable if it can take on an infinite number of X (ω) = ω P (X = i) := 0, for all i ∊ Ω values between any two given values. (probability of receiving exactly i What is the probability we receive (at least) a inches? inches of snowfall is zero) P (X ≥ a) := P ( {ω : X(ω) ≥ a} ) What is the probability we receive between a and b inches? P (a ≤ X ≤ b) := P ( {ω : a ≤ X(ω) ≤ b} ) 8

Random Variables, Revisited X : A mapping from Ω to ℝ that describes the question we care about in practice. Example: Ω = inches of snowfall = [0, ∞) ⊆ ℝ X amount of inches in a snowstorm X is a continuous random variable if it can take on an infinite number of X (ω) = ω P (X = i) := 0, for all i ∊ Ω values between any two given values. (probability of receiving exactly i What is the probability we receive (at least) a inches? inches of snowfall is zero) P (X ≥ a) := P ( {ω : X(ω) ≥ a} ) How to model? What is the probability we receive between a and b inches? P (a ≤ X ≤ b) := P ( {ω : a ≤ X(ω) ≥ b} ) 9

Continuous Random Variables Discretize them! (group into discrete bins) How to model? 10

Continuous Random Variables P (bin=8) = .32 P (bin=12) = .08 But aren’t we throwing away information? 11

Continuous Random Variables 12

Continuous Random Variables X is a continuous random variable if it can take on an infinite number of values between any two given values. X is a continuous random variable if there exists a function fx such that: 13

Continuous Random Variables X is a continuous random variable if it can take on an infinite number of values between any two given values. X is a continuous random variable if there exists a function fx such that: fx : “probability density function” (pdf) 14

Continuous Random Variables Common Trap ● does not yield a probability ○ does ○ � may be anything ( ℝ ) ■ thus, may be > 1 17

Continuous Random Variables A Common Probability Density Function 18

Continuous Random Variables Common pdf s: Normal( μ , σ 2 ) = 19

Continuous Random Variables Common pdf s: Normal( μ , σ 2 ) = μ : mean (or “center”) = expectation σ 2 : variance, σ : standard deviation 20

Continuous Random Variables Common pdf s: Normal( μ , σ 2 ) Credit: Wikipedia = μ : mean (or “center”) = expectation σ 2 : variance, σ : standard deviation 21

Continuous Random Variables Common pdf s: Normal( μ , σ 2 ) X ~ Normal( μ , σ 2 ), examples: ● height ● intelligence/ability ● measurement error ● averages (or sum) of lots of random variables 22

Continuous Random Variables Common pdf s: Normal(0, 1) (“standard normal”) How to “standardize” any normal distribution: ● subtract the mean, μ (aka “mean centering”) ● divide by the standard deviation, σ z = (x - μ ) / σ , (aka “z score”) Credit: MIT Open Courseware: Probability and Statistics 23

Continuous Random Variables Common pdf s: Normal(0, 1) Credit: MIT Open Courseware: Probability and Statistics 24

Cumulative Distribution Function For a given random variable X, the cumulative distribution function (CDF), Uniform Fx: ℝ → [0, 1] , is defined by: Normal 25

Cumulative Distribution Function For a given random variable X, the cumulative distribution function (CDF), Uniform Fx: ℝ → [0, 1] , is defined by: Pro: yields a probability! Exponential Con: Not intuitively interpretable. Normal 26

Random Variables, Revisited X : A mapping from Ω to ℝ that describes the question we care about in practice. X is a continuous random variable if it X is a discrete random variable can take on an infinite number of if it takes only a countable values between any two given values. number of values. 27

Discrete Random Variables For a given random variable X, the cumulative distribution function (CDF), Fx: ℝ → [0, 1] , is defined by: X is a discrete random variable if it takes only a countable number of values. 28

Discrete Random Variables For a given random variable X, the cumulative distribution function (CDF), Fx: ℝ → [0, 1] , is defined by: X is a discrete random variable if it takes only a countable number of values. Binomial (n, p) (like normal) 29

Discrete Random Variables Binomial (n, p) For a given random variable X, the cumulative distribution function (CDF), Fx: ℝ → [0, 1] , is defined by: X is a discrete random variable if it takes only a countable number of values. For a given discrete random variable X, probability mass function ( pmf ), fx: ℝ → [0, 1] , is defined by: 30

Discrete Random Variables Binomial (n, p) Two Common Discrete Random Variables ● Binomial(n, p) example: number of heads after n coin flips (p, probability of heads) ● Bernoulli(p) = Binomial(1, p) example: one trial of success or failure 31

Hypothesis Testing Hypothesis -- something one asserts to be true. Classical Approach: H 0 : null hypothesis -- some “default” value; “null” => nothing changes H 1 : the alternative -- the opposite of the null => a change or a difference

Hypothesis Testing Hypothesis -- something one asserts to be true. Classical Approach: H 0 : null hypothesis -- some “default” value; “null” => nothing changes H 1 : the alternative -- the opposite of the null => a change or a difference Goal: Use probability to determine if we can “reject the null”( H 0 ) in favor of H 1 . “ There is less than a 5% chance that the null is true” (i.e. 95% alternative is true). Example: Hypothesize a coin is biased. H 0 : the coin is not biased (i.e. flipping n times results in a Binomial(n, 0.5))

Hypothesis Testing Hypothesis -- something one asserts to be true. H 0 : null hypothesis -- some “default” value (usually that one’s hypothesis is false) Classical Approach: H 1 : the alternative -- usually that one’s “hypothesis” is true H 0 : null hypothesis -- some “default” value (usually that one’s hypothesis is false) More formally: Let X be a random variable and let R be the range of X. R reject ⊂ R is the rejection region. If X ∊ R reject then we reject the null.

Statistical Preliminaries Stony Brook University CSE545, Fall 2016 - PowerPoint PPT Presentation

Statistical Preliminaries Stony Brook University CSE545, Fall 2016 Random Variables X : A mapping from to that describes the question we care about in practice. 2 Random Variables X : A mapping from to that describes the question

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

Outline 2 Introduction Introduction Preliminaries Preliminaries Problem formulation Problem

Preliminaries Programming Coprogramming Advanced Coprogramming Preliminaries Higher-Order

Statistical graphics with Statistical graphics with ggplot2 ggplot2 Programming for Statistical

Statistical presentation Statistical presentation Statistical tabulations by age, sex and 3 digit

EFTA Statistical Cooperation & the European Statistical System EEA Seminar EEA Seminar

EFTA Statistical Cooperation & the European Statistical System EEA Seminar EEA Seminar

13 Jan, 2011 Statistical Literacy: Confounding UTSA Confounding 2011 1 2011 2 Statistical

STAT 401A - Statistical Methods for Research Workers Statistical Inference Jarad Niemi (Dr. J)

Statistics 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical

Nov 2010 Statistical Literacy: Harper's Magazine Fall 2010 1 Fall 2010 2 Statistical

Statistical Machine Translation George Foster George Foster Statistical Machine Translation A

STA 214: Probability & Statistical Models STA 214: Analysis of Statistical Models

Introduction to Statistical Process Control Statistical Process Control (SPC) uses seven major

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Workshop 4: Statistical modelling intro Murray Logan 10 Mar 2019 Section 1 Introduction

CS 188: Artificial Intelligence Optimization and Neural Nets Instructors: Pieter Abbeel and Dan

Fairness-Aware Learning for Continuous Attributes and Treatments Jrmie Mary, Criteo AI Lab

Preparing for Your Reviews Debbie Calhoun, MS, RD, SNS School Meals Program Specialist OSPI

cedram Math literature Math E-literature DML Implementation Conclusions Outline The

EFET position paper One line title for an improved market design in intraday Irina Nikolova

Neural Networks Stefan Edelkamp 1 Overview - Introduction - Percepton - Hofield-Nets -

Deep networks CS 446 The ERM perspective These lectures will follow an ERM perspective on deep

Neural Networks Hopfield Nets and Boltzmann Machines Fall 2017 1 Recap: Hopfield network &

Statistical Preliminaries Stony Brook University CSE545, Fall 2016 - PowerPoint PPT Presentation

Statistical Preliminaries Stony Brook University CSE545, Fall 2016 Random Variables X : A mapping from to that describes the question we care about in practice. 2 Random Variables X : A mapping from to that describes the question

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

Outline 2 Introduction Introduction Preliminaries Preliminaries Problem formulation Problem

Preliminaries Programming Coprogramming Advanced Coprogramming Preliminaries Higher-Order

Statistical graphics with Statistical graphics with ggplot2 ggplot2 Programming for Statistical

Statistical presentation Statistical presentation Statistical tabulations by age, sex and 3 digit

EFTA Statistical Cooperation &amp; the European Statistical System EEA Seminar EEA Seminar

EFTA Statistical Cooperation &amp; the European Statistical System EEA Seminar EEA Seminar

13 Jan, 2011 Statistical Literacy: Confounding UTSA Confounding 2011 1 2011 2 Statistical

STAT 401A - Statistical Methods for Research Workers Statistical Inference Jarad Niemi (Dr. J)

Statistics 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical

Nov 2010 Statistical Literacy: Harper's Magazine Fall 2010 1 Fall 2010 2 Statistical

Statistical Machine Translation George Foster George Foster Statistical Machine Translation A

STA 214: Probability &amp; Statistical Models STA 214: Analysis of Statistical Models

Introduction to Statistical Process Control Statistical Process Control (SPC) uses seven major

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Workshop 4: Statistical modelling intro Murray Logan 10 Mar 2019 Section 1 Introduction

CS 188: Artificial Intelligence Optimization and Neural Nets Instructors: Pieter Abbeel and Dan

Fairness-Aware Learning for Continuous Attributes and Treatments Jrmie Mary, Criteo AI Lab

Preparing for Your Reviews Debbie Calhoun, MS, RD, SNS School Meals Program Specialist OSPI

cedram Math literature Math E-literature DML Implementation Conclusions Outline The

EFET position paper One line title for an improved market design in intraday Irina Nikolova

Neural Networks Stefan Edelkamp 1 Overview - Introduction - Percepton - Hofield-Nets -

Deep networks CS 446 The ERM perspective These lectures will follow an ERM perspective on deep

Neural Networks Hopfield Nets and Boltzmann Machines Fall 2017 1 Recap: Hopfield network &amp;

EFTA Statistical Cooperation & the European Statistical System EEA Seminar EEA Seminar

EFTA Statistical Cooperation & the European Statistical System EEA Seminar EEA Seminar

STA 214: Probability & Statistical Models STA 214: Analysis of Statistical Models

Neural Networks Hopfield Nets and Boltzmann Machines Fall 2017 1 Recap: Hopfield network &