Probability Review III Harvard Math Camp - Econometrics Ashesh - PowerPoint PPT Presentation

Probability Review III Harvard Math Camp - Econometrics Ashesh Rambachan Summer 2018

Outline Useful Univariate Distributions Bernoulli distribution Binomial distribution Uniform distribution Normal distribution Chi-squared Distribution Multivariate Normal Distribution Definition Properties Quadratic Forms

Useful Univariate Distributions Not going to review them all in math camp but will refresh the most useful distributions. See the notes for a full review.

Bernoulli distribution X is a discrete random variable that can only take on two values: 0 , 1. We write f X ( x ) = p x (1 − p ) 1 − x . Note that E [ X k ] = p , k ≥ 1 V ( X ) = p (1 − p ) , µ X ( t ) = (1 − p ) + pe t . X has a Bernoulli distribution .

Binomial distribution X i for i = 1 , . . . , n are i.i.d Bernoulli random variables with P ( X i = 1) = p . Define n � X = X i . i =1 X follows a binomial distribution with parameters n and p . Takes values 1 , 2 , . . . , n and � n � p x (1 − p ) n − x f X ( x ) = x with E [ X ] = np , V ( X ) = np (1 − p ) .

Uniform distribution X is a continuous random variable with 1 f X ( x ) = b − a for x ∈ [ a , b ] and 0 otherwise. X is uniformly distributed on [a, b] and write X ∼ U [ a , b ]. E [ X ] = 1 V ( X ) = 1 12( b − a ) 2 . 2( a + b ) ,

Normal distribution Suppose Z is continuously distributed with support over R . X follows a standard normal distribution if 1 e − 1 2 z 2 f Z ( z ) = √ 2 π Denote it Z ∼ N (0 , 1) where E [ Z ] = 0 , V ( Z ) = 1. X ∼ N ( µ, σ 2 ) if 1 1 2 σ 2 ( x − µ ) 2 2 πσ 2 e − f X ( x ) = √ with E [ X ] = µ, V ( X ) = σ 2 and X = µ + σ Z , where Z ∼ N (0 , 1).

Normal distribution The MGF of a standard normal random variable is incredibly useful. If Z ∼ N (0 , 1), then 1 2 t 2 . M Z ( t ) = e If X ∼ N ( µ, σ 2 ), then M X ( t ) = e µ t + 1 2 σ 2 t 2 Why?

Chi-squared Distribution Let Z i ∼ N (0 , 1) i.i.d. for i = 1 , . . . , n . Let n � Z 2 X = i . i =1 X is a chi-squared random variable with n degrees of freedom and write X ∼ χ 2 n . Note E [ X ] = n , V ( X ) = 2 n .

Outline Useful Univariate Distributions Bernoulli distribution Binomial distribution Uniform distribution Normal distribution Chi-squared Distribution Multivariate Normal Distribution Definition Properties Quadratic Forms

The i.i.d. case Z = ( Z 1 , . . . , Z m ) ′ , where Z i ∼ N (0 , 1) i.i.d. The joint density of Z is 1 e − 1 2 z 2 f Z ( z ) = Π m √ i i =1 2 π = (2 π ) n / 2 exp( − 1 2 z ′ z ) Moreover, E [ Z ] = 0 and V ( Z ) = I m . The MGF of Z is M Z ( t ) = E [ e t ′ Z ] 1 = Π m i =1 E [ e t i z i ] = e 2 t ′ t This is a useful reference point as we develop some results about the multivariate normal distribution.

Definition The m -dimensional random vector X follows a m -dimensional multivariate normal distribution if and only if a T X is normally distributed for all a ∈ R m . We write X ∼ N m ( µ, Σ), where E [ X ] = µ is the m -dimensional mean vector and V ( X ) = Σ is the m × m dimensional covariance matrix. What is its joint density? We use the following results to get there.

Density of Multivariate Normal Result 1 : Suppose X ∼ N ( µ, Σ). Then, M X ( t ) = e t ′ µ + 1 2 t ′ Σ t . Proof: t ′ X ∼ N ( t ′ µ, t ′ Σ t ). Therefore, M X ( t ) = E [ e t ′ X ] = E [ e Y ] , Y ∼ N ( t ′ µ, t ′ Σ t ) = M Y (1)

Density of Multivariate Normal Result 2 : X ∼ N m ( µ, Σ) and Y = AX + b , where A ∈ R n × m , b ∈ R n . Then, Y ∼ N n ( A µ + b , A Σ A ′ ) . Proof: For t ∈ R n , M Y ( t ) = E [ e t ′ Y ] = E [ e t ′ ( AX + b ) ] = e t ′ b E [ e ( A ′ t ) ′ X ] = e t ′ b e ( A ′ t ) ′ µ + 1 2 ( A ′ t ) ′ Σ( A ′ t ) ′ = e t ′ ( A µ + b )+ 1 2 t ′ ( A Σ A ′ ) t

Density of Multivariate Normal We are now ready to derive the density of X ∼ N ( µ, Σ). Suppose X ∼ N ( µ, Σ) and Σ has full column rank. Then, the density of X is given by f X ( x ) = (2 π ) − m / 2 | Σ | − 1 / 2 exp( − 1 2( x − µ ) ′ Σ − 1 ( x − µ ))

Density of Multivariate Normal: Proof Sketch Z is a m -dimensional random vector of i.i.d. standard normal random variables. We have 1 2 t ′ t M Z ( t ) = e . so, Z ∼ N m (0 , I m ) with f Z ( z ) = (2 π ) − m / 2 e − 1 2 z ′ z Let X = µ + Σ 1 / 2 Z . Using results, X ∼ N m ( µ, Σ). From the multivariate transformation of random variables formula, we can get f X ( x ) = | Σ | − 1 / 2 f Z (Σ − 1 / 2 ( x − µ ))

Properties of Multivariate Normal Distribution Next, we provide a list of a set of useful properties of the multivariate normal distribution. No need to memorize them but here so you’re familiar with them. ◮ Results stated without proof.

Property #1: Concatenating independent multivariate normals Property #1 : If X 1 ∼ N m ( µ 1 , Σ 1 ) , X 2 ∼ N n ( µ 2 , Σ 2 ) and X 1 ⊥ X 2 , then 2 ) ′ ∼ N m + n ( µ, Σ) X = ( X ′ 1 , X ′ where � µ 1 � � Σ 1 0 � µ = Σ = , 0 Σ 2 µ 2

Property #2: Subvectors are multivariate normals Property #2 : Let X ∼ N m ( µ, Σ). Let X 1 be a p -dimensional sub-vector of X with p < m . Write � X 1 � X = X 2 and � µ 1 � � Σ 11 � Σ 12 µ = Σ = , . Σ 21 Σ 22 µ 2 Then, X 1 ∼ N p ( µ 1 , Σ 11 ).

Property #3: Cov ( X 1 , X 2 ) = 0 ⇐ ⇒ X 1 ⊥ X 2 Property #3 : Let X ∼ N m ( µ, Σ). Partition X into two sub-vectors. That is, write � X 1 � X = X 2 and � µ 1 � � Σ 11 � Σ 12 µ = Σ = , . Σ 21 Σ 22 µ 2 Then, X 1 ⊥ X 2 if and only if Σ 12 = Σ 21 = 0.

Property #4 Property #4 : Let X ∼ N m ( µ, Σ). If Y = AX + b , V = CX + d , where A , C ∈ R n × m and b , d ∈ R n , then Cov ( Y , V ) = A Σ C ′ . Moreover, Y ⊥ V if and only if A Σ C ′ = 0 .

Property #5: Linear conditional expectations Property #5 : Let X ∼ N m ( µ, Σ) with X = ( X ′ 1 , X ′ 2 ) ′ , 2 ) ′ and µ = ( µ ′ 1 , µ ′ � Σ 11 � Σ 12 Σ = . Σ 21 Σ 22 Provided that Σ 22 has full rank, the conditional distribution of X 1 given X 2 = x 2 is X 1 | X 2 = x 2 ∼ N ( µ 1 + Σ 12 Σ − 1 22 ( x 2 − µ 2 ) , Σ 11 − Σ 12 Σ − 1 22 Σ 21 ) .

Property #5: Linear Conditional Expectations What’s the intuition of this? E [ X 1 | X 2 = x 2 ] = µ 1 + Σ 12 Σ − 1 22 ( x 2 − µ 2 ) . In 1-d, it becomes E [ X 1 | X 2 = x 2 ] = E [ X 1 ] + Cov ( X 1 , X 2 ) ( x 2 − E [ X 2 ]) V ( X 2 ) Next let’s relabel Y = X 1 , X = X 2 and re-arrange E [ Y | X = x ] = ( E [ Y ] − Cov ( Y , X ) E [ X ]) + Cov ( Y , X ) x . V ( X ) V ( X ) This is simply the linear regression formula! If ( X , Y ) are jointly normal, linear regression exactly returns the conditional expectation function.

Property #6: Quadratic Form of a Multivariate Normal A quadratic form is a quantity of the form y ′ Ay , where A is a symmetric matrix. Suppose that Z i ∼ N (0 , 1) i.i.d. for i = 1 , . . . , n . We already know that � n i =1 Z 2 i = Z ′ Z ∼ χ 2 n . Property #6 :If X ∼ N m ( µ, Σ) and Σ has full rank, then ( X − µ ) ′ Σ − 1 ( X − µ ) ∼ χ 2 m . ◮ Why? Let Z = Σ − 1 / 2 ( X − µ ) ∼ N m (0 , I m ). Then, Z ′ Z ∼ χ 2 m .

Probability Review III Harvard Math Camp - Econometrics Ashesh - PowerPoint PPT Presentation

Probability Review III Harvard Math Camp - Econometrics Ashesh Rambachan Summer 2018 Outline Useful Univariate Distributions Bernoulli distribution Binomial distribution Uniform distribution Normal distribution Chi-squared Distribution

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Counting and Probability Whats to come? Counting and Probability Whats to come?

CS70: Jean Walrand: Lecture 21. Events, Conditional Probability 1. Probability Basics Review 2.

Which probability Which probability Which probability Which probability theory for cosmology?

Recap of Basic Probability Elements of basic probability theory probability theory The

1 2 3 4 Stopping Probability Visiting Probability 5 Stopping

Unit 2: Probability and distributions Lecture 1: Probability and conditional probability

Probability Review CMSC 473/673 UMBC Some slides adapted from 3SLP, Jason Eisner Probability

Lecture 15: More Probability. Summary. CS70: Onwards. Events, Conditional Probability,

DATA MINING TECHNIQUES Review of Probability Theory Yijun Zhao Northeastern University spring

Probability Probability Random variables Atomic events Sample space Probability

Foundations of Computer Science Lecture 16 Conditional Probability Updating a Probability when

Multivariate normal distribution Surajit Ray Reader, University of Glasgow DataCamp

MLE 04-09-2019 For Gaussian and Mixture Gaussian Models Instructor - Sriram Ganapathy

Machine Learning (CSE 446): Probabilistic Machine Learning MLE & MAP Sham M Kakade 2018

& Exact inference for Gaussian networks Probabilistic Graphical Models Sharif University of

Using Discrete Gaussian Sampling Divesh Aggarwal National University of Singapore (NUS) Daniel

Quantitative Security Colorado State University Yashwant K Malaiya CS 559 L6: Probability &

using Gaussian process regression Christopher Moore 20/08/2015 Institute of Astronomy,

CS70: Lecture 32. Normal (Gaussian) Distribution. For any and , a normal (aka Gaussian )

Sambuz

Useful Links

Newsletter

Mail Us

Probability Review III Harvard Math Camp - Econometrics Ashesh - PowerPoint PPT Presentation

Probability Review III Harvard Math Camp - Econometrics Ashesh Rambachan Summer 2018 Outline Useful Univariate Distributions Bernoulli distribution Binomial distribution Uniform distribution Normal distribution Chi-squared Distribution

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Counting and Probability Whats to come? Counting and Probability Whats to come?

CS70: Jean Walrand: Lecture 21. Events, Conditional Probability 1. Probability Basics Review 2.

Which probability Which probability Which probability Which probability theory for cosmology?

Recap of Basic Probability Elements of basic probability theory probability theory The

1 2 3 4 Stopping Probability Visiting Probability 5 Stopping

Unit 2: Probability and distributions Lecture 1: Probability and conditional probability

Probability Review CMSC 473/673 UMBC Some slides adapted from 3SLP, Jason Eisner Probability

Lecture 15: More Probability. Summary. CS70: Onwards. Events, Conditional Probability,

DATA MINING TECHNIQUES Review of Probability Theory Yijun Zhao Northeastern University spring

Probability Probability Random variables Atomic events Sample space Probability

Foundations of Computer Science Lecture 16 Conditional Probability Updating a Probability when

Multivariate normal distribution Surajit Ray Reader, University of Glasgow DataCamp

MLE 04-09-2019 For Gaussian and Mixture Gaussian Models Instructor - Sriram Ganapathy

Machine Learning (CSE 446): Probabilistic Machine Learning MLE &amp; MAP Sham M Kakade 2018

&amp; Exact inference for Gaussian networks Probabilistic Graphical Models Sharif University of

Using Discrete Gaussian Sampling Divesh Aggarwal National University of Singapore (NUS) Daniel

Quantitative Security Colorado State University Yashwant K Malaiya CS 559 L6: Probability &amp;

using Gaussian process regression Christopher Moore 20/08/2015 Institute of Astronomy,

CS70: Lecture 32. Normal (Gaussian) Distribution. For any and , a normal (aka Gaussian )

Sambuz

Useful Links

Newsletter

Mail Us

Machine Learning (CSE 446): Probabilistic Machine Learning MLE & MAP Sham M Kakade 2018

& Exact inference for Gaussian networks Probabilistic Graphical Models Sharif University of

Quantitative Security Colorado State University Yashwant K Malaiya CS 559 L6: Probability &