Gov 2000: 2. Random Variables and Probability Distributions
Matthew Blackwell
Fall 2016
1 / 56
Gov 2000: 2. Random Variables and Probability Distributions - - PowerPoint PPT Presentation
Gov 2000: 2. Random Variables and Probability Distributions Matthew Blackwell Fall 2016 1 / 56 1. Random Variables 2. Probability Distributions 3. Cumulative Distribution Functions 4. Properties of Distributions 5. Famous distributions 6.
Matthew Blackwell
Fall 2016
1 / 56
2 / 56
X = 0 (Don't Approve) X = 1 (Approve)
Obama Presidential Approval (ANES 2016)
100 200 300 400 500 600
variable.
▶ What is the true Obama approval rate in the US?
▶ If we knew the true Obama approval, what samples are likely? 3 / 56
4 / 56
probability:
▶ ⇝ ℙ(𝜕) is the probability that a particular outcome will
happen.
▶ We don’t know which outcome will occur, but we know which
▶ Ω = {𝐼𝐼, 𝐼𝑈, 𝑈𝐼, 𝑈𝑈} ▶ Fair coins, independent,
ℙ(𝐼𝐼) = ℙ(𝐼)ℙ(𝐼) = 0.5 × 0.5 = 0.25
5 / 56
Random Variable
A random variable (r.v.) is a function that maps from the sample space of an experiment to the real line or 𝑌 ∶ Ω → ℝ.
use math!
𝑌(𝜕).
6 / 56
▶ one possible outcome: 𝜕 = 𝐼𝑈𝐼𝑈𝑈, but not a random
variable because it’s not numeric.
▶ 𝑌(𝜕) = number of heads in the fjve tosses ▶ 𝑌(𝐼𝑈𝐼𝑈𝑈) = 2
▶ Ω = {approve, don’t approve}. ▶ Random variable converts this into a number:
𝑌 = ⎧ { ⎨ { ⎩ 1 if approve 0 if don’t approve
▶ Called a Bernoulli, binary, or dummy random variable.
▶ Ω = [0, ∞) ⇝ already numeric so 𝑌(𝜕) = 𝜕. 7 / 56
8 / 56
▶ Uncertainty over Ω ⇝ uncertainty over value of 𝑌. ▶ We’ll use probability to formalize this uncertainty.
all of the possible values of the r.v.
2 4 0.0 0.1 0.2 0.3 0.4 0.5 0.6 x Less Likely Less Likely More Likely
9 / 56
▶ Independent fair coin fmips so that ℙ(𝐼) = 0.5 ▶ Then if 𝑌 = 1 for heads, ℙ(𝑌 = 1) = 0.5
data came to be.
▶ Examples: coin fmips, randomly selecting a card from a deck,
etc.
▶ These assumptions imply probabilities over outcomes. ▶ Often we’ll skip the defjnition of Ω and directly connect the
DGP and a r.v.
10 / 56
Ω TT HT TH HH | | 1 | 2 ℝ 1/4 1/4 1/4 1/4
𝜕 ℙ({𝜕}) 𝑌(𝜕) TT 1/4 HT 1/4 1 TH 1/4 1 HH 1/4 2 𝑦 ℙ(𝑌 = 𝑦) 1/4 1 1/2 2 1/4
11 / 56
Discrete Random Variable
A r.v., 𝑌, is discrete if its range (the set of values it can take) is fjnite (𝑌 ∈ {𝑦1, … , 𝑦𝑙}) or countably infjnite (𝑌 ∈ {𝑦1, 𝑦2, …}).
𝑔𝑌(𝑦) = ℙ(𝑌 = 𝑦)
0 ≤ 𝑔𝑌(𝑦) ≤ 1
𝑙
∑
𝑘=1
𝑔𝑌(𝑦𝑘) = 1
ℙ(𝑌 ∈ 𝑇) = ∑
𝑦∈𝑇
𝑔𝑌(𝑦)
confmict, number of parties elected to a legislature.
12 / 56
▶ Flip independent fair coins for each unit ▶ Heads assigned to Control (C), tails to Treatment (T)
𝑌 = ⎧ { { { ⎨ { { { ⎩ if (𝐷, 𝐷, 𝐷) 1 if (𝑈, 𝐷, 𝐷) or (𝐷, 𝑈, 𝐷) or (𝐷, 𝐷, 𝑈) 2 if (𝑈, 𝑈, 𝐷) or (𝐷, 𝑈, 𝑈) or (𝑈, 𝐷, 𝑈) 3 if (𝑈, 𝑈, 𝑈)
ℙ(𝐷, 𝑈, 𝐷) = ℙ(𝐷)ℙ(𝑈)ℙ(𝐷) = 1 2 ⋅ 1 2 ⋅ 1 2 = 1 8
13 / 56
𝑔𝑌(0) = ℙ(𝑌 = 0) = ℙ(𝐷, 𝐷, 𝐷) = 1 8 𝑔𝑌(1) = ℙ(𝑌 = 1) = ℙ(𝑈, 𝐷, 𝐷) + ℙ(𝐷, 𝑈, 𝐷) + ℙ(𝐷, 𝐷, 𝑈) = 3 8 𝑔𝑌(2) = ℙ(𝑌 = 2) = ℙ(𝑈, 𝑈, 𝐷) + ℙ(𝐷, 𝑈, 𝑈) + ℙ(𝑈, 𝐷, 𝑈) = 3 8 𝑔𝑌(3) = ℙ(𝑌 = 3) = ℙ(𝑈, 𝑈, 𝑈) = 1 8
0!
14 / 56
1 2 3 0.0 0.1 0.2 0.3 0.4 0.5 x f(x)
treatment? What is one major problem with it?
15 / 56
subset of the real line?
▶ Suppose ℙ(𝑌 = 𝑦) = 𝜁 for 𝑦 ∈ (0, 1) where 𝜁 is a very small
number.
▶ What’s the probability of being between 0 and 1? ▶ There are an infjnite number of real numbers between 0 and 1:
0.232879873 … 0.57263048743 … 0.9823612984 …
▶ Each one has probability 𝜁 ⇝ ℙ(𝑌 ∈ (0, 1)) = ∞ × 𝜁 = ∞
16 / 56
Thought experiment: draw a random real value between 0 and 10. What’s the probability that we draw a value that is exact equal to 𝜌?
3.1415926535 8979323846 2643383279 5028841971 6939937510 5820974944 5923078164 0628620899 8628034825 3421170679 8214808651 3282306647 0938446095 5058223172 5359408128 4811174502 8410270193 8521105559 6446229489 5493038196 4428810975 6659334461 2847564823 3786783165 2712019091 4564856692 3460348610 4543266482 1339360726 0249141273 7245870066 0631558817 4881520920 9628292540 9171536436 7892590360 0113305305 4882046652 1384146951 9415116094 3305727036 5759591953 0921861173 8193261179 3105118548 0744623799 6274956735 1885752724 8912279381 8301194912 9833673362 4406566430 8602139494 6395224737 1907021798 6094370277 0539217176 2931767523 8467481846 7669405132 0005681271 4526356082 7785771342 7577896091 7363717872 1468440901 2249534301 4654958537 1050792279 6892589235 4201995611 2129021960 8640344181 5981362977 4771309960 5187072113 4999999837 2978049951 0597317328 1609631859 5024459455 3469083026 4252230825 3344685035 2619311881 7101000313 7838752886 5875332083 8142061717 7669147303 5982534904 2875546873 1159562863 8823537875 9375195778 1857780532 1712268066 1300192787 6611195909 2164201989 3809525720 1065485863 2788659361 5338182796 8230301952 0353018529 6899577362 2599413891 2497217752 8347913151 5574857242 4541506959 5082953311 6861727855 8890750983 8175463746 4939319255 0604009277 0167113900 9848824012 8583616035 6370766010 4710181942 9555961989 4676783744...
17 / 56
Continuous Random Variable
A r.v., 𝑌, is continuous if there exists a nonnegative function on ℝ, 𝑔𝑌 called the probability density function (p.d.f.) such that for any interval, 𝐶: ℙ(𝑌 ∈ 𝐶) = ∫
𝐶 𝑔𝑌(𝑦)𝑒𝑦
ℙ(𝑏 < 𝑌 < 𝑐) = ∫
𝑐 𝑏 𝑔𝑌(𝑦)𝑒𝑦.
that region.
𝑑 𝑔𝑌(𝑦)𝑒𝑦 = 0
parliamentary system, proportion of voters who turned out, governmental budgets allocations
18 / 56
2 4 0.0 0.1 0.2 0.3 0.4 0.5 0.6 x f(x) P(0 < X < 2)
𝑔𝑌(𝑦) ≠ ℙ(𝑌 = 𝑦)
particular region.
19 / 56
20 / 56
doesn’t depend on discrete vs. continuous:
Cumulative distribution function
The cumulative distribution function (c.d.f.) returns the probability is that a variable is less than a particular value: 𝐺𝑌(𝑦) ≡ ℙ(𝑌 ≤ 𝑦).
like 𝑌 = 𝑦) on the real line.
−∞ 𝑔𝑌(𝑢)𝑒𝑢
21 / 56
▶ Proof: the event 𝑌 < 𝑦 includes the event 𝑌 < 𝑦′ so ℙ(𝑌 < 𝑦′)
can’t be smaller than ℙ(𝑌 < 𝑦).
point from the right)
▶ For discrete 𝑌, 𝐺𝑌(𝑦) is piecewise constant and staircase-like. ▶ For continuous 𝑌, 𝐺𝑌(𝑦) is continuous. 22 / 56
𝑦 ℙ(𝑌 = 𝑦) 1/8 1 3/8 2 3/8 3 1/8
𝐺𝑌(𝑦) = ⎧ { { { { { ⎨ { { { { { ⎩ 𝑦 < 0 1/8 0 ≤ 𝑦 < 1 1/2 1 ≤ 𝑦 < 2 7/8 2 ≤ 𝑦 < 3 1 𝑦 ≥ 3
0.5
23 / 56
1 2 3 4 0.0 0.2 0.4 0.6 0.8 1.0 x F(x)
24 / 56
2 4 0.0 0.1 0.2 0.3 0.4 0.5
p.d.f.
x f(x)
2 4 0.0 0.2 0.4 0.6 0.8 1.0
c.d.f.
x F(x)
𝐺𝑌(𝑦) = ∫
𝑦 −∞ 𝑔𝑌(𝑢)𝑒𝑢
value.
25 / 56
▶ Value of the c.d.f. just below 𝑦 ▶ For continuous r.v., 𝐺𝑌(𝑦−) = 𝐺𝑌(𝑦)
interval or value:
▶ 𝑌 = 1 or 𝑌 = 2, so we need the prob. of 0 < 𝑌 ≤ 2:
ℙ(0 < 𝑌 ≤ 2) = 𝐺𝑌(2) − 𝐺𝑌(0) = 7/8 − 1/8 = 0.75
26 / 56
27 / 56
curves? How might we summarize this difgerence?
1 2 3 x
28 / 56
▶ We’ll focus on the mean/expectation.
▶ We’ll focus on the variance/standard deviation.
data on a r.v.
29 / 56
(a/k/a the expectation or mean) of 𝑌.
𝔽[𝑌] =
𝑙
∑
𝑘=1
𝑦𝑘𝑔 (𝑦𝑘)
▶ Weighted average of the values of the r.v. weighted by the
probability of each value occurring.
𝔽[𝑌] = ∫
∞ −∞ 𝑦𝑔𝑌(𝑦)𝑒𝑦
30 / 56
units. 𝑦 𝑔𝑌(𝑦) 𝑦𝑔𝑌(𝑦) 1/8 1 3/8 3/8 2 3/8 6/8 3 1/8 3/8
𝔽[𝑌] =
𝑙
∑
𝑘=1
𝑦𝑘𝑔 (𝑦𝑘) = 0 × 𝑔𝑌(0) + 1 × 𝑔𝑌(1) + 2 × 𝑔𝑌(2) + 3 × 𝑔𝑌(3) = 0 × 1 8 + 1 × 3 8 + 2 × 3 8 + 3 × 1 8 = 12 8 = 1.5
31 / 56
𝑔𝑌(𝑦) = ⎧ { ⎨ { ⎩ 2𝑦 for 0 < 𝑦 < 1
𝔽[𝑌] = ∫
∞ −∞ 𝑦𝑔𝑌(𝑦)𝑒𝑦
= ∫
1 0 𝑦(2𝑦)𝑒𝑦
= ∫
1 0 2𝑦2𝑒𝑦
= (2/3)𝑦3∣
1
= (2/3) ⋅ 13 − (2/3) ⋅ 03 = (2/3)
32 / 56
𝔽[𝑌 + 𝑍] = 𝔽[𝑌] + 𝔽[𝑍]
𝔽[𝑏𝑌 + 𝑑] = 𝑏𝔽[𝑌] + 𝑑
a function of a discrete random variable, then 𝔽[(𝑌)] = ∑
𝑦
(𝑦)𝑔𝑌(𝑦),
▶ 𝔽[(𝑌)] ≠ (𝔽[𝑌]) unless (⋅) is a linear function. ▶ 𝔽[𝑌𝑍] ≠ 𝔽[𝑌]𝔽[𝑍] unless 𝑌 and 𝑍 are independent (next
week).
33 / 56
𝕎[𝑌] = 𝔽[(𝑌 − 𝔽[𝑌])2]
𝕎[𝑌] =
𝑙
∑
𝑘=1
(𝑦𝑘 − 𝔽[𝑌])2𝑔𝑌(𝑦𝑘)
𝕎[𝑌] = ∫
∞ −∞(𝑦 − 𝔽[𝑌])2𝑔𝑌(𝑦)𝑒𝑦
▶ Larger deviations (+ or −) ⇝ higher variance
variance: 𝜏𝑌 = √𝕎[𝑌].
34 / 56
𝑦 𝑔𝑌(𝑦) 𝑦 − 𝔽[𝑌] (𝑦 − 𝔽[𝑌])2 1/8
2.25 1 3/8
0.25 2 3/8 0.5 0.25 3 1/8 1.5 2.25
variance of the number of treated units: 𝕎[𝑌] =
𝑙
∑
𝑘=1
(𝑦𝑘 − 𝔽[𝑌])2𝑔𝑌(𝑦𝑘) = (−1.5)2 × 1 8 + (−0.5)2 × 3 8 + 0.52 × 3 8 + 1.52 × 1 8 = 2.25 × 1 8 + 0.25 × 3 8 + 0.25 × 3 8 + 2.25 × 1 8 = 0.75
35 / 56
independent (next week).
36 / 56
𝑔𝑌(𝑦) = ⎧ { ⎨ { ⎩ 2𝑦 for 0 < 𝑦 < 1
𝔽[𝑌2] = ∫
∞ −∞ 𝑦2𝑔𝑌(𝑦)𝑒𝑦
= ∫
1 0 𝑦2(2𝑦)𝑒𝑦
= ∫
1 0 2𝑦3𝑒𝑦
= (2/4)𝑦4∣
1 0 = (1/2)
37 / 56
38 / 56
▶ The p.m.f./p.d.f. within the family has the same form, with
parameters that might vary across the family.
▶ The parameters determine the shape of the distribution
common distribution 𝑔𝜄(𝑦) within a family of distributions (normal, poisson, etc)
the 𝜄: ̂ 𝜄(𝑌1, 𝑌2, …)
39 / 56
binary and ℙ(𝑌 = 1) = 𝑞
𝑔𝑌(𝑦) = 𝑞𝑦(1 − 𝑞)1−𝑦
▶ 𝑌1, 𝑌2, … , 𝑌𝑜 are each a Bernoulli r.v. indicating Obama
approval for the 𝑗th respondent.
▶ 𝑞 is the Obama approval rate in the population. ▶ Sneak peak: how can we learn about 𝑞 from 𝑌1, 𝑌2, … , 𝑌𝑜? 40 / 56
𝔽[𝑌] =
𝑙
∑
𝑘=1
𝑦𝑘𝑔𝑌(𝑦𝑘) = 0 × 𝑔𝑌(0) + 1 × 𝑔𝑌(1) = 0 × (1 − 𝑞) + 1 × 𝑞 = 𝑞
𝕎[𝑌] = 𝔽[𝑌2] − (𝔽[𝑌])2 = 𝑞 − 𝑞2 = 𝑞(1 − 𝑞)
41 / 56
4 8 0.00 0.05 0.10 0.15 0.20 0.25 0.30 x f(x)
independent coin fmips with probability 𝑞 of heads.
written 𝑌 ∼ Bin(𝑜, 𝑞) which has p.m.f.: 𝑔𝑌(𝑦) = (𝑜 𝑦)𝑞𝑦(1 − 𝑞)𝑜−𝑦 where (𝑜
𝑙) = 𝑜! /(𝑙! (𝑜 − 𝑙)! )
probability 𝑞.
42 / 56
2 6 10 0.00 0.05 0.10 0.15 x f(x)
𝑔𝑌(𝑦) = ⎧ { ⎨ { ⎩ 1/𝑙 for 𝑦 = 1, … , 𝑙
sampling.
43 / 56
x μ σ2
▶ It is extremely useful and ubiquitous in statistics.
▶ 𝔽[𝑌] = 𝜈 and 𝕎[𝑌] = 𝜏2 are the parameters of the normal. 44 / 56
𝑔𝑌(𝑦) = 1 𝜏√2𝜌 exp {− 1 2𝜏2 (𝑦 − 𝜈)2} .
distribution with 𝑂(0, 1).
45 / 56
2 4 0.0 0.1 0.2 0.3 0.4 0.5 0.6 x f(x)
pnorm(q = 0, mean = 0, sd = 1) ## [1] 0.5
46 / 56
2 4 0.0 0.1 0.2 0.3 0.4 0.5 0.6 x f(x)
pnorm(q = 0, mean = 0, sd = 1, lower.tail = FALSE) ## [1] 0.5
47 / 56
2 4 0.0 0.1 0.2 0.3 0.4 0.5 0.6 x f(x)
pnorm(q = 0, mean = 0, sd = 1) - pnorm(q = -1, mean = 0, sd = 1) ## [1] 0.3413447
48 / 56
0.0 0.5 1.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 x f(x)
𝑔𝑌(𝑦) = ⎧ { ⎨ { ⎩
1 𝑐−𝑏
for 𝑦 ∈ [𝑏, 𝑐]
containing 𝑌.
49 / 56
50 / 56
▶ ⇝ calculate 𝔽[𝑌]/𝕎[𝑌] directly using the defjnitions. ▶ Often need calculus/summation tricks.
mean/variance you do know?
▶ ⇝ use linearity of expectations. ▶ Ex.: 𝔽[𝑍] = 0.2 and 𝑌 = 𝑍 + 1 ⇝ 𝔽[𝑌] = 𝔽[𝑍] + 1 = 1.2
▶ draw a large number of realizations of 𝑌 and calculate the
mean/variance of those.
▶ useful when using p.d.f./p.m.f. is complicated. 51 / 56
functions like runif() or rnorm().
runif(n = 1, min = 0, max = 1) ## [1] 0.7265663
hold <- runif(n = 1000, min = 0, max = 1) mean(hold) ## [1] 0.5134936
52 / 56
ℙ(𝑌 ∈ 𝐶) ≈ # of draws in 𝐶 total number of draws
sum(hold > 0.7)/length(hold) ## [1] 0.305 mean(hold > 0.7) ## [1] 0.305
53 / 56
54 / 56
data.
uncertainty in their outcomes.
variances.
55 / 56
expectation.
56 / 56