Probability & Statistics Intro / Review NEU 560 Jonathan Pillow - - PowerPoint PPT Presentation

probability statistics intro review
SMART_READER_LITE
LIVE PREVIEW

Probability & Statistics Intro / Review NEU 560 Jonathan Pillow - - PowerPoint PPT Presentation

Probability & Statistics Intro / Review NEU 560 Jonathan Pillow Lecture 6, part II 1 continuous probability distribution takes values in a continuous space, e.g., probability density function (pdf) : 2 discrete probability


slide-1
SLIDE 1

Probability & Statistics Intro / Review

NEU 560 Jonathan Pillow Lecture 6, part II

1

slide-2
SLIDE 2

continuous probability distribution

takes values in a continuous space, e.g., probability density function (pdf):

  • 2
slide-3
SLIDE 3

discrete probability distribution

takes finite (or countably infinite) number of values, eg

  • probability mass function (pmf):

3

slide-4
SLIDE 4

some friendly neighborhood distributions

⇤ ⌅ P(x; µ, σ) = 1 √ 2πσ exp (x − u)2 2σ2 ⇥

P(xn; µ, Λ) = 1 (2π)

n 2 |Λ| 1 2 exp

  • − 1

2(x − µ)T Λ−1(x − µ)

Gaussian multivariate Gaussian

P(x; a) = aeax

exponential Continuous

P(k; n, p) = ⇤n k ⌅ pk(1 − p)n−k

binomial

P(k; λ) = λk k! e−λ

Poisson Bernoulli Discrete coin flipping sum of n coin flips sum of n coin flips with P(heads)=λ/n, in limit n→∞

4

slide-5
SLIDE 5

joint density

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

  • positive
  • sums to 1

5

slide-6
SLIDE 6

marginalization (“integration”)

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

6

slide-7
SLIDE 7

marginalization (“integration”)

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

7

slide-8
SLIDE 8

conditionalization (“slicing”)

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

  • 3
  • 2
  • 1

1 2 3

(“joint divided by marginal”)

8

slide-9
SLIDE 9

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

  • 3
  • 2
  • 1

1 2 3

conditionalization (“slicing”)

(“joint divided by marginal”)

9

slide-10
SLIDE 10

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

  • 3
  • 2
  • 1

1 2 3

marginal P(y) conditional

conditionalization (“slicing”)

10

slide-11
SLIDE 11

conditional densities

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

  • 3
  • 2
  • 1

1 2 3

11

slide-12
SLIDE 12

conditional densities

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

  • 3
  • 2
  • 1

1 2 3

12

slide-13
SLIDE 13

Bayes’ Rule

Conditional Densities

posterior

Bayes’ Rule

likelihood prior marginal probability of y (“normalizer”)

13

slide-14
SLIDE 14

Terminology question:

  • When do we call this a likelihood?

A: when considered as a function of x
 (i.e., with y held fixed)

  • note: doesn’t integrate to 1.
  • What’s it called as a function of y, for fixed x?

conditional distribution or sampling distribution

14

slide-15
SLIDE 15

Expectations (“averages”)

pdf

con'nuous discrete

  • r

Corresponds to taking weighted average of f(X), weighted by how probable they are under P(x). Expectation is the weighted average of a function (of a random variable) according to the distribution (of that random variable)

pmf

15

slide-16
SLIDE 16

Expectations (“averages”)

Monte Carlo evaluation of an expectation: x(i) ∼ P(x)

  • 1. draw samples from distribu'on:
  • 2. average

for i = 1 to N

E[f(x)] ≈ 1

N N

X

i=1

f(x(i))

Expectation is the weighted average of a function (of a random variable) according to the distribution (of that random variable)

pdf

con'nuous

pmf

discrete

  • r

16

slide-17
SLIDE 17

Expectations (“averages”)

Expectation is the weighted average of a function (of a random variable) according to the distribution (of that random variable) It’s really just a dot product! Thus, expectation is a linear function:

pdf

con'nuous

pmf

discrete

  • r

17

slide-18
SLIDE 18

Expectations (“averages”)

The two most important expectations (also known as “moments”):

  • Mean: E[x] (average value of RV)

  • Variance: E[(x - E[x])2] (average squared dist between X and its mean).

Note: expectations don’t always exist!

e.g. Cauchy: has no mean!

18

slide-19
SLIDE 19

independence

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

19

slide-20
SLIDE 20

independence

Definition: x, y are independent iff

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

20

slide-21
SLIDE 21

independence

Definition: x, y are independent iff In linear algebra terms:

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

(outer product)

21

slide-22
SLIDE 22

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

independence

Definition: x, y are independent iff Alternative definition:

  • 3
  • 2
  • 1

1 2 3

All conditionals are the same!

22

slide-23
SLIDE 23

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

independence

Definition: x, y are independent iff Alternative definition:

  • 3
  • 2
  • 1

1 2 3

All conditionals are the same!

23

slide-24
SLIDE 24

Correlation vs. Dependence

Mean of y|x changes systematically with x

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

positive correlation

−3 −2 −1 1 2 3 3 2 1 1 2 3

negative correlation

  • 1. Correlation

24

slide-25
SLIDE 25

Correlation vs. Dependence

Mean of y|x changes systematically with x

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

positive correlation

−3 −2 −1 1 2 3 3 2 1 1 2 3

negative correlation

  • 1. Correlation
  • 2. Dependence
  • arises whenever
  • quantified by

mutual information:

KL divergence

  • MI=0 ⇒ independence

25

slide-26
SLIDE 26

Correlation vs. Dependence

Q: Can you draw a distribution that is uncorrelated but dependent?

26

slide-27
SLIDE 27

Correlation vs. Dependence

filter 1 output filter 2 output P(filter 2 output | filter 1 output)

Flower image: [Schwartz & Simoncelli 2001]

“Bowtie” dependencies in natural scenes:

(uncorrelated but dependent)

Q: Can you draw a distribution that is uncorrelated but dependent?

27

slide-28
SLIDE 28

Is this distribution independent?

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

28

slide-29
SLIDE 29

Is this distribution independent?

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

29

slide-30
SLIDE 30

Is this distribution independent?

−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

−3 −2 −1 1 2 3

No! Conditionals over y are different for different x!

30

slide-31
SLIDE 31

FUN FACT:

  • independent (equal to the product of its marginals)
  • spherically symmetric:

Independent Gaussian is the only distribution that is both: Corollary: circular scatter / contour plot 
 not sufficient to show independence!

  • rthogonal matrix

31

slide-32
SLIDE 32

Summary

  • continuous & discrete distributions
  • marginalization (splatting)
  • conditionalization (slicing)
  • Bayes’ rule (prior, likelihood, posterior)
  • Expectations
  • Independence & Correlation

32