Probability Review CMSC 473/673 UMBC Some slides adapted from - - PowerPoint PPT Presentation

probability review
SMART_READER_LITE
LIVE PREVIEW

Probability Review CMSC 473/673 UMBC Some slides adapted from - - PowerPoint PPT Presentation

Probability Review CMSC 473/673 UMBC Some slides adapted from 3SLP, Jason Eisner Probability Prerequisites Basic probability axioms Definition of conditional and definitions probability Joint probability Bayes rule Probabilistic


slide-1
SLIDE 1

Probability Review

CMSC 473/673 UMBC

Some slides adapted from 3SLP, Jason Eisner

slide-2
SLIDE 2

Probability Prerequisites

Basic probability axioms and definitions Joint probability Probabilistic Independence Marginal probability Definition of conditional probability Bayes rule Probability chain rule Expected Value (of a function) of a Random Variable

slide-3
SLIDE 3

Interpretations of Probability

Past performance 58% of the past 100 flips were heads Hypothetical performance If I flipped the coin in many parallel universes… Subjective strength of belief Would pay up to 58 cents for chance to win $1 Output of some computable formula? p(heads) vs q(heads)

slide-4
SLIDE 4

(Most) Probability Axioms

p(everything) = 1 p(nothing) = p(φ) = 0 p(A) ≤ p(B), when A ⊆ B p(A ∪ B) = p(A) + p(B), when A ∩ B = φ

everything A B

p(A ∪ B) = p(A) + p(B) – p(A ∩ B) p(A ∪ B) ≠ p(A) + p(B)

empty set

slide-5
SLIDE 5

Examining p(everything) =1

If p(everything) = 1…

slide-6
SLIDE 6

Examining p(everything) =1

If p(everything) = 1… and you can break everything into M unique items 𝑦1, 𝑦2, … , 𝑦𝑁…

slide-7
SLIDE 7

Examining p(everything) =1

If p(everything) = 1… and you can break everything into M unique items 𝑦1, 𝑦2, … , 𝑦𝑁… then each pair 𝑦𝑗 and 𝑦𝑘 are disjoint (𝑦𝑗 ∩ 𝑦𝑘 = 𝜚)…

slide-8
SLIDE 8

Examining p(everything) =1

If p(everything) = 1… and you can break everything into M unique items 𝑦1, 𝑦2, … , 𝑦𝑁… then each pair 𝑦𝑗 and 𝑦𝑘 are disjoint (𝑦𝑗 ∩ 𝑦𝑘 = 𝜚)… and because everything is the union of 𝑦1, 𝑦2, … , 𝑦𝑁…

slide-9
SLIDE 9

Examining p(everything) =1

If p(everything) = 1… and you can break everything into M unique items 𝑦1, 𝑦2, … , 𝑦𝑁… then each pair 𝑦𝑗 and 𝑦𝑘 are disjoint (𝑦𝑗 ∩ 𝑦𝑘 = 𝜚)… and because everything is the union of 𝑦1, 𝑦2, … , 𝑦𝑁… 𝑞 everything = ෍

𝑗=1 𝑁

𝑞 𝑦𝑗 = 1

slide-10
SLIDE 10

A Very Important Concept to Remember

The probabilities of all unique (disjoint) items 𝑦1, 𝑦2, … , 𝑦𝑁 must sum to 1: 𝑞 everything = ෍

𝑗=1 𝑁

𝑞 𝑦𝑗 = 1

slide-11
SLIDE 11

Probabilities and Random Variables

Random variables: variables that represent the possible outcomes of some random “process”

slide-12
SLIDE 12

Probabilities and Random Variables

Random variables: variables that represent the possible outcomes of some random “process” Example #1: A (weighted) coin that can come up heads or tails

X is a random variable denoting the possible

  • utcomes

X=HEADS or X=TAILS

slide-13
SLIDE 13

Distribution Notation

If X is a R.V. and G is a distribution:

  • 𝑌 ∼ 𝐻 means X is distributed according to

(“sampled from”) 𝐻

slide-14
SLIDE 14

Distribution Notation

If X is a R.V. and G is a distribution:

  • 𝑌 ∼ 𝐻 means X is distributed according to

(“sampled from”) 𝐻

  • 𝐻 often has parameters 𝜍 = (𝜍1, 𝜍2, … , 𝜍𝑁)

that govern its “shape”

  • Formally written as 𝑌 ∼ 𝐻(𝜍)
slide-15
SLIDE 15

Distribution Notation

If X is a R.V. and G is a distribution:

  • 𝑌 ∼ 𝐻 means X is distributed according to

(“sampled from”) 𝐻

  • 𝐻 often has parameters 𝜍 = (𝜍1, 𝜍2, … , 𝜍𝑁) that

govern its “shape”

  • Formally written as 𝑌 ∼ 𝐻(𝜍)

i.i.d. If 𝑌1, X2, … , XN are all independently sampled

from 𝐻(𝜍), they are independently and identically distributed

slide-16
SLIDE 16

Probability Prerequisites

Basic probability axioms and definitions Joint probability Probabilistic Independence Marginal probability Definition of conditional probability Bayes rule Probability chain rule Expected Value (of a function) of a Random Variable

slide-17
SLIDE 17

Joint Probability

Probability that multiple things “happen together”

everything A B Joint probability

slide-18
SLIDE 18

Joint Probability

Probability that multiple things “happen together” p(x,y), p(x,y,z), p(x,y,w,z) Symmetric: p(x,y) = p(y,x)

everything A B Joint probability

slide-19
SLIDE 19

Joint Probability

Probability that multiple things “happen together” p(x,y), p(x,y,z), p(x,y,w,z) Symmetric: p(x,y) = p(y,x) Form a table based of

  • utcomes: sum across cells = 1

everything A B Joint probability p(x,y) Y=0 Y=1 X=“cat” .04 .32 X=“dog” .2 .04 X=“bird” .1 .1 X=“human” .1 .1

slide-20
SLIDE 20

Joint Probabilities

1

p(A)

what happens as we add conjuncts?

slide-21
SLIDE 21

Joint Probabilities

1

p(A, B) p(A)

what happens as we add conjuncts?

slide-22
SLIDE 22

Joint Probabilities

1

p(A, B, C) p(A, B) p(A)

what happens as we add conjuncts?

slide-23
SLIDE 23

Joint Probabilities

p(A, B, C, D)

1

p(A, B, C) p(A, B) p(A)

what happens as we add conjuncts?

slide-24
SLIDE 24

Joint Probabilities

p(A, B, C, D)

1

p(A, B, C) p(A, B) p(A) p(A, B, C, D, E)

what happens as we add conjuncts?

slide-25
SLIDE 25

A Note on Notation

p(X INCLUSIVE_OR Y)  p(X ∪ Y) p(X AND Y)  p(X, Y) p(X, Y) = p(Y, X)

– except when order matters (should be obvious from context)

slide-26
SLIDE 26

Probability Prerequisites

Basic probability axioms and definitions Joint probability Probabilistic Independence Marginal probability Definition of conditional probability Bayes rule Probability chain rule Expected Value (of a function) of a Random Variable

slide-27
SLIDE 27

Probabilistic Independence

Independence: when events can occur and not impact the probability of

  • ther events

Formally: p(x,y) = p(x)*p(y) Generalizable to > 2 random variables

Q: Are the results of flipping the same coin twice in succession independent?

slide-28
SLIDE 28

Probabilistic Independence

Independence: when events can occur and not impact the probability of

  • ther events

Formally: p(x,y) = p(x)*p(y) Generalizable to > 2 random variables

Q: Are the results of flipping the same coin twice in succession independent? A: Yes (assuming no weird effects)

slide-29
SLIDE 29

Probabilistic Independence

Independence: when events can occur and not impact the probability of

  • ther events

Formally: p(x,y) = p(x)*p(y) Generalizable to > 2 random variables

everything A B

Q: Are A and B independent?

slide-30
SLIDE 30

Probabilistic Independence

Independence: when events can occur and not impact the probability of

  • ther events

Formally: p(x,y) = p(x)*p(y) Generalizable to > 2 random variables

everything A B

Q: Are A and B independent? A: No (work it out from p(A,B)) and the axioms

slide-31
SLIDE 31

Probabilistic Independence

Independence: when events can occur and not impact the probability of

  • ther events

Formally: p(x,y) = p(x)*p(y) Generalizable to > 2 random variables

Q: Are X and Y independent?

p(x,y) Y=0 Y=1 X=“cat” .04 .32 X=“dog” .2 .04 X=“bird” .1 .1 X=“human” .1 .1

slide-32
SLIDE 32

Probabilistic Independence

Independence: when events can occur and not impact the probability of

  • ther events

Formally: p(x,y) = p(x)*p(y) Generalizable to > 2 random variables

Q: Are X and Y independent?

p(x,y) Y=0 Y=1 X=“cat” .04 .32 X=“dog” .2 .04 X=“bird” .1 .1 X=“human” .1 .1

A: No (find the marginal probabilities of p(x) and p(y))

slide-33
SLIDE 33

Probability Prerequisites

Basic probability axioms and definitions Joint probability Probabilistic Independence Marginal probability Definition of conditional probability Bayes rule Probability chain rule Expected Value (of a function) of a Random Variable

slide-34
SLIDE 34

Marginal(ized) Probability: The Discrete Case

y x1 & y x2 & y x3 & y x4 & y Consider the mutually exclusive ways that different values of x could occur with y

Q: How do write this in terms of joint probabilities?

slide-35
SLIDE 35

Marginal(ized) Probability: The Discrete Case

y x1 & y x2 & y x3 & y x4 & y

𝑞 𝑧 = ෍

𝑦

𝑞(𝑦, 𝑧)

Consider the mutually exclusive ways that different values of x could occur with y

slide-36
SLIDE 36

Probability Prerequisites

Basic probability axioms and definitions Joint probability Probabilistic Independence Marginal probability Definition of conditional probability Bayes rule Probability chain rule Expected Value (of a function) of a Random Variable

slide-37
SLIDE 37

Conditional Probability

𝑞 𝑌 𝑍) = 𝑞(𝑌, 𝑍) 𝑞(𝑍)

Conditional Probabilities are Probabilities

slide-38
SLIDE 38

Conditional Probability

𝑞 𝑌 𝑍) = 𝑞(𝑌, 𝑍) 𝑞(𝑍) 𝑞 𝑍 = marginal probability of Y

slide-39
SLIDE 39

Conditional Probability

𝑞 𝑌 𝑍) = 𝑞(𝑌, 𝑍) 𝑞(𝑍) 𝑞 𝑍 = ෍

𝑦

𝑞(𝑌 = 𝑦, 𝑍)

slide-40
SLIDE 40

Revisiting Marginal Probability: The Discrete Case

y x1 & y x2 & y x3 & y x4 & y

𝑞 𝑧 = ෍

𝑦

𝑞(𝑦, 𝑧) = ෍

𝑦

𝑞 𝑦 𝑞 𝑧 𝑦)

slide-41
SLIDE 41

Probability Prerequisites

Basic probability axioms and definitions Joint probability Probabilistic Independence Marginal probability Definition of conditional probability Bayes rule Probability chain rule Expected Value (of a function) of a Random Variable

slide-42
SLIDE 42

Deriving Bayes Rule

Start with conditional p(X | Y)

slide-43
SLIDE 43

Deriving Bayes Rule

𝑞 𝑌 𝑍) = 𝑞(𝑌, 𝑍) 𝑞(𝑍)

Solve for p(x,y)

slide-44
SLIDE 44

Deriving Bayes Rule

𝑞 𝑌 𝑍) = 𝑞 𝑍 𝑌) ∗ 𝑞(𝑌) 𝑞(𝑍)

𝑞 𝑌 𝑍) = 𝑞(𝑌, 𝑍) 𝑞(𝑍)

Solve for p(x,y)

𝑞 𝑌, 𝑍 = 𝑞 𝑌 𝑍)𝑞(𝑍)

p(x,y) = p(y,x)

slide-45
SLIDE 45

Bayes Rule

𝑞 𝑌 𝑍) = 𝑞 𝑍 𝑌) ∗ 𝑞(𝑌) 𝑞(𝑍)

posterior probability likelihood prior probability marginal likelihood (probability)

slide-46
SLIDE 46

Probability Prerequisites

Basic probability axioms and definitions Joint probability Probabilistic Independence Marginal probability Definition of conditional probability Bayes rule Probability chain rule Expected Value (of a function) of a Random Variable

slide-47
SLIDE 47

Probability Chain Rule

𝑞 𝑦1, 𝑦2 = 𝑞 𝑦1 𝑞 𝑦2 𝑦1)

Bayes rule

slide-48
SLIDE 48

Probability Chain Rule

𝑞 𝑦1, 𝑦2, … , 𝑦𝑇 = 𝑞 𝑦1 𝑞 𝑦2 𝑦1)𝑞 𝑦3 𝑦1, 𝑦2) ⋯ 𝑞 𝑦𝑇 𝑦1, … , 𝑦𝑗

slide-49
SLIDE 49

Probability Chain Rule

𝑞 𝑦1, 𝑦2, … , 𝑦𝑇 = 𝑞 𝑦1 𝑞 𝑦2 𝑦1)𝑞 𝑦3 𝑦1, 𝑦2) ⋯ 𝑞 𝑦𝑇 𝑦1, … , 𝑦𝑗 = ෑ

𝑗 𝑇

𝑞 𝑦𝑗 𝑦1, … , 𝑦𝑗−1)

slide-50
SLIDE 50

Probability Chain Rule

𝑞 𝑦1, 𝑦2, … , 𝑦𝑇 = 𝑞 𝑦1 𝑞 𝑦2 𝑦1)𝑞 𝑦3 𝑦1, 𝑦2) ⋯ 𝑞 𝑦𝑇 𝑦1, … , 𝑦𝑗 = ෑ

𝑗 𝑇

𝑞 𝑦𝑗 𝑦1, … , 𝑦𝑗−1)

extension of Bayes rule

slide-51
SLIDE 51

Probability Prerequisites

Basic probability axioms and definitions Joint probability Probabilistic Independence Marginal probability Definition of conditional probability Bayes rule Probability chain rule Common distributions Expected Value (of a function) of a Random Variable

slide-52
SLIDE 52

Expected Value of a Random Variable

𝑌 ~ 𝑞 ⋅

random variable

slide-53
SLIDE 53

Expected Value of a Random Variable

𝑌 ~ 𝑞 ⋅ 𝔽 𝑌 = ෍

𝑦

𝑦 𝑞 𝑦

random variable expected value (distribution p is implicit)

slide-54
SLIDE 54

Expected Value: Example

1 2 3 4 5 6

uniform distribution of number of cats I have

1/6 * 1 + 1/6 * 2 + 1/6 * 3 + 1/6 * 4 + 1/6 * 5 + 1/6 * 6 = 3.5 𝔽 𝑌 = ෍

𝑦

𝑦 𝑞 𝑦

slide-55
SLIDE 55

Expected Value: Example 2

1 2 3 4 5 6

non-uniform distribution of number of cats a normal cat person has

1/2 * 1 + 1/10 * 2 + 1/10 * 3 + 1/10 * 4 + 1/10 * 5 + 1/10 * 6 = 2.5 𝔽 𝑌 = ෍

𝑦

𝑦 𝑞 𝑦

slide-56
SLIDE 56

Expected Value of a Function of a Random Variable

𝑌 ~ 𝑞 ⋅ 𝔽 𝑌 = ෍

𝑦

𝑦 𝑞(𝑦) 𝔽 𝑔(𝑌) =? ? ?

slide-57
SLIDE 57

Expected Value of a Function of a Random Variable

𝑌 ~ 𝑞 ⋅ 𝔽 𝑌 = ෍

𝑦

𝑦 𝑞(𝑦) 𝔽 𝑔(𝑌) = ෍

𝑦

𝑔(𝑦) 𝑞 𝑦

slide-58
SLIDE 58

Expected Value of Function: Example

1 2 3 4 5 6

non-uniform distribution of number of cats I start with

What if each cat magically becomes two? 𝑔 𝑙 = 2𝑙 𝔽 𝑔(𝑌) = ෍

𝑦

𝑔(𝑦) 𝑞 𝑦

slide-59
SLIDE 59

Expected Value of Function: Example

1 2 3 4 5 6

non-uniform distribution of number of cats I start with

1/2 * 21 + 1/10 * 22 + 1/10 * 23 + 1/10 * 24 + 1/10 * 25 + 1/10 * 26 = 13.4 What if each cat magically becomes two? 𝑔 𝑙 = 2𝑙 𝔽 𝑔(𝑌) = ෍

𝑦

𝑔(𝑦) 𝑞 𝑦 = ෍

𝑦

2𝑦𝑞(𝑦)

slide-60
SLIDE 60

Probability Prerequisites

Basic probability axioms and definitions Joint probability Probabilistic Independence Marginal probability Definition of conditional probability Bayes rule Probability chain rule Expected Value (of a function) of a Random Variable