Probability and Inference Dr. Jarad Niemi STAT 544 - Iowa State - PowerPoint PPT Presentation

Probability and Inference Dr. Jarad Niemi STAT 544 - Iowa State University January 23, 2019 Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 1 / 35

Outline Quick review of probability Kolmogorov’s axioms Bayes’ Rule Application to Down’s syndrome screening Bayesian statistics Condition on what is known Describe uncertainty using probability Exponential example What is probability? Frequency interpretation Personal belief Why or why not Bayesian? Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 2 / 35

Quick review of probability Set theory Events Definition The set, Ω , of all possible outcomes of a particular experiment is called the sample space for the experiment. Definition An event is any collection of possible outcomes of an experiment, that is, any subset of Ω (including Ω itself). Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 3 / 35

Quick review of probability Set theory Craps Craps: Ω = { (1 , 1) , (1 , 2) , . . . , (1 , 6) , (2 , 1) , (2 , 2) , . . . , (6 , 6) } Come-out roll win: the sum of the dice is 7 or 11 Come-out roll loss: the sum of the dice is 2, 3, or 12 Come-out roll establishes a point: the sum of the dice is 4, 5, 6, 8, 9, or 10 Events: the come-out roll wins the come-out roll loses the come-out roll establishes a point Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 4 / 35

Quick review of probability Set theory Pairwise disjoint Definition Two events A 1 and A 2 are disjoint (or mutually exclusive) if both A 1 and A 2 cannot occur simultaneously, i.e. A i ∩ A j = ∅ . The events A 1 , A 2 , . . . are pairwise disjoint (or mutually exclusive) if A i and A j cannot occur simultaneously for all i � = j , i.e. A 1 ∩ A 2 = ∅ . Craps pairwise disjoint examples: Win ( A 1 ), Loss ( A 2 ) Win ( A 1 ), Loss ( A 2 ), Point ( A 3 ) A 1 = (1 , 1) , A 2 = (1 , 2) , . . . , A 6 = (1 , 6) , A 7 = (2 , 1) , . . . , A 12 = (2 , 6) , . . . , A 36 = (6 , 6) Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 5 / 35

Quick review of probability Axioms of probability Kolmogorov’s axioms of probability Definition Given a sample space Ω and event space E , a probability is a function P : E → R that satisfies 1. P ( A ) ≥ 0 for any A ∈ E 2. P (Ω) = 1 3. If A 1 , A 2 , . . . ∈ E are pairwise disjoint, then P ( A 1 or A 2 or . . . ) = � ∞ i =1 P ( A i ) . Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 6 / 35

Quick review of probability Axioms of probability Craps come-out roll probabilities The following table provides the probability mass function for the sum of the two dice if we believe the probability of each elementary outcome is equal: Outcome 2 3 4 5 6 7 8 9 10 11 12 Sum Combinations 1 2 3 4 5 6 5 4 3 2 1 36 1 2 3 4 5 6 5 4 3 2 1 Probability 1 36 36 36 36 36 36 36 36 36 36 36 Craps probability examples: P(Win) = P(7 or 11) = 8/36 = 2/9 P(Loss) = P(2, 3, or 12) = 4/36 = 1/9 P(Point) = P(4, 5, 6, 8, 9 or 10) = 6/9 Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 7 / 35

Quick review of probability Axioms of probability Partition Definition A set of events, { A 1 , A 2 , . . . } , is a partition of the sample space Ω if and only if the events in { A 1 , A 2 , . . . } are pairwise disjoint and ∪ ∞ i =1 A i = Ω . Craps partition examples: Win ( A 1 ), Loss ( A 2 ), Point ( A 3 ) A 1 = (1 , 1) , A 2 = (1 , 2) , . . . , A 6 = (1 , 6) , A 7 = (2 , 1) , . . . , A 12 = (2 , 6) , . . . , A 36 = (6 , 6) Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 8 / 35

Quick review of probability Conditional probability Conditional probability Definition If A and B are events in E , and P ( B ) > 0 , then the conditional probability of A given B, written P ( A | B ) , is P ( A | B ) = P ( A and B ) P ( B ) Example (Craps conditional probability) P (7 | Win ) = P (7 and Win ) P ( Win ) = 6 / 36 P (7) 8 / 36 = 6 = P ( Win ) 8 Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 9 / 35

Quick review of probability Conditional probability Law of Total Probability Corollary (Law of Total Probability) Let A 1 , A 2 , . . . be a partition of Ω and B is another event in Ω . The Law of Total Probability states that ∞ ∞ � � P ( B ) = P ( B and A i ) = P ( B | A i ) P ( A i ) . i =1 i =1 Example (Craps Win Probability) Let A i be the event that the sum of two die rolls is i . Then 12 P ( Win and A i ) = P (7) + P (11) = 6 36 + 2 36 = 8 36 = 2 � P ( Win ) = 9 . i =2 Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 10 / 35

Quick review of probability Bayes’ Rule Bayes’ Rule Theorem (Bayes’ Rule) If A and B are events in E with P ( B ) > 0 , then Bayes’ Rule states P ( A | B ) = P ( B | A ) P ( A ) P ( B | A ) P ( A ) = P ( B | A ) P ( A ) + P ( B | A c ) P ( A c ) P ( B ) Example (Craps Bayes’ Rule) P (7 | Win ) = P ( Win | 7) P (7) = 1 · P (7) P ( Win ) = 6 / 36 8 / 36 = 6 P ( Win ) 8 Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 11 / 35

Quick review of probability Application to Down Syndrome screening Down Syndrome screening If a pregnant woman has a test for Down syndrome and it is positive, what is the probability that the child will have Down syndrome? Let D indicate a child with Down syndrome and D c the opposite. Let ‘+’ indicate a positive test result and − a negative result. sensitivity = P (+ | D ) = 0 . 94 = P ( −| D c ) = 0 . 77 specificity prevalence = P ( D ) = 1 / 1000 = P (+ | D ) P ( D ) P (+ | D ) P ( D ) 0 . 94 · 0 . 001 P ( D | +) = P (+ | D ) P ( D )+ P (+ | D c ) P ( D c ) = P (+) 0 . 94 · 0 . 001+0 . 23 · 0 . 999 ≈ 1 / 250 P ( D |− ) ≈ 1 / 10 , 000 Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 12 / 35

Bayesian statistics A Bayesian statistician Let y be the data we will collect from an experiment, K be everything we know for certain about the world (aside from y ), and θ be anything we don’t know for certain. My definition of a Bayesian statistician is an individual who makes decisions based on the probability distribution of those things we don’t know conditional on what we know, i.e. p ( θ | y, K ) . Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 13 / 35

Bayesian statistics Bayesian statistics (with explicit conditioning) Parameter estimation: p ( θ | y, M ) where M is a model with parameter (vector) θ and y is data assumed to come from model M with true parameter θ 0 . Hypothesis testing/model comparison: p ( M j | y, M ) where M is a set of models with M j ∈ M for i = 1 , 2 , . . . and y is data assumed to come from some model M 0 ∈ M . Prediction: y | y, M ) p (˜ where ˜ y is unobserved data and y and ˜ y are both assumed to come from M . Alternatively, y | y, M ) p (˜ y are both assumed to come from some M 0 ∈ M . where y and ˜ Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 14 / 35

Bayesian statistics Bayesian statistics (with implicit conditioning) Parameter estimation: p ( θ | y ) where θ is the unknown parameter (vector) and y is the data. Hypothesis testing/model comparison: p ( M j | y ) where M j is one of a set of models under consideration and y is data assumed to come from one of those models. Prediction: p (˜ y | y ) where ˜ y is unobserved data and y and ˜ y are both assumed to come from the same (set of) model(s). Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 15 / 35

Bayesian statistics Bayes’ Rule Bayes’ Rule applied to a partition P = { A 1 , A 2 , . . . } , P ( A i | B ) = P ( B | A i ) P ( A i ) P ( B | A i ) P ( A i ) = � ∞ P ( B ) i =1 P ( B | A i ) P ( A i ) Bayes’ Rule also applies to probability density (or mass) functions, e.g. p ( θ | y ) = p ( y | θ ) p ( θ ) p ( y | θ ) p ( θ ) = � p ( y ) p ( y | θ ) p ( θ ) dθ where the integral plays the role of the sum in the previous statement. Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 16 / 35

Bayesian statistics Parameter estimation Parameter estimation Let y be data from some model with unknown parameter θ . Then p ( θ | y ) = p ( y | θ ) p ( θ ) p ( y | θ ) p ( θ ) = � p ( y ) p ( y | θ ) p ( θ ) dθ and we use the following terminology Terminology Notation p ( θ | y ) Posterior Prior p ( θ ) p ( y | θ ) Model Prior predictive distribution p ( y ) (marginal likelihood) If θ is discrete (continuous), then p ( θ ) and p ( θ | y ) are probability mass (density) functions. If y is discrete (continuous), then p ( y | θ ) and p ( y ) are probability mass (density) functions. Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 17 / 35

Bayesian statistics Example: exponential model Example: exponential model Let Y | θ ∼ Exp ( θ ) , then this defines the likelihood, i.e. p ( y | θ ) = θe − θy . Let’s assume a convenient prior θ ∼ Ga ( a, b ) , then b a Γ( a ) θ a − 1 e − bθ . p ( θ ) = The prior predictive distribution is b a � Γ( a + 1) p ( y ) = p ( y | θ ) p ( θ ) dθ = ( b + y ) a +1 . Γ( a ) The posterior is = ( b + y ) a +1 p ( θ | y ) = p ( y | θ ) p ( θ ) Γ( a + 1) θ a +1 − 1 e − ( b + y ) θ , p ( y ) thus θ | y ∼ Ga ( a + 1 , b + y ) . Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 18 / 35

Probability and Inference Dr. Jarad Niemi STAT 544 - Iowa State - PowerPoint PPT Presentation

Probability and Inference Dr. Jarad Niemi STAT 544 - Iowa State University January 23, 2019 Jarad Niemi (STAT544@ISU) Probability and Inference January 23, 2019 1 / 35 Outline Quick review of probability Kolmogorovs axioms Bayes

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Probability Basics Probabilistic Inference Martin Emms October 1, 2020 Probability Basics

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Counting and Probability Whats to come? Counting and Probability Whats to come?

Unit 2: Probability and distributions Lecture 1: Probability and conditional probability

Which probability Which probability Which probability Which probability theory for cosmology?

Recap of Basic Probability Elements of basic probability theory probability theory The

1 2 3 4 Stopping Probability Visiting Probability 5 Stopping

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Chapter 1: Probability Theory (a recap) STK4011/9011: Statistical Inference Theory Johan Pensar

Probability and Random Processes Lecture 7 Conditional probability and expectation

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Product Rule,

Non-Archimedean Probability and Conditional Probability; ManyVal2013 Prague 2013 F.Montagna,

Making Employment a Reality: Were All Responsible Derek Nord, PhD, FAAIDD Director and

Listening Sessions: Impact of the Coronavirus Pandemic on the Disability Community Michael

COVID-19 and ID IDD for Professionals ls Matthew P. Janicki, Ph.D. University of Illinois at

Conversations About Down Syndrome I have nothing to disclose. My husband, Chris Small,

Potty Training in Potty Training in Potty Training in Potty Training in Four Days Four Days

Discovery and Analysis of Regulatory Regions in the Human Genome Wyeth Wasserman Centre for

The Avatar project: Improving embedded security with SE, KLEE and Qemu

University of Kentucky College of College of Communication and Information Strategic Planning