Probability Marc H. Mehlman marcmehlman@yahoo.com University of - - PowerPoint PPT Presentation

probability
SMART_READER_LITE
LIVE PREVIEW

Probability Marc H. Mehlman marcmehlman@yahoo.com University of - - PowerPoint PPT Presentation

Probability Marc H. Mehlman marcmehlman@yahoo.com University of New Haven The theory of probabilities is at bottom nothing but common sense reduced to calculus. Laplace, Th eorie analytique des probabilit es, 1820 Baseball is


slide-1
SLIDE 1

Marc Mehlman

Probability

Marc H. Mehlman

marcmehlman@yahoo.com

University of New Haven

“The theory of probabilities is at bottom nothing but common sense reduced to calculus. – Laplace, Th´ eorie analytique des probabilit´ es, 1820 “Baseball is 90 percent mental. The other half is physical.” – Yogi Berra

Marc Mehlman (University of New Haven) Probability 1 / 34

slide-2
SLIDE 2

Marc Mehlman

Table of Contents

1

Probability Models

2

Independence and Conditioning

3

Risks and Odds

4

Counting

Marc Mehlman (University of New Haven) Probability 2 / 34

slide-3
SLIDE 3

Marc Mehlman

Probability Models

Probability Models

Probability Models

Marc Mehlman (University of New Haven) Probability 3 / 34

slide-4
SLIDE 4

Marc Mehlman

Probability Models

Chance behavior is unpredictable in the short run, but has a regular and predictable pattern in the long run.

4

The Language of Probability

We call a phenomenon random if individual outcomes are uncertain but there is nonetheless a regular distribution of

  • utcomes in a large number of repetitions.

The probability of any outcome of a chance process is the proportion of times the outcome would occur in a very long series

  • f repetitions.

We call a phenomenon random if individual outcomes are uncertain but there is nonetheless a regular distribution of

  • utcomes in a large number of repetitions.

The probability of any outcome of a chance process is the proportion of times the outcome would occur in a very long series

  • f repetitions.

Marc Mehlman (University of New Haven) Probability 4 / 34

slide-5
SLIDE 5

Marc Mehlman

Probability Models

7

Probability Models

Descriptions of chance behavior contain two parts: a list of possible

  • utcomes and a probability for each outcome.

The sample space S of a chance process is the set of all possible outcomes. An event is an outcome or a set of outcomes of a random

  • phenomenon. That is, an event is a subset of the sample

space. A probability model is a description of some chance process that consists of two parts: a sample space S and a probability for each outcome. The sample space S of a chance process is the set of all possible outcomes. An event is an outcome or a set of outcomes of a random

  • phenomenon. That is, an event is a subset of the sample

space. A probability model is a description of some chance process that consists of two parts: a sample space S and a probability for each outcome.

Marc Mehlman (University of New Haven) Probability 5 / 34

slide-6
SLIDE 6

Marc Mehlman

Probability Models

Definition simple event = an occurance or outcome compound event = an event that is not simple. Example Ask random earth people what hour it is. sample space = hours of the day 1PM = a simple event AM = a compound event

Marc Mehlman (University of New Haven) Probability 6 / 34

slide-7
SLIDE 7

Marc Mehlman

Probability Models

Definition Given a sample space, S, a probability function. P, is a function that assigns to each event, A, a number, P(A), between 0 and 1, inclusive, that corresponds to the probability that event A occurs. Example “The probability of surviving the next five years after being diagnosed with pancreatic cancer is 0.07.” means that only seven percent of those diagnosed with pancreatic cancer are alive five years later. Here S = everyone (in the USA) diagonosed with pancreatic cancer A = the event of being alive five years after being diagonosed P(A) = the probability of being alive five years after being diagonosed. Note: P(impossible event) = 0 and P(certain event) = 1.

Marc Mehlman (University of New Haven) Probability 7 / 34

slide-8
SLIDE 8

Marc Mehlman

Probability Models

11

Finite Probability Models

One way to assign probabilities to events is to assign a probability to every individual outcome, then add these probabilities to find the probability of any event. This idea works well when there are only a finite (fixed and limited) number of outcomes. A probability model with a finite sample space is called finite. To assign probabilities in a finite model, list the probabilities of all the individual outcomes. These probabilities must be numbers between 0 and 1 that add to exactly 1. The probability

  • f any event is the sum of the probabilities of the outcomes

making up the event. A probability model with a finite sample space is called finite. To assign probabilities in a finite model, list the probabilities of all the individual outcomes. These probabilities must be numbers between 0 and 1 that add to exactly 1. The probability

  • f any event is the sum of the probabilities of the outcomes

making up the event.

Definition (Equally Likely Outcome Probability) Given a probability model, if there are only a finite number of outcomes and each outcome is equally likely, the probability of any event A is P(A) def = # outcomes in A # possible outcomes in S .

Marc Mehlman (University of New Haven) Probability 8 / 34

slide-9
SLIDE 9

Marc Mehlman

Probability Models

8

Probability Models

Sample Space 36 Outcomes Sample Space 36 Outcomes Since the dice are fair, each outcome is equally likely. Each outcome has probability 1/36. Since the dice are fair, each outcome is equally likely. Each outcome has probability 1/36.

Example: Give a probability model for the chance process of rolling two fair, six- sided dice―one that’s red and one that’s green.

P(roll a 10) = P({(6, 4), (5, 5), (4, 6)}) = 3 36 = 1 12.

Marc Mehlman (University of New Haven) Probability 9 / 34

slide-10
SLIDE 10

Marc Mehlman

Probability Models

Definition A trial is a procedure that leads outcome. The sample space associated with trial(s) is the set of possible outcomes. A series of trails are independent if and

  • nly if the outcome of one trail gives no clue to what the outcome of any other

trail is. Definition (Relative Frequency (or Empirical) Probability) Consider the proportion of times an event occurs in a series of independent trails. That proportion approaches the relative frequency probability of an event

  • ccurring as the number of independent trails increase. For a large number of

trials: P(A) ≈ # times A occurs # times procedure was done Definition Subjective Probabilities are probabilities assigned using knowledge of circumstances.

Marc Mehlman (University of New Haven) Probability 10 / 34

slide-11
SLIDE 11

Marc Mehlman

Probability Models

Theorem (Law of Large Numbers) As the number of trials, n, increases, the relative frequency probability of an event tends to approach the actual probability. Thus # A occurs n → P(A) as n ↑ ∞. where P(A) is a actual probability of A occurring. If the number of trials is large enough, one can be confident that relative frequency probability is a good estimate of the actual probability.

Marc Mehlman (University of New Haven) Probability 11 / 34

slide-12
SLIDE 12

Marc Mehlman

Probability Models

9

Probability Rules

1. Any probability is a number between 0 and 1. 2. All possible outcomes together must have probability 1. 3. If two events have no outcomes in common, the probability that

  • ne or the other occurs is the sum of their individual probabilities.

4. The probability that an event does not occur is 1 minus the probability that the event does occur. 1. Any probability is a number between 0 and 1. 2. All possible outcomes together must have probability 1. 3. If two events have no outcomes in common, the probability that

  • ne or the other occurs is the sum of their individual probabilities.

4. The probability that an event does not occur is 1 minus the probability that the event does occur. Rule 1. The probability P(A) of any event A satisfies 0 ≤ P(A) ≤ 1. Rule 2. If S is the sample space in a probability model, then P(S) = 1. Rule 3. If A and B are disjoint, P(A or B) = P(A) + P(B). This is the addition rule for disjoint events. Rule 4: The complement of any event A is the event that A does not

  • ccur, written AC. P(AC) = 1 – P(A).

Rule 1. The probability P(A) of any event A satisfies 0 ≤ P(A) ≤ 1. Rule 2. If S is the sample space in a probability model, then P(S) = 1. Rule 3. If A and B are disjoint, P(A or B) = P(A) + P(B). This is the addition rule for disjoint events. Rule 4: The complement of any event A is the event that A does not

  • ccur, written AC. P(AC) = 1 – P(A).

Marc Mehlman (University of New Haven) Probability 12 / 34

slide-13
SLIDE 13

Marc Mehlman

Probability Models

10

Probability Rules

Distance-learning courses are rapidly gaining popularity among college

  • students. Randomly select an undergraduate student who is taking

distance-learning courses for credit and record the student’s age. Here is the probability model:

(a) Show that this is a legitimate probability model. (b) Find the probability that the chosen student is not in the traditional college age group (18 to 23 years). Each probability is between 0 and 1 and 0.57 + 0.17 + 0.14 + 0.12 = 1 P(not 18 to 23 years) = 1 – P(18 to 23 years) = 1 – 0.57 = 0.43

Marc Mehlman (University of New Haven) Probability 13 / 34

slide-14
SLIDE 14

Marc Mehlman

Probability Models

12

Venn Diagrams

Sometimes it is helpful to draw a picture to display relations among several

  • events. A picture that shows the sample space S as a rectangular area and

events as areas within S is called a Venn diagram.

Two disjoint events:

Two events that are not disjoint, and the event {A and B} consisting

  • f the outcomes they have in

common:

Marc Mehlman (University of New Haven) Probability 14 / 34

slide-15
SLIDE 15

Marc Mehlman

Probability Models

34

The General Addition Rule

Addition Rule for Unions of Two Events For any two events A and B: P(A or B) = P(A) + P(B) – P(A and B) Addition Rule for Unions of Two Events For any two events A and B: P(A or B) = P(A) + P(B) – P(A and B) Addition Rule for Disjoint Events If A, B, and C are disjoint in the sense that no two have any in common, then: P(A or B) = P(A) + P(B) Addition Rule for Disjoint Events If A, B, and C are disjoint in the sense that no two have any in common, then: P(A or B) = P(A) + P(B)

Marc Mehlman (University of New Haven) Probability 15 / 34

slide-16
SLIDE 16

Marc Mehlman

Independence and Conditioning

Independence and Conditioning

Independence and Conditioning

Marc Mehlman (University of New Haven) Probability 16 / 34

slide-17
SLIDE 17

Marc Mehlman

Independence and Conditioning

13

Multiplication Rule for Independent Events

If two events A and B do not influence each other, and if knowledge about one does not change the probability of the other, the events are said to be independent of each other. Multiplication Rule for Independent Events Two events A and B are independent if knowing that one occurs does not change the probability that the other occurs. If A and B are independent: P(A and B) = P(A) × P(B) Multiplication Rule for Independent Events Two events A and B are independent if knowing that one occurs does not change the probability that the other occurs. If A and B are independent: P(A and B) = P(A) × P(B)

Marc Mehlman (University of New Haven) Probability 17 / 34

slide-18
SLIDE 18

Marc Mehlman

Independence and Conditioning

“. . .when you have eliminated the impossible, whatever remains, however improbably, must be the truth.” – Sherlock Holmes in the Sign of Four

36

Conditional Probability

The probability we assign to an event can change if we know that some

  • ther event has occurred. This idea is the key to many applications of

probability. When we are trying to find the probability that one event will happen under the condition that some other event is already known to have

  • ccurred, we are trying to determine a conditional probability.

The probability that one event happens given that another event is already known to have happened is called a conditional probability. When P(A) > 0, the probability that event B happens given that event A has happened is found by: The probability that one event happens given that another event is already known to have happened is called a conditional probability. When P(A) > 0, the probability that event B happens given that event A has happened is found by: ) ( ) and ( ) | ( A P B A P A B P =

Marc Mehlman (University of New Haven) Probability 18 / 34

slide-19
SLIDE 19

Marc Mehlman

Independence and Conditioning

37

The General Multiplication Rule

The probability that events A and B both occur can be found using the general multiplication rule: P(A and B) = P(A) • P(B | A) where P(B | A) is the conditional probability that event B occurs given that event A has already occurred. The probability that events A and B both occur can be found using the general multiplication rule: P(A and B) = P(A) • P(B | A) where P(B | A) is the conditional probability that event B occurs given that event A has already occurred.

The definition of conditional probability reminds us that in principle all probabilities, including conditional probabilities, can be found from the assignment of probabilities to events that describe a random

  • phenomenon. The definition of conditional probability then turns into

a rule for finding the probability that both of two events occur.

Note: Two events A and B that both have positive probability are independent if: P(B|A) = P(B) Note: Two events A and B that both have positive probability are independent if: P(B|A) = P(B)

Marc Mehlman (University of New Haven) Probability 19 / 34

slide-20
SLIDE 20

Marc Mehlman

Independence and Conditioning

Bayes’ Theorem

Theorem (Bayes’ Theorem) The probability of event A occurring given that event B has already occurred is P(A|B) = P(A)P(B|A) P(A)P(B|A) + P( ¯ A)P(B| ¯ A). Proof. P(B) = P(B and A) + P(B and ¯ A) = P(A)P(B|A) + P( ¯ A)P(B| ¯ A). Thus P(A|B) = P(A and B) P(B) = P(A)P(B|A) P(A)P(B|A) + P( ¯ A)P(B| ¯ A).

Marc Mehlman (University of New Haven) Probability 20 / 34

slide-21
SLIDE 21

Marc Mehlman

Independence and Conditioning

Example Consider all patients requiring surgery after sustaining an injury. Let B be the event the patient requires a blood transfusion and let A be the event the patient is on coumadin (Generic Name: warfarin). Assume P(A) = 1

4

P(B|A) = 2 3 and P(B|Ac) = 1 4. What is P(A|B) (what portion of the patients needing blood transfusions are on coumadin)? Sol: P(A|B) = P(A)P(B|A) P(A)P(B|A) + P(Ac)P(B|Ac) =

1 4( 2 3) 1 4( 2 3) + 3 4( 1 4) = 8

17.

Marc Mehlman (University of New Haven) Probability 21 / 34

slide-22
SLIDE 22

Marc Mehlman

Risks and Odds

Risks and Odds

Risks and Odds

Marc Mehlman (University of New Haven) Probability 22 / 34

slide-23
SLIDE 23

Marc Mehlman

Risks and Odds

Consider experiment with vaccine: disease no disease treatment a b control c d (entries are number of patients) Definition pt = incidence rate in treatment group = P(disease in treatment group) = a a + b . pc = incidence rate in control group = P(disease in control group) = c c + d . absolute risk reduction

def

= |pt − pc| =

  • a

a + b − c c + d

  • risk ratio

def

= pt pc .

Marc Mehlman (University of New Haven) Probability 23 / 34

slide-24
SLIDE 24

Marc Mehlman

Risks and Odds

Example disease no disease treatment 50 9,950 control 100 9,900 (entries are # of patients). Then absolute risk reduction =

  • 50

10, 000 − 100 10, 000

  • =

50 10, 000 = 0.005 risk ratio = 50/10, 000 100/10, 000 = 1 2. The absolute risk reduction indicates vaccine will not result in a huge reduction in the disease. The risk ratio indicates that one has half the chance of getting the disease with the vaccine than without.

Marc Mehlman (University of New Haven) Probability 24 / 34

slide-25
SLIDE 25

Marc Mehlman

Risks and Odds

Question: What if one has 1/100 th chance of getting a disease with a treatment than without. Should you get the treatment? Fact: People who wear steel helmets outside have 1/100 th death rate from meteorites as those who don’t (helmets more effective against meteorites than polio vaccine against polio). Conclusion: Need to consider: number needed to treat = 1 absolute risk reduction = 1

  • a

a+b − c c+d

  • = # treatments before, statistically speaking, one incident of disease is

prevented. 2,458 children vaccinated to prevent one case of polio (book) vs 974 million people helmeted to save one meteorite death.

Marc Mehlman (University of New Haven) Probability 25 / 34

slide-26
SLIDE 26

Marc Mehlman

Risks and Odds

Question: What if one has 1/100 th chance of getting a disease with a treatment than without. Should you get the treatment? Fact: People who wear steel helmets outside have 1/100 th death rate from meteorites as those who don’t (helmets more effective against meteorites than polio vaccine against polio). Conclusion: Need to consider: number needed to treat = 1 absolute risk reduction = 1

  • a

a+b − c c+d

  • = # treatments before, statistically speaking, one incident of disease is

prevented. 2,458 children vaccinated to prevent one case of polio (book) vs 974 million people helmeted to save one meteorite death.

Marc Mehlman (University of New Haven) Probability 25 / 34

slide-27
SLIDE 27

Marc Mehlman

Risks and Odds

Question: What if one has 1/100 th chance of getting a disease with a treatment than without. Should you get the treatment? Fact: People who wear steel helmets outside have 1/100 th death rate from meteorites as those who don’t (helmets more effective against meteorites than polio vaccine against polio). Conclusion: Need to consider: number needed to treat = 1 absolute risk reduction = 1

  • a

a+b − c c+d

  • = # treatments before, statistically speaking, one incident of disease is

prevented. 2,458 children vaccinated to prevent one case of polio (book) vs 974 million people helmeted to save one meteorite death.

Marc Mehlman (University of New Haven) Probability 25 / 34

slide-28
SLIDE 28

Marc Mehlman

Risks and Odds

Rates

Definition number of incidents per k def = a b

  • k

a = number of times something happens b = size of population under consideration k = multiplier number Example Assume 25 million of 250 million Americans get flu in 2009. then there is a rate of 25 million 250 million × 1, 000 = 100 incidents of flu per 1,000 Americans.

Marc Mehlman (University of New Haven) Probability 26 / 34

slide-29
SLIDE 29

Marc Mehlman

Risks and Odds

Example Mortality Rates crude # deaths before 1 year old # in pop × k infant # deaths before 1 year old # live births × k Example Fertility Rates crude # live births # in pop × k fertility rate # live births # women aged 15–44 × k Example Morbidity Rates incidence # reported (new) cases # in pop × k prevalence # with disease # in pop × k

Marc Mehlman (University of New Haven) Probability 27 / 34

slide-30
SLIDE 30

Marc Mehlman

Counting

Counting

Counting

Marc Mehlman (University of New Haven) Probability 28 / 34

slide-31
SLIDE 31

Marc Mehlman

Counting

Theorem (Counting Rule) Given a sequence of k events, event #1 n1 possible outcomes . . . . . . event #k nk possible outcomes then the number of possible outcomes for doing all k events is n1n2 · · · nk. Example Dressing in the morning one has 10 shirts to chose from 7 pants to chose from 3 shoes to chose from ⇒ 10 ∗ 7 ∗ 3 = 210 possible shirt/pant/shoe combinations.

Marc Mehlman (University of New Haven) Probability 29 / 34

slide-32
SLIDE 32

Marc Mehlman

Counting

Definition (Factorial) Define 0! def = 1 and if n is a positive integer, define n! def = n(n − 1) · · · 3 × 2 × 1. Example 5! = 5 × 4 × 3 × 2 × 1 = 120. Theorem (Factorial Rule) Given n objects, they can be ordered n! ways. Example There are 25! ways for 25 students to exit classroom, one at a time.

Marc Mehlman (University of New Haven) Probability 30 / 34

slide-33
SLIDE 33

Marc Mehlman

Counting

Theorem (Permutations (all objects distinct)) # ways of selecting r ordered objects from n distinct objects = nPr = n! (n − r)!. Example

25P5 = 25! 20! = # ways 5 students out of 25 can leave classroom, one at a

time.

Marc Mehlman (University of New Haven) Probability 31 / 34

slide-34
SLIDE 34

Marc Mehlman

Counting

Theorem (Permutations (some items same)) Assume one as n items of k different types and that one can not tell any two objects of the same type apart. n1 = items of type 1 n2 = items of type 2 . . . = . . . nk = items of type k n = n1 + n2 + · · · + nk, then the number of outcomes from ordering the items = n! n1!n2!n3! · · · nk!.

Marc Mehlman (University of New Haven) Probability 32 / 34

slide-35
SLIDE 35

Marc Mehlman

Counting

Example Make a train from 3 cattle cars 5 tankers 7 flat beds 6 passenger coaches. Then there are 21! 3!5!7!6! = 19, 554, 575, 040 ways to do this if one can not tell difference between cars of same type. (If

  • ne can tell, there are 21! ways)

Marc Mehlman (University of New Haven) Probability 33 / 34

slide-36
SLIDE 36

Marc Mehlman

Counting

Theorem (Combinations (order does not matter)) The number of ways to chose r objects from n distinct objects is

nCr =

n! (n − r)!r! = # ways of choosing r from n. Example There are 25C5 =

25! 20!5! = 53, 130 ways of making a team of 5 from 25

students.

Marc Mehlman (University of New Haven) Probability 34 / 34