Laboratorio de Ciberseguridad Probability, Random Processes and - - PowerPoint PPT Presentation

laboratorio de ciberseguridad probability random
SMART_READER_LITE
LIVE PREVIEW

Laboratorio de Ciberseguridad Probability, Random Processes and - - PowerPoint PPT Presentation

INSTITUTO POLITCNICO NACIONAL CENTRO DE INVESTIGACION EN COMPUTACION Laboratorio de Ciberseguridad Probability, Random Processes and Inference Dr. Ponciano Jorge Escamilla Ambrosio pescamilla@cic.ipn.mx http://www.cic.ipn.mx/~pescamilla/


slide-1
SLIDE 1

INSTITUTO POLITÉCNICO NACIONAL CENTRO DE INVESTIGACION EN COMPUTACION

Probability, Random Processes and Inference

  • Dr. Ponciano Jorge Escamilla Ambrosio

pescamilla@cic.ipn.mx http://www.cic.ipn.mx/~pescamilla/

Laboratorio de Ciberseguridad

slide-2
SLIDE 2

CIC

❑ Instructor ➢ Dr. Ponciano Jorge Escamilla-Ambrosio ➢ pescamilla@cic.ipn.mx ➢ http://www.cic.ipn.mx/~pescamilla/ ❑ Class meetings ➢ Mondays and Wednesdays 10:00 – 12:00 hrs. ➢ Classroom Aula A2

2

Pr Probabil

  • babilit

ity, , Random Random Processes Processes and and Inference Inference

slide-3
SLIDE 3

CIC

❑ Course web site: ➢ http://www.cic.ipn.mx/~pescamilla/academy.html ➢ Reading material, homework exercises, etc.

3

Course Course web web site site

slide-4
SLIDE 4

CIC

The student will learn the fundamentals of probability theory: probabilistic models, discrete and continuous random variables, multiple random variables and limit theorems as well as an introduction to more advanced topics such as random processes and statistical inference. At the end of the course the student will be able to develop and analyse probabilistic models in a manner that combines intuitive understanding and mathematical precision.

4

Course Course Objective Objective

slide-5
SLIDE 5

CIC

5

Course Course content content

  • 1. Probability

1.1. What is Probability? 1.1.1. Statistical Probability 1.1.2. Probability as a Measure of Uncertainty 1.2. Sample Space and Probability 1.2.1. Probabilistic Models 1.2.2. Conditional Probability 1.2.3. Total Probability Theorem and Bayes’ Rule 1.2.4. Independence 1.2.5. Counting 1.2.6. The probabilistic Method

slide-6
SLIDE 6

CIC

6

Course Course content content

1.3. Discrete Random Variables 1.3.1. Basic Concepts 1.3.2. Probability Mass Functions 1.3.3. Functions of Random Variables 1.3.4. Expectation and Variance 1.3.5. Joint PMFs of Multiple Random Variables 1.3.6. Conditioning 1.3.7. Independence

slide-7
SLIDE 7

CIC

7

Course Course content content

1.4. General Random Variables 1.4.1. Continuous Random Variables and PDFs 1.4.2. Cumulative Distribution Function 1.4.3. Normal Random Variables 1.4.4. Joint PDFs of Multiple Random Variables 1.4.5. Conditioning 1.4.6. The Continuous Bayes’ Rule 1.4.7. The Strong Law of Large Numbers

slide-8
SLIDE 8

CIC

8

Course Course content content

  • 2. Introduction to Random Processes

2.1. Markov Chains 2.1.1. Discrete Time Markov Chains 2.1.2. Classification of States 2.1.3. Steady State Behavior 2.1.4. Absorption Probabilities and Expected Time to Absorption 2.1.5. Continuous Time Markov Chains 2.1.6. Ergodic Theorem for Discrete Markov Chains 2.1.7. Markov Chain Montecarlo Method 2.1.8.Queuing Theory

slide-9
SLIDE 9

CIC

9

Course Course content content

  • 3. Statistics

3.2. Classical Statistical Inference 3.2.1. Classical Parameter Estimation 3.2.2. Linear Regression 3.2.3. Analysis of Variance and Regression 3.2.4. Binary Hypothesis Testing 3.2.5. Significance Testing

slide-10
SLIDE 10

CIC

10

Course Course Schedule Schedule A-20 20

https://www.cic.ipn.mx/~pescamilla/Curse_schedule_PRPI-Y-A.html

slide-11
SLIDE 11

CIC

11

Cour Course se text text books books

Joseph Blitzstein, Jessica Hwang. Introduction to probability, CRC Press 2014. https://www.crcpress.com/Introduction-to-Probability/Blitzstein- Hwang/9781466575578 Dimitri P. Bertsekas and John N. Tsitsiklis. Introduction to probability, 2nd Edition, Athena Scientific, 2008. http://athenasc.com/probbook.html

slide-12
SLIDE 12

CIC

12

Cour Course se text text books books

Géza Schay, Introduction to probability with statistical applications, Birkhauser, Boston, 2007. http://link.springer.com/book/10.1007/978-0-8176-4591-5 William Feller. An introduction to probability theory and its applications,

  • Vol. 1, 3rd Edition, Wiley, 1968.

http://www.wiley.com/WileyCDA/WileyTitle/productCd-0471257087.html

slide-13
SLIDE 13

CIC

❑ Midterm exam 30% ❑ Final exam 30% ❑ Homework assignments 20% ❑ Final project 20%

13

Grading Grading

slide-14
SLIDE 14

CIC

  • 1. What is Probability?

1.1.1. Statistical Probability 1.1.2. Probability as a Measure of Uncertainty

14

Probabilit Probability

slide-15
SLIDE 15

CIC

15

What What is is Pr Probabil

  • babilit

ity? y?

slide-16
SLIDE 16

CIC

❑ The relative is trying to use the concept of

probability to discuss an uncertain situation

❑ Luck, Coincidence, Randomness,

Uncertainty, Risk, Doubt, Fortune, Chance…

❑ Used in a vague, casual way! ❑ A first approach to define probability is in

terms of frequency of occurrence, as a percentage of success

16

What What is is Pr Probabil

  • babilit

ity? y?

slide-17
SLIDE 17

CIC

❑ For example, if we toss a coin, and observe

whether it lands head (H) or tail (T) up What is the probability of either result? Why?

17

What What is is Pr Probabil

  • babilit

ity? y?

slide-18
SLIDE 18

CIC

❑ Example: Flip a coin twice

18

What What is is Pr Probabil

  • babilit

ity? y? 𝑄 𝐵 = # 𝐺𝑏𝑤𝑝𝑠𝑏𝑐𝑚𝑓 𝑝𝑣𝑢𝑑𝑝𝑛𝑓𝑡 # 𝑄𝑝𝑡𝑡𝑗𝑐𝑚𝑓 𝑝𝑣𝑢𝑑𝑝𝑛𝑓𝑡

slide-19
SLIDE 19

CIC

❑ Definition 1 (Sample space and event). ➢ The sample space S of an experiment is the

set of all possible outcomes of an experiment.

➢ An event A is a subset of the sample space S,

and we say that A occurred if the actual

  • utcome is in A.

19

Sample Sample space space

slide-20
SLIDE 20

CIC

❑ Tossing twice a coin experiment, example:

S = ? A = ?

20

Sample Sample space space

slide-21
SLIDE 21

CIC

❑ “Probability is logical framework for

quantifying uncertainty and randomness”

[Blitzstein and Hwang, 2014]

❑ “Probability theory is a branch of

mathematics that deals with repetitive events whose occurrence or nonoccurrence is subject to chance variation.” [Schay, 2007]

21

What What is is Pr Probabil

  • babilit

ity? y?

slide-22
SLIDE 22

CIC

❑ Provides tools for understanding and

explaining variation, separating signal from noise, and modeling complex phenomena.

(engineer definition) 22

What What is is Pr Probabil

  • babilit

ity? y?

slide-23
SLIDE 23

CIC

❑ There are situation where the frequency

interpretation is not appropriate

❑ Example: A scholar asserts that the Iliad and

the Odyssey were composed by the same person, with probability 90%

❑ It is based on the scholar’s subjective belief

23

What What is is Pr Probabil

  • babilit

ity? y?

slide-24
SLIDE 24

CIC

❑ The theory of probability is useful in a broad

variety of contexts and applications:

❑ Statistics, Physics, Biology, Computer Science,

Meteorology, Gambling, Finance, Political Science, Medicine, Life.

❑ Assignment 1a: Give an example of the application

  • f probability theory in the area of your interest

❑ Assignment 1b: Read math review:

http://projects.iq.harvard.edu/files/stat110/files/math_review_ handout.pdf

24

What What is is Pr Probabil

  • babilit

ity? y?

slide-25
SLIDE 25

CIC

25

Probabilist Probabilistic ic Model Model

slide-26
SLIDE 26

CIC

❑ The sample space S, which is the set of all

possible outcomes of an experiment.

❑ The probability law, which assigns to a set A of

possible outcomes (also called an event) a nonnegative number P(A) (called the probability

  • f A) that encodes our knowledge or belief about

the collective “likelihood” of the elements of A. The probability law must satisfy certain properties.

26

Elements Elements of a Pr

  • f a Probabilistic
  • babilistic Model

Model

slide-27
SLIDE 27

CIC

❑ The experiment will produce exactly one out of

several possible outcomes.

❑ A subset of the sample space, that is, a collection

  • f possible outcomes, is called an event.

❑ It means that any collection of possible

  • utcomes, including the entire sample space S

and its complement, the empty set , may qualify as an event.

Strictly speaking, however, some sets have to be excluded. In particular when dealing with probabilistic models involving an uncountable infinite sample space, there are certain unusual subsets for which one cannot associate meaningful probabilities. 27

Experiments Experiments and events and events

slide-28
SLIDE 28

CIC

❑ There is no restriction on what constitutes an

experiment.

❑ The events to be considered can be described by

such statements as “a toss of a given coin results in head,” “a card drawn at random from a regular 52 card deck is an Ace,” or “this book is green.”

❑ Associated with each statement there is a set S of

possibilities, or possible outcomes.

28

Experiments Experiments and events and events

slide-29
SLIDE 29

CIC

Examples of experiments and events:

❑ Tossing a Coin. For a coin toss, S may be taken to consist of

two possible outcomes, which we may abbreviate as H and T for head and tail. We say that H and T are the members, elements or points of S, and write S = {H, T}.

❑ Tossing two coins but ignore one of them. In this case S =

{HH, HT, TH, TT}. In this case, for instance, the outcome “the first coin shows H” is represented by the set {HH, HT}, that is, this statement is true if we obtain HH or HT and false if we obtain TH or TT.

29

Experiments Experiments and events and events

slide-30
SLIDE 30

CIC

❑ Tossing a Coin Until an H is Obtained. If we toss a coin

until an H is obtained, we cannot say in advance how many tosses will be required, and so the natural sample space is S = {H, TH, TTH, TTTH, . . . }, an infinite set. We can use, of course, many other sample spaces as well, for instance, we may be interested only in whether we had to toss the coin more than twice or not, in which case S = {1 or 2, more than 2} is adequate.

❑ Selecting a Number from an Interval. Sometimes, we need

an uncountable set for a sample space. For instance, if the experiment consists of choosing a random number between 0 and 1, we may use S = {x : 0 < x < 1}.

30

Experiments Experiments and events and events

slide-31
SLIDE 31

CIC

❑ Specifies the “likelihood” of any outcome, or of

any set of possible outcomes.

❑ Assigns to every event A, a number P(A), called

the probability of A.

31

The The probability probability law law

slide-32
SLIDE 32

CIC

❑ Given a sample space S and a certain collection ℱ of its

subsets, called events, an assignment P of a number P(A) to each event A in ℱ is called a probability measure, and P(A) the probability of A, if P has the following properties:

  • 1. P(A) ≥ 0 for every A,
  • 2. P(S) = 1, and
  • 3. P(A1 ∪ A2 ∪· · · ) = P(A1)+ P(A2) + ·· · for any finite or

countably infinite set of mutually exclusive events A1, A2, … Then, the sample space S together with ℱ and P is called a probability space.

32

Pr Probabil

  • babilit

ity y Space Space

[Schay Schay 2007] 2007]

slide-33
SLIDE 33

CIC

33

Probabilit Probability y Axioms Axioms

[Ber Berts tsekas ekas and and Tsit Tsitsikli siklis, , 2008] 2008]

S P(S) = 1.

slide-34
SLIDE 34

CIC

❑ Definition 1.6.1 (General definition of probability). A

probability space consists of a sample space S and a probability function P which takes an event A  S as input and returns P(A), a real number between 0 and 1, as output. The function P must satisfy the following axioms:

  • 1. P() = 0, P(S) = 1.
  • 2. If A1, A2, . . . are disjoint events, then:

(Saying that these events are disjoint means that they are mutually exclusive: Ai ∩ Aj =  for i ≠ j.)

34

Pr Probabil

  • babilit

ity y Space Space

[Blit [Blitzstein zstein and Hwang, 2015] and Hwang, 2015]

slide-35
SLIDE 35

CIC

❑ The Probability of the Empty Set Is 0. In any

probability space, P(∅) = 0.

❑ Proof:

35

Proper Properties ties of probabilit

  • f probabilities

ies

1 = P(S) = P(S ∪ ) = P(S) + P() = 1 + P()

slide-36
SLIDE 36

CIC

❑ The Probability of the Union of Two Events.

For any two events A and B, P(A ∪ B) = P(A) + P(B) − P(A ∩ B) Proof:

36

Proper Properties ties of probabilit

  • f probabilities

ies

slide-37
SLIDE 37

CIC

❑ Probability of Complements. For any event A,

P(Ac) = 1 − P(A) Proof: Ac ∩ A = ∅ and Ac ∪ A = S by the definition

  • f Ac. Thus, by Axiom 3, P(S) = P(Ac ∪ A) = P(Ac)

+ P(A). Now, Axiom 2 says that P(S) = 1, and so, comparing these two values of P(S), we obtain P(Ac) + P(A) = 1.

37

Proper Properties ties of probabilit

  • f probabilities

ies

slide-38
SLIDE 38

CIC

❑ Probability of Subsets. If A ⊂ B,

then P(A) ≤ P(B). Proof:

38

Proper Properties ties of probabilit

  • f probabilities

ies

If A  B, then we can write B as the union of A and B ∩ Ac, where B ∩ Ac is the part of B not also in A. Since A and B ∩ Ac are disjoint, we can apply the second axiom: P(B) = P(A ∪ (B ∩ Ac)) = P(A) + P(B ∩ Ac)

Probability is nonnegative, so P(B ∩ Ac) ≥ 0, proving that P(B) ≥ P(A).

slide-39
SLIDE 39

CIC

❑ Inclusion-exclusion. For any events A1, . . . ,An,

39

Proper Properties ties of probabilit

  • f probabilities

ies

slide-40
SLIDE 40

CIC

❑ Example:

40

Proper Properties ties of probabilit

  • f probabilities

ies

slide-41
SLIDE 41

CIC

41

Proper Properties ties of Pr

  • f Probability
  • bability Laws

Laws

slide-42
SLIDE 42

CIC

42

Discret Discrete e Probabilit Probability y Law Law

slide-43
SLIDE 43

CIC

❑ In the special case where the probabilities P(s1),

…, P(sn) are all the same, by necessity equal to 1/n, in view of the normalization axiom, we

  • btain:

43

Discret Discrete e Uniform Uniform Probabilit Probability y Law Law

slide-44
SLIDE 44

CIC

44

Discret Discrete e Uniform Uniform Probabilit Probability y Law Law

slide-45
SLIDE 45

CIC

45

Discret Discrete e Uniform Uniform Probabilit Probability y Law Law

slide-46
SLIDE 46

CIC

❑ The calculation of probabilities often involves

counting the number of outcomes in various events.

➢ When the sample space S has finite number of equally likely

  • utcomes, so that the discrete uniform probability law applies. Then,

the probability of any event A is given by:

➢ When we want to calculate the probability of an event A with a finite

number of equally likely outcomes, each of which has an already known probability p. Then the probability of A is given by:

46

Counting Counting

𝑄 𝐵 =

number of elements of 𝐵 number of elements of 𝑇 = 𝑙 𝑜

𝑄 𝐵 = 𝑞 ∙ (number of elements of 𝐵)

slide-47
SLIDE 47

CIC

❑ In how many ways you can dress today if you

find:

➢ 4 shirts ➢ 3 ties ➢ 2 jackets

in your closet?

47

Basic Counting Principle

slide-48
SLIDE 48

CIC

❑ Consider a process that consists of r stages. Suppose

that:

a) There are n1 possible results at the firs stage. b) For every possible result at the first stage, there are n2 possible results at the second stage. c) More generally, for any sequence of possible results at the first i ˗ 1 stage, there are ni possible results at the ith stage. Then, the total number of possible results of the r-stage process is:

𝑜1𝑜2 ⋯ 𝑜𝑠

48

The The Multiplicat Multiplication ion Principle Principle

slide-49
SLIDE 49

CIC

49

The The Multiplicat Multiplication ion Principle Principle

slide-50
SLIDE 50

CIC

❑ Example 1. The number of telephone numbers. A

local telephone company number is a 7-digit sequence, but the first digit has to be different from 0 or 1. How many distinct telephone numbers are there?

50

The The Multiplicat Multiplication ion Principle Principle

slide-51
SLIDE 51

CIC

❑ Example 2. The number of subsets of an n-

element set. Consider an n-element set {s1, s2,…, sn}.

❑ How many subsets it have, including itself and

the empty set?

❑ Example, in the set {1,2,3}?

51

The The Multiplicat Multiplication ion Principle Principle

slide-52
SLIDE 52

CIC

❑ This is a sequential process where we take in turn

each of the n elements and decide whether to include it in the desired subset or not.

❑ Thus we have n steps, and in each step two choices,

namely yes or no to the question of whether the element belongs to the desired subset. Therefore the number of subsets is:

❑ for n = 1?

52

The The Multiplicat Multiplication ion Principle Principle

slide-53
SLIDE 53

CIC

❑ Example 3. Drawing three cards. What is the

number of ways three cards can be drawn one after the other from a regular 52 cards deck without replacement?

❑ What is this number if we replace each card

before the next one is drawn?

53

Number Number of subset

  • f subsets
slide-54
SLIDE 54

CIC

❑ Example 3. Drawing three cards. What is the

number of ways three cards can be drawn one after the other from a regular 52 cards deck without replacement?

❑ What is this number if we replace each card

before the next one is drawn?

54

Number Number of subset

  • f subsets

n1 = n2 = n3 = 52  523 n1 = 52, n2 = 51, n3 = 50  525150

slide-55
SLIDE 55

CIC

❑ Involve the selection of k objects out of a

collection of n objects.

❑ If the order of selection matters, the selection is

called a permutation.

❑ If the order of selection does not matter, the

selection is called a combination.

55

Permut Permutation ation and Combination and Combination

slide-56
SLIDE 56

CIC

k permutations

❑ Assume there are n distinct objects, and let k be

some positive integer with k  n.

❑ We want to count the number of different ways

that we can pick k out of these n objects and arrange them in a sequence, e.g. the number of distinct k-object sequences.

56

Permut Permutation ation

slide-57
SLIDE 57

CIC

➢ In place 1 we can put n objects, which we can write as

n−1+1;

➢ In place 2 we can put n−1 = n−2+1 objects; and so on.

❑ Thus the kth factor will be n − k + 1, and so, for any 2

positive integers n and k ≤ n: n(n − 1)(n − 2) · · · (n − k + 1) = Pn,k

❑ In the special case where k = n:

n(n − 1)(n − 2) · · · 3 · 2 · 1 = n!

❑ The number of possible sequences is simple called

permutations

57

Permut Permutation ation

slide-58
SLIDE 58

CIC

❑ From the definitions of n!, (n − k)! and Pn,k we

can obtain the following relation:

n! = [n(n − 1)(n − 2) · · · (n − k + 1)][(n − k)(n − k − 1) · · · 2 · 1]

= Pn,k · (n − k)!

❑ and so:

𝑄𝑜,𝑙 = 𝑜! 𝑜 − 𝑙 ! with 0! = 1.

58

Permut Permutation ation

slide-59
SLIDE 59

CIC

❑ Example 4. Six rolls of a die. Find the probability

that:

➢ Six rolls of a (six sided) die all give different numbers ➢ Assume all outcomes are equally likely

P(all six rolls give different numbers) = ?

59

Probabilit Probability y calculation calculation

𝑄 𝐵 =

number of elements of 𝐵 number of elements of 𝑇 = 𝑙 𝑜

𝑄 𝐵 = 𝑞 ∙ (number of elements of 𝐵)

p = probability of each equally likely outcome in A

slide-60
SLIDE 60

CIC

❑ Example 4. Six rolls of a die. Find the probability

that:

➢ Six rolls of a (six sided) die all give different numbers ➢ Assume all outcomes are equally likely

P(all six rolls give different numbers) = ?

60

Probabilit Probability y calculation calculation

𝑄 𝐵 =

number of elements of 𝐵 number of elements of 𝑇 = 𝑙 𝑜 = 𝑄6,6 # 𝑓𝑚𝑓𝑛𝑓𝑜𝑢𝑡 𝑗𝑜 𝑇 = 6! 66

𝑄 𝐵 = 𝑞 ∙ number of elements of 𝐵 =

1 66 6!

p = probability of each equally likely outcome in A

slide-61
SLIDE 61

CIC

❑ Example 5. Dealing Three Cards. In how many

ways can three cards be dealt from a regular deck

  • f 52 cards?

61

Permut Permutation ation

slide-62
SLIDE 62

CIC

❑ Example 5. Dealing Three Cards. In how many

ways can three cards be dealt from a regular deck

  • f 52 cards?

P52,3 = 𝑄𝑜,𝑙 =

𝑜! 𝑜−𝑙 !

= 52·51·50 = 132, 600.

62

Permut Permutation ation

slide-63
SLIDE 63

CIC

❑ Example 6. Birthday problem. There are k people

in a room. Assume each person’s birthday is equally likely to be any of the 365 days of the year (we exclude February 29), and that people’s birthdays are independent (we assume there are no twins in the room). What is the probability that two or more people in the group have the same birthday?

63

Permut Permutation ation

slide-64
SLIDE 64

CIC

❑ This amounts to sampling the 365 days of the year without

replacement, so: 365 · 364 · 363 · · · (365−k +1) for k  365 Therefore the probability of no birthday matches in a group of k people is: and the probability of at least one birthday match is:

64

Permut Permutation ation

slide-65
SLIDE 65

CIC

65

Permut Permutation ation

Probability that in a room of k people, at least two were born on the same day. This probability first exceeds 0.5 when k = 23.

slide-66
SLIDE 66

CIC

❑ The number of possible unordered selections of k different

things out of n different ones is denoted by Cn,k, and each such selection is called a combination of the given things.

❑ If we select k things out of n without regard to order, then, this

can be done in Cn,k ways.

❑ In each case we have k things which can be ordered k! ways. ❑ Thus, by the multiplication principle, the number of ordered

selections is Cn,k · k!

❑ On the other hand, this number is, by definition, Pn,k . Therefore

Cn,k · k! = Pn,k , and so:

66

Combinations Combinations

𝐷𝑜,𝑙 = 𝑄𝑜,𝑙 𝑙! = 𝑜! 𝑙! 𝑜 − 𝑙 !

slide-67
SLIDE 67

CIC

❑ The quantity on the right-hand side is usually abbreviated

as 𝑜 𝑙 , and is called a binomial coefficient.

❑ Thus, for any positive integer n and k = 1, 2, . . . , n:

𝐷𝑜,𝑙 = 𝑜 𝑙 = 𝑜(𝑜 − 1)(𝑜 − 2) ⋯ (𝑜 − 𝑙 + 1) 𝑙! = 𝑜! 𝑙! 𝑜 − 𝑙 !

67

Combinations Combinations

n ! = [n (n − 1)(n − 2) · · · (n − k + 1)][(n − k)(n − k − 1) · · · 2 · 1]

slide-68
SLIDE 68

CIC

68

Combinations Combinations

slide-69
SLIDE 69

CIC

❑ Binomial coefficient 𝑜

𝑙 → Binomial probabilities

❑ n  1 independent coin tosses: P(H) = p; P(k heads) = ? ❑ Example: P(HTTTHH) = ? ❑ P(particular sequence) = ? ❑ P(particular k-head sequence) = ?

69

Binomial Binomial probabiliti probabilities es

slide-70
SLIDE 70

CIC

❑ Binomial coefficient 𝑜

𝑙 → Binomial probabilities

❑ n  1 independent coin tosses: P(H) = p; P(k heads) = ? ❑ Example: P(HTTTHH) = p (1-p)(1-p)(1-p) p p ❑ P(particular sequence) = p#H (1-p)#T ❑ P(particular k-head sequence) = pk (1-p)n-k ❑ P(k heads) = pk (1-p)n-k(# k-head sequence)

= 𝑜 𝑙 pk (1-p)n-k

70

Binomial Binomial probabiliti probabilities es

slide-71
SLIDE 71

CIC

❑ A combination can be seen as a partition of the

set in two: one part contains k elements and the

  • ther contains the remaining n ˗ k elements.

❑ Given an n-element set and nonnegative integers

n1, n2, …, nr, whose sum is equal to n; consider partitions of the set into r disjoint subsets, with the ith subset containing exactly ni elements.

❑ In how many ways this can be done?

71

Part Partitions itions

slide-72
SLIDE 72

CIC

❑ There are 𝑜

𝑜1 ways of forming the first subset.

❑ Having formed the first subset, there are left n – n1

  • elements. We need to choose n2 of them in order to form

the second subset, and have 𝑜 − 𝑜1 𝑜2 choices, and so on.

❑ Thus, using the Counting Principle:

72

Part Partitions itions

slide-73
SLIDE 73

CIC

❑ As several terms cancel, it results: ❑ This is called the multinomial coefficient and is

usually denoted by:

73

Part Partitions itions

slide-74
SLIDE 74

CIC

74

Part Partitions itions

slide-75
SLIDE 75

CIC

❑ Example 7. Each person gets an ace. There is a 52-

card deck, dealt (fairly) to four players. What is the probability of each player getting an ace?

75

Part Partitions itions

slide-76
SLIDE 76

CIC

❑ Example 7. Each person gets an ace. There is a 52-

card deck, dealt (fairly) to four players. What is the probability of each player getting an ace?

➢ The size of the sample space is:

52! 13!13!13!13!

➢ Constructing an outcome with one ace for each person:

  • # of different ways of distributing the 4 aces to 4 players: 4!
  • Distribution of the remaining 48 cards:

48! 12!12!12!12!

76

Part Partitions itions

slide-77
SLIDE 77

CIC

77

Summary Summary of Counting Results

  • f Counting Results
slide-78
SLIDE 78

CIC

❑ Conditional probability provides us with a way to

reason about the outcome of an experiment, based on partial information.

❑ Examples:

❑ A) In an experiment involving two successive rolls of a die,

you are told that the sum of the two rolls is 9. How likely is that the first roll was 6?

❑ B) In a word guessing game, the first letter of the word is a

“t”. What is the likelihood that the second letter is an “h”?

78

Conditional Conditional Probabilit Probability

slide-79
SLIDE 79

CIC

❑ C) How likely is it that a person has certain

disease given that a medical test was negative?

❑ D) A spot shows up on a radar screen. How likely

is it to correspond to an aircraft?

79

Conditional Conditional Probabilit Probability

slide-80
SLIDE 80

CIC

❑ Given: ➢ An experiment ➢ A corresponding sample space ➢ A probability law ➢ We know that the outcome is within some given event

B.

❑ Quantify the likelihood that the outcome also

belongs to some other given event A.

80

Conditional Conditional Probabilit Probability

slide-81
SLIDE 81

CIC

❑ Construct a new probability law that takes into

account the available knowledge.

➢ A probability law that for any event A, specifies the

conditional probability of A given B, P(A|B).

❑ The conditional probabilities P(A|B) of

different events A should satisfy the probability axioms.

81

Conditional Conditional Probabilit Probability y

slide-82
SLIDE 82

CIC

❑ Example: ➢ Suppose that all six possible outcomes of a fair die

roll are equally likely.

➢ If the outcome is even, then there are only three

possible outcomes: 2, 4 and 6.

➢ What is the probability of the outcome being 6 given

that the outcome is even?

82

Conditional Conditional Probabilit Probability y

slide-83
SLIDE 83

CIC

❑ If all possible outcomes are equally likely: ❑ Conditional probability definition: ➢ With P(B) > 0.

❑ The total probability of the elements of B, P(A|B) is the fraction

that is assigned to possible outcomes that also belong to A.

83

Conditional Conditional Probabilit Probability y

slide-84
SLIDE 84

CIC

❑ Probability law of conditional probabilities

satisfy the three axioms:

1.

P(A|B) ≥ 0 for every event A,

2.

P(S|B) = 1,

3.

P(A1 ∪ A2 ∪ ·· · |B) = P(A1|B)+ P(A2|B) + ·· · for any finite or countably infinite number of mutually exclusive events A1, A2, . . . .

84

Conditional Conditional Probabilit Probability y

slide-85
SLIDE 85

CIC

❑ Proofs:

  • 1. In the definition of P(A|B) the numerator is

nonnegative by Axiom 1, and the denominator is positive by assumption. Thus, the fraction is nonnegative.

  • 2. Taking A = S in the definition of P(A|B), we get:

85

Conditional Conditional Probabilit Probability y

slide-86
SLIDE 86

CIC

3.

86

Conditional Conditional Probabilit Probability y

slide-87
SLIDE 87

CIC

87

Conditional Conditional Probabilit Probability y

Knowledge that event B has occurred implies that the outcome of the experiment is in the set B. In computing P(A|B) we can therefore view the experiment as now having the reduced sample space B. The event A occurs in the reduced sample space if and only if the outcome ζ is in A ∩ B. The equation simply renormalizes the probability of events that occur jointly with B.

slide-88
SLIDE 88

CIC

88

Conditional Conditional Probabilit Probability y

Suppose that we learn that B occurred. Upon obtaining this information, we get rid

  • f all the pebbles in Bc because they are incompatible with the knowledge that B

has occurred. Then P(A∩B) is the total mass of the pebbles remaining in A. Finally, we renormalize, that is, divide all the masses by a constant so that the new total mass of the remaining pebbles is 1. This is achieved by dividing by P(B), the total mass of the pebbles in B. The updated mass of the outcomes corresponding to event A is the conditional probability P(A|B) = P(A∩B)/P(B).

slide-89
SLIDE 89

CIC

❑ If we interpret probability as relative frequency: ➢ P(A|B) should be the relative frequency of the event

P(A∩B) in experiments where B occurred.

➢ Suppose that the experiment is performed n times, and

suppose that event B occurs nB times, and that event A∩B occurs nA∩B times. The relative frequency of interest is then:

➢ where we have implicitly assumed that P(B) > 0.

89

Conditional Conditional Probabilit Probability y

slide-90
SLIDE 90

CIC

❑ Example 1. Given the figure below, obtain

P(A|B)

90

Conditional Conditional Probabilit Probability y

slide-91
SLIDE 91

CIC

❑ Example 2. A ball is selected from an urn containing two

black balls, numbered 1 and 2, and two white balls, numbered 3 and 4. The number and color of the ball is noted, so the sample space is {(1,b),(2,b), (3,w), (4,w)}. Assuming that the four outcomes are equally likely, find P(A|B) and P(A|C), where A, B, and C are the following events:

91

Conditional Conditional Probabilit Probability y

slide-92
SLIDE 92

CIC

❑ Example 3. From all families with three children,

we select one family at random. What is the probability that the children are all boys, if we know that a) the first one is a boy, and b) at least

  • ne is a boy? (Assume that each child is a boy or

a girl with probability 1/2, independently of each

  • ther.)

92

Conditional Conditional Probabilit Probability y

slide-93
SLIDE 93

CIC

❑ Example 4. A card is drawn at random from a deck

  • f 52 cards. What is the probability that it is a King
  • r a 2, given that it is a face card (J, Q, K)?

93

Conditional Conditional Probabilit Probability y

slide-94
SLIDE 94

CIC

❑ If we multiply both sides of the definition of

P(A|B) by P(B) we obtain: P(A ∩ B) = P(A|B) P(B)

❑ Similarly, if we multiply both sides of the

definition of P(B|A) by P(A) we obtain: P(B ∩ A) = P(B|A) P(A)

94

Total Probability Total Probability Theorem and Theorem and Bayes’ Rule

slide-95
SLIDE 95

CIC

❑ Joint Probability of Two Events. For any events

A and B with positive probabilities: P(A ∩ B) = P(B) P(A|B) = P(A) P(B|A)

❑ Joint Probability of Three Events

P(A∩B∩C) = P(A) P(B|A) P(C|A∩B) P(A1∩A2∩A3) = P(A1) P(A2|A1) P(A3|A1∩A2)

95

Total Probability Total Probability Theorem and Theorem and Bayes’ Rule

slide-96
SLIDE 96

CIC

❑ Applying repeatedly, we can generalise to the

intersection of n events.

96

Total Probability Total Probability Theorem and Theorem and Bayes’ Rule

slide-97
SLIDE 97

CIC

97

Total Probability Total Probability Theorem and Theorem and Bayes’ Rule

slide-98
SLIDE 98

CIC

98

Total Probability Total Probability Theorem Theorem

❑ Total Probability Theorem:

slide-99
SLIDE 99

CIC

❑ P(B) = P(A1) P(B|A1) + · · · + P(An) P(B|An) ➢ The Ai partition the sample space; P(B) is equal to:

99

Total Probability Total Probability Theorem Theorem

The probability that B occurs is a weighted average of its conditional probability under each scenario, where each scenario is weighted according to its (unconditional) probability.

slide-100
SLIDE 100

CIC

100

Total Probability Total Probability Theorem Theorem

slide-101
SLIDE 101

CIC

❑ Example 1. Radar detection. If an aircraft is present

in certain area, a radar detects it and generates an alarm signal with probability 0.99. If an aircraft is not present, the radar generates a (false) alarm, with probability 0.10. We assume that an aircraft is present with probability 0.05.

❑ What is the probability of no aircraft presence and

false alarm?

❑ What is the probability of aircraft presence and no

detection?

101

Total Probability Total Probability Theorem Theorem

slide-102
SLIDE 102

CIC

102

Total Probability Total Probability Theorem Theorem

Sequential representation in a tree diagram

slide-103
SLIDE 103

CIC

103

Total Probability Total Probability Theorem Theorem

Sequential Representation in a tree diagram

slide-104
SLIDE 104

CIC

❑ Example 2. Picking Balls from Urns. Suppose we

have two urns, with the first one containing 2 white and 6 black balls, and the second one containing 2 white and 2 black balls. We pick an urn at random, and then pick a ball from the chosen urn at random.

❑ What is the probability of picking a white ball?

104

Total Probability Total Probability Theorem Theorem

slide-105
SLIDE 105

CIC

105

Total Probability Total Probability Theorem Theorem

Tree diagram What is the probability of picking a black ball?

slide-106
SLIDE 106

CIC

❑ Dealing Three Cards. From a deck of 52 cards

three are drawn without replacement.

❑ What is the probability of the event E of getting

two Aces and one King in any order?

❑ Denote the relevant outcomes by A, K and O (for

“other”),

106

Total Probability Total Probability Theorem Theorem

slide-107
SLIDE 107

CIC

107

Total Probability Total Probability Theorem Theorem

slide-108
SLIDE 108

CIC

108

Total Probability Total Probability Theorem Theorem

slide-109
SLIDE 109

CIC

109

Bayes’ Rule

slide-110
SLIDE 110

CIC

❑ To verify Bayes’ rule, by the definition of

conditional probability:

❑ P(B) follows from the total probability theorem.

110

Bayes’ Rule

slide-111
SLIDE 111

CIC

111

Bayes’ Rule

slide-112
SLIDE 112

CIC

112

Bayes’ Rule

slide-113
SLIDE 113

CIC

❑ Example 1. Rare disease. A test for a rare disease is assumed

to be correct 95% of the time: if a person has the disease, the test results are positive with probability 0.95, and if the person does not have the disease, the results are negative with probability 0.95. A random person drawn from a certain population has probability 0.001 of having the disease. Given that the person just tested positive, what is the probability of having the disease? A={“the person has the disease”} B={“the test results are positive”} P(A|B)=?

113

Bayes’ Rule

slide-114
SLIDE 114

CIC

114

Bayes’ Rule

A rare disease we need a much more accurate test. The probability of a false positive result must be of a lower order of magnitude than the fraction of people with the disease.

slide-115
SLIDE 115

CIC

❑ Example 2. Random coin. You have one fair

coin, and one biased coin which lands Heads with probability 3/4. You pick one of the coins at random and flip it three times. It lands Heads all three times. Given this information, what is the probability that the coin you picked is the fair

  • ne?

115

Bayes’ Rule

slide-116
SLIDE 116

CIC

116

Bayes’ Rule

Before flipping the coin, we thought we were equally likely to have picked the fair coin as the biased coin: P(F) = P(Fc) = 1/2. Upon observing three Heads, however, it becomes more likely that we’ve chosen the biased coin than the fair coin, so P(F|A) is only about 0.23.

slide-117
SLIDE 117

CIC

117

Bayes Bayes’ Rule

slide-118
SLIDE 118

CIC

118

Bayes Bayes’ Rule

slide-119
SLIDE 119

CIC

119

Bayes Bayes’ Rule

slide-120
SLIDE 120

CIC

120

Bayes Bayes’ Rule

slide-121
SLIDE 121

CIC

❑ Independence of two events. Events A and B are

independent if P(A ∩ B) = P(A) P(B)

❑ If P(A) > 0 and P(B) > 0, then this is equivalent

to: P(A|B) = P(A) and also equivalent to: P(B|A) = P(B)

121

Independence Independence

slide-122
SLIDE 122

CIC

❑ Two events are independent if we can obtain the

probability of their intersection by multiplying their individual probabilities. Alternatively, A and B are independent if learning that B occurred gives us no information that would change our probabilities for A occurring (and vice versa).

❑ Independence is a symmetric relation: if A is

independent of B, then B is independent of A.

122

Independence Independence

slide-123
SLIDE 123

CIC

❑ Independence is completely different from

  • disjointness. If A and B are disjoint, then

P(A∩B) = 0, so disjoint events can be independent only if P(A) = 0 or P(B) = 0. Knowing that A occurs tells us that B definitely did not occur, so A clearly conveys information about B, meaning the two events are not independent (except if A or B already has zero probability).

123

Independence Independence

❑ For example, when tossing a coin once, if a head occurs a tail

  • cannot. Therefore P(T | H) = 0. If a head and tail were independent

then we should have P(T | H) = P(T) = 1/2, which is not the case.

slide-124
SLIDE 124

CIC

❑ If A and B are independent, then A and Bc are

independent, Ac and B are independent, and Ac and Bc are independent.

  • Proof. Let A and B be independent. Then

P(Bc|A) = 1 − P(B|A) = 1 − P(B) = P(Bc)

so A and Bc are independent. Swapping the roles of A and B, we have that Ac and B are independent. Using the fact that A, B independent implies A, Bc independent, with Ac playing the role

  • f A, we also have that Ac and Bc are independent.

124

Independence Independence

slide-125
SLIDE 125

CIC

❑ Independence of three events. Events A, B, and C

are said to be independent if all of the following equations hold:

❑ P(A ∩ B) = P(A)P(B) ❑ P(A ∩ C) = P(A)P(C) ❑ P(B ∩ C) = P(B)P(C) ❑ P(A ∩ B ∩ C) = P(A)P(B)P(C)

125

Independence Independence

slide-126
SLIDE 126

CIC

126

Independence Independence

slide-127
SLIDE 127

CIC

❑ Independence of many events. For n events A1,A2, . . . ,

An to be independent, we require any pair to satisfy: P(Ai ∩ Aj) = P(Ai)P(Aj) (for i ≠ j), any triplet to satisfy: P(Ai ∩ Aj ∩ Ak) = P(Ai)P(Aj)P(Ak) (for i, j, k distinct) And similarly for all quadruplets, quintuplets, and so on.

❑ For infinitely many events, we say that they are

independent if every finite subset of the events is independent.

127

Independence Independence

slide-128
SLIDE 128

CIC

❑ Given an event C, the events A and B are said to

be conditionally independent if: P(A ∩ B|C) = P(A|C) P(B|C)

128

Conditional Conditional independenc independence

slide-129
SLIDE 129

CIC

❑ The previous relation states that if C is known to

have occurred, the additional knowledge that B also occurred does not change the probability of A.

❑ The independence of two events A and B with

respect to the unconditional probability law, does not imply conditional independence, and vice versa.

129

Conditional Conditional independenc independence

slide-130
SLIDE 130

CIC

❑ Example 2. Reliability.

130

Independence Independence

pi: probability that unit i is “up” ui: ith unit is up u1, u2,…, un are independent fi: ith unit is down  fi are independent P(system is up) = ?