Business Statistics CONTENTS Probability distribution functions - - PowerPoint PPT Presentation

โ–ถ
business statistics contents
SMART_READER_LITE
LIVE PREVIEW

Business Statistics CONTENTS Probability distribution functions - - PowerPoint PPT Presentation

PROBABILITY DISTRIBUTIONS Business Statistics CONTENTS Probability distribution functions (discrete) Characteristics of a discrete distribution Today we want to speed up. We will skip some slides or Example: uniform (discrete) distribution


slide-1
SLIDE 1

PROBABILITY DISTRIBUTIONS

Business Statistics

slide-2
SLIDE 2

Probability distribution functions (discrete) Characteristics of a discrete distribution Example: uniform (discrete) distribution Example: Bernoulli distribution Example: binomial distribution Probability density functions (continuous) Characteristics of a continuous distribution Example: uniform (continuous) distribution Example: normal (or Gaussian) distribution Example: standard normal distribution Back to the normal distribution Approximations to distributions Old exam question Further study

CONTENTS

Today we want to speed up. We will skip some slides or postpone a few. Prepare well, we want to start the statistical topics as soon as possible.

slide-3
SLIDE 3

โ–ช A sample space is called discrete when its elements can be counted โ–ช We will code the elements of a discrete sample space ๐‘‡ as 1,2,3, โ€ฆ , ๐‘œ or 0,1,2, โ€ฆ , ๐‘œ โˆ’ 1 โ–ช Examples

โ–ช die ๐‘ฆ โˆˆ 1,2,3,4,5,6 , so ๐‘‡ = 1,2,3,4,5,6 โ–ช coin ๐‘ฆ โˆˆ 0,1 โ–ช number of broken TV sets ๐‘ฆ โˆˆ 0,1,2, โ€ฆ

PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE)

slide-4
SLIDE 4

Distribution function ๐‘„ ๐‘ฆ = ๐‘„ ๐‘Œ = ๐‘ฆ โ–ช the probability that the (discrete) random variable ๐‘Œ assumes the value ๐‘ฆ โ–ช alternative notation: ๐‘„

๐‘Œ ๐‘ฆ

PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE)

Note our convention: capital letters (๐‘Œ) for random variables lowercase letters (๐‘ฆ) for values

slide-5
SLIDE 5

Example โ–ช die: ๐‘„ ๐‘ฆ =

1 6

if ๐‘ฆ = 1

1 6

if ๐‘ฆ = 2

1 6

if ๐‘ฆ = 3

1 6

if ๐‘ฆ = 4

1 6

if ๐‘ฆ = 5

1 6

if ๐‘ฆ = 6

  • therwise

PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE)

slide-6
SLIDE 6

Example: flipping a coin 3 times โ–ช sample space ๐‘‡ = ๐ผ๐ผ๐ผ, ๐ผ๐ผ๐‘ˆ, ๐ผ๐‘ˆ๐ผ, ๐‘ˆ๐ผ๐ผ, โ€ฆ โ–ช define the random variable ๐‘Œ = number of heads โ–ช distribution function ๐‘„ ๐‘ฆ =

1 8

if ๐‘ฆ = 0

3 8

if ๐‘ฆ = 1

3 8

if ๐‘ฆ = 2

1 8

if ๐‘ฆ = 3

  • therwise

โ–ช or: ๐‘„

๐‘Œ 0 = 1 8 , ๐‘„ ๐‘Œ 1 = 3 8 , ๐‘„ ๐‘Œ 2 = 3 8 , ๐‘„ ๐‘Œ 3 = 1 8

PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE)

slide-7
SLIDE 7

โ–ช ๐‘„ ๐‘ฆ is a (discrete) probability distribution function (pdf or PDF) โ–ช ๐‘„ ๐‘ฆ = ๐‘„ ๐‘Œ = ๐‘ฆ expresses the probability that ๐‘Œ = ๐‘ฆ โ–ช A random variable ๐‘Œ that is distributed with pdf ๐‘„ is written as ๐‘Œ~๐‘„ โ–ช Some properties of the pdf:

โ–ช 0 โ‰ค ๐‘„ ๐‘ฆ โ‰ค 1 โ–ช a probability is always between 0 and 1 โ–ช ฯƒ๐‘ฆโˆˆ๐‘‡ ๐‘„ ๐‘ฆ = 1 โ–ช the probabilities of all elementary outcomes add up to 1

PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE)

slide-8
SLIDE 8

โ–ช A pdf may have one or more parameters to denote a collection of different but โ€œsimilarโ€pdfs โ–ช Example: a regular die with ๐‘› faces

โ–ช ๐‘„ ๐‘Œ = ๐‘ฆ; ๐‘› = ๐‘„

๐‘Œ ๐‘ฆ; ๐‘› = ๐‘„ ๐‘ฆ; ๐‘› = 1 ๐‘› (for ๐‘ฆ = 1, โ€ฆ , ๐‘›)

โ–ช ๐‘Œ~๐‘„ ๐‘›

PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE)

๐‘› = 4 ๐‘› = 6 ๐‘› = 8 ๐‘› = 12 ๐‘› = 20

slide-9
SLIDE 9

In addition to the (discrete) probability distribution function (pdf) โ–ช ๐‘„ ๐‘Œ = ๐‘ฆ = ๐‘„

๐‘Œ ๐‘ฆ = ๐‘„ ๐‘ฆ

we define the (discrete) cumulative distribution function (cdf or CDF) ๐บ ๐‘ฆ = ๐บ

๐‘Œ ๐‘ฆ = ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ

and therefore ๐บ ๐‘ฆ = เท

๐‘™=โˆ’โˆž ๐‘ฆ

๐‘„ ๐‘Œ = ๐‘™ = เท

๐‘™=โˆ’โˆž ๐‘ฆ

๐‘„ ๐‘™ PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE)

Depending on how we count, you may also start at ๐‘™ = 0 or ๐‘™ = 1

slide-10
SLIDE 10

Example โ–ช die: ๐‘„ ๐‘Œ = 2 =

1 6, but ๐‘„ ๐‘Œ โ‰ค 2 = ๐‘„ ๐‘Œ = 1 +

๐‘„ ๐‘Œ = 2 =

1 3

โ–ช Some properties of the cdf:

โ–ช ๐บ โˆ’โˆž = 0 and ๐บ โˆž = 1 โ–ช monotonously increasing

PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE)

slide-11
SLIDE 11

โ–ช pdf โ–ช cdf PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE)

slide-12
SLIDE 12

Expected value of ๐‘Œ ๐น ๐‘Œ = เท

๐‘—=1 ๐‘‚

๐‘ฆ๐‘—๐‘„ ๐‘Œ = ๐‘ฆ๐‘— = เท

๐‘—=1 ๐‘‚

๐‘ฆ๐‘—๐‘„ ๐‘ฆ๐‘— โ–ช Example

โ–ช die with ๐‘„ 1 = ๐‘„ 2 = โ‹ฏ = ๐‘„ 6 =

1 6

โ–ช ๐น ๐‘Œ = 1 ร—

1 6 + 2 ร— 1 6 + 3 ร— 1 6 + 4 ร— 1 6 + 5 ร— 1 6 + 6 ร— 1 6 = 7 2 = 3 1 2

โ–ช Interpretation: mean (average)

โ–ช alternative notation: ๐œˆ or ๐œˆ๐‘Œ โ–ช so ๐น ๐‘Œ = ๐œˆ๐‘Œ

โ–ช Note difference between ๐œˆ and the sample mean าง ๐‘ฆ

โ–ช e.g., rolling a specific die ๐‘œ = 100 times may return a mean าง ๐‘ฆ = 3.72 or 3.43 โ–ช while ๐œˆ = 7/2, always (property of die, property of โ€œpopulationโ€)

CHARACTERISTICS OF A DISCRETE DISTRIBUTION

slide-13
SLIDE 13

Variance var ๐‘Œ = เท

๐‘—=1 ๐‘‚

๐‘ฆ๐‘— โˆ’ ๐น ๐‘Œ

2๐‘„ ๐‘ฆ๐‘—

โ–ช Interpretation: dispersion

โ–ช alternative notation: ๐œ2 or ๐œ๐‘Œ

2 or ๐‘Š ๐‘Œ

โ–ช so var ๐‘Œ = ๐œ๐‘Œ

2

โ–ช Note difference between ๐œ2 and the sample variance ๐‘ก2

โ–ช e.g., rolling a specific die 100 times may return a variance ๐‘ก2 = 2.86 or 3.04 โ–ช while ๐œ2 =

35 12, always (property of die, property of โ€œpopulationโ€)

โ–ช And of course: standard deviation ๐œ๐‘Œ = var ๐‘Œ

CHARACTERISTICS OF A DISCRETE DISTRIBUTION

slide-14
SLIDE 14

Transformation rules of random variable ๐‘Œ and ๐‘ โ–ช For means:

โ–ช ๐น ๐‘™ + ๐‘Œ = ๐‘™ + ๐น ๐‘Œ โ–ช ๐น ๐‘๐‘Œ = ๐‘๐น ๐‘Œ โ–ช ๐น ๐‘Œ + ๐‘ = ๐น ๐‘Œ + ๐น ๐‘

โ–ช For variances:

โ–ช var ๐‘™ + ๐‘Œ = var ๐‘Œ โ–ช var ๐‘๐‘Œ = ๐‘2var ๐‘Œ โ–ช if ๐‘Œ and ๐‘ independent (so if cov ๐‘Œ, `๐‘ ): โ–ช var ๐‘Œ + ๐‘ = var ๐‘Œ + var ๐‘ โ–ช if ๐‘Œ and ๐‘ dependent: โ–ช var ๐‘Œ + ๐‘ = var ๐‘Œ + 2cov ๐‘Œ, ๐‘ + var ๐‘

CHARACTERISTICS OF A DISCRETE DISTRIBUTION

slide-15
SLIDE 15

โ–ช Generalization of fair die:

โ–ช equal probability of integer outcomes from ๐‘ through ๐‘ โ–ช conditions: ๐‘, ๐‘ โˆˆ โ„ค, ๐‘ < ๐‘ โ–ช zero probability elsewhere โ–ช uniform discrete distribution

โ–ช pdf: ๐‘„ ๐‘ฆ; ๐‘, ๐‘ = เต

1 ๐‘โˆ’๐‘+1

๐‘ฆ โˆˆ โ„ค and ๐‘ฆ โˆˆ ๐‘, ๐‘

  • therwise

โ–ช Examples:

โ–ช coin: ๐‘ = 0, ๐‘ = 1 โ–ช die: ๐‘ = 1, ๐‘ = 6

โ–ช Random variable:

โ–ช ๐‘Œ~๐‘‰ ๐‘, ๐‘

EXAMPLE: UNIFORM DISTRIBUTION

slide-16
SLIDE 16

EXAMPLE: UNIFORM DISTRIBUTION

No need to memorize or even discuss this sheet. Most information is either on the formula sheet or unimportant.

slide-17
SLIDE 17

โ–ช Example: choose a random number from 1 through 100 with equal probability and denote it by ๐‘Œ

โ–ช random variable: ๐‘Œ~๐‘‰ 1,100 โ–ช pdf: ๐‘„ ๐‘ฆ = ๐‘„ ๐‘Œ = ๐‘ฆ =

1 100 (๐‘ฆ โˆˆ 1,2, โ€ฆ , 100 )

โ–ช cdf: ๐บ ๐‘ฆ = ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ =

๐‘ฆ 100 (๐‘ฆ โˆˆ 1,2, โ€ฆ , 100 )

โ–ช expected value: ๐น ๐‘Œ = 50

1 2

โ–ช variance: var ๐‘Œ =

9999 12 โ‰ˆ 833.25

โ–ช Sample (๐‘œ = 1000):

โ–ช values (e.g.): 45, 96, 33, 7, 44, 96, 20, โ€ฆ โ–ช mean: าง ๐‘ฆ = 50.92 (e.g.) โ–ช variance: ๐‘ก๐‘ฆ

2 = 823.25 (e.g.)

EXAMPLE: UNIFORM DISTRIBUTION

slide-18
SLIDE 18

Given are two dice, with outcomes ๐‘Œ and ๐‘.

  • a. Find ๐น ๐‘Œ + ๐‘
  • b. Find var ๐‘Œ + ๐‘

EXERCISE 1

slide-19
SLIDE 19

โ–ช Bernoulli experiment

โ–ช random experiment with 2 discrete outcomes (coin type) โ–ช head, true, โ€œsuccessโ€, female: ๐‘Œ = 1 โ–ช tail, false, โ€œfailโ€, male: ๐‘Œ = 0 โ–ช Bernoulli distribution

โ–ช Examples:

โ–ช winning a price in a lottery (buying one ticket) โ–ช your luggage arrives in time at a destination

โ–ช Probability of success is parameter ๐œŒ (with 0 โ‰ค ๐œŒ โ‰ค 1)

โ–ช ๐‘„ 1 = ๐‘„ ๐‘Œ = 1 = ๐œŒ โ–ช ๐‘„ 0 = ๐‘„ ๐‘Œ = 0 = 1 โˆ’ ๐œŒ

โ–ช Random variable

โ–ช ๐‘Œ~๐ถ๐‘“๐‘ ๐‘œ๐‘๐‘ฃ๐‘š๐‘š๐‘— ๐œŒ or ๐‘Œ~๐‘๐‘š๐‘ข ๐œŒ

EXAMPLE: BERNOULLI DISTRIBUTION

slide-20
SLIDE 20

โ–ช Expected value

โ–ช ๐น ๐‘Œ = ๐œŒ (obviously!)

โ–ช Variance

โ–ช var ๐‘Œ = ๐œŒ 1 โˆ’ ๐œŒ โ–ช variance zero when ๐œŒ = 0 or ๐œŒ = 1 (obviously!) โ–ช variance maximal when ๐œŒ = 1 โˆ’ ๐œŒ =

1 2 (obviously!)

โ–ช pdf: ๐‘ž ๐‘ฆ; ๐œŒ = แ‰ ๐œŒ if ๐‘ฆ = 1 1 โˆ’ ๐œŒ if ๐‘ฆ = 0

  • therwise

โ–ช cdf: (not so interesting) EXAMPLE: BERNOULLI DISTRIBUTION

slide-21
SLIDE 21

โ–ช Repeating a Bernoulli experiment ๐‘œ times

โ–ช ๐‘Œ is total number of โ€œsuccessesโ€ โ–ช ๐‘„ ๐‘Œ = ๐‘ฆ is probality of ๐‘ฆ โ€œsuccessesโ€ in sample โ–ช ๐‘Œ = ๐‘Œ1 + ๐‘Œ2 + โ‹ฏ + ๐‘Œ๐‘œ โ–ช where ๐‘Œ๐‘— is the outcome of Bernoulli experiment number ๐‘— = 1,2, โ€ฆ , ๐‘œ โ–ช ๐‘Œ has a binomial distribution

EXAMPLE: BINOMIAL DISTRIBUTION

slide-22
SLIDE 22

โ–ช Example

โ–ช flip a coin 10 times:๐‘Œ is number of โ€œheads upโ€ โ–ช roll 100 dice: ๐‘Œ is number of โ€œsixesโ€ โ–ช produce 1000 TV sets: ๐‘Œ is number of broken sets

โ–ช What is important?

โ–ช the number of repitions (๐‘œ) โ–ช the probability of success (๐œŒ) per item โ–ช the constancy of ๐œŒ โ–ช the independence of the โ€œexperimentsโ€

EXAMPLE: BINOMIAL DISTRIBUTION

slide-23
SLIDE 23

โ–ช Expected value

โ–ช ๐น ๐‘Œ = ๐‘œ๐œŒ (obviously!)

โ–ช Variance

โ–ช var ๐‘Œ = ๐‘œ๐œŒ 1 โˆ’ ๐œŒ โ–ช minimum (0) when ๐œŒ = 0 or ๐œŒ = 1 (obviously!) โ–ช maximum for given ๐‘œ when ๐œŒ = 1 โˆ’ ๐œŒ =

1 2 (obviously!)

โ–ช pdf:

โ–ช ๐‘ž ๐‘ฆ; ๐‘œ, ๐œŒ =

๐‘œ! ๐‘ฆ! ๐‘œโˆ’๐‘ฆ ! ๐œŒ๐‘ฆ 1 โˆ’ ๐œŒ ๐‘œโˆ’๐‘ฆ

(๐‘ฆ โˆˆ 0,1,2, โ€ฆ , ๐‘œ )

โ–ช cdf:

โ–ช ๐บ ๐‘ฆ; ๐‘œ, ๐œŒ = ฯƒ๐‘™=0

๐‘ฆ

๐‘ž ๐‘ฆ; ๐‘œ, ๐œŒ

โ–ช Random variable:

โ–ช ๐‘Œ~๐‘๐‘—๐‘œ ๐‘œ, ๐œŒ or ๐‘Œ~๐‘๐‘—๐‘œ๐‘๐‘› ๐‘œ, ๐œŒ

EXAMPLE: BINOMIAL DISTRIBUTION

Recall the factorial function: 5! = 5 ร— 4 ร— 3 ร— 2 ร— 1

slide-24
SLIDE 24

โ–ช Example:

โ–ช roll 10 dice: what is the distribution of ๐‘Œ = number of โ€œsixesโ€?

โ–ช What is the probability model?

โ–ช you repeat an experiment 10 times (๐‘œ = 10) โ–ช with a probability ๐œŒ =

1 6 of success and a probability 1 โˆ’ ๐œŒ = 5 6 of failure per

experiment

โ–ช What is the probability distribution?

โ–ช ๐‘Œ~๐‘๐‘—๐‘œ 10,

1 6

โ–ช where the random variable ๐‘Œ represents the total number of sixes โ–ช so ๐‘Œ is not the outcome of a roll of the die!

โ–ช ๐น ๐‘Œ = 10 ร—

1 6 = 1 2 3

โ–ช so we expect on average 1

2 3 sixes in 10 rolls

โ–ช var ๐‘Œ = 10 ร—

1 6 ร— 5 6 = 25 18

EXAMPLE: BINOMIAL DISTRIBUTION

slide-25
SLIDE 25

EXAMPLE: BINOMIAL DISTRIBUTION

No need to memorize or even discuss this

  • sheet. Most information is either on the

formula sheet or unimportant.

slide-26
SLIDE 26

โ–ช Calculating pdf and cdf values โ–ช Example: binomial distrbution with ๐‘œ = 8, ๐œŒ = 0.5

โ–ช what is ๐‘„ 3 = ๐‘„ ๐‘Œ = 3 (pdf)? โ–ช what is ๐บ 3 = ๐‘„ ๐‘Œ โ‰ค 3 (cdf)?

โ–ช Different methods:

โ–ช using a graphical calculator (not at the exam) โ–ช using the formula (see next slides) โ–ช using a table (see next slides) โ–ช using Excel (see the computer tutorials) โ–ช using online calculators (figure out for yourself)

EXAMPLE: BINOMIAL DISTRIBUTION

slide-27
SLIDE 27

โ–ช pdf using the formula

โ–ช ๐‘„ 3; 8,0.5 =

8! 3! 8โˆ’3 ! 0.53 1 โˆ’ 0.5 8โˆ’3 = 0.2188

โ–ช or โ–ช ๐‘„ 3; 8,0.5 =

8 3 0.53 1 โˆ’ 0.5 8โˆ’3 = 0.2188

โ–ช using the binomial coefficient ๐‘œ

๐‘™ = ๐‘œ๐ท๐‘™ = ๐‘œ! ๐‘™! ๐‘œโˆ’๐‘™ !

EXAMPLE: BINOMIAL DISTRIBUTION

At the exam, you can just use the tables. Much easier!

slide-28
SLIDE 28

โ–ช pdf using the table in Appendix A

โ–ช ๐‘„ 3; 8,0.50 = 0.2188

EXAMPLE: BINOMIAL DISTRIBUTION

slide-29
SLIDE 29

โ–ช At the exam: non-cumulative table only โ–ช Problem: how to do the cdf? โ–ช Use the definition: ๐บ ๐‘ฆ = ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ = เท

๐‘™=0 ๐‘ฆ

๐‘„ ๐‘Œ = ๐‘™

โ–ช ๐‘„ ๐‘Œ โ‰ค 3 = ๐‘„ ๐‘Œ = 0 + ๐‘„ ๐‘Œ = 1 + ๐‘„ ๐‘Œ = 2 + ๐‘„ ๐‘Œ = 3 โ–ช use table, four times

EXAMPLE: BINOMIAL DISTRIBUTION

slide-30
SLIDE 30

โ–ช Example

โ–ช ๐บ 3; 8,0.50 = 0.0039 + 0.0313 + 0.1094 + 0.2188

EXAMPLE: BINOMIAL DISTRIBUTION

Note that this table gives a pdf, not a cdf

slide-31
SLIDE 31

โ–ช Note that cdf is ๐บ ๐‘ฆ = ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ

โ–ช How to find ๐‘„ ๐‘Œ < ๐‘ฆ ? โ–ช use ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ = ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ โˆ’ 1 โ–ช How to find ๐‘„ ๐‘Œ > ๐‘ฆ ? โ–ช use ๐‘„ X > x = 1 โˆ’ ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ โ–ช How to find ๐‘„ ๐‘ฆ1 < ๐‘Œ < ๐‘ฆ2 ? โ–ช use ๐‘„ ๐‘ฆ1 < ๐‘Œ < ๐‘ฆ2 = ๐‘„ ๐‘Œ < ๐‘ฆ2 โˆ’ ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ1 โ–ช Etc.

EXAMPLE: BINOMIAL DISTRIBUTION

slide-32
SLIDE 32

โ–ช Use such rules to efficiently use the (pdf) table (๐‘œ = 8)

โ–ช ๐‘„ ๐‘Œ โ‰ค 7 = ๐‘„ 0 + ๐‘„ 1 + โ‹ฏ + ๐‘„ 7

โ–ช Much easier:

โ–ช ๐‘„ ๐‘Œ โ‰ค 7 = 1 โˆ’ ๐‘„ 8

EXAMPLE: BINOMIAL DISTRIBUTION

slide-33
SLIDE 33

Example: โ–ช Context:

โ–ช on average, 20% of the emergency room patients at Greenwood General Hospital lack health insurance

โ–ช In a random sample of 4 patients, what is the probability that at least 2 will be uninsured? EXAMPLE: BINOMIAL DISTRIBUTION

slide-34
SLIDE 34

โ–ช Binomial model (patient is uninsured or not, ๐œŒuninsured = 0.20)

โ–ช ๐‘Œ is number of uninsured patients in sample โ–ช ๐‘„ ๐‘Œ โ‰ฅ 2 = ๐‘„ ๐‘Œ = 2 + ๐‘„ ๐‘Œ = 3 + ๐‘„ ๐‘Œ = 4 = 0.1536 + 0.0256 + 0.0016 = 0.1808

EXAMPLE: BINOMIAL DISTRIBUTION

Note that this table gives a pdf, not a cdf

slide-35
SLIDE 35

Discrete distributions

โ–ช probability distribution function (pdf): ๐‘„ ๐‘ฆ = ๐‘„ ๐‘Œ = ๐‘ฆ โ–ช probability of obtaining the value ๐‘ฆ

Continuous distributions

โ–ช the probability of obtaining the value ๐‘ฆ is 0 โ–ช define probability density function (pdf): ๐‘” ๐‘ฆ โ–ช ๐‘„ ๐‘ โ‰ค ๐‘Œ โ‰ค ๐‘ = ืฌ

๐‘ ๐‘ ๐‘” ๐‘ฆ ๐‘’๐‘ฆ

โ–ช probability of obtaining a value between ๐‘ and ๐‘

PROBABILITY DENSITY FUNCTION (CONTINUOUS)

Compare with the probability distribution function (pdf) ๐‘„ ๐‘Œ = ๐‘ฆ for the discrete case The red curve is the pdf, ๐‘” ๐‘ฆ The integral is the grey area under the pdf

slide-36
SLIDE 36

So pdf refers to two distinct but related things:

โ–ช probability distribution function ๐‘„ ๐‘ฆ (discrete case) โ–ช probability density function ๐‘” ๐‘ฆ (continuous case)

Note also that the dimensions are different

โ–ช ๐‘„ is a dimensionless probability โ–ช example: โ–ช if ๐‘Œ is in kg, the discrete pdf ๐‘„ ๐‘Œ is dimensionless โ–ช while the continuous pdf ๐‘” ๐‘ฆ is in 1/kg

PROBABILITY DENSITY FUNCTION (CONTINUOUS)

Because ืฌ ๐‘” ๐‘ฆ ๐‘’๐‘ฆ should be dimensionless, and ๐‘’๐‘ฆ is in in kg

slide-37
SLIDE 37

In addition to the probability density function ...

โ–ช ๐‘„ ๐‘ฆ = ๐‘„

๐‘Œ ๐‘ฆ

... we define the cumulative distribution function (cdf or CDF) ๐บ ๐‘ฆ = ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ = เถฑ

โˆ’โˆž ๐‘ฆ

๐‘” ๐‘ง ๐‘’๐‘ง Some properties of the cdf:

โ–ช ๐บ โˆ’โˆž = 0 and ๐บ โˆž = 1 โ–ช monotonously increasing

PROBABILITY DENSITY FUNCTION (CONTINUOUS)

Compare with ๐บ ๐‘ฆ = ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ = เท

๐‘™=โˆ’โˆž ๐‘ฆ

๐‘„ ๐‘Œ = ๐‘™ for the discrete case ๐‘ฆ ๐บ ๐‘ฆ

slide-38
SLIDE 38

โ–ช pdf โ–ช cdf PROBABILITY DENSITY FUNCTION (CONTINUOUS)

๐‘„ 70 โ‰ค ๐‘Œ โ‰ค 75 = เถฑ

70 75

๐‘” ๐‘ฆ ๐‘’๐‘ฆ ๐‘„ 70 โ‰ค ๐‘Œ โ‰ค 75 = ๐บ 75 โˆ’ ๐บ 70

slide-39
SLIDE 39

โ–ช Expected value ๐น ๐‘Œ = เถฑ

โˆ’โˆž โˆž

๐‘ฆ๐‘” ๐‘ฆ ๐‘’๐‘ฆ โ–ช Example: let ๐‘” ๐‘ฆ = 1 for ๐‘ฆ โˆˆ 0,1

โ–ช ๐น ๐‘Œ = ืฌ

1 ๐‘ฆ๐‘’๐‘ฆ =

แ‰ƒ

1 2 ๐‘ฆ2 1

=

1 2

โ–ช Interpretation: mean (average)

โ–ช alternative notation for ๐น ๐‘Œ : ๐œˆ or ๐œˆ๐‘Œ

CHARACTERISTICS OF A CONTINUOUS DISTRIBUTION

Compare with ๐น ๐‘Œ = เท

๐‘—=1 ๐‘œ

๐‘ฆ๐‘—๐‘„ ๐‘ฆ for the discrete case

slide-40
SLIDE 40

โ–ช Variance var ๐‘Œ = เถฑ

โˆ’โˆž โˆž

๐‘ฆ โˆ’ ๐น ๐‘Œ

2๐‘” ๐‘ฆ ๐‘’๐‘ฆ

โ–ช Interpretation: dispersion

โ–ช alternative notation for var ๐‘Œ : ๐œ2 or ๐œ๐‘Œ

2 or V(๐‘Œ)

CHARACTERISTICS OF A CONTINUOUS DISTRIBUTION

Compare with var ๐‘Œ = เท

๐‘—=1 ๐‘œ

๐‘ฆ๐‘— โˆ’ ๐น ๐‘Œ

2๐‘„ ๐‘ฆ๐‘—

for the discrete case

slide-41
SLIDE 41

โ–ช Analogy with uniform discrete distribution

โ–ช equal density for all outcomes between ๐‘ and ๐‘ โ–ช condition: ๐‘ < ๐‘ โ–ช zero probability elsewhere โ–ช uniform continuous distribution

โ–ช pdf: ๐‘” ๐‘ฆ; ๐‘, ๐‘ = เต

1 ๐‘โˆ’๐‘

๐‘ฆ โˆˆ ๐‘, ๐‘

  • therwise

โ–ช or easier: ๐‘” ๐‘ฆ; ๐‘, ๐‘ =

1 ๐‘โˆ’๐‘

(๐‘ฆ โˆˆ ๐‘, ๐‘ ) โ–ช Examples:

โ–ช โ€œstandardโ€ uniform deviate: ๐‘ = 0, ๐‘ = 1

EXAMPLE: UNIFORM (CONTINUOUS) DISTRIBUTION

slide-42
SLIDE 42

Example: let ๐‘Œ be exam grade of randomly selected student

โ–ช assume uniform distribution: ๐‘Œ~๐‘‰ 1,10 โ–ช what is ๐‘„ ๐‘Œ โ‰ฅ 6.5 ?

Solution

โ–ช use ๐‘„ ๐‘Œ โ‰ฅ 6.5 = 1 โˆ’ ๐‘„ ๐‘Œ < 6.5 = 1 โˆ’ ๐‘„ ๐‘Œ โ‰ค 6.5 โ–ช cdf: ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ = ๐บ ๐‘ฆ = ืฌ

โˆ’โˆž ๐‘ฆ ๐‘” ๐‘ง ๐‘’๐‘ง

โ–ช uniform continuous with ๐‘ = 1 and ๐‘ = 10 โ–ช pdf: ๐‘” ๐‘ฆ =

1 9

(๐‘ฆ โˆˆ 1,10 ) โ–ช cdf: ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ = ืฌ

1 ๐‘ฆ 1 9 ๐‘’๐‘ง = 1 9 ๐‘ฆ โˆ’ 1

โ–ช answer: ๐‘„ ๐‘Œ โ‰ฅ 6.5 = 1 โˆ’

1 9 6.5 โˆ’ 1

โ–ช or: area of black rectangle

EXAMPLE: UNIFORM (CONTINUOUS) DISTRIBUTION

For a continuous distribution ๐‘„ ๐‘Œ < ๐‘ฆ = ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ because ๐‘„ ๐‘Œ = ๐‘ฆ = 0 1 6.5 10 1 9 ๐‘„ ๐‘Œ โ‰ฅ 6.5 is the black area

slide-43
SLIDE 43

โ–ช Expected value

โ–ช ๐น ๐‘Œ =

๐‘+๐‘ 2

โ–ช Variance

โ–ช var ๐‘Œ =

๐‘โˆ’๐‘ 2 12

(ืฌ

๐‘ ๐‘ ๐‘ฆ โˆ’ ๐‘+๐‘ 2 2

ร—

1 ๐‘โˆ’๐‘ ๐‘’๐‘ฆ = ๐‘โˆ’๐‘ 2 12

)

โ–ช pdf

โ–ช ๐‘” ๐‘ฆ =

1 ๐‘โˆ’๐‘

โ–ช cdf

โ–ช ๐บ ๐‘ฆ =

๐‘ฆโˆ’๐‘ ๐‘โˆ’๐‘

โ–ช Random variable

โ–ช ๐‘Œ~๐‘‰ ๐‘, ๐‘ or ๐‘Œ~โ„Ž๐‘๐‘› 0, ๐œ„ or ๐‘Œ~โ„Ž๐‘๐‘› ๐œ„ etc.

EXAMPLE: UNIFORM (CONTINUOUS) DISTRIBUTION

slide-44
SLIDE 44

โ–ช pdf

โ–ช ๐‘” ๐‘ฆ; ๐œˆ, ๐œ =

1 ๐œ 2๐œŒ ๐‘“โˆ’1

2 ๐‘ฆโˆ’๐œˆ ๐œ 2

โ–ช cdf

โ–ช ๐บ ๐‘ฆ = ืฌ

โˆ’โˆž ๐‘ฆ ๐‘” ๐‘ง; ๐œˆ, ๐œ ๐‘’๐‘ง =? ? ?

โ–ช Expected value

โ–ช ๐น ๐‘Œ = ๐œˆ

โ–ช Variance

โ–ช var ๐‘Œ = ๐œ2

โ–ช Random variable

โ–ช ๐‘Œ~๐‘‚ ๐œˆ, ๐œ or ๐‘Œ~๐‘‚ ๐œˆ, ๐œ2

EXAMPLE: NORMAL (OR GAUSSIAN) DISTRIBUTION

In a concrete case indicate the parameterโ€™s symbol: ๐‘‚ 12, ๐œ = 2 or ๐‘‚ 12, ๐œ2 = 4 Remember notation ๐œˆ๐‘Œ for expected value and ๐œ๐‘Œ

2 for variance.

So here ๐œˆ๐‘Œ = ๐œˆ and ๐œ๐‘Œ

2 = ๐œ2.

This is no coincedence! Now, ๐œŒ = 3.1415 ...

slide-45
SLIDE 45

โ–ช Some characteristics

โ–ช range: ๐‘ฆ โˆˆ โˆ’โˆž, โˆž โ–ช pdf has maximum at ๐‘ฆ = ๐œˆ โ–ช pdf is symmetric around ๐‘ฆ = ๐œˆ โ–ช not too interesting for ๐‘ฆ < ๐œˆ โˆ’ 3๐œ and for ๐‘ฆ > ๐œˆ + 3๐œ

EXAMPLE: NORMAL (OR GAUSSIAN) DISTRIBUTION

slide-46
SLIDE 46

โ–ช Normal distribution with ๐œˆ = 0 and ๐œ = 1

โ–ช so a 0-parameter distribution: standard normal

โ–ช pdf

โ–ช ๐‘” ๐‘ฆ =

1 2๐œŒ ๐‘“โˆ’1

2๐‘ฆ2

โ–ช cdf

โ–ช ๐บ ๐‘ฆ = ืฌ

โˆ’โˆž ๐‘ฆ ๐‘” ๐‘ง ๐‘’๐‘ง =? ? ? = ฮฆ ๐‘ฆ

โ–ช with ฮฆ โˆ’โˆž = 0, ฮฆ โˆž = 1, ฮฆ 0 = 0.5,

๐‘’ฮฆ ๐‘’๐‘ฆ = ๐‘” ๐‘ฆ

โ–ช Expected value

โ–ช ๐น ๐‘Œ = 0

โ–ช Variance

โ–ช var ๐‘Œ = 1

โ–ช Random variable

โ–ช ๐‘Œ~๐‘‚ 0,1 , we often write ๐‘Ž~๐‘‚ 0,1

EXAMPLE: STANDARD NORMAL DISTRIBUTION

Remember the trick: if you donโ€™t know something, just give it a name

slide-47
SLIDE 47

โ–ช Important because any normally distributed variable can be โ€œstandardizedโ€ to standard normal distribution โ–ช Methods for determing the values of ฮฆ ๐‘ฆ :

โ–ช using a graphical calculator (not at the exam) โ–ช not using a formula (no formula available for ฮฆ ๐‘ฆ ) โ–ช using a table (see next slides) โ–ช using Excel (see the computer tutorials) โ–ช using online calculators (figure out for yourself)

EXAMPLE: STANDARD NORMAL DISTRIBUTION

slide-48
SLIDE 48

โ–ช Calculating the value of the cdf with a table

โ–ช ๐‘„ ๐‘Ž โ‰ค 1.36 = ฮฆ 1.36 โ–ช table C-2 (p.768): ๐‘„ ๐‘Ž โ‰ค 1.36 = 0.9131

EXAMPLE: STANDARD NORMAL DISTRIBUTION

slide-49
SLIDE 49

Note that cdf is ๐‘„ ๐‘Ž โ‰ค ๐‘ฆ โ–ช How to find ๐‘„ ๐‘Ž < ๐‘ฆ ?

โ–ช use ๐‘„ ๐‘Ž โ‰ค ๐‘ฆ (why?)

โ–ช How to find ๐‘„ ๐‘Ž > ๐‘ฆ ?

โ–ช use 1 โˆ’ ๐‘„ ๐‘Ž โ‰ค ๐‘ฆ (why?) โ–ช or use ๐‘„ ๐‘Ž > ๐‘ฆ = ๐‘„ ๐‘Ž < โˆ’๐‘ฆ (why?)

โ–ช How to find ๐‘„ ๐‘Ž โ‰ฅ ๐‘ฆ ?

โ–ช is easy now ...

โ–ช How to find ๐‘„ ๐‘ฆ โ‰ค ๐‘Ž โ‰ค ๐‘ง ?

โ–ช use ๐‘„ ๐‘Ž โ‰ค ๐‘ง โˆ’ ๐‘„ ๐‘Ž โ‰ค ๐‘ฆ

โ–ช Etc. EXAMPLE: STANDARD NORMAL DISTRIBUTION

= โˆ’ Scale for standard normal, but this applies to any continuous distribution

slide-50
SLIDE 50

โ–ช Inverse lookup

โ–ช ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ = ฮฆ ๐‘ฆ = 0.90 โ–ช table C-2 (p.768): ๐‘ฆ โ‰ˆ 1.28

EXAMPLE: STANDARD NORMAL DISTRIBUTION

slide-51
SLIDE 51

No need to know this table by heart... but two values can be convenient to know โ–ช ๐‘„ ๐‘Ž โ‰ค 1.96 = 0.95, a ๐‘จ-value as large as 1.96 or larger occurs only with 5% probability โ–ช ๐‘„ โˆ’1.645 โ‰ค ๐‘Ž โ‰ค 1.645 = 0.95, a ๐‘จ-value as large as 1.96 or larger or as small as โˆ’1.645 or smaller occurs

  • nly with 5% probability

โ–ช so remember 1.96 and 1.645 โ–ช (you can always look them up if you forgot or are unsure) EXAMPLE: STANDARD NORMAL DISTRIBUTION

slide-52
SLIDE 52

Note: ๐‘Œ~๐‘‚ ๐œˆ, ๐œ2 โ‡” ๐‘Œ โˆ’ ๐œˆ~๐‘‚ 0, ๐œ2 โ‡”

๐‘Œโˆ’๐œˆ ๐œ ~๐‘‚ 0,1

โ–ช Standardization

โ–ช ๐‘ฆ โ†’ ๐‘จ =

๐‘ฆโˆ’๐œˆ ๐œ and ๐‘Œ โ†’ ๐‘Ž = ๐‘Œโˆ’๐œˆ ๐œ

โ–ช If ๐‘Œ~๐‘‚ ๐œˆ, ๐œ2 , how to determine ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ ?

โ–ช ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ = ๐‘„ ๐‘Œ โˆ’ ๐œˆ โ‰ค ๐‘ฆ โˆ’ ๐œˆ = ๐‘„

๐‘Œโˆ’๐œˆ ๐œ

โ‰ค

๐‘ฆโˆ’๐œˆ ๐œ

= ๐‘„ ๐‘Ž โ‰ค

๐‘ฆโˆ’๐œˆ ๐œ

โ–ช Example

โ–ช suppose ๐‘Œ~๐‘‚ 180, ๐œ2 = 25 โ–ช ๐‘„ ๐‘Œ โ‰ค 190 = ๐‘„ ๐‘Ž โ‰ค

190โˆ’180 5

= ๐‘„ ๐‘Ž โ‰ค 2 = 0.9772 โ–ช ๐‘„ ๐‘Œ โ‰ค ๐‘ฆ = 0.90 = ๐‘„ ๐‘Ž โ‰ค

๐‘ฆโˆ’180 5

โ‡’

๐‘ฆโˆ’180 5

= 1.28 โ‡’ ๐‘ฆ = 186.4

BACK TO THE NORMAL DISTRIBUTION

This is our way of doing normalcdf and invnorm if you donโ€™t have a graphical calculator!

slide-53
SLIDE 53

โ–ช What is โ€œnormalโ€ about the normal distribution?

โ–ช it has quite a weird pdf formula โ–ช and an even weirder cdf formula

โ–ช But

โ–ช it is unimodal โ–ช it is symmetric โ–ช very often empirical distributions โ€œlookโ€ normal โ–ช a quantity is approximately normal if it is influenced by many additive factors, none of which is dominating โ–ช several statistics (mean, sum, ...) are normally distributed

โ–ช Youโ€™ll learn that soon

โ–ช when we discuss the Central Limit Theorem (CLT)

BACK TO THE NORMAL DISTRIBUTION

slide-54
SLIDE 54

โ–ช Scaling

โ–ช If ๐‘Œ~๐‘‚ ๐œˆ๐‘Œ, ๐œ๐‘Œ

2 then ๐‘๐‘Œ + ๐‘~๐‘‚ ๐‘๐œˆ๐‘Œ + ๐‘, ๐‘2๐œ๐‘Œ 2

โ–ช Additivity

โ–ช If ๐‘Œ~๐‘‚ ๐œˆ๐‘Œ, ๐œ๐‘Œ

2 and ๐‘~๐‘‚ ๐œˆ๐‘, ๐œ๐‘ 2 and ๐‘Œ, ๐‘ independent, then

๐‘Œ + ๐‘~๐‘‚ ๐œˆ๐‘Œ + ๐œˆ๐‘, ๐œ๐‘Œ

2 + ๐œ๐‘ 2

PROPERTIES OF THE NORMAL DISTRIBUTION

pdf of 0.825๐‘Œ + 11 pdf of ๐‘Œ

slide-55
SLIDE 55

Sometimes, we can approximate a โ€œdifficultโ€ distribution by a โ€œsimplerโ€ one โ–ช Important case: binomial ๏‚ฎ normal

โ–ช example 1: flipping a coin (๐œŒ = 0.50, ๐‘Œ = #heads) very often

APPROXIMATIONS TO DISTRIBUTIONS

slide-56
SLIDE 56

โ–ช But also when ๐œŒ โ‰  0.50

โ–ช example 2: flipping a biased coin (๐œŒ = 0.30, ๐‘Œ = #heads) very

  • ften

APPROXIMATIONS TO DISTRIBUTIONS

๐‘œ = 10; ๐œŒ = .30 ๐‘œ = 20; ๐œŒ = .30 ๐‘œ = 40; ๐œŒ = .30

slide-57
SLIDE 57

โ–ช binomial ๏‚ฎ normal

โ–ช ๐‘๐‘—๐‘œ ๐‘œ, ๐œŒ ๏‚ฎ ๐‘‚ ๐œˆ, ๐œ2 โ–ช using ๐œˆ =? ? ? and ๐œ2 =? ? ?

We know that when ๐‘Œ~๐‘๐‘—๐‘œ ๐‘œ, ๐œŒ

โ–ช ๐น ๐‘Œ = ๐‘œ๐œŒ โ–ช var ๐‘Œ = ๐‘œ๐œŒ 1 โˆ’ ๐œŒ

So, replace

โ–ช ๐œˆ = ๐‘œ๐œŒ โ–ช ๐œ2 = ๐‘œ๐œŒ 1 โˆ’ ๐œŒ

So,

โ–ช ๐‘๐‘—๐‘œ ๐‘œ, ๐œŒ ๏‚ฎ ๐‘‚ ๐‘œ๐œŒ, ๐‘œ๐œŒ 1 โˆ’ ๐œŒ โ–ช rule: allowed when ๐‘œ๐œŒ โ‰ฅ 5 and ๐‘œ 1 โˆ’ ๐œŒ โ‰ฅ 5

APPROXIMATIONS TO DISTRIBUTIONS

The book says โ‰ฅ 10 instead of โ‰ฅ 5

slide-58
SLIDE 58

โ–ช Example binomial ๏‚ฎ normal

โ–ช roll a die ๐‘œ = 900 times โ–ช study the occurrence of โ€œsixesโ€ (so ๐œŒ =

1 6)

โ–ช what is the probability of no more then 170 โ€œsixesโ€?

โ–ช Exact: ๐‘„๐‘๐‘—๐‘œ ๐‘œ=900;๐œŒ=1/6 X โ‰ค 170 =? โ–ช Two problems:

โ–ช need to add 171 pdf-terms (๐‘„ ๐‘Œ = 0 until ๐‘„ ๐‘Œ = 170 ) โ–ช 900! gives an ERROR

โ–ช Approximation: ๐‘„๐‘‚ ๐œˆ=150;๐œ2=125 ๐‘Œ โ‰ค 170 = ๐‘„๐‘Ž ๐‘Ž โ‰ค

170โˆ’150 125

= ฮฆ๐‘Ž 1.7888 โ‰ˆ 0.9631 APPROXIMATIONS TO DISTRIBUTIONS

900 ร— 1 6 = 150 900 ร— 1 6 ร— 1 โˆ’ 1 6 = 125

slide-59
SLIDE 59

โ–ช Now take ๐‘Œ~๐‘๐‘—๐‘œ 18,0.5

โ–ช In a โ€œbinomialโ€ context ๐‘„ ๐‘Œ โ‰ค 11 = ๐‘„ ๐‘Œ < 12 โ–ช But in a โ€œnormalโ€ context ๐‘„ ๐‘Œ โ‰ค 11 = ๐‘„ ๐‘Œ < 11

โ–ช So, take care about using integers โ–ช Safest: go half-way: ๐‘„ ๐‘Œ โ‰ค 11.5 = ๐‘„ ๐‘Œ < 11.5 โ–ช This is the continuity correction APPROXIMATIONS TO DISTRIBUTIONS

slide-60
SLIDE 60

The intuitive notion of the continuity correction

โ–ช when approximating a discrete distribution by a continuous distribution

APPROXIMATIONS TO DISTRIBUTIONS

๐‘„๐‘๐‘—๐‘œ ๐‘Œ โ‰ค 7 โ‰ˆ ๐‘„๐‘‚ ๐‘Œ โ‰ค 7 1 2 ๐‘„๐‘๐‘—๐‘œ ๐‘Œ โ‰ฅ 7 โ‰ˆ ๐‘„๐‘‚ ๐‘Œ โ‰ฅ 6 1 2

slide-61
SLIDE 61

Improving previous result โ–ช without continuity correction

โ–ช ๐‘„๐‘๐‘—๐‘œ ๐‘œ=900;๐œŒ=1/6 X โ‰ค 170 = ๐‘„๐‘‚ ๐œˆ=150;๐œ2=125 ( ) ๐‘Œ โ‰ค 170 = ๐‘„๐‘Ž ๐‘Ž โ‰ค

170โˆ’150 125

= ฮฆ๐‘Ž 1.788 โ‰ˆ 0.9631

โ–ช with continuity correction

โ–ช ๐‘„๐‘๐‘—๐‘œ ๐‘œ=900;๐œŒ=1/6 X โ‰ค 170 = ๐‘„๐‘‚ ๐œˆ=150;๐œ2=125 ( ) ๐‘Œ โ‰ค 170.5 = ๐‘„๐‘Ž ๐‘Ž โ‰ค

170.5โˆ’150 125

= ฮฆ๐‘Ž 1.833 โ‰ˆ 0.9664

APPROXIMATIONS TO DISTRIBUTIONS

slide-62
SLIDE 62

30 June 2014, Q1d OLD EXAM QUESTION

slide-63
SLIDE 63

30 June 2014, Q1f OLD EXAM QUESTION

slide-64
SLIDE 64

Doane & Seward 5/E 6.1-6.4, 6.8, 7.1-7.5 Tutorial exercises week 1 discrete probability distributions continuous probability distributions expectation and variance FURTHER STUDY