Categorical data Reasoning by diagrams R.W. Oldford Crossed data - - - PowerPoint PPT Presentation

categorical data
SMART_READER_LITE
LIVE PREVIEW

Categorical data Reasoning by diagrams R.W. Oldford Crossed data - - - PowerPoint PPT Presentation

Categorical data Reasoning by diagrams R.W. Oldford Crossed data - tables The main data structure for crossed categorical data is a table . Each variate has a finite number of values (categories) city <- c ("Kitchener",


slide-1
SLIDE 1

Categorical data

Reasoning by diagrams R.W. Oldford

slide-2
SLIDE 2

Crossed data - tables

The main data structure for crossed categorical data is a table.

Each variate has a finite number of values (categories) city <- c("Kitchener", "Waterloo") housing <- c("House", "Apartment", "Residence") All combinations of one value from each variate are possible (crossed) and we have the number of times each combination occurs # fake data counts <- rpois(6, lambda = 50) Arranged in a rectangular array: vacancy <- matrix(counts, nrow = length(city), ncol = length(housing), byrow = TRUE, dimnames = list(city = city, housing = housing)) And now coerced to be an object of class table vacancy <- as.table(vacancy) vacancy ## housing ## city House Apartment Residence ## Kitchener 52 53 46 ## Waterloo 47 64 43

slide-3
SLIDE 3

Crossed data - tables

The table can be a many-way array from crossing many categorical variates

term <- c("Fall", "Winter", "Spring") # more fake counts counts <- seq(from = 10, to = 180, by = 10) vacancy <- array(counts, dim=c(length(city), length(housing), length(term)), dimnames =list(city = city, housing = housing, term = term)) vacancy <- as.table(vacancy) vacancy ## , , term = Fall ## ## housing ## city House Apartment Residence ## Kitchener 10 30 50 ## Waterloo 20 40 60 ## ## , , term = Winter ## ## housing ## city House Apartment Residence ## Kitchener 70 90 110 ## Waterloo 80 100 120 ## ## , , term = Spring ## ## housing ## city House Apartment Residence ## Kitchener 130 150 170 ## Waterloo 140 160 180 Note when filling the array, the earlier indices change more quickly than do the later indices.

slide-4
SLIDE 4

Crossed data - tables

The order of dimensions can be rearranged - the R function aperm(...)

aperm(vacancy, perm=c(3,2,1)) ## , , city = Kitchener ## ## housing ## term House Apartment Residence ## Fall 10 30 50 ## Winter 70 90 110 ## Spring 130 150 170 ## ## , , city = Waterloo ## ## housing ## term House Apartment Residence ## Fall 20 40 60 ## Winter 80 100 120 ## Spring 140 160 180

slide-5
SLIDE 5

Crossed data - constructing tables from data

Have an existing dataframe with categorical variates

SAheart[1:3,] ## sbp tobacco ldl adiposity famhist typea obesity alcohol age chd ## 1 160 12.00 5.73 23.11 Present 49 25.30 97.20 52 1 ## 2 144 0.01 4.41 28.61 Absent 55 28.87 2.06 63 1 ## 3 118 0.08 3.48 32.28 Present 52 29.14 3.81 46 Create the table directly from individual factors (like famhist) or unique values (like chd): table(SAheart$chd, SAheart$famhist, dnn = c("chd", "famhist")) ## famhist ## chd Absent Present ## 206 96 ## 1 64 96 Or, by cross-tabulation (“cross tabs” or xtabs) xtabs( ~ chd + famhist, data = SAheart) # Note formula ## famhist ## chd Absent Present ## 206 96 ## 1 64 96

slide-6
SLIDE 6

Crossed data - working with tables

Consider the three-way table (a 4 x 4 x 2 array) HairEyeColor:

## , , Sex = Male ## ## Eye ## Hair Brown Blue Hazel Green ## Black 32 11 10 3 ## Brown 53 50 25 15 ## Red 10 10 7 7 ## Blond 3 30 5 8 ## ## , , Sex = Female ## ## Eye ## Hair Brown Blue Hazel Green ## Black 36 9 5 2 ## Brown 66 34 29 14 ## Red 16 7 7 7 ## Blond 4 64 5 8

The names of its variates (dimnames) in order are:

names(dimnames(HairEyeColor)) ## [1] "Hair" "Eye" "Sex"

are used to create interesting sub-tables or alternative tables.

slide-7
SLIDE 7

Crossed data - working with tables

Selecting slices (conditioning)

HairEyeColor["Black",,] ## Sex ## Eye Male Female ## Brown 32 36 ## Blue 11 9 ## Hazel 10 5 ## Green 3 2 HairEyeColor[,"Green",] ## Sex ## Hair Male Female ## Black 3 2 ## Brown 15 14 ## Red 7 7 ## Blond 8 8 HairEyeColor["Black","Blue",] ## Male Female ## 11 9 HairEyeColor["Black","Green","Male"] ## [1] 3

slide-8
SLIDE 8

Crossed data - working with tables

Collapsing dimensions (marginalizing, projecting)

# Zero dimensional margin.table(HairEyeColor) ## [1] 592 # 1 dimensional -- here margin 1 ("Hair") is preserved margin.table(HairEyeColor, margin=1) ## Hair ## Black Brown Red Blond ## 108 286 71 127 # 2 dimensional -- here margins 1 and 2 ("Hair", "Eye") are preserved margin.table(HairEyeColor, margin=c(1,2)) ## Eye ## Hair Brown Blue Hazel Green ## Black 68 20 15 5 ## Brown 119 84 54 29 ## Red 26 17 14 14 ## Blond 7 94 10 16

# Note: except for 0 dimensional. these are the same as using "apply" with "sum" apply(HairEyeColor, MARGIN=1, FUN=sum) ## Black Brown Red Blond ## 108 286 71 127

slide-9
SLIDE 9

Crossed data - working with tables

Summing along every margin (new variate value Sum for each variate)

# Every margin is summed addmargins(HairEyeColor) ## , , Sex = Male ## ## Eye ## Hair Brown Blue Hazel Green Sum ## Black 32 11 10 3 56 ## Brown 53 50 25 15 143 ## Red 10 10 7 7 34 ## Blond 3 30 5 8 46 ## Sum 98 101 47 33 279 ## ## , , Sex = Female ## ## Eye ## Hair Brown Blue Hazel Green Sum ## Black 36 9 5 2 52 ## Brown 66 34 29 14 143 ## Red 16 7 7 7 37 ## Blond 4 64 5 8 81 ## Sum 122 114 46 31 313 ## ## , , Sex = Sum ## ## Eye ## Hair Brown Blue Hazel Green Sum ## Black 68 20 15 5 108 ## Brown 119 84 54 29 286 ## Red 26 17 14 14 71 ## Blond 7 94 10 16 127 ## Sum 220 215 93 64 592

slide-10
SLIDE 10

Crossed data - working with tables

Summing along a single margin

# Just produce marginal sums over dimension 2 ("Eyes") values # for each pair (i, k) of remaining variates "Hair" and "Sex" addmargins(HairEyeColor, margin=2) ## , , Sex = Male ## ## Eye ## Hair Brown Blue Hazel Green Sum ## Black 32 11 10 3 56 ## Brown 53 50 25 15 143 ## Red 10 10 7 7 34 ## Blond 3 30 5 8 46 ## ## , , Sex = Female ## ## Eye ## Hair Brown Blue Hazel Green Sum ## Black 36 9 5 2 52 ## Brown 66 34 29 14 143 ## Red 16 7 7 7 37 ## Blond 4 64 5 8 81

slide-11
SLIDE 11

Crossed data - working with tables

Summing along two margins

# Produce marginal sums over both dimensions 1 and 2 ("Hair" and "Eyes") # for each value for "Eye" addmargins(HairEyeColor, margin=c(1,2)) ## , , Sex = Male ## ## Eye ## Hair Brown Blue Hazel Green Sum ## Black 32 11 10 3 56 ## Brown 53 50 25 15 143 ## Red 10 10 7 7 34 ## Blond 3 30 5 8 46 ## Sum 98 101 47 33 279 ## ## , , Sex = Female ## ## Eye ## Hair Brown Blue Hazel Green Sum ## Black 36 9 5 2 52 ## Brown 66 34 29 14 143 ## Red 16 7 7 7 37 ## Blond 4 64 5 8 81 ## Sum 122 114 46 31 313

slide-12
SLIDE 12

Crossed data - working with tables

Proportions (depends on which margin is fixed)

# No margins fixed, just total ... single multinomial round(prop.table(HairEyeColor), 3) ## , , Sex = Male ## ## Eye ## Hair Brown Blue Hazel Green ## Black 0.054 0.019 0.017 0.005 ## Brown 0.090 0.084 0.042 0.025 ## Red 0.017 0.017 0.012 0.012 ## Blond 0.005 0.051 0.008 0.014 ## ## , , Sex = Female ## ## Eye ## Hair Brown Blue Hazel Green ## Black 0.061 0.015 0.008 0.003 ## Brown 0.111 0.057 0.049 0.024 ## Red 0.027 0.012 0.012 0.012 ## Blond 0.007 0.108 0.008 0.014

Possible generative model:

slide-13
SLIDE 13

Crossed data - working with tables

Proportions (depends on which margin is fixed)

# No margins fixed, just total ... single multinomial round(prop.table(HairEyeColor), 3) ## , , Sex = Male ## ## Eye ## Hair Brown Blue Hazel Green ## Black 0.054 0.019 0.017 0.005 ## Brown 0.090 0.084 0.042 0.025 ## Red 0.017 0.017 0.012 0.012 ## Blond 0.005 0.051 0.008 0.014 ## ## , , Sex = Female ## ## Eye ## Hair Brown Blue Hazel Green ## Black 0.061 0.015 0.008 0.003 ## Brown 0.111 0.057 0.049 0.024 ## Red 0.027 0.012 0.012 0.012 ## Blond 0.007 0.108 0.008 0.014

Possible generative model: multinomial. Here counts nijk have fixed total n = n+++ =

ijk nijk = 592.

Pr(Data) = n n111 n211 · · · n442

  • p n111

111

· · · p n442

442

with p+++ = 4

i=1

4

j=1

2

k=1 pijk = 1.

slide-14
SLIDE 14

Crossed data - working with tables

Proportions (depends on which margin is fixed)

# One margin (the third here, i.e. Sex) is fixed ... as many multinomials as in round(prop.table(HairEyeColor, margin=3), 2) ## , , Sex = Male ## ## Eye ## Hair Brown Blue Hazel Green ## Black 0.11 0.04 0.04 0.01 ## Brown 0.19 0.18 0.09 0.05 ## Red 0.04 0.04 0.03 0.03 ## Blond 0.01 0.11 0.02 0.03 ## ## , , Sex = Female ## ## Eye ## Hair Brown Blue Hazel Green ## Black 0.12 0.03 0.02 0.01 ## Brown 0.21 0.11 0.09 0.04 ## Red 0.05 0.02 0.02 0.02 ## Blond 0.01 0.20 0.02 0.03 Possible generative model:

slide-15
SLIDE 15

Crossed data - working with tables

Proportions (depends on which margin is fixed)

# One margin (the third here, i.e. Sex) is fixed ... as many multinomials as in round(prop.table(HairEyeColor, margin=3), 2) ## , , Sex = Male ## ## Eye ## Hair Brown Blue Hazel Green ## Black 0.11 0.04 0.04 0.01 ## Brown 0.19 0.18 0.09 0.05 ## Red 0.04 0.04 0.03 0.03 ## Blond 0.01 0.11 0.02 0.03 ## ## , , Sex = Female ## ## Eye ## Hair Brown Blue Hazel Green ## Black 0.12 0.03 0.02 0.01 ## Brown 0.21 0.11 0.09 0.04 ## Red 0.05 0.02 0.02 0.02 ## Blond 0.01 0.20 0.02 0.03 Possible generative model: product multinomial. Fixed sums are (n++1, n++2) =

ij (nij1, nij2) = (279, 313).

Pr(Data) = n++1 n111 n211 · · · n441

  • p n111

111

· · · p n441

441

× n++2 n112 n212 · · · n442

  • p n112

112

· · · p n442

442

with p++k = 4

i=1

4

j=1 pijk = 1 for each k = 1, 2.

slide-16
SLIDE 16

Crossed data - working with tables

Proportions (depends on which margin is fixed)

# Easier to see with column sums for a two way table HairEye <- margin.table(HairEyeColor, margin = c(1,2)) # Sum of each table's proportions must be one round(prop.table(HairEye, margin=2), 2) ## Eye ## Hair Brown Blue Hazel Green ## Black 0.31 0.09 0.16 0.08 ## Brown 0.54 0.39 0.58 0.45 ## Red 0.12 0.08 0.15 0.22 ## Blond 0.03 0.44 0.11 0.25 Possible generative model:

slide-17
SLIDE 17

Crossed data - working with tables

Proportions (depends on which margin is fixed)

# Easier to see with column sums for a two way table HairEye <- margin.table(HairEyeColor, margin = c(1,2)) # Sum of each table's proportions must be one round(prop.table(HairEye, margin=2), 2) ## Eye ## Hair Brown Blue Hazel Green ## Black 0.31 0.09 0.16 0.08 ## Brown 0.54 0.39 0.58 0.45 ## Red 0.12 0.08 0.15 0.22 ## Blond 0.03 0.44 0.11 0.25 Possible generative model: Again a product multinomial. But now begin with sums over k (i.e. Sex) so that the relevant counts are nij+ = 2

k=1 nijk. Fixed sums for this table (i.e. HairEye) are

n+j+ = 4

i=1 nij+ ∀ j = 1, . . . , 4. The values (n+1+, n+2+, n+3+, n+4+) = (220, 215, 93, 64).

Pr(Data) =

4

  • j=1
  • n+j+

n1j+ n2j+ n3j+ n4j+

  • p

n1j+ 1j+

· · · p

n4j+ 4j+

with p+j+ = 4

i=1 pij+ = 1 for each j = 1, 2, 3, 4. (That is, each of the above columns of HairEye sum to 1.)

slide-18
SLIDE 18

Crossed data - tidy tables

On the course website, there is another package called tidytable which provides an implementation of the rules we developed for table analysis. To install the package

  • 1. Download it (tidytable_0.0-1.tar.gz) from the course website
  • 2. Place it somewhere in your file system, say in “SomeDirectoryYouPicked”
  • 3. Then install it in R as follows:

3.1 EITHER from RStudio’s “Install Packages . . . ” menu:

◮ select “Install from:”“Package archive file . . . ” ◮ browse to your directory “SomeDirectoryYouPicked” ◮ leave the defauly “install to Library” ◮ select “Install”

3.2 OR

◮ get a terminal (or shell, or console) window ◮ change directories (cd) to “SomeDirectoryYouPicked” ◮ in that directory type

R CMD INSTALL tidytable_0.0-1.tar.gz (maybe no .gz)

◮ if you have a problem with permissions, you might need to have

administrator privileges

slide-19
SLIDE 19

Crossed data - tidy tables

For example, recall the arrangement of HairEyeColor:

HairEyeColor ## , , Sex = Male ## ## Eye ## Hair Brown Blue Hazel Green ## Black 32 11 10 3 ## Brown 53 50 25 15 ## Red 10 10 7 7 ## Blond 3 30 5 8 ## ## , , Sex = Female ## ## Eye ## Hair Brown Blue Hazel Green ## Black 36 9 5 2 ## Brown 66 34 29 14 ## Red 16 7 7 7 ## Blond 4 64 5 8

slide-20
SLIDE 20

Crossed data - tidy tables

tidytable rearranges the dimensions of the table (plus other functionality) to give library(tidytable) tidytable(HairEyeColor)$table ## , , Eye = Brown ## ## Hair ## Sex Brown Blond Black Red ## Female 66 4 36 16 ## Male 53 3 32 10 ## ## , , Eye = Blue ## ## Hair ## Sex Brown Blond Black Red ## Female 34 64 9 7 ## Male 50 30 11 10 ## ## , , Eye = Hazel ## ## Hair ## Sex Brown Blond Black Red ## Female 29 5 5 7 ## Male 25 5 10 7 ## ## , , Eye = Green ## ## Hair ## Sex Brown Blond Black Red ## Female 14 8 2 7 ## Male 15 8 3 7 Which more easily reveals patterns. E.g. consider whether hair colour and sex are independent after conditioning

  • n eye colour.
slide-21
SLIDE 21

Crossed data - tidy tables

Or, recalling the proportions, compare proportions <- round(prop.table(HairEye, margin=2), 2) proportions ## Eye ## Hair Brown Blue Hazel Green ## Black 0.31 0.09 0.16 0.08 ## Brown 0.54 0.39 0.58 0.45 ## Red 0.12 0.08 0.15 0.22 ## Blond 0.03 0.44 0.11 0.25 to tidyproportions <- tidytable(proportions) tidyproportions$table ## Hair ## Eye Brown Blond Black Red ## Brown 54 3 31 12 ## Blue 39 44 9 8 ## Hazel 58 11 16 15 ## Green 45 25 8 22 Note that the tidyproportions contains more information such as tidyproportions$units ## [1] 0.01

slide-22
SLIDE 22

Eikosogram - picture of probability

Eikosograms are modelled directly on conditional probability and so can be used to reveal important patterns in this probability. To see this, let’s begin with an unconditional probability. Suppose we only have a response variate Y which can take only one of two possible values, either Y = y1 or Y = y2. An eikosogram representing this information could look like:

1/3 1 Y= y1 Y= y2

◮ Probability framed within a rectangle. ◮ Frame provides conditions ◮ Horizontal and vertical scales are [0,1] ◮ Horizontal (and vertical) lines only. ◮ Colours are in horizontal bands. ◮ Probability = Area

Actually represents: Pr(Y = y1 Frame) = 1

3

We drop the condition “Frame”, but it is understood to be present.

slide-23
SLIDE 23

Eikosogram - picture of probability

Suppose we consider tossing two coins simultaneously.

1/ 4 1 Y= {T,T} Y= {H,H} Y= {H,T} 3/ 4

◮ Two coins tossed simultaneously ◮ Three events, matching outcomes of Coin1,

Coin2

◮ Events: {H, H}, {T, T}, {H, T} ◮ Probabilities = areas

We have: Pr(Y = {H, H} Frame) = 1

4 ,

Pr(Y = {H, T} Frame) = 3

4 − 1 4 = 1 2

and Pr(Y = {T, T} Frame) = 1 − 3

4 = 1 4

slide-24
SLIDE 24

Eikosogram - picture of probability

Now imagine instead that we toss one coin, observe the outcome, then toss the next. We could sketch this in an outcome tree Action proceeds from left to right. Often natural to model process. Key features:

◮ Single root ◮ Multiple branches at each node ◮ Typically finite, though not necessarily ◮ Interest lies in different subsets of the tree from the root (paths, partial paths, subtrees) called

  • events. There are still only three of interest here.

Typically time is associated with the left to right. This nicely matches the model for an eikosogram!

slide-25
SLIDE 25

Eikosogram - picture of probability

First toss one coin, observe outcome, then toss the next. Let X take the value of the first coin’s outcome, Y the value of the second’s. Outcome tree is binary with two layers, first for X second for Y . 1/2 1 Y= tails X= heads Y= heads

◮ Suppose first coin lands heads. ◮ This frame has X = heads ◮ Probabilities = areas

Pr(Y = heads X = heads, Frame) = 1

2 ,

Pr(Y = tails X = heads, Frame) = 1 − 1

2 = 1 2

slide-26
SLIDE 26

Eikosogram - picture of probability

Similarly, the eikosogram for the case when X = tails can be produced.\ 1/2 1 Y= tails X= heads Y= heads 1/2 1 Y= tails X= tails Y= heads These two separate eikosograms can be put together in a common frame to tell the whole story in a single eikosogram.

slide-27
SLIDE 27

Eikosogram - picture of probability

Putting the two separate frames together: 1/2 1 Y= tails Y= heads

X= heads X= tails

1/2

◮ Probabilities are still areas. ◮ Marginal of X read off horizontal scale ◮ Conditional of Y

X read off vertical

◮ Note X and Y are independent here (flat)

We have: Pr(X = heads Frame) = 1

2 ,

Pr(Y = heads X = tails, Frame) = 1

2

Pr(Y = heads & X = tails Frame) = Area of rectangle = Pr(Y = heads X = tails, Frame) × Pr(X = tails Frame)

slide-28
SLIDE 28

Eikosogram - picture of probability

Note: Comparing the two yields the same results (as it should) 1/ 4 1 Y= {T,T} Y= {H,H} Y= {H,T} 3/ 4 1/2 1 Y= tails Y= heads

X= heads X= tails

1/2 Two coins at a time One after another Left eikosogram shows only joint outcomes, right shows marginal, conditional, and joint.

slide-29
SLIDE 29

Eikosogram - picture of probability

Consider a different example, one that still two variates X and Y each with binary

  • utcomes: X = x with x ∈ {1, 2} and Y = y with y ∈ {1, 2}. But now X and Y have

probabilities as given below:

◮ Probabilities are still areas. ◮ Marginal of X read off horizontal scale ◮ Conditional of Y

X read off vertical

◮ Note X and Y are not independent here (not flat)

slide-30
SLIDE 30

Eikosogram - picture of probability

Which variate, where?

Choice of which variate, X or Y , appears on the horizontal and which on the vertical depends on which conditional probabilities are of interest. Below are two views of the {same} probability

  • distribution. (Check areas!)

(a) Y X & X (b) X Y & Y

slide-31
SLIDE 31

Eikosogram - picture of probability

Rules of probability follow from calculating and equating rectangular areas. ⇐ ⇒ Bottom left yellow area is Bottom left yellow area is Pr(X = 1, Y = 1) Pr(X = 1, Y = 1) = Pr(Y = 1 X = 1) × Pr(X = 1) = Pr(X = 1 Y = 1) × Pr(Y = 1) Pr(Y X) × Pr(X) = Pr(X Y ) × Pr(Y ) . . . Bayes “theorem”

slide-32
SLIDE 32

Monty Hall problem

You are on TV show called ‘Let’s make a Deal’ and the host, Monty Hall, shows you three doors.

◮ Behind one of these doors is a brand new car! ◮ Behind each of the other two doors is a goat! ◮ You get to choose one of the three doors and take home the prize hidden

behind it.

slide-33
SLIDE 33

Monty Hall problem

You are on ‘Let’s make a Deal’ and the host, Monty Hall, shows you three doors.

◮ You choose a door. ◮ Before the prize is revealed, Monty opens one of the two other doors to reveal

. . . . . . a goat!

◮ Monty then offers you the opportunity to change your mind and either keep what’s

behind the door you have already selected, or whatever’s behind the other unopened door.

◮ Is it better to stay with your original choice? Or switch? Does it matter?

slide-34
SLIDE 34

Monty Hall problem

You select door C, and then Monty opens door B:

◮ Should you switch? Or does it matter? ◮ Reasoning often goes as follows:

◮ You always knew that at least one of doors A and B hides a goat. Knowing which one

doesn’t change anything.

◮ Or, two doors remain. It doesn’t matter which you choose.

◮ Both seem reasonable.

slide-35
SLIDE 35

Monty Hall problem

Let’s frame the possibilities in an outcome tree: Levels are:

  • 1. Monty places the car behind one of three

doors.

  • 2. You choose a door.
  • 3. Monty reveals a goat.
slide-36
SLIDE 36

Monty Hall problem

Let’s frame the possibilities in an outcome tree: Levels are:

  • 1. Monty places the car behind one of three

doors.

  • 2. You choose a door.
  • 3. Monty reveals a goat.

Highlighted is the event we have observed. Monty placed the car behind one of A

  • r C, you chose C, Monty reveals goat

behind B.

slide-37
SLIDE 37

Monty Hall problem

We want to determine: Pr

  • Car is behind door C

Contestant selects door C and Monty reveals a goat behind door B

slide-38
SLIDE 38

Monty Hall problem

We want to determine: Pr

  • Car is behind door C

Contestant selects door C and Monty reveals a goat behind door B

  • Our Frame is that you have already chosen door C.

Consider Monty’s choices:

◮ Open door A, or ◮ Open door B.

slide-39
SLIDE 39

Monty Hall problem - the eikosogram

Frame: You have chosen door C and Monty is about to open a door. Consider Monty’s choices: Open door A, Open door B.

Door A Door B

A

1/3

B C True location of the car

2/3 Monty reveals 1

What would Monty do, if the car were truly behind door A?

slide-40
SLIDE 40

Monty Hall problem - the eikosogram

Frame: You have chosen door C and Monty is about to open a door. Consider Monty’s choices: Open door A, Open door B.

Door A Door B

A

1/3

B C True location of the car

2/3 Monty reveals 1

If the car is behind A, then Monty will reveal B with probability 1.

slide-41
SLIDE 41

Monty Hall problem - the eikosogram

Frame: You have chosen door C and Monty is about to open a door. Consider Monty’s choices: Open door A, Open door B.

Door A Door B

A

1/3

B C True location of the car

2/3 Monty reveals 1

AND, the car is behind B, then Monty will reveal B with probability 0.

slide-42
SLIDE 42

Monty Hall problem - the eikosogram

Frame: You have chosen door C and Monty is about to open a door. Consider Monty’s choices: Open door A, Open door B.

Door A Door B

A

1/3

B C True location of the car

2/3 Monty reveals 1

What if the car is behind C, the door you chose?

slide-43
SLIDE 43

Monty Hall problem - the eikosogram

Frame: You have chosen door C and Monty is about to open a door. Consider Monty’s choices: Open door A, Open door B.

Door A Door B

A

1/3

B C True location of the car

2/3 Monty reveals 1

What if the car is behind C, the door you chose?

slide-44
SLIDE 44

Monty Hall problem - the eikosogram

Frame: You have chosen door C and Monty is about to open a door. Consider Monty’s choices: Open door A, Open door B.

Door A Door B

A

1/3

B C True location of the car

2/3 Monty reveals 1

What if the car is behind C, the door you chose? Now Monty has a choice.

slide-45
SLIDE 45

Monty Hall problem - the eikosogram

Frame: You have chosen door C and Monty is about to open a door. Consider Monty’s choices: Open door A, Open door B.

p Door A Door B

A

1/3

B C True location of the car

2/3 Monty reveals 1

If the car is behind C, the door you chose, Monty will choose to reveal B with some probability p of his choosing.

slide-46
SLIDE 46

Monty Hall problem - the eikosogram

Recall we want Pr(Car at C we chose C and Monty reveals B) This conditional probability can be determined from the eikosogram,

p Door A Door B

A

1/3

B C True location of the car

2/3 Monty reveals 1

The conditional probability is p p + 1

◮ If p = 0, best to switch. ◮ If p = 1, doesn’t matter. ◮ If 0 < p < 1, best to switch. ◮ Overall, best to switch.

slide-47
SLIDE 47

Monty Hall problem - the eikosogram

Seeing association Pr(Car at C we chose C and Monty reveals B) This conditional probability can be determined from the eikosogram,

p Door A Door B

A

1/3

B C True location of the car

2/3 Monty reveals 1

The conditional probability is p p + 1

◮ If p = 0, best to switch. ◮ If p = 1, doesn’t matter. ◮ If 0 < p < 1, best to switch. ◮ Overall, best to switch.