exploring models
play

Exploring models Categorical data R.W. Oldford 1974 Motor trend - PowerPoint PPT Presentation

Exploring models Categorical data R.W. Oldford 1974 Motor trend magazine data Recall the R data set called mtcars . head (mtcars) ## mpg cyl disp hp drat wt qsec vs am gear carb ## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4


  1. 1974 Motor trend magazine data Could do the same for engine type vs (v-shaped or straight) with (mtcars, plot (wt, vs, main = "Engine type", col = adjustcolor ("black", alpha.f = 0.7), pch = 21, cex = 2, xlab = "weight", ylab = "(0 = V-shaped, 1 = straight)")) fit2 <- loess (vs ~ wt, data = mtcars) x <- seq ( min (mtcars $ wt), max (mtcars $ wt), length.out = 200) y <- predict (fit2, newdata = data.frame (wt = x)) lines (x, y, col = "steelblue", lwd = 2) Engine type 1.0 0.8 (0 = V−shaped, 1 = straight) 0.6 0.4 0.2 0.0 2 3 4 5 weight which is a bit weirder. Note also that it goes below 0 and above 1 in places. What does the blue curve estimate now?

  2. 1974 Motor trend magazine data Could do the same for engine type vs (v-shaped or straight) with (mtcars, plot (wt, vs, main = "Engine type", col = adjustcolor ("black", alpha.f = 0.7), pch = 21, cex = 2, xlab = "weight", ylab = "(0 = V-shaped, 1 = straight)")) fit2 <- loess (vs ~ wt, data = mtcars) x <- seq ( min (mtcars $ wt), max (mtcars $ wt), length.out = 200) y <- predict (fit2, newdata = data.frame (wt = x)) lines (x, y, col = "steelblue", lwd = 2) Engine type 1.0 0.8 (0 = V−shaped, 1 = straight) 0.6 0.4 0.2 0.0 2 3 4 5 weight which is a bit weirder. Note also that it goes below 0 and above 1 in places. What does the blue curve estimate now? Again. this is mean of the vs variate. Or, given the way vs is recorded, E ( vs ) = Pr ( Engine = straight ).

  3. Generalized linear model (glm) We could also use a parametric model, such as a generalized linear model.

  4. Generalized linear model (glm) We could also use a parametric model, such as a generalized linear model. Recall a generalized linear model models a known function of the mean µ , say g ( µ ), called the link function and we model g ( µ ( x 1 , . . . , x q )) = θ 0 + θ 1 x 1 + · · · + θ p x p with everything else as before.

  5. Generalized linear model (glm) We could also use a parametric model, such as a generalized linear model. Recall a generalized linear model models a known function of the mean µ , say g ( µ ), called the link function and we model g ( µ ( x 1 , . . . , x q )) = θ 0 + θ 1 x 1 + · · · + θ p x p with everything else as before. If Y ∈ { 0 , 1 } is a Bernoulli random variable then = E ( Y ) µ

  6. Generalized linear model (glm) We could also use a parametric model, such as a generalized linear model. Recall a generalized linear model models a known function of the mean µ , say g ( µ ), called the link function and we model g ( µ ( x 1 , . . . , x q )) = θ 0 + θ 1 x 1 + · · · + θ p x p with everything else as before. If Y ∈ { 0 , 1 } is a Bernoulli random variable then = E ( Y ) = 0 × Pr ( Y = 0) + 1 × Pr ( Y = 1) µ

  7. Generalized linear model (glm) We could also use a parametric model, such as a generalized linear model. Recall a generalized linear model models a known function of the mean µ , say g ( µ ), called the link function and we model g ( µ ( x 1 , . . . , x q )) = θ 0 + θ 1 x 1 + · · · + θ p x p with everything else as before. If Y ∈ { 0 , 1 } is a Bernoulli random variable then = E ( Y ) = 0 × Pr ( Y = 0) + 1 × Pr ( Y = 1) µ = Pr ( Y = 1)

  8. Generalized linear model (glm) We could also use a parametric model, such as a generalized linear model. Recall a generalized linear model models a known function of the mean µ , say g ( µ ), called the link function and we model g ( µ ( x 1 , . . . , x q )) = θ 0 + θ 1 x 1 + · · · + θ p x p with everything else as before. If Y ∈ { 0 , 1 } is a Bernoulli random variable then = E ( Y ) = 0 × Pr ( Y = 0) + 1 × Pr ( Y = 1) µ = Pr ( Y = 1) = p , say, The natural link function for Bernoulli random variables is the logit , where � � µ g ( µ ) = ln 1 − µ

  9. Generalized linear model (glm) We could also use a parametric model, such as a generalized linear model. Recall a generalized linear model models a known function of the mean µ , say g ( µ ), called the link function and we model g ( µ ( x 1 , . . . , x q )) = θ 0 + θ 1 x 1 + · · · + θ p x p with everything else as before. If Y ∈ { 0 , 1 } is a Bernoulli random variable then = E ( Y ) = 0 × Pr ( Y = 0) + 1 × Pr ( Y = 1) µ = Pr ( Y = 1) = p , say, The natural link function for Bernoulli random variables is the logit , where � � � � µ p g ( µ ) = ln = ln . 1 − µ 1 − p

  10. Generalized linear model (glm) We could also use a parametric model, such as a generalized linear model. Recall a generalized linear model models a known function of the mean µ , say g ( µ ), called the link function and we model g ( µ ( x 1 , . . . , x q )) = θ 0 + θ 1 x 1 + · · · + θ p x p with everything else as before. If Y ∈ { 0 , 1 } is a Bernoulli random variable then = E ( Y ) = 0 × Pr ( Y = 0) + 1 × Pr ( Y = 1) µ = Pr ( Y = 1) = p , say, The natural link function for Bernoulli random variables is the logit , where � � � � µ p g ( µ ) = ln = ln . 1 − µ 1 − p This particular response modelling is called logistic regression . Other link functions are also possible. Binomial random variables (sums of Bernoullis) have the same model.

  11. 1974 Motor trend magazine data - glm fit A logistic model is fitted to mtcars (here with am as response) as follows: fit_am_glm <- glm (am ~ wt, data = mtcars, family = "binomial")

  12. 1974 Motor trend magazine data - glm fit A logistic model is fitted to mtcars (here with am as response) as follows: fit_am_glm <- glm (am ~ wt, data = mtcars, family = "binomial") which has a summary much like lm()

  13. 1974 Motor trend magazine data - glm fit A logistic model is fitted to mtcars (here with am as response) as follows: fit_am_glm <- glm (am ~ wt, data = mtcars, family = "binomial") which has a summary much like lm() summary (fit_am_glm) ## ## Call: ## glm(formula = am ~ wt, family = "binomial", data = mtcars) ## ## Deviance Residuals: ## Min 1Q Median 3Q Max ## -2.11400 -0.53738 -0.08811 0.26055 2.19931 ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 12.040 4.510 2.670 0.00759 ** ## wt -4.024 1.436 -2.801 0.00509 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 43.230 on 31 degrees of freedom ## Residual deviance: 19.176 on 30 degrees of freedom ## AIC: 23.176 ## ## Number of Fisher Scoring iterations: 6

  14. 1974 Motor trend magazine data - glm fit To display the fitted model, we need predictions as before. x <- seq ( min (mtcars $ wt), max (mtcars $ wt), length.out = 200) y <- predict (fit_am_glm, newdata = data.frame (wt = x), type = "response") Note the argument type = "response" ensures the prediction is on the original scale (namely that of µ , or p ).

  15. 1974 Motor trend magazine data - glm fit To display the fitted model, we need predictions as before. x <- seq ( min (mtcars $ wt), max (mtcars $ wt), length.out = 200) y <- predict (fit_am_glm, newdata = data.frame (wt = x), type = "response") Note the argument type = "response" ensures the prediction is on the original scale (namely that of µ , or p ). with (mtcars, plot (wt, am, main = "Transmission", col = adjustcolor ("black", alpha.f = 0.7), pch = 21, cex = 2, xlab = "weight", ylab = "(0 = automatic, 1 = manual)")) lines (x, y, col = "steelblue", lwd = 2) Transmission 1.0 0.8 (0 = automatic, 1 = manual) 0.6 0.4 0.2 0.0 2 3 4 5 weight The predicted values are probabilities as a function of wt .

  16. 1974 Motor trend magazine data - glm fit Could have had a more complex model, say a polynomial (on the logit scale): fit_am3_glm <- glm (am ~ poly (wt, 3), data = mtcars, family = "binomial") x <- seq ( min (mtcars $ wt), max (mtcars $ wt), length.out = 200) y2 <- predict (fit_am3_glm, newdata = data.frame (wt = x), type = "response") with (mtcars, plot (wt, am, main = "Transmission (cubic fit)", col = adjustcolor ("black", alpha.f = 0.7), pch = 21, cex = 2, xlab = "weight", ylab = "(0 = automatic, 1 = manual)")) lines (x, y2, col = "steelblue", lwd = 2) Transmission (cubic fit) 1.0 0.8 (0 = automatic, 1 = manual) 0.6 0.4 0.2 0.0 2 3 4 5 weight Which is slightly different (higher order terms not significant here).

  17. 1974 Motor trend magazine data - glm fit Fitting the other binary response vs now using a logistic model: fit_vs_glm <- glm (vs ~ wt, data = mtcars, family = "binomial")

  18. 1974 Motor trend magazine data - glm fit Fitting the other binary response vs now using a logistic model: fit_vs_glm <- glm (vs ~ wt, data = mtcars, family = "binomial") and plotting

  19. 1974 Motor trend magazine data - glm fit Fitting the other binary response vs now using a logistic model: fit_vs_glm <- glm (vs ~ wt, data = mtcars, family = "binomial") and plotting Engine 1.0 0.8 (0 = V−shaped, 1 = straight) 0.6 0.4 0.2 0.0 2 3 4 5 weight which at least conforms to producing probability estimates between 0 and 1.

  20. 1974 Motor trend magazine data - glm fit with factors Note that our binary variates were represented as numeric vectors and not as factors. We can fix that by coercion and adding these as new variates to mtcars as follows: mtcars $ Transmission <- factor (mtcars $ am, labels = c ("automatic", "manual")) mtcars $ Engine <- factor (mtcars $ vs, labels = c ("V-shaped", "straight"))

  21. 1974 Motor trend magazine data - glm fit with factors Note that our binary variates were represented as numeric vectors and not as factors. We can fix that by coercion and adding these as new variates to mtcars as follows: mtcars $ Transmission <- factor (mtcars $ am, labels = c ("automatic", "manual")) mtcars $ Engine <- factor (mtcars $ vs, labels = c ("V-shaped", "straight")) These can be fit via glm() as before: fit_trans_glm <- glm (Transmission ~ wt, data = mtcars, family = "binomial") fit_engine_glm <- glm (Engine ~ wt, data = mtcars, family = "binomial")

  22. 1974 Motor trend magazine data - glm fit with factors Note that our binary variates were represented as numeric vectors and not as factors. We can fix that by coercion and adding these as new variates to mtcars as follows: mtcars $ Transmission <- factor (mtcars $ am, labels = c ("automatic", "manual")) mtcars $ Engine <- factor (mtcars $ vs, labels = c ("V-shaped", "straight")) These can be fit via glm() as before: fit_trans_glm <- glm (Transmission ~ wt, data = mtcars, family = "binomial") fit_engine_glm <- glm (Engine ~ wt, data = mtcars, family = "binomial") The first of the two levels denotes “failure”, the second (or all others for multiple categories) denotes “success”. Here levels (mtcars $ Transmission) ## [1] "automatic" "manual" levels (mtcars $ Engine) ## [1] "V-shaped" "straight"

  23. 1974 Motor trend magazine data - glm fit with factors Plotting takes a little more care though. We need to add one to the prediction since levels correspond to values 1 and 2.

  24. 1974 Motor trend magazine data - glm fit with factors Plotting takes a little more care though. We need to add one to the prediction since levels correspond to values 1 and 2. with (mtcars, plot (wt, Transmission, main = "Transmission as factor", col = adjustcolor ("black", alpha.f = 0.7), pch = 21, cex = 2, xlab = "weight")) x <- seq ( min (mtcars $ wt), max (mtcars $ wt), length.out = 200) y <- predict (fit_trans_glm, newdata = data.frame (wt = x), type = "response") lines (x, y + 1, col = "steelblue", lwd = 2) Transmission as factor 2.0 1.8 Transmission 1.6 1.4 1.2 1.0 2 3 4 5 weight Similarly for the factor Engine .

  25. 1974 Motor trend magazine data - glm fit with factors Plotting takes a little more care though. We need to add one to the prediction since levels correspond to values 1 and 2. with (mtcars, plot (wt, Transmission, main = "Transmission as factor", col = adjustcolor ("black", alpha.f = 0.7), pch = 21, cex = 2, xlab = "weight")) x <- seq ( min (mtcars $ wt), max (mtcars $ wt), length.out = 200) y <- predict (fit_trans_glm, newdata = data.frame (wt = x), type = "response") lines (x, y + 1, col = "steelblue", lwd = 2) Transmission as factor 2.0 1.8 Transmission 1.6 1.4 1.2 1.0 2 3 4 5 weight Similarly for the factor Engine . See help(glm) for more variations.

  26. Interactive exploration Don’t forget that we have already seen how to explore the relationship between a binary response and a continuous explanatory variate interactively in loon . For example, library (loon) h_wt <- with (mtcars, l_hist (wt, linkingGroup = "mtcars")) h_engine <- with (mtcars, l_hist (Engine, linkingGroup = "mtcars")) h_trans <- with (mtcars, l_hist (Transmission, linkingGroup = "mtcars"))

  27. Interactive exploration For example, effect of different wt groups on either Transmission or Engine library (gridExtra) wt_grps <- cut (mtcars $ wt, 5) h_wt["color"] <- wt_grps grid.arrange ( plot (h_wt, draw = FALSE), plot (h_trans, draw = FALSE), plot (h_engine, draw = FALSE), nrow = 1) Frequency Frequency Frequency automatic manual V−shaped straight wt Transmission Engine

  28. Interactive exploration Or, condition on Transmission choice: library (gridExtra) h_wt["color"] <- mtcars $ Transmission grid.arrange ( plot (h_wt, draw = FALSE), plot (h_trans, draw = FALSE), plot (h_engine, draw = FALSE), nrow = 1) Frequency Frequency Frequency automatic manual V−shaped straight wt Transmission Engine

  29. Interactive exploration Or, alternatively, condition on Engine choice: library (gridExtra) h_wt["color"] <- mtcars $ Engine grid.arrange ( plot (h_wt, draw = FALSE), plot (h_trans, draw = FALSE), plot (h_engine, draw = FALSE), nrow = 1) Frequency Frequency Frequency automatic manual V−shaped straight wt Transmission Engine

  30. Spine plots Another way to display binary responses as a function of a continuous response is via spineplot() . It requires the response to be a factor . spineplot (Transmission ~ wt, data = mtcars)

  31. Spine plots Another way to display binary responses as a function of a continuous response is via spineplot() . It requires the response to be a factor . spineplot (Transmission ~ wt, data = mtcars) 1.0 manual 0.8 Transmission 0.6 0.4 automatic 0.2 0.0 1.5 2 2.5 3 3.5 4 5.5 wt

  32. Spine plots Another way to display binary responses as a function of a continuous response is via spineplot() . It requires the response to be a factor . spineplot (Transmission ~ wt, data = mtcars) 1.0 manual 0.8 Transmission 0.6 0.4 automatic 0.2 0.0 1.5 2 2.5 3 3.5 4 5.5 Note: wt ◮ each interval has identical range ◮ each interval width is proportional to the number of observations, and ◮ the bar heights are coloured by the proportion of each level in that interval. ◮ Conceptually, this is a plot of Pr ( y | x ) versus P ( x ).

  33. Spine plots And for the factor Engine spineplot (Engine ~ wt, data = mtcars) 1.0 0.8 straight 0.6 Engine 0.4 V−shaped 0.2 0.0 1.5 2 2.5 3 3.5 4 5.5 wt Again, conceptually, spine plots plot of Pr ( y | x ) versus P ( x )

  34. Eikosograms - the probability picture This plot is very similar to the spine plot with the following differences: ◮ both response and explanatory variates are categorical ◮ there can be as many explanatory variates as the size of the display will support ◮ probabilities can be shown on both horizontal and vertical axes The display is available from the R package eikosograms available on CRAN.

  35. Eikosograms - the probability picture This plot is very similar to the spine plot with the following differences: ◮ both response and explanatory variates are categorical ◮ there can be as many explanatory variates as the size of the display will support ◮ probabilities can be shown on both horizontal and vertical axes The display is available from the R package eikosograms available on CRAN. library (eikosograms) eikos (Transmission ~ Engine, data = mtcars) 0.56 manual 0.67 Transmission 0.5 automatic V−shaped straight Engine

  36. Eikosograms - the probability picture The eikosogram shows the joint distribution of X and Y by showing Pr ( X , Y ) = Pr ( Y | X ) × Pr ( X )

  37. Eikosograms - the probability picture The eikosogram shows the joint distribution of X and Y by showing Pr ( X , Y ) = Pr ( Y | X ) × Pr ( X ) We could have just as easily written the joint probability as Pr ( X , Y ) = Pr ( X | Y ) × Pr ( Y )

  38. Eikosograms - the probability picture The eikosogram shows the joint distribution of X and Y by showing Pr ( X , Y ) = Pr ( Y | X ) × Pr ( X ) We could have just as easily written the joint probability as Pr ( X , Y ) = Pr ( X | Y ) × Pr ( Y ) These would be two different but equivalent (in the sense that one can be derived from the other) eikosograms.

  39. Eikosograms - the probability picture The eikosogram shows the joint distribution of X and Y by showing Pr ( X , Y ) = Pr ( Y | X ) × Pr ( X ) We could have just as easily written the joint probability as Pr ( X , Y ) = Pr ( X | Y ) × Pr ( Y ) These would be two different but equivalent (in the sense that one can be derived from the other) eikosograms. e1 <- eikos (Transmission ~ Engine, data = mtcars, draw = FALSE) e2 <- eikos (Engine ~ Transmission, data = mtcars, draw = FALSE)

  40. Eikosograms - the probability picture The two eikosograms are grid.arrange (e1, e2, nrow = 1, widths = c (0.75, 0.825)) 0.56 0.59 manual straight 0.67 Transmission 0.63 Engine 0.5 0.46 automatic V−shaped Engine automatic manual Transmission V−shaped straight One can be derived from the other.

  41. Eikosograms - the probability picture The two eikosograms are grid.arrange (e1, e2, nrow = 1, widths = c (0.75, 0.825)) 0.56 0.59 manual straight 0.67 Transmission 0.63 Engine 0.5 0.46 automatic V−shaped Engine automatic manual Transmission V−shaped straight One can be derived from the other. Matching ( X , Y ) areas in both diagrams

  42. Eikosograms - the probability picture The two eikosograms are grid.arrange (e1, e2, nrow = 1, widths = c (0.75, 0.825)) 0.56 0.59 manual straight 0.67 Transmission 0.63 Engine 0.5 0.46 automatic V−shaped Engine automatic manual Transmission V−shaped straight One can be derived from the other. Matching ( X , Y ) areas in both diagrams area rectangle in left eikosogram = area of corresponding rectangle in right eikosogram

  43. Eikosograms - the probability picture The two eikosograms are grid.arrange (e1, e2, nrow = 1, widths = c (0.75, 0.825)) 0.56 0.59 manual straight 0.67 Transmission 0.63 Engine 0.5 0.46 automatic V−shaped Engine automatic manual Transmission V−shaped straight One can be derived from the other. Matching ( X , Y ) areas in both diagrams area rectangle in left eikosogram = area of corresponding rectangle in right eikosogram height × width = height × width of corresponding rectangle

  44. Eikosograms - the probability picture The two eikosograms are grid.arrange (e1, e2, nrow = 1, widths = c (0.75, 0.825)) 0.56 0.59 manual straight 0.67 Transmission 0.63 Engine 0.5 0.46 automatic V−shaped Engine automatic manual Transmission V−shaped straight One can be derived from the other. Matching ( X , Y ) areas in both diagrams area rectangle in left eikosogram = area of corresponding rectangle in right eikosogram height × width = height × width of corresponding rectangle Pr ( Y | X ) × Pr ( X ) = Pr ( X | Y ) × Pr ( Y ) or Bayes’s theorem .

  45. Eikosograms - tables A table ia a data structure that summarizes (e.g. counts) of cross-classified factors. table1 <- table (mtcars $ vs, mtcars $ am) # vectors intepretable as factors

  46. Eikosograms - tables A table ia a data structure that summarizes (e.g. counts) of cross-classified factors. table1 <- table (mtcars $ vs, mtcars $ am) # vectors intepretable as factors table1 ## ## 0 1 ## 0 12 6 ## 1 7 7

  47. Eikosograms - tables A table ia a data structure that summarizes (e.g. counts) of cross-classified factors. table1 <- table (mtcars $ vs, mtcars $ am) # vectors intepretable as factors table1 ## ## 0 1 ## 0 12 6 ## 1 7 7 # Or with a formula via cross-tabulation table2 <- xtabs ( ~ Engine + am, data = mtcars) # Note no response

  48. Eikosograms - tables A table ia a data structure that summarizes (e.g. counts) of cross-classified factors. table1 <- table (mtcars $ vs, mtcars $ am) # vectors intepretable as factors table1 ## ## 0 1 ## 0 12 6 ## 1 7 7 # Or with a formula via cross-tabulation table2 <- xtabs ( ~ Engine + am, data = mtcars) # Note no response table2 ## am ## Engine 0 1 ## V-shaped 12 6 ## straight 7 7

  49. Eikosograms - tables A table ia a data structure that summarizes (e.g. counts) of cross-classified factors. table1 <- table (mtcars $ vs, mtcars $ am) # vectors intepretable as factors table1 ## ## 0 1 ## 0 12 6 ## 1 7 7 # Or with a formula via cross-tabulation table2 <- xtabs ( ~ Engine + am, data = mtcars) # Note no response table2 ## am ## Engine 0 1 ## V-shaped 12 6 ## straight 7 7 Cross tabulation does a little more

  50. Eikosograms - tables A table ia a data structure that summarizes (e.g. counts) of cross-classified factors. table1 <- table (mtcars $ vs, mtcars $ am) # vectors intepretable as factors table1 ## ## 0 1 ## 0 12 6 ## 1 7 7 # Or with a formula via cross-tabulation table2 <- xtabs ( ~ Engine + am, data = mtcars) # Note no response table2 ## am ## Engine 0 1 ## V-shaped 12 6 ## straight 7 7 Cross tabulation does a little more table3 <- xtabs (wt ~ Engine + Transmission, data = mtcars) # response summed

  51. Eikosograms - tables A table ia a data structure that summarizes (e.g. counts) of cross-classified factors. table1 <- table (mtcars $ vs, mtcars $ am) # vectors intepretable as factors table1 ## ## 0 1 ## 0 12 6 ## 1 7 7 # Or with a formula via cross-tabulation table2 <- xtabs ( ~ Engine + am, data = mtcars) # Note no response table2 ## am ## Engine 0 1 ## V-shaped 12 6 ## straight 7 7 Cross tabulation does a little more table3 <- xtabs (wt ~ Engine + Transmission, data = mtcars) # response summed round (table3 / table2, 2) # Average weights ## Transmission ## Engine automatic manual ## V-shaped 4.10 2.86 ## straight 3.19 2.03 See help("table") , help("xtabs") and also help(tabulate)

  52. Tabulating admissions - sex discrimination at grad school? A well known table of counts is UCBAdmissions which records the number of applications and admissions to several large graduate programmes at the University of California (Berkeley) in 1973. This is a table data structure of counts cross classified by three different factors: dim (UCBAdmissions) ## [1] 2 2 6 dimnames (UCBAdmissions) ## $Admit ## [1] "Admitted" "Rejected" ## ## $Gender ## [1] "Male" "Female" ## ## $Dept ## [1] "A" "B" "C" "D" "E" "F"

  53. Tabulating admissions - sex discrimination at grad school? A well known table of counts is UCBAdmissions which records the number of applications and admissions to several large graduate programmes at the University of California (Berkeley) in 1973. This is a table data structure of counts cross classified by three different factors: dim (UCBAdmissions) ## [1] 2 2 6 dimnames (UCBAdmissions) ## $Admit ## [1] "Admitted" "Rejected" ## ## $Gender ## [1] "Male" "Female" ## ## $Dept ## [1] "A" "B" "C" "D" "E" "F" Based on the data in this table, the question of whether sexism played a role in admission to graduate school at Berkeley.

  54. Tabulating admissions - sex discrimination at grad school? A well known table of counts is UCBAdmissions which records the number of applications and admissions to several large graduate programmes at the University of California (Berkeley) in 1973. This is a table data structure of counts cross classified by three different factors: dim (UCBAdmissions) ## [1] 2 2 6 dimnames (UCBAdmissions) ## $Admit ## [1] "Admitted" "Rejected" ## ## $Gender ## [1] "Male" "Female" ## ## $Dept ## [1] "A" "B" "C" "D" "E" "F" Based on the data in this table, the question of whether sexism played a role in admission to graduate school at Berkeley. This claim can be explored using eikosgrams.

  55. Eikosograms - sex discrimination at grad school? First, to look at the admission rate we use eikos ("Admit", data = UCBAdmissions)

  56. Eikosograms - sex discrimination at grad school? First, to look at the admission rate we use eikos ("Admit", data = UCBAdmissions) Rejected Admit 0.39 Admitted

  57. Eikosograms - sex discrimination at grad school? First, to look at the admission rate we use eikos ("Admit", data = UCBAdmissions) Rejected Admit 0.39 Admitted Comments?

  58. Eikosograms - sex discrimination at grad school? But, what if we break this down by the sex of the applicant?

  59. Eikosograms - sex discrimination at grad school? But, what if we break this down by the sex of the applicant? eikos (Admit ~ Gender , data = UCBAdmissions)

  60. Eikosograms - sex discrimination at grad school? But, what if we break this down by the sex of the applicant? eikos (Admit ~ Gender , data = UCBAdmissions) 0.59 Rejected Admit 0.45 0.3 Admitted Gender Male Female Comments?

  61. Eikosograms - sex discrimination at grad school? Recall that there are several different departments

  62. Eikosograms - sex discrimination at grad school? Recall that there are several different departments eikos ("Dept" , data = UCBAdmissions)

  63. Eikosograms - sex discrimination at grad school? Recall that there are several different departments eikos ("Dept" , data = UCBAdmissions) F 0.84 E 0.71 D Dept 0.54 C 0.34 B 0.21 A Comments?

  64. Eikosograms - sex discrimination at grad school? How do admissions for the two sexes fare across departments?

  65. Eikosograms - sex discrimination at grad school? How do admissions for the two sexes fare across departments? Males: eikos (Admit ~ Dept, data = UCBAdmissions[,"Male",])

  66. Eikosograms - sex discrimination at grad school? How do admissions for the two sexes fare across departments? Males: eikos (Admit ~ Dept, data = UCBAdmissions[,"Male",]) 0.31 0.51 0.64 0.79 0.86 Rejected 0.63 0.62 Admit 0.37 0.33 Admitted 0.28 0.06 Dept A B C D E F Comments?

  67. Eikosograms - sex discrimination at grad school? How do admissions for the two sexes fare across departments?

  68. Eikosograms - sex discrimination at grad school? How do admissions for the two sexes fare across departments? Females: eikos (Admit ~ Dept, data = UCBAdmissions[,"Female",])

  69. Eikosograms - sex discrimination at grad school? How do admissions for the two sexes fare across departments? Females: eikos (Admit ~ Dept, data = UCBAdmissions[,"Female",]) 0.06 0.07 0.81 0.4 0.6 Rejected 0.82 0.68 Admit Admitted 0.35 0.34 0.24 0.07 Dept A B C D E F Comments?

  70. Eikosograms - sex discrimination at grad school? How above comparing sexes by department?

  71. Eikosograms - sex discrimination at grad school? How above comparing sexes by department? Department A: eikos (Admit ~ Gender, data = UCBAdmissions[,,"A"])

  72. Eikosograms - sex discrimination at grad school? How above comparing sexes by department? Department A: eikos (Admit ~ Gender, data = UCBAdmissions[,,"A"]) 0.88 Rejected 0.82 0.62 Admit Admitted Gender Male Female Comments?

  73. Eikosograms - sex discrimination at grad school? How above comparing sexes by department?

  74. Eikosograms - sex discrimination at grad school? How above comparing sexes by department? Department B: eikos (Admit ~ Gender, data = UCBAdmissions[,,"B"])

  75. Eikosograms - sex discrimination at grad school? How above comparing sexes by department? Department B: eikos (Admit ~ Gender, data = UCBAdmissions[,,"B"]) 0.96 Rejected 0.68 0.63 Admit Admitted Gender Male Female Comments?

  76. Eikosograms - sex discrimination at grad school? How above comparing sexes by department?

  77. Eikosograms - sex discrimination at grad school? How above comparing sexes by department? Department C: eikos (Admit ~ Gender, data = UCBAdmissions[,,"C"])

  78. Eikosograms - sex discrimination at grad school? How above comparing sexes by department? Department C: eikos (Admit ~ Gender, data = UCBAdmissions[,,"C"]) 0.35 Rejected Admit 0.37 0.34 Admitted Gender Male Female Comments?

  79. Eikosograms - sex discrimination at grad school? How above comparing sexes by department?

  80. Eikosograms - sex discrimination at grad school? How above comparing sexes by department? Department D: eikos (Admit ~ Gender, data = UCBAdmissions[,,"D"])

  81. Eikosograms - sex discrimination at grad school? How above comparing sexes by department? Department D: eikos (Admit ~ Gender, data = UCBAdmissions[,,"D"]) 0.53 Rejected Admit 0.35 0.33 Admitted Gender Male Female Comments?

  82. Eikosograms - sex discrimination at grad school? How above comparing sexes by department?

  83. Eikosograms - sex discrimination at grad school? How above comparing sexes by department? Department E: eikos (Admit ~ Gender, data = UCBAdmissions[,,"E"])

  84. Eikosograms - sex discrimination at grad school? How above comparing sexes by department? Department E: eikos (Admit ~ Gender, data = UCBAdmissions[,,"E"]) 0.33 Rejected Admit 0.28 0.24 Admitted Gender Male Female Comments?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend