Visual comparisons Comparing distributions: Part 1 R.W. Oldford

The Titanic The data set ‘Titanic‘ provides “information on the fate of passengers on the fatal maiden voy- age of the ocean liner ‘Titanic’, summarized ac- cording to economic status (class), sex, age and survival.” The Titanic data records the number of passengers in various categories for four different categorical variates No. Variate Values 1 Class 1st, 2nd, 3rd, Crew 2 Sex Male, Female 3 Age Child, Adult 4 Survived No, Yes

The Titanic Might be interested in comparing classes by survival library (knitr) ## Warning: package 'knitr' was built under R version 3.5.2 # Subtable of survival/not by class classTable <- apply (Titanic, MARGIN = c (4,1), FUN = sum) kable (classTable) 1st 2nd 3rd Crew No 122 167 528 673 Yes 203 118 178 212 # Number in each class is classTotals <- apply (classTable, MARGIN = 2, FUN = sum) classSurvival <- t (classTable["Yes", ] / classTotals) rownames (classSurvival) <- c ("Survived") kable (classSurvival) 1st 2nd 3rd Crew Survived 0.6246154 0.4140351 0.2521246 0.239548

The Titanic Following the rules for tables, a better way to present these numbers is as # Rescale and round to two decimals newTable <- 100 * round (classSurvival, 2) # swap rows and columns newTable <- t (newTable) # Values are already in the right order, but in general # order the values in descending order descendingOrder <- order (newTable, decreasing = TRUE) newTable <- newTable[descendingOrder, ,drop = FALSE] # Note drop argument colnames (newTable) <- c ("% survived") kable (newTable, caption = "Survival rates on the Titanic by class") Table 4: Survival rates on the Titanic by class % survived 1st 62 2nd 41 3rd 25 Crew 24 How else might we visually compare these sets of numbers?

The Titanic As lengths of bars, colour coded (and labelled) by class: nvals <- nrow (newTable) cols <- rainbow (nvals, alpha = 0.5) barplot (newTable, col = cols, horiz = TRUE, names.arg = c (""), axes = FALSE, xlab = colnames (newTable)) xlocs <- cumsum (newTable) centres <- c (xlocs[1] / 2, xlocs[1 : (nvals - 1)] + diff (xlocs) / 2) text (centres, 0.75, labels = rownames (newTable)) 1st 2nd 3rd Crew % survived which compares lengths along a common NON-aligned scale.

The Titanic barplot (newTable, col = cols, horiz = TRUE, beside = TRUE, names.arg = c (""), xlab = colnames (newTable), legend.text = rownames (newTable)) Crew 3rd 2nd 1st 0 10 20 30 40 50 60 % survived which compares lengths along a common ALIGNED scale.

The Titanic Survival and not surviving survivalProportions <- classTable survivalProportions["Yes",] <- survivalProportions["Yes", ] / classTotals survivalProportions["No",] <- survivalProportions["No", ] / classTotals survivalCols <- adjustcolor ( c ("black", "grey"), 0.5) barplot (survivalProportions, col = survivalCols, horiz = TRUE, beside = TRUE, xlab = "Proportion of class", xlim = c (0,1)) legend ("bottomright", title = "Survival", fill = survivalCols, legend = rownames (survivalProportions)) Crew 3rd 2nd Survival No 1st Yes 0.0 0.2 0.4 0.6 0.8 1.0 Proportion of class

The Titanic Survival and not surviving; frame barplot (survivalProportions, col = survivalCols, horiz = TRUE, beside = FALSE, xlab = "Proportion of class", space = 0) Crew 3rd 2nd 1st 0.0 0.2 0.4 0.6 0.8 1.0 Proportion of class Both are again along a common but non-aligned scale, but now bars to be compared are closer and we have the positive effect of framing.

Warning – problems with stacked bars Bars placed side by side are pretty natural in some contexts, for example when the horizontal axis (and bar width) represents time. For example, consider the following “sleep telemetry chart”: Yellow corresponds to when the baby is awake, blue when they are asleep. But take care when these bars are stacked on top of each other (as above; or placed side by side if arranged vertically). Look what happens for many many stacked bars (and many bars in each). www.trixietracker.com/tour/sleep/

Warning – problems with stacked bars Take care when placing bars of stacked colours side by side. For example, Horizontal lines look crooked.

Warning – problems with stacked bars

Warning – problems with stacked bars Even when the rectangles are the same size, unintended visual effects can be introduced. All lines are perfectly horizontal! This is called the “cafe wall illusion” after a cafe in Bristol, England.

Aside – The cafe wall illusion Take care when placing bars of stacked colours side by side or you might induce unintended visual variation. Cafe on St. Michael’s Hill in Bristol, England

The Titanic - Number of passengers by class barplot ( apply (classTable, MARGIN = 2, FUN = sum), col= adjustcolor ("steelblue", 0.5), xlab="Class", ylab="Number of passengers") 800 Number of passengers 600 400 200 0 1st 2nd 3rd Crew Class

The Titanic - Number who died in each class barplot (classTable["No",], col = survivalCols[1], xlab="Class", ylab="Number of passengers") 600 Number of passengers 400 200 0 1st 2nd 3rd Crew Class

The Titanic - Number who survived in each class barplot (classTable["Yes",], col = survivalCols[2], xlab="Class", ylab="Number of passengers") 200 Number of passengers 150 100 50 0 1st 2nd 3rd Crew Class

The Titanic - The proportion of deaths in each class barplot (classTable, col= survivalCols, xlab="Class", ylab="Number of passengers") 800 Number of passengers 600 400 200 0 1st 2nd 3rd Crew Class

The Titanic savePar <- par (mfrow= c (1,3)) barplot ( apply (classTable, MARGIN = 2, FUN = sum), col= adjustcolor ("steelblue", 0.5), ylim = c (0,1000), # ensure common scale xlab="Class", ylab="Number of passengers") barplot (classTable["No",], col = survivalCols[1], ylim = c (0,1000), # ensure common scale main="Died", xlab="Class", ylab="Number of passengers") barplot (classTable["Yes",], col = survivalCols[2], ylim = c (0,1000), # ensure common scale main="Survived", xlab="Class", ylab="Number of passengers") par (savePar)

The Titanic Comparing counts Died Survived 1000 1000 1000 800 800 800 Number of passengers Number of passengers Number of passengers 600 600 600 400 400 400 200 200 200 0 0 0 1st 2nd 3rd Crew 1st 2nd 3rd Crew 1st 2nd 3rd Crew Class Class Class Can easily compare number of each class. Common aligned scales. Position, length, areas redundantly encode the values. Easier to compare the “shapes” of the distributions as well. Again, “Died” shape looks fairly similar to the total, except perhaps for 1st and 2nd classes. (Differences easier to tell in framed versions.)

The Titanic Comparing shapes - no common scale savePar <- par (mfrow= c (1,3)) barplot ( apply (classTable, MARGIN = 2, FUN = sum), col= adjustcolor ("steelblue", 0.5), # NO COMMON SCALE main="Total", xlab="Class", ylab="Number of passengers") barplot (classTable["No",], col = survivalCols[1], # NO COMMON SCALE main="Died", xlab="Class", ylab="Number of passengers") barplot (classTable["Yes",], col = survivalCols[2], # NO COMMON SCALE main="Survived", xlab="Class", ylab="Number of passengers") par (savePar)

The Titanic Comparing shapes - no common scale Total Died Survived 200 800 Number of passengers Number of passengers Number of passengers 500 150 600 100 400 300 200 50 100 0 0 0 1st 2nd 3rd Crew 1st 2nd 3rd Crew 1st 2nd 3rd Crew Class Class Class Different scaling makes it easier to compare the “shapes” of the distributions but harder to compare the actual values.

South African heart disease Here we will look at a dataset ‘SAheart‘ from the package ‘ElemStatLearn‘. It is a sample from a retrospective study of heart disease in males from a high-risk region of the Western Cape, South Africa. There are 462 cases and 10 variates (see ‘help(SAheart, package="ElemStatLearn")‘ for details). For example, ’sbp’ is the measured systolic blood pressure which is the blood pressure when the heart pumps, ‘chd‘ is 1 if the patient has coronary heart disease, and ‘famhist‘ indicates whether or not the patient has a family history of heart disease. library (ElemStatLearn) ## Warning: package 'ElemStatLearn' was built under R version 3.5.2 kable ( head (SAheart)) sbp tobacco ldl adiposity famhist typea obesity alcohol age chd 160 12.00 5.73 23.11 Present 49 25.30 97.20 52 1 144 0.01 4.41 28.61 Absent 55 28.87 2.06 63 1 118 0.08 3.48 32.28 Present 52 29.14 3.81 46 0 170 7.50 6.41 38.03 Present 51 31.99 24.26 58 1 134 13.60 3.50 27.78 Present 60 25.99 57.34 49 1 132 6.20 6.47 36.21 Present 62 30.77 14.14 45 0

Visual comparisons Comparing distributions: Part 1 R.W. Oldford - PowerPoint PPT Presentation

Visual comparisons Comparing distributions: Part 1 R.W. Oldford The Titanic The data set Titanic provides information on the fate of passengers on the fatal maiden voy- age of the ocean liner Titanic, summarized ac- cording to

Case Comparisons Department of Government London School of Economics and Political Science Uses

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

Comparisons of gyrokinetic PIC and CIP codes Comparisons of gyrokinetic PIC and CIP codes

Graph Resistance and Learning from Pairwise Comparisons pairwise comparisons of items. In

BMI-206 Structure-Structure comparisons Sequence-Structure comparisons Marc A. Marti-Renom

Multiple Comparisons Occasionally, e.g., at the start of a research project, we do not have a

I10 - Multiple comparisons STAT 401 (Engineering) - Iowa State University March 2, 2018

Correction for multiple comparisons in FreeSurfer 1 Problem of Multiple Comparisons p < 10 -7

Recap by Milo Davies, SAS NZ POWERFUL ADAPTIVE OPEN UNIFIED SAS Visual Analytics SAS Visual

Visual Analytics Visual Analytics is the science of analytical reasoning supported by interactive

Visual Perception human perception display devices 1 CS 349 - Visual Perception Reference

Interior Design Visual Presentation Mitton Maureen Interior Design Visual Presentation Mitton

VISUAL LIBRARY THE VISUAL LIBRARY CONTACT URL: https://visuals.newzealand.com Contact: Jodi

VLSI programming Systolic Design Book Parhi, Chp. 7 Rudolf Mak r.h.mak@tue.nl 18-May-16

Dont Use a Single Large Systolic Array, Use Many Small Ones Instead H. T. Kung Harvard

Probabilistic modeling of sensor artifacts in critical care Norm Aleks and Stuart J. Russell

Realization theory for systems biology Mihly Petreczky CNRS Ecole Central Lille, France

Approach in ML Architecture" Professor Uri Weiser Viterbi Faculty of Electrical Engineering

CS137: Today Electronic Design Automation Sequential Sorting Building on Parallel

Algorithm-SoC Co-Design for Mobile Continuous Vision Yuhao Zhu Department of Computer Science

Using TPUs to Design TPUs Cliff Young, Google AI AIDArc Keynote 3 June 2018 Why Were at