CSSS 569 Visualizing Data and Models Lab 4: Advanced ggplot2 Kai - PowerPoint PPT Presentation

CSSS 569 Visualizing Data and Models Lab 4: Advanced ggplot2 Kai Ping (Brian) Leung Department of Political Science, UW January 30, 2020

Introduction ◮ Recap of what we’ve covered last week ◮ Making a scatterplot from scratch in ggplot2 (from Chris’s slides) 1. Decide on dimensions: aspect ratio, axis limits 2. Add axis labels, plot titles 3. Choose data markers: points, symbols, text 4. Scaling & transformation, add ticks if needed 5. Choose a color palette 6. Add annotations: labels, arrows, notes 7. Add best-fit line(s) & confidence intervals 8. Add extra plots (e.g., rugs) to make a confection 9. Repeat as small multiples ( facet_grid and facet_wrap ) ◮ Next week we’ll implement them using tile ◮ Unpack the inner working of ggplot2 ◮ data, aes(. . . ), geom(. . . , inherit.aes = TRUE) ◮ Customized theme: theme_cavis.R ◮ Exercise to reproduce a graph

Roadmap for today Today’s lab is structured around three exercises: Predicted probability of voting Clinton Perot Bush 1.0 0.8 0.6 0.4 0.2 White Non−white 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Ideological self−placement (from very liberal to very conservative)

Roadmap for today Today’s lab is structured around three exercises: First Difference in Predicted Probabilities in winning CY Young −75% −50% −25% 0% 25% 50% ● era ● ● walks ● ● strikeout ● ● innings ● ● Model 1 ● winpct ● ● Model 2 −75% −50% −25% 0% 25% 50% First Difference in Predicted Probabilities in winning CY Young

Roadmap for today Today’s lab is structured around three exercises: Incidence of Measles in the US Wyoming Wisconsin West Virginia Washington Virginia Vermont Utah Texas Tennessee South Dakota South Carolina Rhode Island Cases per Pennsylvania 100,000 people Oregon Oklahoma Ohio >1000 North Dakota North Carolina New York 500−1000 New Mexico New Jersey New Hampshire 100−500 Nevada Nebraska Montana Missouri 10−100 Mississippi Minnesota Michigan 1−10 Massachusetts Maryland Maine 0−1 Louisiana Kentucky Kansas 0 Iowa Indiana Illinois NA Idaho Hawaii Georgia Florida District Of Columbia Delaware Connecticut Colorado California Arkansas Arizona Alaska Alabama 1930 1940 1950 1960 1970 1980 1990 2000

Roadmap for today 1. Last exercise: 1992 Presidential Election

Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...}

Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels

Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award

Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider

Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider ◮ Sorting using fct_reorder

Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider ◮ Sorting using fct_reorder ◮ Use of scale_{...}

Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider ◮ Sorting using fct_reorder ◮ Use of scale_{...} 3. Heatmap exercise: Measles in US

Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider ◮ Sorting using fct_reorder ◮ Use of scale_{...} 3. Heatmap exercise: Measles in US ◮ Use of geom_tile and various ways to scale_color/fill_{...}

Roadmap for today 1. Last exercise: 1992 Presidential Election ◮ Use of scale_{...} ◮ Use of facet_grid and facet-specific labels 2. Ropeladder exercise: Cy Young award ◮ pivot_longer and pivot_wider ◮ Sorting using fct_reorder ◮ Use of scale_{...} 3. Heatmap exercise: Measles in US ◮ Use of geom_tile and various ways to scale_color/fill_{...} 4. Highlight ggplot2 extension packages (See more here)

Last exercise: 1992 Presidential Election Predicted probability of voting Clinton Perot Bush 1.0 0.8 0.6 0.4 0.2 White Non−white 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Ideological self−placement (from very liberal to very conservative)

Last exercise: motivation ◮ There are many ways to do small multiples:

Last exercise: motivation ◮ There are many ways to do small multiples: ◮ plot + facet_grid(nonwhite ~ vote92) Clinton Perot Bush 0.8 Predicted prob. of voting 0.6 Non−white 0.4 0.2 0.0 0.8 0.6 White 0.4 0.2 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Ideological self−placement (from very liberal to very conservative)

Last exercise: motivation ◮ Thoughtful juxtaposition facilitates meaningful comparison and provokes further inquiry

Last exercise: motivation ◮ Thoughtful juxtaposition facilitates meaningful comparison and provokes further inquiry ◮ Sometimes, data overlapping might be the interesting phenomenon. . . Predicted probability of voting Clinton Perot Bush 1.0 0.8 0.6 0.4 0.2 White Non−white 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Ideological self−placement (from very liberal to very conservative)

Last exercise: 1992 Presidential Election # Prerequisite # Load package library (tidyverse) library (RColorBrewer) # Load data presVoteEV <- read_csv ("data/presVoteEV.csv") # Load theme source ("theme/theme_cavis.R") # Get nice color brewer <- brewer.pal (9, "Set1") blue <- brewer[2] orange <- brewer[5]

Last exercise: 1992 Presidential Election # Factorize variables presVoteEV <- presVoteEV %>% mutate ( nonwhite = factor (nonwhite), vote92 = factor (vote92, levels = c ("Clinton", "Perot", "Bush")) )

Last exercise: 1992 Presidential Election p <- ggplot (presVoteEV, aes (x = rlibcon, y = pe, ymin = lower, ymax = upper, color = nonwhite, fill = nonwhite)) + facet_grid ( ~ vote92) + geom_line () + theme_cavis_hgrid print (p) Clinton Perot Bush 1.00 0.75 0 pe 0.50 1 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

Last exercise: 1992 Presidential Election p <- p + scale_color_manual (values = c (blue, orange), labels = c ("White", "Non-white")) print (p) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

Last exercise: 1992 Presidential Election p + geom_ribbon (alpha = 0.5, show.legend = FALSE) + scale_fill_manual (values = c (blue, orange)) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

Last exercise: 1992 Presidential Election p + geom_ribbon (alpha = 0.5, linetype = 0, show.legend = FALSE) + scale_fill_manual (values = c (blue, orange)) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

Last exercise: 1992 Presidential Election p <- p + geom_ribbon (alpha = 0.5, linetype = 0, show.legend = FALSE) + scale_fill_manual (values = c (blue, NA)) print (p) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

Last exercise: 1992 Presidential Election p + geom_line ( aes (y = upper)) + geom_line ( aes (y = lower)) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

Last exercise: 1992 Presidential Election p <- p + geom_line ( aes (y = upper, linetype = nonwhite), show.legend = FALSE) + geom_line ( aes (y = lower, linetype = nonwhite), show.legend = FALSE) + scale_linetype_manual (values = c (0, 2)) # 0 = blank; 2 = dashed print (p) Clinton Perot Bush 1.00 0.75 White pe 0.50 Non−white 0.25 0.00 2 4 6 2 4 6 2 4 6 rlibcon

Last exercise: 1992 Presidential Election p <- p + scale_x_continuous (breaks = 1 : 7) + scale_y_continuous (breaks = seq (0, 1, 0.2), limits = c (0, 1), expand = c (0, 0)) print (p) Clinton Perot Bush 1.0 0.8 0.6 White pe 0.4 Non−white 0.2 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 rlibcon

Last exercise: 1992 Presidential Election p <- p + theme (legend.position = c (0.06, 0.13), legend.key.size = unit (0.2, "cm")) + labs (y = "Predicted prob. of voting", x = "Ideological self-placement") print (p) Clinton Perot Bush 1.0 Predicted prob. of voting 0.8 0.6 0.4 0.2 White Non−white 0.0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Ideological self−placement

CSSS 569 Visualizing Data and Models Lab 4: Advanced ggplot2 Kai - PowerPoint PPT Presentation

CSSS 569 Visualizing Data and Models Lab 4: Advanced ggplot2 Kai Ping (Brian) Leung Department of Political Science, UW January 30, 2020 Introduction Recap of what weve covered last week Making a scatterplot from scratch in ggplot2

CSSS 569 Visualizing Data and Models Lab 8: Visualizing Relational Data Kai Ping (Brian) Leung

CSSS 569 Visualizing Data and Models Lab 7: Visualizing Spatial Data Kai Ping (Brian) Leung

CSSS 569 Visualizing Data and Models Lab 1: Intro to labs, R, and R Markdown Kai Ping (Brian)

CSSS 569 Visualizing Data and Models Lab 2: Intro to L A T EX with Overleaf Kai Ping (Brian)

CSSS 569 Visualizing Data and Models Lab 5: Intro to tile Kai Ping (Brian) Leung Department of

CSSS 569 Visualizing Data and Models Lab 3: Intro to ggplot2 Kai Ping (Brian) Leung Department of

Outline - Tasks - Map projections - Visualizing area data - Visualizing point data -

Case Study: Montreal BIXI Bike Data Ryan Hafen Author, TrelliscopeJS DataCamp Visualizing Big

Visualizing Heart Data Visualizing Heart Data of a living entity by analyzing time- -series data

CME/STATS 195 CME/STATS 195 Lecture 4: Visualizing data Lecture 4: Visualizing data Evan

Visualizing Data with Graphs and Maps Yifan Hu AT&T Labs Research NIST May 7, 2012

Abstracting and Visualizing Host Behaviour Abstracting and Visualizing Host Behaviour through

Visualizing Large Pedigree Visualizing Large Pedigree Charts in 3D Space Charts in 3D Space

VISUALIZING UNCERTAINTY Fall 2017 Mac Hill VISUALIZING UNCERTAINTY 2 DEVELOPING A VISUAL

Visualizing search results Haystack Europe, London 2018 / sebastian.russ@tudock.de / Visualizing

MATH 105: Finite Mathematics 9-2: Graphical Representations of Data Prof. Jonathan Duncan Walla

Care: Messages Matter Diane E. Meier, MD, FACP, FAAHPM Director, Center to Advance Palliative

for Efficient Adaptation in Multi-Task Learning Asa Cooper Stickland and Iain Murray University

Modern Block Cipher Standards (DES) Debdeep Mukhopadhyay Assistant Professor Department of

Managers and Productivity Differences Nezih Guner, Andrii Parkhomenko and Gustavo Ventura RIDGE,

Kingshuk Pal (GP and HeLP-Diabetes Technical Lead) Helen Gibson (Diabetes Nurse Consultant

Pastures: Towards Usable Security Policy Engineering Sergey Bratus, Alex Ferguson, Doug McIlroy,

Education Disrupted, Education Disrupted, Education Reimagined Education Reimagined Part II

GUTs, Neutrinos and Flavor Symmetries R. N. Mohapatra WIN2017, UC, Irvine Grand Unified Theories

CSSS 569 Visualizing Data and Models Lab 4: Advanced ggplot2 Kai - PowerPoint PPT Presentation

CSSS 569 Visualizing Data and Models Lab 4: Advanced ggplot2 Kai Ping (Brian) Leung Department of Political Science, UW January 30, 2020 Introduction Recap of what weve covered last week Making a scatterplot from scratch in ggplot2

CSSS 569 Visualizing Data and Models Lab 8: Visualizing Relational Data Kai Ping (Brian) Leung

CSSS 569 Visualizing Data and Models Lab 7: Visualizing Spatial Data Kai Ping (Brian) Leung

CSSS 569 Visualizing Data and Models Lab 1: Intro to labs, R, and R Markdown Kai Ping (Brian)

CSSS 569 Visualizing Data and Models Lab 2: Intro to L A T EX with Overleaf Kai Ping (Brian)

CSSS 569 Visualizing Data and Models Lab 5: Intro to tile Kai Ping (Brian) Leung Department of

CSSS 569 Visualizing Data and Models Lab 3: Intro to ggplot2 Kai Ping (Brian) Leung Department of

Outline - Tasks - Map projections - Visualizing area data - Visualizing point data -

Case Study: Montreal BIXI Bike Data Ryan Hafen Author, TrelliscopeJS DataCamp Visualizing Big

Visualizing Heart Data Visualizing Heart Data of a living entity by analyzing time- -series data

CME/STATS 195 CME/STATS 195 Lecture 4: Visualizing data Lecture 4: Visualizing data Evan

Visualizing Data with Graphs and Maps Yifan Hu AT&amp;T Labs Research NIST May 7, 2012

Abstracting and Visualizing Host Behaviour Abstracting and Visualizing Host Behaviour through

Visualizing Large Pedigree Visualizing Large Pedigree Charts in 3D Space Charts in 3D Space

VISUALIZING UNCERTAINTY Fall 2017 Mac Hill VISUALIZING UNCERTAINTY 2 DEVELOPING A VISUAL

Visualizing search results Haystack Europe, London 2018 / sebastian.russ@tudock.de / Visualizing

MATH 105: Finite Mathematics 9-2: Graphical Representations of Data Prof. Jonathan Duncan Walla

Care: Messages Matter Diane E. Meier, MD, FACP, FAAHPM Director, Center to Advance Palliative

for Efficient Adaptation in Multi-Task Learning Asa Cooper Stickland and Iain Murray University

Modern Block Cipher Standards (DES) Debdeep Mukhopadhyay Assistant Professor Department of

Managers and Productivity Differences Nezih Guner, Andrii Parkhomenko and Gustavo Ventura RIDGE,

Kingshuk Pal (GP and HeLP-Diabetes Technical Lead) Helen Gibson (Diabetes Nurse Consultant

Pastures: Towards Usable Security Policy Engineering Sergey Bratus, Alex Ferguson, Doug McIlroy,

Education Disrupted, Education Disrupted, Education Reimagined Education Reimagined Part II

GUTs, Neutrinos and Flavor Symmetries R. N. Mohapatra WIN2017, UC, Irvine Grand Unified Theories

Visualizing Data with Graphs and Maps Yifan Hu AT&T Labs Research NIST May 7, 2012