AMOUNTS MPA 635: Data Visualization September 25, 2018 P L A N F - - PowerPoint PPT Presentation

amounts
SMART_READER_LITE
LIVE PREVIEW

AMOUNTS MPA 635: Data Visualization September 25, 2018 P L A N F - - PowerPoint PPT Presentation

AMOUNTS MPA 635: Data Visualization September 25, 2018 P L A N F O R T O D A Y More on truth Amounts Verbs Live example M O R E O N T R U T H D A T A A N D W H I T E L I E S I secretly wonder if I'm a righteous dude, is it OK


slide-1
SLIDE 1

AMOUNTS

MPA 635: Data Visualization September 25, 2018

slide-2
SLIDE 2

P L A N F O R T O D A Y More on truth Amounts Verbs Live example

slide-3
SLIDE 3

M O R E O N T R U T H

slide-4
SLIDE 4

D A T A A N D W H I T E L I E S “I secretly wonder if I'm a righteous dude, is it OK for me to sort of maybe possibly mislead people so they pursue a more righteous policy?”

Anonymous MPA 635 student

slide-5
SLIDE 5

I S N U D G I N G O K A Y ?

slide-6
SLIDE 6

W H O D E F I N E S “ G O O D ” ?

slide-7
SLIDE 7

D O N ’ T M E S S W I T H D A T A

You can push people towards policy outcomes, but don't distort data to do it. “Lies, damned lies, and statistics”

↑ Don’t perpetuate this ↑

slide-8
SLIDE 8

A M O U N T S

slide-9
SLIDE 9

P R O B L E M S W I T H B A R P L O T S

slide-10
SLIDE 10

# b a r b a r p l o t s

slide-11
SLIDE 11

B A R P L O T S A N D S U M M A R Y S T A T S

slide-12
SLIDE 12

G E N E R A L R U L E S

More data = better Show actual points Don’t use bars for summary stats The end of the bar is

  • ften all that matters

Lollipops, points, heatmaps Counts okay, but there are better solutions Always start at zero!

slide-13
SLIDE 13

V E R B S

slide-14
SLIDE 14

M O S T C O M M O N V E R B S filter()

Choose rows based on conditions

select()

Choose (and rename) columns

mutate()

Add column (or change existing column)

group_by()

Make subgroups based on a column

summarize()

Calculate summary statistics for groups

slide-15
SLIDE 15

F I LT E R

gapminder %>% filter(year == 1967)

slide-16
SLIDE 16

F I LT E R

gapminder %>% filter(lifeExp < 40)

slide-17
SLIDE 17

F I LT E R

gapminder %>% filter(continent == "Asia", lifeExp < 40)

slide-18
SLIDE 18

S E L E C T

gapminder %>% select(country, year, pop)

slide-19
SLIDE 19

M U T A T E

gapminder %>% mutate(something_new = 5)

slide-20
SLIDE 20

M U T A T E

gapminder %>% mutate(pop_million = pop / 1000000)

slide-21
SLIDE 21

M U T A T E

gapminder %>% mutate(lifeExp_binary = ifelse(lifeExp < 40, "Very low", "Not very low"))

slide-22
SLIDE 22

G R O U P _ B Y + S U M M A R I Z E

gapminder %>% group_by(continent) %>% summarize(avg_lifeexp = mean(lifeExp), median_lifeexmp = median(lifeExp), num_countries = n())

slide-23
SLIDE 23

G R O U P _ B Y + S U M M A R I Z E

gapminder %>% group_by(continent, year) %>% summarize(avg_lifeexp = mean(lifeExp), median_lifeexmp = median(lifeExp), num_countries = n())

slide-24
SLIDE 24

O T H E R H E L P F U L V E R B S arrange()

Sort a data frame by a column

left_join()

Merge two data frames by column(s)

count()

group_by() %>% summarize(n = n())

gather()

Make a data frame long

spread()

Make a data frame wide

slide-25
SLIDE 25

L I V E E X A M P L E