scatter plots
play

Scatter plots IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G - PowerPoint PPT Presentation

Scatter plots IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2 Rick Sca v e a Fo u nder , Sca v e a Academ y 48 geometries geom _* abline conto u r dotplot ji er pointrange ribbon spoke area co u nt errorbar label


  1. Scatter plots IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2 Rick Sca v e � a Fo u nder , Sca v e � a Academ y

  2. 48 geometries geom _* abline conto u r dotplot ji � er pointrange ribbon spoke area co u nt errorbar label pol y gon r u g step bar crossbar errorbarh line qq segment te x t bin 2 d c u r v e freqpol y linerange qq _ line sf tile blank densit y he x map q u antile sf _ label v iolin bo x plot densit y2 d histogram path raster sf _ te x t v line col densit y_2 d hline point rect smooth INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  3. Common plot t y pes Plot t y pe Possible Geoms Sca � er plots points , ji � er , abline , smooth , co u nt INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  4. Scatter plots ggplot(iris, aes(x = Sepal.Length, Each geom can accept speci � c aesthetic y = Sepal.Width)) + mappings , e . g . geom _ point (): geom_point() Essential x,y INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  5. Scatter plots ggplot(iris, aes(x = Sepal.Length, Each geom can accept speci � c aesthetic y = Sepal.Width, mappings , e . g . geom _ point (): col = Species)) + geom_point() Essential Optional alpha , color , � ll , shape , si z e , x,y stroke INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  6. Geom - specific aesthetic mappings # These result in the same plot! ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_point() ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) + geom_point(aes(col = Species)) Control aesthetic mappings of each la y er independentl y: INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  7. head(iris, 3) # Raw data Species Sepal.Length Sepal.Width Petal.Length Petal.Width 1 setosa 5.1 3.5 1.4 0.2 2 setosa 4.9 3.0 1.4 0.2 3 setosa 4.7 3.2 1.3 0.2 iris %>% group_by(Species) %>% summarise_all(mean) -> iris.summary iris.summary # Summary statistics # A tibble: 3 x 5 Species Sepal.Length Sepal.Width Petal.Length Petal.Width <fct> <dbl> <dbl> <dbl> <dbl> 1 setosa 5.01 3.43 1.46 0.246 2 versicolor 5.94 2.77 4.26 1.33 3 virginica 6.59 2.97 5.55 2.03 INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  8. ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + # Inherits both data and aes from ggplot() geom_point() + # Different data, but inherited aes geom_point(data = iris.summary, shape = 15, size = 5) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  9. Shape attrib u te v al u es INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  10. E x ample ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_point() + geom_point(data = iris.summary, shape = 21, size = 5, fill = "black", stroke = 2) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  11. On - the - fl y stats b y ggplot 2 See the second co u rse for the stats la y er . Note : A v oid plo � ing onl y the mean w itho u t a meas u re of spread , e . g . the standard de v iation . INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  12. position = " jitter " ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_point(position = "jitter") INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  13. geom _ jitter () A short - c u t to geom_point(position = "jitter") ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_jitter() INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  14. Don ' t forget to adj u st alpha Combine ji � ering w ith alpha - blending if necessar y ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_jitter(alpha = 0.6) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  15. Hollo w circles also help shape = 1 is a . hollo w circle . Not necessar y to also u se alpha - blending . ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + geom_jitter(shape = 1) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  16. Let ' s practice ! IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2

  17. Histograms IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2 Rick Sca v e � a Fo u nder , Sca v e � a Academ y

  18. Common plot t y pes Plot t y pe Possible Geoms Sca � er plots points , ji � er , abline , smooth , co u nt Bar plots histogram , bar , col , errorbar Line plots line , path INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  19. Histograms ggplot(iris, aes(x = Sepal.Width)) + geom_histogram() A plot of binned v al u es i . e . a statistical f u nction `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  20. Defa u lt of 30 e v en bins ggplot(iris, aes(x = Sepal.Width)) + geom_histogram() A plot of binned v al u es i . e . a statistical f u nction # Default bin width: diff(range(iris$Sepal.Width))/30 [1] 0.08 INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  21. Int u iti v e and meaningf u l bin w idths ggplot(iris, aes(x = Sepal.Width)) + geom_histogram(binwidth = 0.1) Al w a y s set a meaningf u l bin w idths for y o u r data . No spaces bet w een bars . INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  22. Re - position tick marks ggplot(iris, aes(x = Sepal.Width)) + geom_histogram(binwidth = 0.1, center = 0.05) Al w a y s set a meaningf u l bin w idths for y o u r data . No spaces bet w een bars . X a x is labels are bet w een bars . INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  23. Different Species ggplot(iris, aes(x = Sepal.Width, fill = Species)) + geom_histogram(binwidth = .1, center = 0.05) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  24. Defa u lt position is " stack " ggplot(iris, aes(x = Sepal.Width, fill = Species)) + geom_histogram(binwidth = .1, center = 0.05, position = "stack") INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  25. position = " dodge " ggplot(iris, aes(x = Sepal.Width, fill = Species)) + geom_histogram(binwidth = .1, center = 0.05, position = "dodge") INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  26. position = " fill " ggplot(iris, aes(x = Sepal.Width, fill = Species)) + geom_histogram(binwidth = .1, center = 0.05, position = "fill") INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  27. Final Slide IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2

  28. Bar plots IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2 Rick Sca v e � a Fo u nder , Sca v e � a Academ y

  29. Bar Plots , w ith a categorical X - a x is Use geom _ bar () or geom _ col () Geom Stat Action geom_bar() " co u nt " Co u nts the n u mber of cases at each x position geom_col() " identit y" Plot act u al v al u es All positions from before are a v ailable T w o t y pes Absol u te co u nts Distrib u tions INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  30. Bar Plots , w ith a categorical X - a x is Use geom _ bar () or geom _ col () Geom Stat Action geom_bar() " co u nt " Co u nts the n u mber of cases at each x position geom_col() " identit y" Plot act u al v al u es INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  31. Bar Plots , w ith a categorical X - a x is Use geom _ bar () or geom _ col () Geom Stat Action geom_bar() " co u nt " Co u nts the n u mber of cases at each x position geom_col() " identit y" Plot act u al v al u es All positions from before are a v ailable T w o t y pes Absol u te co u nts Distrib u tions INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  32. Habits of mammals str(sleep) 'data.frame': 76 obs. of 3 variables: $ vore : Factor w/ 4 levels "carni","herbi",..: 1 4 2 4 2 2 1 1 2 2 ... $ total: num 12.1 17 14.4 14.9 4 14.4 8.7 10.1 3 5.3 ... $ rem : num NA 1.8 2.4 2.3 0.7 2.2 1.4 2.9 NA 0.6 ... INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  33. Bar plot ggplot(sleep, aes(vore)) + geom_bar() INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  34. Plotting distrib u tions instead of absol u te co u nts iris_summ_long # Calculate Descriptive Statistics: iris %>% select(Species, Sepal.Width) %>% Species a v g stde v gather(key, value, -Species) %>% group_by(Species) %>% setosa 3.43 0.38 summarise(avg = mean(value), v ersicolor 2.77 0.31 stdev = sd(value)) -> iris_summ_long v irginica 2.97 0.32 INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  35. Plotting distrib u tions ggplot(iris_summ_long, aes(x = Species, y = avg)) + geom_col() + geom_errorbar(aes(ymin = avg - stdev, ymax = avg + stdev), width = 0.1) INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

  36. Let ' s practice ! IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2

  37. Line plots IN TR OD U C TION TO DATA VISU AL IZATION W ITH G G P L OT 2 Rick Sca v e � a Fo u nder , Sca v e � a Academ y

  38. Common plot t y pes Plot t y pe Possible Geoms Sca � er plots points , ji � er , abline , smooth , co u nt Bar plots histogram , bar , col , errorbar Line plots line , path INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT 2

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend