Data visualization and graphics n e w d a t a < - g a - - PDF document

data visualization and graphics
SMART_READER_LITE
LIVE PREVIEW

Data visualization and graphics n e w d a t a < - g a - - PDF document

An introduction to WS 2019/2020 l i b r a r y ( t i d y r ) Data visualization and graphics n e w d a t a < - g a t h e r ( s t u d e n t s , G r o u p , N u m b e r , 2 : 7 ) What is


slide-1
SLIDE 1

l i b r a r y ( t i d y r ) n e w d a t a <

  • g

a t h e r ( s t u d e n t s , G r

  • u

p , N u m b e r , 2 : 7 )

What is this line of code doing? Answer: combine columns 2 to 7 of dataset s t u d e n t s into two columns named Group and Number How many Groups are there in dataset s t u d e n t s ? Answer: 6 groups

An introduction to WS 2019/2020

  • Dr. Noémie Becker
  • Dr. Eliza Argyridou

Special thanks to:

  • Dr. Benedikt Holtmann and Dr. Sonja Grath for sharing slides for this lecture

Data visualization and graphics

3

What you should know after day 6

Review: Rearranging and manipulating data Graphics with base R

  • Histograms
  • Scatterplots
  • Boxplots

Saving plots Graphics with ggplot2 4

Graphics with base R

Simple graphics using plotting functions in the graphics package

  • Base R, installed by default
  • Easy and quick to type
  • Wide variety of functions

Functjon Descriptjon hist() Histograms plot() Scatuerplots, etc. boxplot() Box- and whisker plots barplot() Bar- and column charts dotchart() Cleveland dot plots contour Contour of a surface (2D) pie() Circular pie chart …

5

What you should know after day 6

Review: Rearranging and manipulating data Graphics with base R

  • Histograms
  • Scatterplots
  • Boxplots

Saving plots Graphics with ggplot2 6

Graphics with base R

Creating a histogram with hist() Example 1: h i s t ( S p a r r

  • w

s $ T a r s u s )

Hist

  • gra

m of Sparrows$Tarsus

S p a r r

  • w

s $ T a r s u s F r e q u e n c y 1 9 2 0 2 1 2 2 2 3 2 4 2 5 5 1 1 5 2

slide-2
SLIDE 2

7

Graphics with base R

Creating a histogram with hist() Example 2: Alter colour and the number of bins h i s t ( S p a r r

  • w

s $ T a r s u s , c

  • l

= " g r e y " , b r e a k s = 5 )

Hist

  • gram
  • f Sparr
  • w

s$ Tarsus

S p a r r

  • w

s $ T a r s u s F r e q u e n c y 1 9 2 2 1 2 2 2 3 2 4 2 5 1 2 3 4 5 6

8

Graphics with base R

Creating a histogram with hist() Example 3: density instead of frequency h i s t ( S p a r r

  • w

s $ T a r s u s , c

  • l

= " g r e y " , b r e a k s = 5 , f r e q = F A L S E )

Histogram

  • f Sparrows$T

arsus

S p a r r

  • w

s $ T a r s u s D e n s i t y 1 9 2 2 1 2 2 2 3 2 4 2 5 . . 2 . 4 . 6

9

Graphics with base R

Creating a histogram with hist() Example 4: Add density curve h i s t ( S p a r r

  • w

s $ T a r s u s , c

  • l

= " g r e y " , b r e a k s = 5 , f r e q = F A L S E ) l i n e s ( d e n s i t y ( S p a r r

  • w

s $ T a r s u s ) , c

  • l

= " b l u e " , l w d = 2 )

Histogram

  • f Sparrows$Tarsus

S p a r r

  • w

s $ T a r s u s D e n s i t y 1 9 2 2 1 2 2 2 3 2 4 2 5 . . 2 . 4 . 6

10

Graphics with base R

Creating a histogram with hist() Example 5: Plot only males hist(Sparrows[Sparrows$Sex == "Male",]$Tarsus, col = "grey", breaks = 50)

Histogram

  • f Sparrow

s[Sparrow s$Sex = = "Male", ]$Tarsus

S p a r r

  • w

s [ S p a r r

  • w

s $ S e x = = " M a l e " , ] $ T a r s u s F r e q u e n c y 2 2 1 2 2 2 3 2 4 2 5 1 0 2 0 3 0 4 0 5

11

What you should know after day 6

Review: Rearranging and manipulating data Graphics with base R

  • Histograms
  • Scatterplots
  • Boxplots

Saving plots Graphics with ggplot2 12

Graphics with base R

Creating a scatterplot with plot()

➔ Relationship between two continuous variables

Example 1: p l

  • t

( S p a r r

  • w

s $ W i n g , S p a r r

  • w

s $ T a r s u s )

5 5 6 6 5 1 9 2 2 1 2 2 2 3 2 4 2 5 S p a r r

  • w

s $ Wi n g S p a r r

  • w

s $ T a r s u s

slide-3
SLIDE 3

13

Graphics with base R

Creating a scatterplot with plot() Example 2: Alter axis limits and shape of symbols p l

  • t

( S p a r r

  • w

s $ T a r s u s , S p a r r

  • w

s $ W i n g , x l i m = c ( 5 , 7 ) , p c h = 1 5 , c

  • l

= “ b l u e ” )

5 5 5 6 6 5 7 1 9 2 2 1 2 2 2 3 2 4 2 5 S p a r r

  • w

s $ Wi n g S p a r r

  • w

s $ T a r s u s

? p c h

YOUR TURN

14

Graphics with base R

Creating a scatterplot with plot() Example 3: Alter the size of plotting symbols p l

  • t

( S p a r r

  • w

s $ W i n g , S p a r r

  • w

s $ T a r s u s , x l i m = c ( 5 , 7 ) , c e x = 1 . 5 )

5 5 5 6 6 5 7 1 9 2 1 2 3 2 5 S p a r r

  • w

s $ Wi n g S p a r r

  • w

s $ T a r s u s

15

Graphics with base R

Creating line graphs with plot() Examples:

p l

  • t

( p r e s s u r e $ t e m p e r a t u r e , p r e s s u r e $ p r e s s u r e ) p l

  • t

( p r e s s u r e $ t e m p e r a t u r e , p r e s s u r e $ p r e s s u r e , t y p e = " l " )

0 5 1 5 2 5 3 5 0 2 6 p r e s s u r e $ t e m p e r a t u r e p r e s s u r e $ p r e s s u r e 0 5 1 5 2 5 3 5 0 2 6 p r e s s u r e $ t e m p e r a t u r e p r e s s u r e $ p r e s s u r e

16

Graphics with base R

Use the type argument to specify the type of plot Possible types

"p" points "l" lines "b" points connected by lines "o" points overlaid by lines "h" vertjcal lines from points to the zero axis "s" steps "n" nothing, only the axes

17

What you should know after day 6

Review: Rearranging and manipulating data Graphics with base R

  • Histograms
  • Scatterplots
  • Boxplots

Saving plots Graphics with ggplot2 18

Graphics with base R

Creating a boxplot with boxplot()

➔ Relationship between continuous and categorical variables

Example 1: b

  • x

p l

  • t

( W i n g ~ S e x , d a t a = S p a r r

  • w

s )

F e m a l e M a l e 5 5 6 6 5

slide-4
SLIDE 4

19

Graphics with base R

Example 2:

b

  • x

p l

  • t

( W i n g ~ S e x , d a t a = S p a r r

  • w

s , x l a b = ' S e x ' , # A d d s l a b e l t

  • x
  • a

x i s y l a b = ' W i n g l e n g t h ( m m ) ' , # A d d s l a b e l t

  • y
  • a

x i s c

  • l

= c ( " r e d " , " b l u e " ) , # A d d s c

  • l
  • u

r y l i m = c ( 5 , 7 ) , # C h a n g e s a x i s l i m i t s m a i n = " B

  • x

p l

  • t

” ) ) # A d d s t i t l e

F e m a l e M a l e 5 5 5 6 6 5 7 Boxplot S e x Wi n g l e n g t h ( m m )

20

Graphics with base R

Example 2: Multiple grouping variables b

  • x

p l

  • t

( W i n g ~ S e x + S p e c i e s , d a t a = S p a r r

  • w

s , x l a b = ’ S p e c i e s a n d S e x ' , y l a b = ' W i n g l e n g t h ( m m ) ' , c

  • l

= c ( " r e d " , " b l u e " ) , y l i m = c ( 5 , 7 ) , m a i n = " " ) )

F e m a l e . S E S P M a l e . S E S P F e m a l e . S S T S M a l e . S S T S 5 5 5 6 6 5 7 S p e c i e s a n d S e x Wi n g l e n g t h ( m m)

21

What you should know after day 6

Review: Rearranging and manipulating data Graphics with base R

  • Histograms
  • Scatterplots
  • Boxplots

Saving plots Graphics with ggplot2 22

Graphics with base R

Common parameters in graphics

main tjtle of the plot xlab label of x-axis ylab label of y-axis xlim range/limits of x-axis ylim range/limits of y-axis col colour of the points, bars, etc. can be character string or hexadecimal colour (e.g. #RRGGBB) breaks number of bins pch shape of symbol cex size of symbols lty line type lwd line width

23

Multiple plots on one page

The par() function:

  • comes with an extensive list of graphical parameters you can

change (see ?par)

  • Some options are helpful; others you may never use

To plot multiple charts within the same window, you can use the mfcol or mfrow parameter For example, par(mfrow = c(2, 2) divides the graphic window into four panels (two rows and two columns) 24

Multiple plots on one page

Hist

  • gra

m

  • f Spa

rrow s$ Tarsus S p a r r

  • w

s $ T a r s u s D e n s i t y 1 9 2 0 2 1 2 2 2 3 2 4 2 5 . . 2 . 4 . 6 5 5 5 6 6 5 7 1 9 2 0 2 1 2 2 2 3 2 4 2 5 S p a r r

  • w

s $ Wi n g S p a r r

  • w

s $ T a r s u s F e m a l e M a l e 5 5 5 6 6 5 7 Boxp lot S e x Wi n g l e n g t h ( mm ) F e m a l e . S E S P M a l e . S S T S 5 5 5 6 6 5 7 S p e c i e s a n d S e x Wi n g l e n g t h ( mm ) Hist

  • gram of Sparrows$Tarsus

S p a r r

  • w

s $ T a r s u s D e n s i t y 1 9 2 0 2 1 2 2 2 3 2 4 2 5 . . 2 . 4 . 6 5 5 5 6 6 5 7 1 9 2 0 2 1 2 2 2 3 2 4 2 5 S p a r r

  • w

s $ Wi n g S p a r r

  • w

s $ T a r s u s F e m a l e M a l e 5 5 5 6 6 5 7 Boxplot S e x Wi n g l e n g t h ( m m ) F e m a l e . S E S P M a l e . S S T S 5 5 5 6 6 5 7 S p e c i e s a n d S e x Wi n g l e n g t h ( m m )

1 2 3 4 1 3 2 4

p a r ( m f r

  • w

= c ( 2 , 2 ) ) p a r ( m f c

  • l

= c ( 2 , 2 ) )

slide-5
SLIDE 5

25

Saving plots

There are several possibilities to save a plot

  • 1. dev.print()

Example: p l

  • t

( x , y , … . ) # M a k e a p l

  • t

# A f t e r y

  • u

a r e fj n i s h e d w i t h t h e p l

  • t

u s e : d e v . p r i n t ( d e v i c e = p d f , fj l e = " fj l e n a m e . p d f " ) Important: When you are done, you have to close the printing device! d e v .

  • fg

( ) # s h u t s d

  • w

n c u r r e n t d e v i c e 26

Saving plots

  • 2. savePlot()

Example: p l

  • t

( x , y , … . ) # M a k e a p l

  • t

s a v e P l

  • t

( fj l e n a m e = " F i g u r e 1 . p d f " , t y p e = " p d f " ) Important: It is possible that it does not work for your system! (uses X11 device, most Unix systems) 27

Saving plots

  • 3. Plot directly into a fjle

Example:

# w i d t h a n d h e i g h t a r e i n i n c h e s p e r d e f a u l t p d f ( " F i g u r e 2 . p d f " , w i d t h = 4 , h e i g h t = 4 ) # Y

  • u

c a n e x e c u t e m u l t i p l e g r a p h i n g c

  • m

m a n d s h i s t ( x ) # T h e r e s u l t

  • f

e a c h w i l l g

  • i

n t

  • t

h e p d f fj l e p l

  • t

( x , y , … ) d e v .

  • fg

( )

But fjle is not printed on screen! 28

Different devices

Functjons to save plots

pdf() Opens a pdf-fjle as device postscript() Opens a postscript-fjle as device png() Opens a png-fjle as device jpeg() Opens a jpeg-fjle as device tjfg() Opens a tjfg-fjle as device bmp() Opens a bmp-fjle as device

29

What you should know after day 6

Review: Rearranging and manipulating data Graphics with base R

  • Histograms
  • Scatterplots
  • Boxplots

Saving plots Graphics with ggplot2 30

Graphics with ggplot2

Why use ggplot2?

  • Many users, a lot of support
  • Check out the ggplot2 documentation at http://docs.ggplot2.org/
  • Very flexible and powerful
  • Sophisticated plots for publication
slide-6
SLIDE 6

31

Graphics with ggplot2

To create a plot you use the ggplot() function Basic structure:

g g p l

  • t

( d a t a , # d a t a f r a m e w i t h v a r i a b l e s t

  • p

l

  • t

a e s ( x v a r i a b l e , y v a r i a b l e ) ) + # s p e c i fj e s w h i c h v a r i a b l e s t

  • p

l

  • t

g e

  • m

_

  • b

j e c t ( ) # s p e c i fj e s t h e g e

  • m

e t r i c

  • b

j e c t s

Commonly used geometric objects: Histogram: + g e

  • m

_ h i s t

  • g

r a m ( ) Scatterplot: + g e

  • m

_ p

  • i

n t ( ) Boxplot: + g e

  • m

_ b

  • x

p l

  • t

( ) 32

Graphics with ggplot2

Creating a histogram with ggplot() Example:

g g p l

  • t

( S p a r r

  • w

s , a e s ( T a r s u s ) ) + g e

  • m

_ h i s t

  • g

r a m ( c

  • l

= " g r e y " , b i n w i d t h = . 1 ) + x l a b ( " T a r s u s l e n g t h ( m m ) " ) + y l a b ( " F r e q u e n c y " )

2 4 6 2 2 2 2 4

T a r s u s l e n g t h ( mm) F r e q u e n c y

33

Graphics with ggplot2

Creating a scatterplot with ggplot() Example 1: g g p l

  • t

( S p a r r

  • w

s , a e s ( x = W i n g , y = T a r s u s ) ) + g e

  • m

_ p

  • i

n t ( )

2 2 2 2 4 5 5 6 6 5

S p a r r

  • w

s $ Wi n g S p a r r

  • w

s $ T a r s u s

34

Graphics with ggplot2

Creating a scatterplot with ggplot() Example 2: Avoid overplotting of symbols g g p l

  • t

( S p a r r

  • w

s , a e s ( x = W i n g , y = T a r s u s ) ) + g e

  • m

_ p

  • i

n t ( p

  • s

i t i

  • n

= p

  • s

i t i

  • n

_ j i t t e r ( w i d t h = . 5 , h e i g h t = ) ) 35

Graphics with ggplot2

Creating a scatterplot with ggplot() Example 2: Avoid overplotting of symbols

2 2 2 2 4 5 5 6 6 5 Wi n g T a r s u s

36

Graphics with ggplot2

Creating a scatterplot with ggplot() Example 3: Alter colour, shape, and size of symbols g g p l

  • t

( S p a r r

  • w

s , a e s ( x = W i n g , y = T a r s u s , c

  • l
  • u

r = S e x , s h a p e = S p e c i e s ) ) + g e

  • m

_ p

  • i

n t ( s i z e = 2 )

slide-7
SLIDE 7

37

Graphics with ggplot2

Creating a scatterplot with ggplot() Example 3: Alter colour, shape, and size of symbols

2 2 2 2 4 5 5 6 6 5

Wi n g T a r s u s S p e c i e s

S E S P S S T S

S e x

F e m a l e M a l e

38

Graphics with ggplot2

Creating a boxplot with ggplot() Example 1: g g p l

  • t

( S p a r r

  • w

s , a e s ( S e x , W i n g , fj l l = S e x ) ) + g e

  • m

_ b

  • x

p l

  • t

( )

5 5 6 6 5 F e m a l e M a l e

S e x Wi n g S e x

F e m a l e M a l e

39

Saving a ggplot

Save a plot from ggplot2 with print() Example 1: print a ggplot to a file # P r i n t t h e p l

  • t

t

  • a

p d f fj l e d a t a ( " m t c a r s " ) p d f ( " m y p l

  • t

. p d f " ) m y p l

  • t

<

  • g

g p l

  • t

( m t c a r s , a e s ( w t , m p g ) ) + g e

  • m

_ p

  • i

n t ( ) p r i n t ( m y p l

  • t

) d e v .

  • fg

( ) 40

Saving a ggplot

Save a plot from ggplot2 with ggsave() Example 2: save the last ggplot # 1 . C r e a t e a p l

  • t

# T h e p l

  • t

i s d i s p l a y e d

  • n

t h e s c r e e n g g p l

  • t

( m t c a r s , a e s ( w t , m p g ) ) + g e

  • m

_ p

  • i

n t ( ) # 2 . S a v e t h e p l

  • t

t

  • a

p d f g g s a v e ( " m y p l

  • t

. p d f " ) 41

Graphics with ggplot2

Preparing plots for publication

  • Title and axis labels
  • Range of axes
  • Colours
  • Overall appearance (themes)
  • Text size
  • Legend

5 5 5 6 6 5 7 F e m a l e Ma l e

S e x Wi n g l e n g t h ( m m )

S p a r r

  • w

m

  • r

p h

  • l
  • g

y

42

Take-home message

  • Look at the help of the plotting functions again and again until

you know the basic options by heart

  • You can realize any fancy plot in R but for this you need to ...
  • … practice!
slide-8
SLIDE 8

43

Further reading

htup://www.cookbook-r.com/ htup://www.cookbook-r.com/Graphs/ htup://docs.ggplot2.org/ http://r4ds.had.co.nz/