m y d a t a <
- r
e a d . t a b l e ( fj l e = " m y d a t a . t x t " , h e a d e r = T R U E , n a . s t r i n g s = " n " )
What was the sign for missing data in mydata.txt?
Answer: “n”
What is written in the first line of mydata.txt?
Answer: column names
Is the command correct?
Answer: YES!
An introduction to WS 2019/2020
- Dr. Noémie Becker
- Dr. Eliza Argyridou
Special thanks to:
- Dr. Benedikt Holtmann and Dr. SOnja Grath for sharing slides for this lecture
Rearranging and manipulating data
3
What you should know after day 5
Rearranging and manipulating data
- Reshaping data
- Combining data sets
- Making new variables
- Subsetting data
- Summarizing data
We will work with two particular packages:
- t
i d y r
- d
p l y r
What do we have to do before we can work with a package in R? (2 things)
YOUR TURN
4
What you should know after day 5
Rearranging and manipulating data
- Reshaping data
- Combining data sets
- Making new variables
- Subsetting data
- Summarizing data
5
Reshaping data
We will use data on fish abundance.
- Download the file F
i s h _ s u r v e y . c s v from the course page. Set directory, for example: s e t w d ( " ~ / D e s k t
- p
/ D a y _ 5 " )
- Import the sample data into a variable F
i s h _ s u r v e y : F i s h _ s u r v e y <
- r
e a d . c s v ( " F i s h _ s u r v e y . c s v " , h e a d e r = T R U E ) h e a d ( F i s h _ s u r v e y ) 6
Reshaping data
h e a d ( F i s h _ s u r v e y ) To combine the three columns into one column that contains all species you can use the function gather() from the tidyr package:
l i b r a r y ( t i d y r ) F i s h _ s u r v e y _ l
- n
g <
- g
a t h e r ( F i s h _ s u r v e y , S p e c i e s , A b u n d a n c e , 4 : 6 )
Note:
- 3 species (trout, perch, stickleback)
- The numbers are abundance values for
the species at specific sites