basics of algorithmics in r
play

Basics of Algorithmics in R a ) T R U E | F A L S E b ) - PDF document

An introduction to WS 2019/2020 Which expression(s) equal to TRUE? ( x equals 5) Basics of Algorithmics in R a ) T R U E | F A L S E b ) x > 5 c ) F A L S E & T R U E d ) x < = 1 0 | x > 5


  1. An introduction to WS 2019/2020 Which expression(s) equal to TRUE? ( x equals 5) Basics of Algorithmics in R a ) T R U E | F A L S E b ) x > 5 c ) F A L S E & T R U E d ) x < = 1 0 | x > 5 Answer: a) and d) Dr. Noémie Becker What is the value of y at the end of the loop if it was 0 and the beginning? How many Dr. Eliza Argyridou iterations of the loop occurred? while (y <= 10) { y <- 2*y + 3} Answer: y = 21; the loop ran 3 times. Special thanks to : Dr. Sonja Grath for addition to slides What you should know after days 7 & 8 Basics Syntax: Review: Data frames and import your data m y f u n < - f u n c t i o n ( a r g 1 , a r g 2 , … ) { c o m m a n d s } Conditional execution in R Example: ● Logic rules We want to define a function that takes a DNA sequence as input and ● if(), else(), ifelse() gives as output the GC content (proportion of G and C in the sequence). ● Example from day 1 How can we name our function? Loops Idea: g c Executing a command from a script ? g c # T h e r e i s a l r e a d y a f u n c t i o n g c ( ) Writing your own functions Another idea: g c C o n t e n t How to avoid slow R code ? g c C o n t e n t N o d o c u m e n t a t i o n f o r ‘ g c C o n t e n t ’ i n s p e c i fj e d p a c k a g e s a n d l i b r a r i e s : y o u c o u l d t r y ‘ ? ? g c C o n t e n t ’ → We can name our function g c C o n t e n t ( ) 3 4 Our function gcContent() from Day 1 Dealing with problems Version 1 Problems: ● R gives an error message if the input is not a character value g c C o n t e n t < - f u n c t i o n ( d n a , c o u n t e r = 0 ) { ● Our function calculates values if the input is most likely not a DNA d n a < - u n l i s t ( s t r s p l i t ( d n a , " " ) ) sequence f o r ( i i n 1 : l e n g t h ( d n a ) ) { i f ( d n a [ i ] = = " C " | d n a [ i ] = = " G " ) How could we deal with these problems? { c o u n t e r = c o u n t e r + 1 } } What do we want our function to output in these cases? r e t u r n ( c o u n t e r / l e n g t h ( d n a ) ) } Does our function works correctly? # T e s t t h e f u n c t i o n w i t h s o m e e x a m p l e d a t a g c C o n t e n t ( " A A C G T G G C T A " ) YOUR TURN g c C o n t e n t ( " A A T A T A T T A T " ) g c C o n t e n t ( 2 3 ) g c C o n t e n t ( T R U E ) g c C o n t e n t ( " n o t D N A " ) g c C o n t e n t ( " C o o l " ) 5 6

  2. Error and Warning Dealing with non-character arguments Self-defined There are two types of error messages in R: Version 2: error message g c C o n t e n t < - f u n c t i o n ( d n a , c o u n t e r = 0 ) { ● Error : Stops execution and returns no value i f ( ! i s . c h a r a c t e r ( d n a ) ) { ● Warning message: Continues execution s t o p ( " T h e a r g u m e n t m u s t b e o f t y p e c h a r a c t e r . " ) } Example: d n a < - u n l i s t ( s t r s p l i t ( d n a , " " ) ) x < - s u m ( " h e l l o " ) f o r ( i i n 1 : l e n g t h ( d n a ) ) { E r r o r i n s u m ( " h e l l o " ) : i n v a l i d ' t y p e ' ( c h a r a c t e r ) o f i f ( d n a [ i ] = = " C " | d n a [ i ] = = " G " ) a r g u m e n t { c o u n t e r = c o u n t e r + 1 } } x < - m e a n ( " h e l l o " ) r e t u r n ( c o u n t e r / l e n g t h ( d n a ) ) W a r n i n g m e s s a g e : I n m e a n . d e f a u l t ( " h e l l o " ) : } a r g u m e n t i s n o t n u m e r i c o r l o g i c a l : r e t u r n i n g N A We can define such messages with the functions s and w t o p ( ) a r n i n g ( ) In our example: ● (Specific) Error when argument is not character ● Warning if character argument is not DNA 7 8 Dealing with input that is not DNA Dealing with input that is not DNA ● We define as 'not DNA' any character different from A, C, G or T. Version 3 ● If the input contains any other character, we compute the value but throw g c C o n t e n t < - f u n c t i o n ( d n a , c o u n t e r = 0 ) { a warning. i f ( ! i s . c h a r a c t e r ( d n a ) ) { s t o p ( " T h e a r g u m e n t m u s t b e o f t y p e c h a r a c t e r . " ) To solve this task, we can use the function g as follows: } r e p ( ) i f ( l e n g t h ( g r e p ( " [ ^ A C G T ] " , d n a ) ) > 0 ) { g r e p ( " [ ^ A C G T ] " , " A A T G A C " ) w a r n i n g ( " T h e i n p u t c o n t a i n s c h a r a c t e r s o t h e r t h a n A , I n t e g e r ( 0 ) # l e n g t h i s 0 C , G o r T - v a l u e s h o u l d n o t b e t r u s t e d ! " ) g r e p ( " [ ^ A C G T ] " , " N A T G A C " ) } [ 1 ] 1 # l e n g t h i s 1 d n a < - u n l i s t ( s t r s p l i t ( d n a , " " ) ) Self-defined f o r ( i i n 1 : l e n g t h ( d n a ) ) { warning message i f ( d n a [ i ] = = " C " | d n a [ i ] = = " G " ) { c o u n t e r = c o u n t e r + 1 } } r e t u r n ( c o u n t e r / l e n g t h ( d n a ) ) } 9 10 Giving several arguments to a function Giving several arguments to a function R functions can have several arguments. Version 4 You can see them listed in the help page for the function. g c C o n t e n t < - f u n c t i o n ( d n a , c o u n t e r = 0 , A T ) { i f ( ! i s . c h a r a c t e r ( d n a ) ) { Example s t o p ( " T h e a r g u m e n t m u s t b e o f t y p e c h a r a c t e r . " ) } ? m e a n ( ) i f ( l e n g t h ( g r e p ( " [ ^ A C G T ] " , d n a ) ) > 0 ) { A frequent argument in R functions is na.rm. This argument (when set to w a r n i n g ( " T h e i n p u t c o n t a i n s c h a r a c t e r s o t h e r t h a n A , TRUE) removes NA values from vectors. C , G o r T - v a l u e s h o u l d n o t b e t r u s t e d ! " ) } d n a < - u n l i s t ( s t r s p l i t ( d n a , " " ) ) m e a n ( c ( 1 , 2 , N A ) ) f o r ( i i n 1 : l e n g t h ( d n a ) ) { [ 1 ] N A YOUR TURN i f ( d n a [ i ] = = " C " | d n a [ i ] = = " G " ) { c o u n t e r = c o u n t e r + 1 } m e a n ( c ( 1 , 2 , N A ) , n a . r m = T R U E ) } [ 1 ] 1 . 5 i f ( A T = = T R U E ) { r e t u r n ( 1 - c o u n t e r / l e n g t h ( d n a ) ) } e l s e { We now want to give our function another argument to output the AT r e t u r n ( c o u n t e r / l e n g t h ( d n a ) ) content instead of the GC content. } } 11 12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend