Regular expressions String Manipulation with stringr Regular - - PowerPoint PPT Presentation

regular expressions
SMART_READER_LITE
LIVE PREVIEW

Regular expressions String Manipulation with stringr Regular - - PowerPoint PPT Presentation

STRING MANIPULATION WITH STRINGR Regular expressions String Manipulation with stringr Regular expressions A language for describing pa erns ^.[\d]+ "the start of the string, followed by any single character, followed by one


slide-1
SLIDE 1

STRING MANIPULATION WITH STRINGR

Regular expressions

slide-2
SLIDE 2

String Manipulation with stringr

  • A language for describing paerns
  • "the start of the string, followed by any single character,

followed by one or more digits"

^.[\d]+

Regular expressions

slide-3
SLIDE 3

String Manipulation with stringr

Regular expressions as a paern argument

> str_detect(c("R2-D2", "C-3P0"), pattern = "^.\\d+") [1] TRUE FALSE > START %R% ANY_CHAR %R%

  • ne_or_more(DGT)

<regex> ^.[\d]+

START %R% ANY_CHAR %R%

  • ne_or_more(DGT)

rebus

^.[\d]+

Regular expression

slide-4
SLIDE 4

String Manipulation with stringr

Regular expressions as a paern argument

> str_detect(c("R2-D2", "C-3P0"), pattern = START %R% ANY_CHAR %R%

  • ne_or_more(DGT))

[1] TRUE FALSE > str_view(c("R2-D2", "C-3P0"), pattern = START %R% ANY_CHAR %R%

  • ne_or_more(DGT))

In HTML viewer

slide-5
SLIDE 5

STRING MANIPULATION WITH STRINGR

Let’s practice!

slide-6
SLIDE 6

STRING MANIPULATION WITH STRINGR

More regular expressions

slide-7
SLIDE 7

String Manipulation with stringr

Regular expression review

Paern Regular Expression rebus Start of string ^ START End of string $ END Any single character . ANY_CHAR Literal dot, carat or dollar sign \. \^ \$ DOT, CARAT, DOLLAR

slide-8
SLIDE 8

String Manipulation with stringr

Alternation

> or("dog", "cat") <regex> (?:dog|cat) > str_view(c("kittycat", "doggone"), pattern = or("dog", "cat"))

(dog|cat)

slide-9
SLIDE 9

String Manipulation with stringr

Character classes

> char_class("Aa") <regex> [Aa] > str_view(c("apple", "Aaron"), pattern = char_class("Aa")) > negated_char_class("Aa") <regex> [^Aa] > str_view(c("apple", "Aaron"), pattern = negated_char_class("Aa"))

slide-10
SLIDE 10

String Manipulation with stringr

Repetition

> str_view(c("apple", "Aaron"), pattern = one_or_more("Aa"))

Paern Regular Expression rebus Optional ?

  • ptional()

Zero or more * zero_or_more() One or more +

  • ne_or_more()

Between n and m times {n}{m} repeated()

slide-11
SLIDE 11

STRING MANIPULATION WITH STRINGR

Let’s practice!

slide-12
SLIDE 12

STRING MANIPULATION WITH STRINGR

Shortcuts

slide-13
SLIDE 13

String Manipulation with stringr

Ranges in character classes

> DOLLAR %R% char_class("0123456789") <regex> \$[0123456789] > char_class("0-9") <regex> [0-9] > char_class("a-z") <regex> [a-z] > char_class("A-Z") <regex> [A-Z]

A digit A lower case leer An upper case leer

slide-14
SLIDE 14

String Manipulation with stringr

Shortcuts

> DGT <regex> \d > WRD <regex> \w > SPC <regex> \s

A digit A word character

> char_class("0-9") <regex> [0-9] > char_class("a-zA-z0-9_") <regex> [a-zA-z0-9_]

A whitespace character

slide-15
SLIDE 15

String Manipulation with stringr

National Electronic Injury Surveillance System (NEISS)

  • neiss package hps://github.com/hadley/neiss
  • Injuries reported in ER of random sample of hospitals

19YOM-SHOULDER STRAIN-WAS TACKLED WHILE PLAYING FOOTBALL W/ FRIENDS

19 year old male

slide-16
SLIDE 16

STRING MANIPULATION WITH STRINGR

Let’s practice!