Regular Expressions Simple matching and searching String: My name - - PowerPoint PPT Presentation

regular expressions simple matching and searching
SMART_READER_LITE
LIVE PREVIEW

Regular Expressions Simple matching and searching String: My name - - PowerPoint PPT Presentation

Regular Expressions Simple matching and searching String: My name is Claus Regex: My name is Simple matching and searching String: My name is Claus Regex: My name is Match: My name is Claus matched part unmatched part of string of string


slide-1
SLIDE 1

Regular Expressions

slide-2
SLIDE 2

Simple matching and searching

String: My name is Claus Regex: My name is

slide-3
SLIDE 3

Simple matching and searching

String: My name is Claus Regex: My name is Match: My name is Claus

matched part

  • f string

unmatched part

  • f string
slide-4
SLIDE 4

Simple matching and searching

String: My name is Claus Regex: My age is

slide-5
SLIDE 5

Simple matching and searching

String: My name is Claus Regex: My age is Match: Does not match!

slide-6
SLIDE 6

Commonly used special symbols in Python regular expressions

Symbol Meaning . matches any character + 1 or more * 0 or more () capture group \d digit \D non-digit \s whitespace \S non-whitespace \w alphanumeric \W non-alphanumeric ^ beginning of string $ end of string

slide-7
SLIDE 7

Commonly used special symbols in Python regular expressions

Symbol Meaning . matches any character + 1 or more * 0 or more () capture group \d digit \D non-digit \s whitespace \S non-whitespace \w alphanumeric \W non-alphanumeric ^ beginning of string $ end of string

slide-8
SLIDE 8

. matches any character, + means one or more

String: My name is Claus Regex: My .+ is

slide-9
SLIDE 9

. matches any character, + means one or more

String: My name is Claus Regex: My .+ is Match: My name is Claus

slide-10
SLIDE 10

. matches any character, + means one or more

String: My is Claus Regex: My .+ is

slide-11
SLIDE 11

. matches any character, + means one or more

String: My is Claus Regex: My .+ is Match: Does not match!

slide-12
SLIDE 12

* means zero or more

String: My is Claus Regex: My .* is

slide-13
SLIDE 13

* means zero or more

String: My is Claus Regex: My .* is Match: My is Claus

slide-14
SLIDE 14

Commonly used special symbols in Python regular expressions

Symbol Meaning . matches any character + 1 or more * 0 or more () capture group \d digit \D non-digit \s whitespace \S non-whitespace \w alphanumeric \W non-alphanumeric ^ beginning of string $ end of string

slide-15
SLIDE 15

String: My name is Claus Regex: My name is (.+)

Capture groups

slide-16
SLIDE 16

String: My name is Claus Regex: My name is (.+) Match: My name is Claus Group 1: Claus

Capture groups

slide-17
SLIDE 17

Commonly used special symbols in Python regular expressions

Symbol Meaning . matches any character + 1 or more * 0 or more () capture group \d digit \D non-digit \s whitespace \S non-whitespace \w alphanumeric \W non-alphanumeric ^ beginning of string $ end of string

slide-18
SLIDE 18

String: My name is Claus Regex: My name is\s+(\S+) Match: My name is Claus Group 1: Claus

Capture groups

slide-19
SLIDE 19

Try it yourself

http://pythex.org/

slide-20
SLIDE 20

Regular Expressions in Python

slide-21
SLIDE 21

Digression: Raw strings

In [1]: print("line1\nline2") Out[1]: line1 line2 How can we print out "line1\nline2"? In [2]: print("line1\\nline2") # escape the backslash Out[2]: line1\nline2 Simpler alternative: use a raw string In [3]: print(r"line1\nline2") # r stands for raw Out[3]: line1\nline2

slide-22
SLIDE 22

The key regex function we will use is re.search()

In [1]: import re # load regular expression module test_string = "My name is Claus" match = re.search(r"name", test_string) if match: # did we find a match? print("Test string matches.") print("Match:", match.group()) Out[1]: Test string matches. Match: name

slide-23
SLIDE 23

match.group() returns the part of the string that matched

In [1]: test_string = "My age is secret." match = re.search(r"My \S+ is", test_string) print(match.group()) Out[1]: My age is In [2]: test_string = "My mood is good." match = re.search(r"My \S+ is", test_string) print(match.group()) Out[2]: My mood is

slide-24
SLIDE 24

match.group() also recovers any captured groups

In [1]: test_string = "My age is secret." match = re.search(r"My (\S+) is (\S+)", test_string) print("Match:", match.group(0)) print("Captured group 1:" , match.group(1)) print("Captured group 2:" , match.group(2)) Out[1]: Match: My age is secret. Captured group 1: age Captured group 2: secret.