CS 230 - Spring 2020 4-1
Computer Systems Lecture 19 NFAs and Regular Expressions CS 230 - - - PowerPoint PPT Presentation
Computer Systems Lecture 19 NFAs and Regular Expressions CS 230 - - - PowerPoint PPT Presentation
CS 230 Introduction to Computers and Computer Systems Lecture 19 NFAs and Regular Expressions CS 230 - Spring 2020 4-1 Non-deterministic Finite Automata (NFA) The same as a DFA except that: NFAs may have transitions from the
CS 230 - Spring 2020 4-2
Non-deterministic Finite Automata (NFA)
The same as a DFA except that:
NFAs may have transitions from the same state on
the same input to different states
DFAs can only have one transition per input per state
NFAs can include an ε (epsilon) transition
Move to a new state without consuming input
Easier to design than equivalent DFA
More complex to evaluate input
Can always create a DFA from an NFA
All DFAs are also legal NFAs
CS 230 - Spring 2020 4-3
NFA Example
a b ε a c
Start state has two a transitions Bottom state has an epsilon transition to an
accept state
CS 230 - Spring 2020 4-4
NFA Example
a b ε a
Lets consider the input string: acb
c
CS 230 - Spring 2020 4-5
NFA Example
a b ε a
Lets consider the input string: acb
c
CS 230 - Spring 2020 4-6
NFA Example
a b ε a
Lets consider the input string: Now we have two (?) choices so we create “clones”
acb
c
CS 230 - Spring 2020 4-7
NFA Example
a b ε a
Lets consider the input string: Wait! There’s one more possibility! Add a clone.
acb
c
CS 230 - Spring 2020 4-8
NFA Example
a b ε a
Lets consider the input string: Two clones got stuck and died!
acb
c
CS 230 - Spring 2020 4-9
NFA Example
a b ε a
Lets consider the input string: At end and at least one clone is in an accept state
acb
c
CS 230 - Spring 2020 4-10
NFA Example
a b ε a
Lets consider the input string: The NFA accepts acb
acb
c
CS 230 - Spring 2020 4-11
NFA Example 2
a b
Middle state has two b transitions Start state has an epsilon transition to the
accept state
ε b
CS 230 - Spring 2020 4-12
NFA Example 2
a b ε b
Lets consider the input string:
abab
CS 230 - Spring 2020 4-13
NFA Example 2
a b ε b
Lets consider the input string: abab
CS 230 - Spring 2020 4-14
NFA Example 2
a b ε b
Lets consider the input string: abab
CS 230 - Spring 2020 4-15
NFA Example 2
a b ε b
Lets consider the input string: abab
CS 230 - Spring 2020 4-16
NFA Example 2
a b ε b
Lets consider the input string: At end and at least one clone is in an accept state
abab
CS 230 - Spring 2020 4-17
NFA Example 2
a b ε b
Lets consider the input string: The NFA accepts abab
abab
CS 230 - Spring 2020 4-18
Try it Yourself
a ε b c c b
Does this NFA accept the string:
bcbb
CS 230 - Spring 2020 4-19
Try it Yourself
a ε b c c b
Does this NFA accept the string:
bcbb
CS 230 - Spring 2020 4-20
Try it Yourself
a ε b c c b
Does this NFA accept the string:
bcbb
CS 230 - Spring 2020 4-21
Try it Yourself
a ε b c c b
Does this NFA accept the string:
bcbb
CS 230 - Spring 2020 4-22
Try it Yourself
a ε b c c b
Does this NFA accept the string:
bcbb
CS 230 - Spring 2020 4-23
Try it Yourself
a ε b c c b
Does this NFA accept the string: The NFA rejects bcbb
bcbb
CS 230 - Spring 2020 4-24
Try it Yourself
Draw an NFA for the language of [an even number of lowercase c] or [an odd number of lowercase d].
CS 230 - Spring 2020 4-25
Try it Yourself
ε
Draw an NFA for the language of [an even number of lowercase c] or [an odd number of lowercase d].
ε
CS 230 - Spring 2020 4-26
Try it Yourself
ε
Draw an NFA for the language of [an even number of lowercase c] or [an odd number of lowercase d].
ε d d
CS 230 - Spring 2020 4-27
Try it Yourself
c ε
Draw an NFA for the language of [an even number of lowercase c] or [an odd number of lowercase d].
c ε d d
CS 230 - Spring 2020 4-28
Regular Expression
Regular expressions (also called regexs) are
another way to define regular languages
Define set of strings over alphabet ∑
∑ is the set of all legal characters in language
Constants:
empty set: Ø empty string: ε literal character: a ∈ ∑
CS 230 - Spring 2020 4-29
Basic Regex Operations
Alternation: R|S = R U S
U is the union operator “R or S” (exclusive or)
Concatenation: RS = { αβ : α in R and β in S}
“R followed by S”
Kleene star: R* = smallest superset of R
containing ε and closed under concatenation
“Zero or more copies of R”
CS 230 - Spring 2020 4-30
Basic Regex Examples
a* = { ε, a, aa, aaa, ... } b|a* = { b, ε, a, aa, aaa, ... } (0|1)* = binary numbers, plus empty string (h|c)at = { hat, cat } (a|b)(c|d) = { ac, ad, bc, bd } while = { while }
CS 230 - Spring 2020 4-31
More Regex Operations
a+ = { a, aa, aaa, ... } a? = { ε, a } ab+ = { ab, abb, abbb, ... } [a-dqh] = {a, b, c, d, q, h} (h|c)?at = { hat, cat, at } (a|b)+(c|d) = { ac, ad, bc, bd, aac, aad,
bbc, bbd, aaac, aaad, bbbc, bbbd, abc, abd, abac ... }
wh?il[e-g] = { while, whilf, whilg, wile, wilf, wilg }
CS 230 - Spring 2020 4-32
Regex Precedence [] > () > * > + > ? > concatenation > |
Regex operators have precedence allows us to make regular expressions
without overusing parenthesis
ab+ = {ab, abb, abbb, abbbb, ...} c|de = {c, de}
CS 230 - Spring 2020 4-33
Try it Yourself
Write a regular expression for the language of all strings beginning with an optional lowercase a, followed by two or more copies of any other lowercase letter, followed by a lowercase a.
CS 230 - Spring 2020 4-34
Try it Yourself
Write a regular expression for the language of all strings beginning with an optional lowercase a, followed by two or more copies of any other lowercase letter, followed by a lowercase a. a?
CS 230 - Spring 2020 4-35
Try it Yourself
Write a regular expression for the language of all strings beginning with an optional lowercase a, followed by two or more copies of any other lowercase letter, followed by a lowercase a. a?[b-z][b-z]
CS 230 - Spring 2020 4-36
Try it Yourself
Write a regular expression for the language of all strings beginning with an optional lowercase a, followed by two or more copies of any other lowercase letter, followed by a lowercase a. a?[b-z][b-z]+
CS 230 - Spring 2020 4-37
Try it Yourself
Write a regular expression for the language of all strings beginning with an optional lowercase a, followed by two or more copies of any other lowercase letter, followed by a lowercase a. a?[b-z][b-z]+a
CS 230 - Spring 2020 4-38
Regular Expressions and DFA/NFA
Regular expressions define regular languages NFAs and DFAs define regular languages Simple conversion rules:
a* a? a+ a|b
a a a a a b
CS 230 - Spring 2020 4-39
Regex to DFA Example
Regular expression: (ab)?(cd)+ We know we start with an optional ab so lets
use that rule first:
a b
CS 230 - Spring 2020 4-40
Regex to DFA Example
Regular expression: (ab)?(cd)+ Now we add one or more copies of cd But now how to we have the ab first then cd ?
a b d c c
CS 230 - Spring 2020 4-41
Regex to DFA Example
Regular expression: (ab)?(cd)+ We need to add a c arrow and remove the extra
accept state. Now it works!
a b d c c c
CS 230 - Spring 2020 4-42
DFA to Regex Example
a b
Write out as many accepted strings as you can find
until you see a pattern:
aa, bb, bbb, bbbb
a b b
CS 230 - Spring 2020 4-43
DFA to Regex Example
a b
We can see a choice of either two a or at least two b Regex: aa|bb+ or aa|bbb*
a b b
CS 230 - Spring 2020 4-44
Try it Yourself
r q
Describe the language accepted by the following DFA then write a regular expression that accepts the same language.
q r q q
CS 230 - Spring 2020 4-45
Try it Yourself
r q
Describe the language accepted by the following DFA then write a regular expression that accepts the same language. Strings accepted: r, qr, qqr, qqqr, rq, rqq, rqqq, qrq, qrqq, qqrqq, qqqrqq, qqrqqq, qqqrqqq, ...
q r q q
CS 230 - Spring 2020 4-46
Try it Yourself
r q
Describe the language accepted by the following DFA then write a regular expression that accepts the same language. This DFA accepts the language of strings containing a single lowercase r and any number of lowercase q.
q r q q
CS 230 - Spring 2020 4-47
Try it Yourself
r q
Describe the language accepted by the following DFA then write a regular expression that accepts the same language. This DFA accepts the same language as the regular expression: q*rq*
q r q q
CS 230 - Spring 2020 4-48
Regex Syntax Extensions
Dot matches any single character
.at matches hat, cat, fat, mat, bat, 7at, Aat, etc.
Escape character
match actual brackets, dots, plus signs, etc. [0-9]+\.[0-9]+ to match fractional numbers
CS 230 - Spring 2020 4-49
Real-world Example
Search for all occurrences of a name in a file
Using unix command “egrep”
Different spelling for George Friedrich Händel:
Händel Haendel Handel Hendel