 
              Logistics • Homework – Homework #1 due today. Regular Expressions – Homework #2 • Exercise 3.1.1 (a,b,c) – pg 89 • Exercise 3.1.4 (a,b,c) – pg 90 • Exercise 3.2.2 (a – d) – pg 106 • Exercise 3.2.4 (a,b,c) – pg 106 • Take the NFA- ε in any part of 3.2.4 and convert to a DFA. Questions Languages • Any questions before we start? • Recall. – What is a language? – What is a class of languages? Languages String Recognition machine • A language is a set of strings. • Given a string and a definition of a language (set of strings), is the string a member of the language? • A class of languages is nothing more than a YES, string is set of languages in Language Language recognition Input string machine NO, string is not in Language 1
Regular Languages Specifying Languages • Recall: how do we specify languages? • Today we continue looking at our first class of languages: Regular languages – If language is finite, you can list all of its strings. • L = {a, aa, aba, aca} – Means of defining: Regular Expressions – Descriptive: – Machine for accepting: Finite Automata • L = {x | n a (x) = n b (x)} – Using basic Language operations • L= {aa, ab} * ∪ {b}{bb} * • Regular languages are described using this last method Regular Languages Kleene Star Operation • A regular language over Σ is a language • The set of strings that can be obtained by concatenating any number of elements of a that can be expressed using only the set language L is called the Kleene Star, L * operations of ∞ = U = ∪ ∪ ∪ ∪ * 0 1 2 3 4 – Union i ... L L L L L L L – Concatenation = 0 i – Kleene Star � Note that since, L * contains L 0 , ε is an element of L * Regular Expressions Regular expressions { ε } ε • Regular expressions are the mechanism by which regular languages are described: { 011} 011 – Take the “set operation” definition of the { 0,1} 0 + 1 language and: • Replace ∪ with + {0, 01} 0 + 01 • Replace {} with () {110} * {0,1} (110) * (0+1) – And you have a regular expression {10, 11, 01} * (10 + 11 + 01) * {0, 11} * ({11} * ∪ {101, ε }) (0 + 11) * ((11) * + 101 + ε ) 2
Regular Expression Regular Expression 4. If L 1 and L 2 are regular languages with regular • Recursive definition of regular languages / expressions r 1 and r 2 then expression over Σ : -- L 1 ∪ L 2 is a regular language with regular expression 1. ∅ is a regular language and its regular (r 1 + r 2 ) expression is ∅ -- L 1 L 2 is a regular language with regular expression (r 1 r 2 ) 2. { ε } is a regular language and ε is its regular * is a regular language with regular expression (r 1 -- L 1 * ) expression 3. For each a ∈ Σ , { a } is a regular language and Only languages obtainable by using rules 1-4 are regular its regular expression is a languages . Regular Expressions Regular Expressions • Some shorthand • More shorthand – If we apply precedents to the operators, we can – Equating regular expressions. relax the full parenthesized definition: • Two regular expressions are considered equal if they • Kleene star has highest precedent describe the same language • Concatenation had mid precedent • 1 * 1 * = 1 * • + has lowest precedent • (a + b) * ≠ a + b * – Thus • a + b * c is the same as (a + ((b * )c)) • (a + b) * is not the same as a + b * Regular Expressions Regular Expressions • Important thing to remember • Even more shorthand – A regular expression is not a language – Sometimes you might see in the book: – A regular expression is used to describe a • r n where n indicates the number of concatenations of language. r (e.g. r 6 ) • r + to indicate one or more concatenations of r. – It is incorrect to say that for a language L, • L = (a + b + c) * – Note that this is only shorthand! – But it’s okay to say that L is described by – r 6 and r + are not regular expressions. • (a + b + c) * 3
Regular Expressions Examples of Regular Languages • Questions? • All finite languages are regular – Can anyone tell me why? Examples of Regular Languages Examples of Regular Languages • L = {x ∈ {0,1} * | |x| is even} • All finite languages are regular – Any string of even length can be obtained by – A finite language L can be expressed as the concatenating strings length 2. union of languages each with one string – Any concatenation of strings of length 2 will be even corresponding to a string in L – L = {00, 01, 10, 11} * – Example: • L = {a, aa, aba, aca} – Regular expressions describing L: • L = {a} ∪ { aa} ∪ {aba} ∪ {aca} • (00 + 01 + 10 + 11) * • Regular expression: (a + aa + aba + abc) • ((0 + 1)(0 + 1)) * Examples of Regular Languages Examples of Regular Languages • L = {x ∈ {0,1} * | x contains an odd number • L = {x ∈ {0,1} * | x does not end in 01 } of 0s } – If x does not end in 01, then either – Express x = yz • |x| < 2 or – y is a string of the form y=1 i 01 j • x ends in 00, 10, or 11 – In z, there must be an even number of – A regular expression that describes L is: additional 0s or z = (01 k 01 m ) * • ε + 0 + 1 + (0 + 1) * (00 + 10 + 11) – x can be described by (1 * 01 * )(01 * 01 * ) * – Questions? 4
Useful properties of regular expressions Useful properties of regular expressions • Commutative • Distributed – L + M = M + L – L (M + N) = LM + LN • Associative – (M + N)L = ML + NL – (L + M) + N = L + (M + N) • Idempotent – (LM)N = L(MN) • Identities – L + L = L – ∅ + L = L + ∅ = L – ε L = L ε = L – ∅ L = L ∅ = ∅ Useful properties of regular expressions Practical uses for regular expressions • Closures • grep – (L * ) * = L * – Global (search for) Regular Expressions and Print – ∅ * = ε – Finds patterns of characters in a text file. – ε * = ε –L + = LL * – grep man foo.txt –L * = L + + ε – grep [ab]*c[de]? foo.txt Practical uses for regular expressions Practical uses for regular expressions • How a compiler works • How a compiler works – The Lexical Analyzer (lexer) reads source code and generates a stream of tokens Stream Parse lexer parser – What is a token? codegen of tokens Tree • Identifier Object • Keyword code • Number Source • Operator file • Punctuation 5
Practical uses for regular expressions Examples of Regular Languages • How a compiler works • L = set of valid C keywords – Tokens can be described using regular – This is a finite set expressions! – L can be described by • if + then + else + while + do + goto + break + switch + … Examples of Regular Languages Practical uses for regular expressions • L = set of valid C identifiers • lex – A valid C identifier begins with a letter or _ – Program that will create a lexical analyzer. – A valid C identifier contains letters, numbers, – Input: set of valid tokens and _ – Tokens are given by regular expressions. – If we let: • l = {a , b , … , z , A , B , … , Z} • d = {1 , 2 , … , 9 , 0} – Then a regular expression for L: • (l + _)(l + d + _) * Summary • Regular languages can be expressed using only the set operations of union, concatenation, Kleene Star. • Regular languages – Means of describing: Regular Expression – Machine for accepting: Finite Automata • Practical uses – Text search (grep) – Compilers / Lexical Analysis (lex) • Questions? • Break time! 6
Recommend
More recommend