regular expressions regular languages
play

Regular Expressions = Regular Languages Mark Greenstreet, CpSc - PowerPoint PPT Presentation

Regular Expressions = Regular Languages Mark Greenstreet, CpSc 421, Term 1, 2008/09 17 September 2008 p.1/18 Lecture Outline Regular Expressions Regular Expresssions Equivalence of Regular Expressions and Finite Automata 17


  1. Regular Expressions = Regular Languages Mark Greenstreet, CpSc 421, Term 1, 2008/09 17 September 2008 – p.1/18

  2. Lecture Outline Regular Expressions ✈ Regular Expresssions ✈ Equivalence of Regular Expressions and Finite Automata 17 September 2008 – p.2/18

  3. Regular Madlibs Once upon a , there was a that noun noun past tense verb . zero or more adjectives plural noun ✈ Let avocado denote the language { avocado } . ✈ Let noun = avocado ∪ beach ∪ carrot ∪ caterpillar ∪ pencil ∪ penguins ∪ zombie . ✈ Let pluralNoun = noun s . ✈ Let verb = add ∪ compile ∪ eat ∪ sing ∪ swim ∪ walk . ✈ Let pastVerb = verb ed . ✈ Let adjective = beautiful ∪ big ∪ cold ∪ considerable ∪ furry ∪ insipid ∪ yellow . ✈ Now, our Madlib TM is Once upon a noun , there was a noun , that pastVerb ( adjective ) ∗ pluralNoun. 17 September 2008 – p.3/18

  4. Regular Madlibs Once upon a , there was a that noun noun past tense verb . zero or more adjectives plural noun ✈ Let avocado denote the language { avocado } . ✈ Let noun = avocado ∪ beach ∪ carrot ∪ caterpillar ∪ pencil ∪ penguins ∪ zombie . ✈ Let pluralNoun = noun s . ✈ Let verb = add ∪ compile ∪ eat ∪ sing ∪ swim ∪ walk . ✈ Let pastVerb = verb ed . ✈ Let adjective = beautiful ∪ big ∪ cold ∪ considerable ∪ furry ∪ insipid ∪ yellow . ✈ Now, our Madlib TM is Once upon a pencil , there was a noun , that pastVerb ( adjective ) ∗ pluralNoun. 17 September 2008 – p.3/18

  5. Regular Madlibs Once upon a , there was a that noun noun past tense verb . zero or more adjectives plural noun ✈ Let avocado denote the language { avocado } . ✈ Let noun = avocado ∪ beach ∪ carrot ∪ caterpillar ∪ pencil ∪ penguins ∪ zombie . ✈ Let pluralNoun = noun s . ✈ Let verb = add ∪ compile ∪ eat ∪ sing ∪ swim ∪ walk . ✈ Let pastVerb = verb ed . ✈ Let adjective = beautiful ∪ big ∪ cold ∪ considerable ∪ furry ∪ insipid ∪ yellow . ✈ Now, our Madlib TM is Once upon a pencil , there was a carrot , that pastVerb ( adjective ) ∗ pluralNoun. 17 September 2008 – p.3/18

  6. Regular Madlibs Once upon a , there was a that noun noun past tense verb . zero or more adjectives plural noun ✈ Let avocado denote the language { avocado } . ✈ Let noun = avocado ∪ beach ∪ carrot ∪ caterpillar ∪ pencil ∪ penguins ∪ zombie . ✈ Let pluralNoun = noun s . ✈ Let verb = add ∪ compile ∪ eat ∪ sing ∪ swim ∪ walk . ✈ Let pastVerb = verb ed . ✈ Let adjective = beautiful ∪ big ∪ cold ∪ considerable ∪ furry ∪ insipid ∪ yellow . ✈ Now, our Madlib TM is Once upon a pencil , there was a carrot , that walked ( adjective ) ∗ pluralNoun. 17 September 2008 – p.3/18

  7. Regular Madlibs Once upon a , there was a that noun noun past tense verb . zero or more adjectives plural noun ✈ Let avocado denote the language { avocado } . ✈ Let noun = avocado ∪ beach ∪ carrot ∪ caterpillar ∪ pencil ∪ penguins ∪ zombie . ✈ Let pluralNoun = noun s . ✈ Let verb = add ∪ compile ∪ eat ∪ sing ∪ swim ∪ walk . ✈ Let pastVerb = verb ed . ✈ Let adjective = beautiful ∪ big ∪ cold ∪ considerable ∪ furry ∪ insipid ∪ yellow . ✈ Now, our Madlib TM is Once upon a pencil , there was a carrot , that walked beautiful, ( adjective ) ∗ pluralNoun. 17 September 2008 – p.3/18

  8. Regular Madlibs Once upon a , there was a that noun noun past tense verb . zero or more adjectives plural noun ✈ Let avocado denote the language { avocado } . ✈ Let noun = avocado ∪ beach ∪ carrot ∪ caterpillar ∪ pencil ∪ penguins ∪ zombie . ✈ Let pluralNoun = noun s . ✈ Let verb = add ∪ compile ∪ eat ∪ sing ∪ swim ∪ walk . ✈ Let pastVerb = verb ed . ✈ Let adjective = beautiful ∪ big ∪ cold ∪ considerable ∪ furry ∪ insipid ∪ yellow . ✈ Now, our Madlib TM is Once upon a pencil , there was a carrot , that walked beautiful, considerable pluralNoun. 17 September 2008 – p.3/18

  9. Regular Madlibs Once upon a , there was a that noun noun past tense verb . zero or more adjectives plural noun ✈ Let avocado denote the language { avocado } . ✈ Let noun = avocado ∪ beach ∪ carrot ∪ caterpillar ∪ pencil ∪ penguins ∪ zombie . ✈ Let pluralNoun = noun s . ✈ Let verb = add ∪ compile ∪ eat ∪ sing ∪ swim ∪ walk . ✈ Let pastVerb = verb ed . ✈ Let adjective = beautiful ∪ big ∪ cold ∪ considerable ∪ furry ∪ insipid ∪ yellow . ✈ Now, our Madlib TM is Once upon a pencil , there was a carrot , that walked beautiful, considerable penguins. 17 September 2008 – p.3/18

  10. Regular Expressions ✈ A regular expression, α , is L ( R ) R where ∅ ∅ { ǫ } ǫ { c } c ∈ Σ c R 1 ∪ R 2 L ( R 1 ) ∪ L ( R 2 ) R 1 and R 2 are regular expressions R 1 · R 2 L ( R 1 ) · L ( R 2 ) R 1 and R 2 are regular expressions R ∗ L ( R 1 ) ∗ R 1 is a regular expression 1 ✈ Language union, concatenation, and asteration were defined in the Sept. 10 notes and Sipser p. 44. 17 September 2008 – p.4/18

  11. Regular Expressions Examples Let Σ = { a , b } . ✈ a ∗ b ∗ – the set of all string with zero or more a ’s followed by zero or more b ’s. For example, the strings ǫ , a , aaab , bb , and aabbb are in this language. The strings aba and ba are not. ✈ ( aaa ) ∗ ( bb ) ∗ b – the set of all strings consisting of a number of a ’s that is divisible by three followed by an odd number of b ’s. For example, the strings b , aaabbb , and aaaaaaaaaaaabbbbb are in this language, but the strings ǫ , baaa , and aabbb are not. ✈ a Σ ∗ b – the set of all strings that begin with an a and end with a b . For example, the strings ab , ababab and abbbaabaaabab are in this language, but the strings a , aba , and babbab are not. 17 September 2008 – p.5/18

  12. A Few More Remarks ✈ We’ll write Σ as a regular language that generates the language of all strings in Σ 1 . ✈ From the definition of L ∗ , we note that ǫ ∈ L ∗ for any language L . In particular, note that ∅ ∗ = { ǫ } . ✈ Regular expressions and programming languages. The following regular expressions describe various lexical pieces of Java: ✈ The keyword class: class . ✈ Identifiers: ([ A − Z ] ∪ [ a − z ] ∪ ∪ $)([ A − Z ] ∪ [ a − z ] ∪ ∪ $ ∪ [ 0 − 9 ]) ∗ , where [ A − Z ] denotes all characters from A to Z , and likewise for [ a − z ] and [ 0 − 9 ] . ✈ Floating point numbers: (([ 0 − 9 ] + . [ 0 − 9 ] ∗ ) ∪ ([ 0 − 9 ] ∗ . [ 0 − 9 ] + ))( ǫ ∪ ( e (+ ∪ − ∪ ǫ )[ 0 − 9 ] + )) [ 0 − 9 ] + e (+ ∪ − ∪ ǫ )[ 0 − 9 ] + , S where [ 0 − 9 ] + = [ 0 − 9 ][ 0 − 9 ] ∗ . 17 September 2008 – p.6/18

  13. RE = DFA = NFA Every DFA is an NFA DFAs NFAs Treat edge labels as Power Set Show a construction Construction regular expressions. for each case in definition Eliminate states to get of regular expression. regular expression. Regular Expressions ✈ We will show that every language described by a regular expression is recognized by an NFA. ✈ We will then show that every language recognized by a DFA has a corresponding regular expression. 17 September 2008 – p.7/18

  14. From REs to NFAs – strategy ✈ Regular expressions are defined inductively (see slide 4) ✈ Our proof is by induction on the structure of the regular expression. ✈ One case for each way to form a regular expression: ✈ The empty language: ∅ ✈ The empty string: ǫ ✈ A single symbol: c ✈ Union of two REs: R 1 ∪ R 2 ✈ Concatenation of two REs: R 1 · R 2 ✈ Kleene star: R ∗ 17 September 2008 – p.8/18

  15. From REs to NFAs ✈ R = ∅ : ✈ R = ǫ : c ✈ R = c : N 1 R recognizes 1 ... ε ✈ R = R 1 ∪ R 2 : ε ... N 2 R recognizes 2 17 September 2008 – p.9/18

  16. From REs to NFAs (cont.) N 1 R N 2 R recognizes recognizes 1 2 ε ε ε ✈ R = R 1 · R 2 : . . . . . . ε ε N 1 R recognizes 1 ε ε ... ε ✈ R = R ∗ 1 : 17 September 2008 – p.10/18

  17. An Example R = ( b ∪ c ∪ ab ) ∗ ✈ a ≡ a b ≡ b c ≡ c ✈ ab ≡ ε a b b ε ✈ b ∪ c ≡ c ε b ε ε c ✈ b ∪ c ∪ ab ≡ ε ε ε a b ε b ε ε ε c ε ✈ ( b ∪ c ∪ ab ) ∗ ≡ ε ε ε a b 17 September 2008 – p.11/18

  18. From DFAs to REs ✈ Given a DFA, we want to construct a regular expression that for the DFA’s language. ✈ The “hard” part is keeping track of all of the possible paths from the start state to an accepting state, especially because there can be many possible loops. ✈ The key observation is that the symbols that label edges in a DFA are simple regular expressions. ✈ We’ll generalize this idea and allow arbitrary regular expressions on edges. ✈ We’ll use the flexibility of regular expressions to allow us to eliminate one state from the DFA at a time. We’ll modify the REs for the remaining edges to account for the deleted states. Thus, our new DFA will recognize the same language as the original one. ✈ By successively deleting states, we’ll eventually get to a DFA with a start state, an accept state, and a single edge from the start state to the accept state. The label for this edge is the RE corresponding to the original DFA. 17 September 2008 – p.12/18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend