Chapter 3: Regular Languages In this chapter, we study: regular - PowerPoint PPT Presentation

Chapter 3: Regular Languages In this chapter, we study: • regular expressions and languages; • five kinds of finite automata; • algorithms for processing and converting between regular expressions and finite automata; and • applications of regular expressions and finite automata to hardware design, searching in text files and lexical analysis. 1 / 29

3.1: Regular Expressions and Languages In this section, we: • define several operations on languages; • say what regular expressions are, what they mean, and what regular languages are; and • begin to show how regular expressions can be processed by Forlan. 2 / 29

Language Operations If L 1 and L 2 are languages, then: • L 1 ∪ L 2 is a language; • L 1 ∩ L 2 is a language; • L 1 − L 2 is a language. E.g., consider union. If L 1 and L 2 are languages, then L 1 ⊆ Σ ∗ 1 and L 2 ⊆ Σ ∗ 2 , for some alphabets Σ 1 and Σ 2 . Thus is an alphabet, and L 1 ∪ L 2 ⊆ ( ) ∗ . 3 / 29

Language Operations If L 1 and L 2 are languages, then: • L 1 ∪ L 2 is a language; • L 1 ∩ L 2 is a language; • L 1 − L 2 is a language. E.g., consider union. If L 1 and L 2 are languages, then L 1 ⊆ Σ ∗ 1 and L 2 ⊆ Σ ∗ 2 , for some alphabets Σ 1 and Σ 2 . Thus Σ 1 ∪ Σ 2 is an alphabet, and L 1 ∪ L 2 ⊆ (Σ 1 ∪ Σ 2 ) ∗ . 3 / 29

Language Concatenation The concatenation of languages L 1 and L 2 ( L 1 @ L 2 ) is the language { x 1 @ x 2 | x 1 ∈ L 1 and x 2 ∈ L 2 } . For example, { 01 , 10 } @ { % , 11 } = = 4 / 29

Language Concatenation The concatenation of languages L 1 and L 2 ( L 1 @ L 2 ) is the language { x 1 @ x 2 | x 1 ∈ L 1 and x 2 ∈ L 2 } . For example, { 01 , 10 } @ { % , 11 } = { (01)% , (10)% , (01)(11) , (10)(11) } = 4 / 29

Language Concatenation The concatenation of languages L 1 and L 2 ( L 1 @ L 2 ) is the language { x 1 @ x 2 | x 1 ∈ L 1 and x 2 ∈ L 2 } . For example, { 01 , 10 } @ { % , 11 } = { (01)% , (10)% , (01)(11) , (10)(11) } = { 01 , 10 , 0111 , 1011 } . 4 / 29

Language Concatenation (Cont.) Concatenation of languages is associative: for all L 1 , L 2 , L 3 ∈ Lan , ( L 1 @ L 2 ) @ L 3 = L 1 @ ( L 2 @ L 3 ) . 5 / 29

Language Concatenation (Cont.) Concatenation of languages is associative: for all L 1 , L 2 , L 3 ∈ Lan , ( L 1 @ L 2 ) @ L 3 = L 1 @ ( L 2 @ L 3 ) . is the identity for concatenation: for all L ∈ Lan , And, @ L = L @ = L . 5 / 29

Language Concatenation (Cont.) Concatenation of languages is associative: for all L 1 , L 2 , L 3 ∈ Lan , ( L 1 @ L 2 ) @ L 3 = L 1 @ ( L 2 @ L 3 ) . And, { % } is the identity for concatenation: for all L ∈ Lan , { % } @ L = L @ { % } = L . 5 / 29

Language Concatenation (Cont.) Concatenation of languages is associative: for all L 1 , L 2 , L 3 ∈ Lan , ( L 1 @ L 2 ) @ L 3 = L 1 @ ( L 2 @ L 3 ) . And, { % } is the identity for concatenation: for all L ∈ Lan , { % } @ L = L @ { % } = L . Furthermore, ∅ is the zero for concatenation: for all L ∈ Lan , ∅ @ L = L @ ∅ = . 5 / 29

Language Concatenation (Cont.) Concatenation of languages is associative: for all L 1 , L 2 , L 3 ∈ Lan , ( L 1 @ L 2 ) @ L 3 = L 1 @ ( L 2 @ L 3 ) . And, { % } is the identity for concatenation: for all L ∈ Lan , { % } @ L = L @ { % } = L . Furthermore, ∅ is the zero for concatenation: for all L ∈ Lan , ∅ @ L = L @ ∅ = ∅ . 5 / 29

Language Concatenation (Cont.) Concatenation of languages is associative: for all L 1 , L 2 , L 3 ∈ Lan , ( L 1 @ L 2 ) @ L 3 = L 1 @ ( L 2 @ L 3 ) . And, { % } is the identity for concatenation: for all L ∈ Lan , { % } @ L = L @ { % } = L . Furthermore, ∅ is the zero for concatenation: for all L ∈ Lan , ∅ @ L = L @ ∅ = ∅ . We often abbreviate L 1 @ L 2 to L 1 L 2 . 5 / 29

Raising a Language to a Power We define the language L n ∈ Lan formed by raising a language L to the power n ∈ N by recursion on n : L 0 = , for all L ∈ Lan ; L n +1 = LL n , for all L ∈ Lan and n ∈ N . We assign this operation higher precedence than concatenation, so that LL n means L ( L n ) in the above definition. 6 / 29

Raising a Language to a Power We define the language L n ∈ Lan formed by raising a language L to the power n ∈ N by recursion on n : L 0 = { % } , for all L ∈ Lan ; L n +1 = LL n , for all L ∈ Lan and n ∈ N . We assign this operation higher precedence than concatenation, so that LL n means L ( L n ) in the above definition. 6 / 29

Raising a Language to a Power (Cont.) Proposition 3.1.1 For all L ∈ Lan and n , m ∈ N , L n + m = L n L m . Proof. An easy mathematical induction on n . The language L and the natural number m can be fixed at the beginning of the proof. ✷ Thus, if L ∈ Lan and n ∈ N , then L n +1 = LL n ( definition ) , and L n +1 = L n L 1 = L n L ( Proposition 3.1.1 ) . 7 / 29

Kleene Closure The Kleene closure (or just closure ) of a language L ( L ∗ ) is the language { L n | n ∈ N } . � 8 / 29

Kleene Closure The Kleene closure (or just closure ) of a language L ( L ∗ ) is the language { L n | n ∈ N } . � Thus, for all w , w ∈ A , for some A ∈ { L n | n ∈ N } w ∈ L ∗ iff w ∈ L n for some n ∈ N . iff 8 / 29

Kleene Closure The Kleene closure (or just closure ) of a language L ( L ∗ ) is the language { L n | n ∈ N } . � Thus, for all w , w ∈ A , for some A ∈ { L n | n ∈ N } w ∈ L ∗ iff w ∈ L n for some n ∈ N . iff For example, { a , ba } ∗ = { a , ba } 0 ∪ { a , ba } 1 ∪ { a , ba } 2 ∪ · · · = 8 / 29

Kleene Closure The Kleene closure (or just closure ) of a language L ( L ∗ ) is the language { L n | n ∈ N } . � Thus, for all w , w ∈ A , for some A ∈ { L n | n ∈ N } w ∈ L ∗ iff w ∈ L n for some n ∈ N . iff For example, { a , ba } ∗ = { a , ba } 0 ∪ { a , ba } 1 ∪ { a , ba } 2 ∪ · · · = { % } ∪ 8 / 29

Kleene Closure The Kleene closure (or just closure ) of a language L ( L ∗ ) is the language { L n | n ∈ N } . � Thus, for all w , w ∈ A , for some A ∈ { L n | n ∈ N } w ∈ L ∗ iff w ∈ L n for some n ∈ N . iff For example, { a , ba } ∗ = { a , ba } 0 ∪ { a , ba } 1 ∪ { a , ba } 2 ∪ · · · = { % } ∪ { a , ba } ∪ 8 / 29

Kleene Closure The Kleene closure (or just closure ) of a language L ( L ∗ ) is the language { L n | n ∈ N } . � Thus, for all w , w ∈ A , for some A ∈ { L n | n ∈ N } w ∈ L ∗ iff w ∈ L n for some n ∈ N . iff For example, { a , ba } ∗ = { a , ba } 0 ∪ { a , ba } 1 ∪ { a , ba } 2 ∪ · · · = { % } ∪ { a , ba } ∪ { aa , aba , baa , baba } ∪ · · · 8 / 29

Precedences of Language Operations We assign our operations on languages relative precedences as follows: • Highest: closure (( · ) ∗ ) and raising to a power (( · ) n ); • Intermediate: concatenation (@, or just juxtapositioning); • Lowest: union ( ∪ ), intersection ( ∩ ) and difference ( − ). For example, if n ∈ N and A , B , C ∈ Lan , then A ∗ BC n ∪ B abbreviates 9 / 29

Precedences of Language Operations We assign our operations on languages relative precedences as follows: • Highest: closure (( · ) ∗ ) and raising to a power (( · ) n ); • Intermediate: concatenation (@, or just juxtapositioning); • Lowest: union ( ∪ ), intersection ( ∩ ) and difference ( − ). For example, if n ∈ N and A , B , C ∈ Lan , then A ∗ BC n ∪ B abbreviates (( A ∗ ) B ( C n )) ∪ B . 9 / 29

Precedences of Language Operations We assign our operations on languages relative precedences as follows: • Highest: closure (( · ) ∗ ) and raising to a power (( · ) n ); • Intermediate: concatenation (@, or just juxtapositioning); • Lowest: union ( ∪ ), intersection ( ∩ ) and difference ( − ). For example, if n ∈ N and A , B , C ∈ Lan , then A ∗ BC n ∪ B abbreviates (( A ∗ ) B ( C n )) ∪ B . Can (( A ∪ B ) C ) ∗ be abbreviated? 9 / 29

Precedences of Language Operations We assign our operations on languages relative precedences as follows: • Highest: closure (( · ) ∗ ) and raising to a power (( · ) n ); • Intermediate: concatenation (@, or just juxtapositioning); • Lowest: union ( ∪ ), intersection ( ∩ ) and difference ( − ). For example, if n ∈ N and A , B , C ∈ Lan , then A ∗ BC n ∪ B abbreviates (( A ∗ ) B ( C n )) ∪ B . Can (( A ∪ B ) C ) ∗ be abbreviated? No—removing either pair of parentheses will change its meaning. 9 / 29

More Operations on Sets of Strings in Forlan In Section 2.3, we introduced the Forlan module StrSet , which defines various functions for processing finite sets of strings, i.e., finite languages. This module also defines the functions val concat : str set * str set -> str set val power : str set * int -> str set which implement our concatenation and exponentiation operations on finite languages. 10 / 29

More Operations in Forlan (Cont.) Here are some examples of how these functions can be used: - val xs = StrSet.fromString "ab, cd"; val xs = - : str set - val ys = StrSet.fromString "uv, wx"; val ys = - : str set - StrSet.output("", StrSet.concat(xs, ys)); abuv, abwx, cduv, cdwx val it = () : unit - StrSet.output("", StrSet.power(xs, 0)); % val it = () : unit - StrSet.output("", StrSet.power(xs, 1)); ab, cd val it = () : unit - StrSet.output("", StrSet.power(xs, 3)); ababab, ababcd, abcdab, abcdcd, cdabab, cdabcd, cdcdab, cdcdcd val it = () : unit 11 / 29

Chapter 3: Regular Languages In this chapter, we study: regular - PowerPoint PPT Presentation

Chapter 3: Regular Languages In this chapter, we study: regular expressions and languages; five kinds of finite automata; algorithms for processing and converting between regular expressions and finite automata; and applications of

Review Languages and Grammars CS 301 - Lecture 5 Alphabets, strings, languages Regular

Objectives You should be able to ... Regular Languages Use the syntax of regular expressions

A Theory of Regular Queries Moshe Y. Vardi Rice University Theory of Regular Languages, I

Chapter Eleven: Non-Regular Languages Formal Language, chapter 11, slide 1 1 We have now

Review Languages and Grammars Alphabets, strings, languages Regular Languages

Review Languages and Grammars Alphabets, strings, languages Regular Languages

Regular Expressions = Regular Languages Mark Greenstreet, CpSc 421, Term 1, 2008/09 17

Theory of Computer Science C3. Regular Languages: Regular Expressions, Pumping Lemma Malte

Theory of Computer Science C2. Regular Languages: Finite Automata Gabriele R oger University

CFLs and Regular Languages We can show that every RL is also a CFL CFLs and Regular Languages

Finite-State Automata Formal Languages in brief Regular Expressions Finite-State

Regular Languages Mark Greenstreet, CpSc 421, Term 1, 2006/07 8 September 2008 p.1/14

U i 0 1 2 3 4 L L L L L L L ... = = language and: i 0 =

Pushdown automata I Context-free languages are more general than regular languages For regular

Regular Expressions A regular expression describes a language using three operations. Regular

Review Languages and Grammars Alphabets, strings, languages Regular Languages CS 301

Regular Expressions Definitions Equivalence to Finite Automata 1 REs: Introduction

Concatenating data CLEAN IN G DATA IN P YTH ON Daniel Chen Instructor Combining data Data may

(Dynamic Strings) Personal Software Engineering Memory Organization The call stack grows from

Information Transmission Chapter 5, Convolutional codes FREDRIK TUFVESSON ELECTRICAL AND

with f-Strings Format Stings in Python Building up str values via concatenation can involve a

Theories of concatenation, arithmetic, and undecidability Yoshihiro Horihata Yonago National

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 23: Speech

Intro to Strings Lecture 7 COP 3252 Summer 2017 May 23, 2017 Strings in Java In Java, a