Regular Expressions Greg Plaxton Theory in Programming Practice, - PowerPoint PPT Presentation

Regular Expressions Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin

What is a Regular Expression? • A regular expression defines a (possibly infinite) set of strings over a given alphabet • Analogous to an arithmetic expression – The symbols of the alphabet are analogous to the numerical constants in an arithmetic expression – Instead of arithmetic operators such as addition, multiplication, and exponentiation, the operators are concatenation, union, and closure Theory in Programming Practice, Plaxton, Spring 2004

Regular Expressions: Syntax • The symbols ∅ (empty set), � (empty string), and any symbol of the alphabet are regular expressions • For any regular expressions p and q , ( pq ) (concatenation) and ( p | q ) (union) are regular expressions • For any regular expression p , p ∗ (Kleene closure) is a regular expression Theory in Programming Practice, Plaxton, Spring 2004

Regular Expressions: Semantics • The regular expression ∅ corresponds to the empty set of strings • The regular expression � corresponds to the set of strings { � } • For any symbol a in the alphabet, the regular expression a corresponds to the set of strings { a } • For any regular expressions p and q with corresponding set of strings X and Y , the regular expression ( pq ) (resp., ( p | q ) ) denotes the set of strings { xy | x ∈ X ∧ y ∈ Y } (resp., X ∪ Y ) • For any regular expression p with corresponding set of strings X , the regular expression p ∗ denotes the set of strings { x 1 x 2 · · · x k | k ≥ 0 ∧ �∀ i : 1 ≤ i ≤ k : x i ∈ X �} Theory in Programming Practice, Plaxton, Spring 2004

Regular Expressions: Parenthesization • When writing a regular expression, we generally try to omit as many parentheses as possible without altering the meaning of the expression • Where parentheses are omitted, Kleene closure has the highest binding power, then concatenation, then union – Parentheses may be omitted whenever this convention yields the intended parenthesization • Note that concatenation and union are associative – These facts often enable us to drop parentheses, e.g., we can write abc instead of (( ab ) c ) Theory in Programming Practice, Plaxton, Spring 2004

A Remark on Kleene Closure • One can think of Kleene closure as follows: p ∗ = � | p | pp | ppp | . . . • The RHS above is not a regular expression because it has an infinite number of terms – It is straightforward to prove by induction that every regular expression has a finite length • The motivation for introducing the Kleene closure operator is to make the above RHS into a regular expression Theory in Programming Practice, Plaxton, Spring 2004

Regular Expressions: Examples • What is the set of strings corresponding to the regular expression a | bc ∗ d ? • It is often convenient to introduce identifiers to stand for certain regular expressions and then to use these identifiers as a shorthand for building up more complex regular expressions – PosDigit = 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 – Digit = 0 | PosDigit – Natural = 0 | PosDigit Digit ∗ • The set of strings over the lowercase English alphabet containing all five vowels in order corresponds to the regular expression ( Letter ∗ ) a ( Letter ∗ ) e ( Letter ∗ ) i ( Letter ∗ ) o ( Letter ∗ ) u ( Letter ∗ ) where Letter = a | b | c | . . . | z Theory in Programming Practice, Plaxton, Spring 2004

A More Elaborate Example • For any binary string x , let f ( x ) denote the nonnegative integer corresponding to x – Example: If x = 00110 , then f ( x ) = 6 • Problem: Construct a regular expression corresponding to the set of all binary strings x such that f ( x ) is a multiple of 3 – We first inductively define the sets B 0 , B 1 , and B 2 of all binary strings x such that f ( x ) is congruent to 0 , 1 , and 2 , respectively, modulo 3 – We then deduce a regular expression for B 0 Theory in Programming Practice, Plaxton, Spring 2004

Inductive Definition of Sets B 0 , B 1 , and B 2 (0) The empty string belongs to B 0 (1) For any binary string x in B 0 , x 0 belongs to B 0 and x 1 belongs to B 1 (2) For any binary string x in B 1 , x 0 belongs to B 2 and x 1 belongs to B 0 (3) For any binary string x in B 2 , x 0 belongs to B 1 and x 1 belongs to B 2 Theory in Programming Practice, Plaxton, Spring 2004

Characterization of B 2 in Terms of B 1 • By (2) and (3), any binary string in B 2 is either of the form x 0 where x belongs to B 1 , or is of the form x 1 where x belongs to B 2 • It follows that B 2 consists of all binary strings of the form x 01 ∗ where x belongs to B 1 Theory in Programming Practice, Plaxton, Spring 2004

Characterization of B 1 in terms of B 0 • By (1), (3), and the preceding characterization of B 2 , any binary string in B 1 is either of the form x 1 where x belongs to B 0 , or is of the form x 01 ∗ 0 where x belongs to B 1 • It follows that B 1 consists of all binary strings of the form x 1(01 ∗ 0) ∗ where x belongs to B 0 Theory in Programming Practice, Plaxton, Spring 2004

Deducing a Regular Expression for B 0 • By (0), (1), (2), and the preceding characterization of B 1 , the set B 0 consists of the empty string, all binary strings of the form x 0 where x belongs to B 0 , and all binary strings of the form x 1(01 ∗ 0) ∗ 1 where x belongs to B 0 • It follows that B 0 consists of all binary strings of the form (0 | 1(01 ∗ 0) ∗ 1) ∗ Theory in Programming Practice, Plaxton, Spring 2004

Remark: Alternative View of the Preceding Example • The binary strings in B 0 may be viewed as being generated by the grammar − → S B 0 − → � | B 0 0 | B 1 1 B 0 − → B 0 1 | B 2 0 B 1 − → B 1 0 | B 2 1 B 2 • As we have seen, the above grammar generates a regular language • Not all grammars generate regular languages Theory in Programming Practice, Plaxton, Spring 2004

Regular Expressions Greg Plaxton Theory in Programming Practice, - PowerPoint PPT Presentation

Regular Expressions Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin What is a Regular Expression? A regular expression defines a (possibly infinite) set of strings over a

Regular Expressions (REs) Regular Expressions (REs) p.1/37 Expressions In arithmetic:

Objectives You should be able to ... Regular Languages Use the syntax of regular expressions

Regexp Lecture 26: Regular Expressions Regular Expressions Regular expressions are a small

C++0x Regular Expressions Simon Andreas Frimann Lund Datalogisk Institut Kbenhavns

Regular Expressions = Regular Languages Mark Greenstreet, CpSc 421, Term 1, 2008/09 17

Theory of Computer Science C3. Regular Languages: Regular Expressions, Pumping Lemma Malte

Regular Expressions A regular expression describes a language using three operations. Regular

Chapter 7 Expressions and Statements Expressions Arithmetic Expressions Conditional

Kleene Algebras: The Algebra of Regular Expressions Adam Braude University of Puget Sound May

CS/COE 1520 pitt.edu/~ach54/cs1520 Regular expressions Regular expressions Formally:

Regular Expressions in .NET Regular Expressions in .NET By: Nasser Alshammari College of

Regular Expressions Regular Expressions and Automata and Automata Berlin Chen 2003 References:

Regular Expressions for Linguists: A Life Skill . Michael Yoshitaka Erlewine mitcho@mitcho.com

Regular Expressions Dr. Mattox Beckman University of Illinois at Urbana-Champaign Department of

Regular Languages Dr. Mattox Beckman University of Illinois at Urbana-Champaign Department of

Fem Poble(s): Expressions Meritxell (Txell) Martn Pardo, Ph.D Research associate Data

Formal Languages 1 Discrete Mathematical Structures Formal Languages

GoogLeNet Deeper than deeper Some slides are from Christian Szegedy GoogLeNet Convolution

CSC 1010 Lecture 7 What do we know so far? Class lecture, lab, Rephactor, Quick Checks,

Closure under regular operations I Recall we define three operations: , , We will see

Theory of Computer Science C6. Context-free Languages: Closure & Decidability Gabriele R

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Announcements HW-01

Lecture 4 Regular Expressions 4-0 DFAs vs NFAs Surprisingly, for finite

91.304 Foundations of (Th (Theoretical) Computer Science ti l) C t S i Chapter 1 Lecture

Regular Expressions Greg Plaxton Theory in Programming Practice, - PowerPoint PPT Presentation

Regular Expressions Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin What is a Regular Expression? A regular expression defines a (possibly infinite) set of strings over a

Regular Expressions (REs) Regular Expressions (REs) p.1/37 Expressions In arithmetic:

Objectives You should be able to ... Regular Languages Use the syntax of regular expressions

Regexp Lecture 26: Regular Expressions Regular Expressions Regular expressions are a small

C++0x Regular Expressions Simon Andreas Frimann Lund Datalogisk Institut Kbenhavns

Regular Expressions = Regular Languages Mark Greenstreet, CpSc 421, Term 1, 2008/09 17

Theory of Computer Science C3. Regular Languages: Regular Expressions, Pumping Lemma Malte

Regular Expressions A regular expression describes a language using three operations. Regular

Chapter 7 Expressions and Statements Expressions Arithmetic Expressions Conditional

Kleene Algebras: The Algebra of Regular Expressions Adam Braude University of Puget Sound May

CS/COE 1520 pitt.edu/~ach54/cs1520 Regular expressions Regular expressions Formally:

Regular Expressions in .NET Regular Expressions in .NET By: Nasser Alshammari College of

Regular Expressions Regular Expressions and Automata and Automata Berlin Chen 2003 References:

Regular Expressions for Linguists: A Life Skill . Michael Yoshitaka Erlewine mitcho@mitcho.com

Regular Expressions Dr. Mattox Beckman University of Illinois at Urbana-Champaign Department of

Regular Languages Dr. Mattox Beckman University of Illinois at Urbana-Champaign Department of

Fem Poble(s): Expressions Meritxell (Txell) Martn Pardo, Ph.D Research associate Data

Formal Languages 1 Discrete Mathematical Structures Formal Languages

GoogLeNet Deeper than deeper Some slides are from Christian Szegedy GoogLeNet Convolution

CSC 1010 Lecture 7 What do we know so far? Class lecture, lab, Rephactor, Quick Checks,

Closure under regular operations I Recall we define three operations: , , We will see

Theory of Computer Science C6. Context-free Languages: Closure &amp; Decidability Gabriele R

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Announcements HW-01

Lecture 4 Regular Expressions 4-0 DFAs vs NFAs Surprisingly, for finite

91.304 Foundations of (Th (Theoretical) Computer Science ti l) C t S i Chapter 1 Lecture

Theory of Computer Science C6. Context-free Languages: Closure & Decidability Gabriele R