Context-free grammars (CFGs) Roadmap Last time RegExp == DFA - - PowerPoint PPT Presentation

context free grammars cfgs roadmap
SMART_READER_LITE
LIVE PREVIEW

Context-free grammars (CFGs) Roadmap Last time RegExp == DFA - - PowerPoint PPT Presentation

Context-free grammars (CFGs) Roadmap Last time RegExp == DFA Jlex: a tool for generating (Java code for) a lexer/scanner Mainly a collection of regexp, action pairs This time CFGs, the underlying abstraction for parsers


slide-1
SLIDE 1

Context-free grammars (CFGs)

slide-2
SLIDE 2

Roadmap

Last time

– RegExp == DFA – Jlex: a tool for generating (Java code for) a lexer/scanner

  • Mainly a collection of 〈regexp, action〉 pairs

This time

– CFGs, the underlying abstraction for parsers

Next week

– Java CUP: a tool for generating (Java code for) a parser

  • Mainly a collection of 〈CFG-rule, action〉 pairs

regexp : JLex :: CFG : Java CUP

slide-3
SLIDE 3

RegExps Are Great!

Perfect for tokenizing a language However, they have some limitations

– Can only define a limited family of languages

  • Cannot use a RegExp to specify all the programming

constructs we need

– No notion of structure

Let’s explore both of these issues

slide-4
SLIDE 4

Limitations of RegExps

Cannot handle “matching” E.g., language of balanced parentheses

L() = { (n )n where n > 0} No D DFA e exists f for t this l language

In Intuition: A given FSM only has a fixed, finite amount

  • f memory

– For an FSM, memory = the states – With a fixed, finite amount of memory, how could an FSM remember how many “(“ characters it has seen?

slide-5
SLIDE 5

Th Theor

  • rem: No RegExp/DFA can describe

the language L()

Proof by contradiction:

  • Suppose that there exists a DFA A for L() and A has

N states

  • A has to accept the string (N )N

N with some path

q0q1…qN…q …q2N+

2N+1

  • By the pigeonhole principle some state has to

repeat: qi = qj for some i<j<N

  • Therefore the run q0q1…q

…qiqj+

j+1…qN…q

…q2N+

2N+1 is also

accepting

  • A accepts the string (N-(j

(j-i) )N∉L(), which is a

contradiction!

slide-6
SLIDE 6

Limitations of RegExps: No Structure

Our Enhanced-RegExp scanner can emit a stream

  • f tokens:

X = Y + Z … but this doesn’t really enforce any order of

  • perations

ID ASSIGN ID PLUS ID

slide-7
SLIDE 7

The Chomsky Hierarchy

Regular Context-Free Context-Sensitive Recursively enumerable power efficiency LANGUAGE CLASS: FSM Turing machine Happy medium? Noam Chomsky

slide-8
SLIDE 8

Context Free Grammars (CFGs)

A set of (recursive) rewriting rules to generate patterns of strings Can envision a “parse tree” that keeps structure

slide-9
SLIDE 9

CFG: Intuition

S → ‘(‘ S ‘)’

A rule that says that you can rewrite S to be an S surrounded by a single set of parenthesis

S S S ( )

After applying rule Before applying rule

slide-10
SLIDE 10

Context Free Grammars (CFGs)

A CFG is a 4-tuple (N,Σ,P,S)

  • N is a set of non-terminals, e.g., A, B, S, …
  • Σ is the set of terminals
  • P is a set of production rules
  • S∈N is the initial non-terminal symbol (“start

symbol”)

slide-11
SLIDE 11

Context Free Grammars (CFGs)

A CFG is a 4-tuple (N,Σ,P,S)

  • N is a set of non-terminals, e.g., A, B, S…
  • Σ is the set of terminals
  • P is a set of production rules
  • S (in N) is the initial non-terminal symbol

Placeholder / interior nodes in the parse tree Tokens from scanner Rules for deriving strings If not otherwise specified, use the non-terminal that appears on the LHS

  • f the first production as the start
slide-12
SLIDE 12

Production Syntax

Expression: Sequence of terminals and nonterminals

LHS → RHS

Single nonterminal symbol

Examples: S à ‘(‘ S ‘)’ S à ε

slide-13
SLIDE 13

Production Shorthand

Nonterm → expression Nonterm→ ε eq equivalen entl tly: Nonterm → expression | ε eq equivalen entl tly: Nonterm → expression | ε

S à ‘(‘ S ‘)’ S à ε S à ‘(‘ S ‘)’ | ε S à ‘(‘ S ‘)’ | ε

slide-14
SLIDE 14

Derivations

To derive a string:

  • Start by setting “Current Sequence” to the start

symbol

  • Repeatedly,

– Find a Nonterminal X in the Current Sequence – Find a production of the form X→α – “Apply” the production: create a new “current sequence” in which α replaces X

  • Stop when there are no more non-terminals
  • This process derives a string of terminal symbols
slide-15
SLIDE 15

Derivation Syntax

  • We’ll use the symbol “⇒” for “derives”
  • We’ll use the symbol “

&

⇒” for “derives in one or more steps” (also written as “⇒&”)

  • We’ll use the symbol “

⇒” for “derives in zero or more steps” (also written as “⇒∗”)

slide-16
SLIDE 16

An Example Grammar

slide-17
SLIDE 17

An Example Grammar

Terminals begin end semicolon assign id plus

slide-18
SLIDE 18

An Example Grammar

Terminals begin end semicolon assign id plus For readability, bold and lowercase

slide-19
SLIDE 19

An Example Grammar

Terminals begin end semicolon assign id plus Program boundary For readability, bold and lowercase

slide-20
SLIDE 20

An Example Grammar

Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements For readability, bold and lowercase

slide-21
SLIDE 21

An Example Grammar

Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements Represents “=“ in an assignment statement For readability, bold and lowercase

slide-22
SLIDE 22

An Example Grammar

Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements Represents “=“ in an assignment statement Identifier / variable name For readability, bold and lowercase

slide-23
SLIDE 23

An Example Grammar

Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements Represents “=“ in an assignment statement Identifier / variable name Represents “+“ operator in an expression For readability, bold and lowercase

slide-24
SLIDE 24

An Example Grammar

Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr For readability, bold and lowercase

slide-25
SLIDE 25

An Example Grammar

Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr For readability, bold and lowercase For readability, Italics and UpperCamelCase

slide-26
SLIDE 26

An Example Grammar

Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree For readability, bold and lowercase For readability, Italics and UpperCamelCase

slide-27
SLIDE 27

An Example Grammar

Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree List of statements For readability, bold and lowercase For readability, Italics and UpperCamelCase

slide-28
SLIDE 28

An Example Grammar

Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree List of statements A single statement For readability, bold and lowercase For readability, Italics and UpperCamelCase

slide-29
SLIDE 29

An Example Grammar

Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree List of statements A single statement A mathematical expression For readability, bold and lowercase For readability, Italics and UpperCamelCase

slide-30
SLIDE 30

Productions Prog → begin Stmts end Stmts → Stmts semicolon Stmt | Stmt Stmt → id assign Expr Expr→ id | Expr plus id

An Example Grammar

Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr For readability, bold and lowercase For readability, Italics and UpperCamelCase Defines the syntax of legal programs

slide-31
SLIDE 31

Productions Prog → begin Stmts end Stmts → Stmts semicolon Stmt | Stmt Stmt → id assign Expr Expr→ id | Expr plus id

An Example Grammar

Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Program boundary Represents “;” Separates statements Represents “=“ statement Identifier / variable name Represents “+“ expression Root of the parse tree List of statements A single statement An expression Defines the syntax of legal programs For readability, bold and lowercase For readability, Italics and UpperCamelCase

slide-32
SLIDE 32

Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id

slide-33
SLIDE 33

Derivation Sequence Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id

slide-34
SLIDE 34

Derivation Sequence Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id Parse Tree

slide-35
SLIDE 35

Derivation Sequence Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id Parse Tree Key terminal Nonterminal Rule used

slide-36
SLIDE 36

Derivation Sequence Prog Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id Prog Parse Tree Key terminal Nonterminal Rule used

slide-37
SLIDE 37

Derivation Sequence Prog ⇒ begin Stmts end Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id Prog Parse Tree 1 Key terminal Nonterminal Rule used

slide-38
SLIDE 38

Derivation Sequence Prog ⇒ begin Stmts end Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id end begin Stmts Prog Parse Tree Key terminal Nonterminal Rule used 1

slide-39
SLIDE 39

Derivation Sequence Prog ⇒ begin Stmts end ⇒ begin Stmts semicolon Stmt end Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id Stmt Stmts end begin Stmts Prog semicolon Parse Tree 2 Key terminal Nonterminal Rule used 1

slide-40
SLIDE 40

Derivation Sequence Prog ⇒ begin Stmts end ⇒ begin Stmts semicolon Stmt end ⇒ begin Stmt semicolon Stmt end Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id Stmt Stmt Stmts end begin Stmts Prog semicolon Parse Tree 3 Key terminal Nonterminal Rule used 2 1

slide-41
SLIDE 41

Derivation Sequence Prog ⇒ begin Stmts end ⇒ begin Stmts semicolon Stmt end ⇒ begin Stmt semicolon Stmt end ⇒ begin id assign Expr semicolon Stmt end Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id id Expr assign Stmt Stmt Stmts end begin Stmts Prog semicolon Parse Tree 4 Key terminal Nonterminal Rule used 3 2 1

slide-42
SLIDE 42

Derivation Sequence Prog ⇒ begin Stmts end ⇒ begin Stmts semicolon Stmt end ⇒ begin Stmt semicolon Stmt end ⇒ begin id assign Expr semicolon Stmt end ⇒ begin id assign Expr semicolon id assign Expr end Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id id Expr assign id Expr assign Stmt Stmt Stmts end begin Stmts Prog semicolon Parse Tree 4 Key terminal Nonterminal Rule used 4 3 2 1

slide-43
SLIDE 43

Derivation Sequence Prog ⇒ begin Stmts end ⇒ begin Stmts semicolon Stmt end ⇒ begin Stmt semicolon Stmt end ⇒ begin id assign Expr semicolon Stmt end ⇒ begin id assign Expr semicolon id assign Expr end ⇒ begin id assign id semicolon id assign Expr end Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id id Expr assign id Expr assign id Stmt Stmt Stmts end begin Stmts Prog semicolon Parse Tree 5 Key terminal Nonterminal Rule used 4 4 3 2 1

slide-44
SLIDE 44

Derivation Sequence Prog ⇒ begin Stmts end ⇒ begin Stmts semicolon Stmt end ⇒ begin Stmt semicolon Stmt end ⇒ begin id assign Expr semicolon Stmt end ⇒ begin id assign Expr semicolon id assign Expr end ⇒ begin id assign id semicolon id assign Expr end ⇒ begin id assign id semicolon id assign Expr plus id end Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id id Expr assign Expr plus id id Expr assign id Stmt Stmt Stmts end begin Stmts Prog semicolon Parse Tree 6 Key terminal Nonterminal Rule used 5 4 4 3 2 1

slide-45
SLIDE 45

Derivation Sequence Prog ⇒ begin Stmts end ⇒ begin Stmts semicolon Stmt end ⇒ begin Stmt semicolon Stmt end ⇒ begin id assign Expr semicolon Stmt end ⇒ begin id assign Expr semicolon id assign Expr end ⇒ begin id assign id semicolon id assign Expr end ⇒ begin id assign id semicolon id assign Expr plus id end ⇒ begin id assign id semicolon id assign id plus id end Productions

  • 1. Prog → begin Stmts end
  • 2. Stmts → Stmts semicolon Stmt

3. | Stmt

  • 4. Stmt → id assign Expr
  • 5. Expr

→ id 6. | Expr plus id id Expr assign Expr plus id id Expr assign id Stmt Stmt Stmts end begin Stmts Prog semicolon Parse Tree 5 Key terminal Nonterminal Rule used id 6 5 4 4 3 2 1

slide-46
SLIDE 46

MA MAKEF KEFILE

A five minute introduction

slide-47
SLIDE 47

Makefiles: Motivation

  • Typing the series of commands to generate our

code can be tedious

– Multiple steps that depend on each other – Somewhat complicated commands – May not need to rebuild everything

  • Makefiles solve these issues

– Record a series of commands in a script-like DSL – Specify dependency rules and Make generates the results

slide-48
SLIDE 48

Makefiles: Basic Structure

<target>: <dependency list> <command to satisfy target> Ex Example

Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java

(tab)

slide-49
SLIDE 49

Makefiles: Basic Structure

<target>: <dependency list> <command to satisfy target> Ex Example

Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java

(tab)

slide-50
SLIDE 50

Makefiles: Basic Structure

<target>: <dependency list> <command to satisfy target> Ex Example

Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java

(tab) Example.class depends on example.java and IO.class

slide-51
SLIDE 51

Makefiles: Basic Structure

<target>: <dependency list> <command to satisfy target> Ex Example

Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java

(tab) Example.class depends on example.java and IO.class Example.class is generated by javac Example.java

slide-52
SLIDE 52

Makefiles: Dependencies

Ex Example

Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java

Example.class Example.java IO.class IO.java

slide-53
SLIDE 53

Makefiles: Dependencies

Ex Example

Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java

Example.class Example.java IO.class IO.java Internal Dependency graph

slide-54
SLIDE 54

Makefiles: Dependencies

Ex Example

Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java

Example.class Example.java IO.class IO.java Internal Dependency graph A file is rebuilt if one of it’s dependencies changes

slide-55
SLIDE 55

Makefiles: Variables

You can thread common configuration values through your makefile

slide-56
SLIDE 56

Makefiles: Variables

You can thread common configuration values through your makefile Ex Example

JC = /s/std/bin/javac JFLAGS = -g

slide-57
SLIDE 57

Makefiles: Variables

You can thread common configuration values through your makefile Ex Example

JC = /s/std/bin/javac JFLAGS = -g

Build for debug

slide-58
SLIDE 58

Makefiles: Variables

You can thread common configuration values through your makefile Ex Example

JC = /s/std/bin/javac JFLAGS = -g Example.class: Example.java IO.class $(JC) $(JFLAGS) Example.java IO.class: IO.java $(JC) $(JFLAGS) IO.java

Build for debug

slide-59
SLIDE 59

Makefiles: Phony Targets

  • You can run commands via make

– Write a target with no dependencies (called phony) – Will cause it to execute the command every time

Ex Exampl mple clean: rm –f *.class test: java –cp . Test.class

slide-60
SLIDE 60

Makefiles: Phony Targets

  • You can run commands via make

– Write a target with no dependencies (called phony) – Will cause it to execute the command every time

Ex Exampl mple clean: rm –f *.class test: java –cp . Test.class

slide-61
SLIDE 61

Makefiles: Phony Targets

  • You can run commands via make

– Write a target with no dependencies (called phony) – Will cause it to execute the command every time

Ex Exampl mple clean: rm –f *.class test: java –cp . Test.class

slide-62
SLIDE 62

Recap

  • We’ve defined context-free grammars

– More powerful than regular expressions

  • Learned a bit about makefiles
  • Next time: we’ll look at grammars in more

detail