CSCI 3136 Principles of Programming Languages Syntactic Analysis - - PowerPoint PPT Presentation

csci 3136 principles of programming languages
SMART_READER_LITE
LIVE PREVIEW

CSCI 3136 Principles of Programming Languages Syntactic Analysis - - PowerPoint PPT Presentation

CSCI 3136 Principles of Programming Languages Syntactic Analysis and Context-Free Grammars - 1 Summer 2013 Faculty of Computer Science Dalhousie University 1 / 13 Constructing a Scanner: Example (1) Language : Strings of 0s and 1s containing


slide-1
SLIDE 1

CSCI 3136 Principles of Programming Languages

Syntactic Analysis and Context-Free Grammars - 1

Summer 2013 Faculty of Computer Science Dalhousie University

1 / 13

slide-2
SLIDE 2

Constructing a Scanner: Example (1)

Language: Strings of 0s and 1s containing an even number of 0s. Regular expression: NFA:

2 / 13

slide-3
SLIDE 3

Constructing a Scanner: Example (2)

DFA: Minimized DFA:

3 / 13

slide-4
SLIDE 4

Constructing a Scanner: Example (2)

DFA: Minimized DFA:

4 / 13

slide-5
SLIDE 5

Extended Example of a Scanner

5 / 13

slide-6
SLIDE 6

Scanner Implementation

  • 1. From finite automaton
  • Case (switch) statements represent transitions of DFA
  • Table-based implementation: where the Table represents

transitions (interpreted by a driver)

  • 2. Ad hoc
  • Write by hand when high performance is an issue, e.g., in

production compilers

6 / 13

slide-7
SLIDE 7

Phases of Compilation

Semantic Analysis and Inter- mediate Code Generation Machine-independent Code Improvement (Optional) Target Code Generation Machine-specific Code Im- provement (Optional) Parser (syntactic analysis) Scanner (lexical analysis) Character Stream Token Stream Parse Tree Abstract Syntax Tree or Other Intermediate Form Modified Intermediate Form Target Language (e.g., assembly) Modified Target Language Symbol Table

7 / 13

slide-8
SLIDE 8

Context-free Grammar Context-free Language

are gen- erated by

Parser (PDA)

recognizes

Regular Expression Regular Language

are gen- erated by

Scanner (DFA)

recognizes

slide-9
SLIDE 9

Context-Free Grammar (CFG) (motivation)

—————————— Set of rules or productions —————————— P → N P → A P S → P V P A → big|green N → cheese|Jim V → ate ——————————– Are the following sentences in the language described by the above grammar?

  • big Jim ate green cheese
  • green Jim ate green cheese
  • Jim ate cheese
  • cheese ate Jim
  • non-terminals or variables

V = {P, S, A, N, V }

  • terminals

Σ= {Jim, ate, cheese, big, green}

  • rules or productions

P ={ P → N, P → A P, S → P V P, A → big|green, N → cheese|Jim, V → ate }

  • start variable or start non-terminal

S = S ————————————— A context-free grammar is a 4-tuple (V , Σ, P, S).

9 / 13

slide-10
SLIDE 10

Context-Free Grammar (CFG)

A context-free grammar is a 4-tuple (V , Σ, P, S), where

  • V is a finite set of non-terminals or variables,
  • Σ is a finite set of terminals,
  • P is a finite set of rules or productions in the form

N ∈ V → (Σ ∪ V )∗

  • S ∈ V is the start variable.

10 / 13

slide-11
SLIDE 11

Productions (Rules)

  • examples: P → N, P → A P
  • different notation: P → N|A P
  • left-hand side (lhs) and right-hand side (rhs) of a rule:

P

  • lhs

→ AP

  • rhs
  • rhs may be a mixture of terminals and non-terminals:

P → big green N

  • empty rule (epsilon rule, epsilon production): P → ǫ
  • unit production: P → N

where P and N are non-terminals

11 / 13

slide-12
SLIDE 12

Generating Sentences, example

———————– S → P V P P → N P → A P A → big|green N → cheese|Jim V → ate ————————

  • S ⇒ P V P ⇒ N V P ⇒ N V N ⇒ Jim V N

⇒ Jim ate N ⇒ Jim ate cheese

  • S ⇒ P V P ⇒ A P V P ⇒ big P V P ⇒

big N V P ⇒ big Jim V P ⇒ big Jim ate P ⇒ big Jim ate A P ⇒ big Jim ate green P ⇒ big Jim ate green N ⇒ big Jim ate green cheese

12 / 13

slide-13
SLIDE 13

Generating Sentences

CFG generates sentences using a process of rewriting in the following way:

  • start with S
  • choose a rule S → α (α is a sequence of terminals and/or

non-terminals), and replace S with α

  • if α contains a non-terminal X, choose a rule X → β, and

replace X with β

  • continue the process until only terminals remain

This process of rewriting is known as derivation. Intermediate strings are called sentential forms.

∗ notation: S ⇒∗ Jim ate cheese

13 / 13