Context-Free Languages Wen-Guey Tzeng Department of Computer - - PowerPoint PPT Presentation

context free languages
SMART_READER_LITE
LIVE PREVIEW

Context-Free Languages Wen-Guey Tzeng Department of Computer - - PowerPoint PPT Presentation

Context-Free Languages Wen-Guey Tzeng Department of Computer Science National Chiao Tung University Context-Free Grammars A grammar G=(V, T, S, P) is context-free if all productions in P are of form A x, where A V, x (V T)*


slide-1
SLIDE 1

Context-Free Languages

Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

slide-2
SLIDE 2

Context-Free Grammars

  • A grammar G=(V, T, S, P) is context-free if all

productions in P are of form Ax, where AV, x(VT)*

– The left side has only one variable.

  • Example,

G = ({S,A,B}, {a,b}, S, {SaAb|bBa, AaAb|, BbBb|})

2 2017 Spring

slide-3
SLIDE 3
  • Derivation:

3 2017 Spring

slide-4
SLIDE 4
  • L(G) = {w* | S * w}
  • A language L is context-free if and only if

there is a context-free grammar G such that L=L(G).

4 2017 Spring

slide-5
SLIDE 5

Examples

  • G=({S}, {a, b}, S, P), with P={SaSa|bSb|}

– S  aSa  aaSaa  aabSbaa  aabbaa=aabbaa – L(G) = {wwR : w{a, b}*}

5 2017 Spring

slide-6
SLIDE 6
  • S abB, AaaBb|, BbbAa

– L(G) = {ab(bbaa)nbba(ba)n : n0} ?

6 2017 Spring

slide-7
SLIDE 7

Design cfg’s

  • Give a cfg for L={anbn : n0}

7 2017 Spring

slide-8
SLIDE 8
  • Give a cfg for L={anbm : n>m}

8 2017 Spring

slide-9
SLIDE 9
  • Give a cfg for L={anbm : nm0}

– Idea1:

  • parse L into two cases (not necessarily disjoint)

L1={anbm : n>m}  L2={anbm : n<m}.

  • Then, construct productions for L1 and L2, respectively.

9 2017 Spring

slide-10
SLIDE 10
  • Give a cfg for L={anbm : nm0}

– Idea2:

  • produce the same amount of a’s and b’s, then extra a’s
  • r b’s

10 2017 Spring

slide-11
SLIDE 11
  • Give a cfg for L={anbmck : m=n+k}

– Match ‘a’ and ‘b’, ‘b’ and ‘c’

11 2017 Spring

slide-12
SLIDE 12
  • Give a cfg for L={anbmck : m>n+k}

12 2017 Spring

slide-13
SLIDE 13
  • Give a cfg for L={w{a,b}* : na(w)=nb(w)}

– Find the “recursion”

13 2017 Spring

slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16
  • Give a cfg for L={w{a,b}* : na(w)>nb(w)}

– Find relation with other language – Consider starting with ‘a’ and ‘b’, respectively

14 2017 Spring

slide-17
SLIDE 17

Leftmost and rightmost derivation

  • G=({A, B, S}, {a, b}, S, P), where P contains

SAB, AaaA, A, BBb, B 

– L(G)={a2nbm : n, m0}

  • For string aab

– Rightmost derivation – Leftmost derivation

15 2017 Spring

slide-18
SLIDE 18

Derivation (parse) tree

  • AabABc

16 2017 Spring

slide-19
SLIDE 19
  • SaAB, AbBb, BA|

17 2017 Spring

slide-20
SLIDE 20

Some comments

  • Derivation trees represent no orders of

derivation

  • Leftmost/rightmost derivations correspond to

depth-first visiting of the tree

  • Derivation tree and derivation order are very

important to “programming language” and “compiler design”

18 2017 Spring

slide-21
SLIDE 21

Grammar for C

19 2017 Spring

slide-22
SLIDE 22

main() { int i=1; printf("i starts out life as %d.", i); i = add(1, 1); /* Function call */ printf(" And becomes %d after function is executed.\n", i); }

20 2017 Spring

slide-23
SLIDE 23

Parsing and ambiguity

  • Parsing of wL(G): find a sequence of

productions by which wL(G) is derived.

  • Questions: given G and w

– Is wL(G) ? (membership problem) – Efficient way to determine whether wL(G) ? – How is wL(G) parsed ? (build the parsing tree) – Is the parsing unique ?

21 2017 Spring

slide-24
SLIDE 24

Exhaustive search/top down parsing

  • SSS|aSb|bSa|
  • Determine aabbL(G) ?

– 1st round: (1) SSS; (2) SaSb; (3) SbSa; (4) S – 2nd round:

  • From (1), SSSSSS, SSSaSbS, SSSbSaS,

SSS S

  • From (2), SaSbaSSb, SaSbaaSbb, SaSbabSab,

SaSbab

– 3rd round: …

  • Drawback: inefficiency
  • Other ways ?

22 2017 Spring

slide-25
SLIDE 25
  • If no productions of form A or AB, the

exhaustive search for wL(G) can be done in |P|+|P|2+…+|P|2|w| = O(|P|2|w|+1)

– Consider the leftmost parsing method. – w can be obtained within 2|w| derivations.

23 2017 Spring

slide-26
SLIDE 26
slide-27
SLIDE 27

Bottom up parsing

  • To reduce a string w to the start variable S
  • SaSb|

– w=aabb  aaSbb  aSb  S

  • Efficiency: O(|w|3)

24 2017 Spring

slide-28
SLIDE 28

Linear-time parsing

  • Simple grammar (s-grammar)

– All productions are of form Aax, where x(V T)* – Any pair (A, a) occurs at most once in P.

  • Example: SaS|bSS|c

– Parsing for ababccc

25 2017 Spring

slide-29
SLIDE 29

Ambiguous grammars

  • G is ambiguous if some wL(G) has two

derivation trees.

  • Example: SaSb|SS|

26 2017 Spring

slide-30
SLIDE 30

Example from programming languages

  • C-like grammar for arithmetic expressions.

G=({E, I}, {a, b, c, +, x, (, )}, E, P), where P contains EI EE+E EExE E(E) Ia|b|c

  • w=a+bxc has two derivation trees

27 2017 Spring

slide-31
SLIDE 31

28 2017 Spring

slide-32
SLIDE 32

Ambiguous languages

  • A cfl L is inherently ambiguous if any cfg G

with L(G)=L is ambiguous. Otherwise, it is unambiguous.

  • Note: an unambiguous language may have

ambiguous grammar.

  • Example: L={anbncm} {anbmcm} is inherently

ambigous.

– Hard to prove.

29 2017 Spring

slide-33
SLIDE 33

CFG and Programming Languages

  • Programming language: syntax + semantics
  • Syntax is defined by a grammar G

– <expression> ::= <term> | <expression> + <term> <term> ::= <factor> | <term> * <factor> – <while_statement> ::= while <expression><statement>

  • Syntax checking in compilers is done by a parser

– Is a program p grammatically correct ? – Is pL(G) ? – We need efficient parsers.

30 2017 Spring

slide-34
SLIDE 34

Restricted CFG Programming Languages

  • Goal:

– Its expression power is enough. – It has no ambiguity. if then if then else

If then “if then else” If then “if then” else

– There exists an efficient parser.

31 2017 Spring

slide-35
SLIDE 35
  • C -- LR(1)
  • PASCAL -- LL(1)
  • Hierarchy of classes of context-free languages

– LL(1)  LR(0)  LR(1)=DCFL  LR(2)  …  CFL

32 2017 Spring

slide-36
SLIDE 36

slide 33

Syntactic Correctness

  • Lexical analyzer produces a stream of tokens

x = y +2.1  <id> <op> <id> <op> <real>

  • Parser (syntactic analyzer) verifies that this

token stream is syntactically correct by constructing a valid parse tree for the entire program

– Unique parse tree for each language construct – Program = collection of parse trees rooted at the top by a special start symbol

2017 Spring

slide-37
SLIDE 37

slide 34

CFG For Floating Point Numbers

::= stands for production rule; <…> are non-terminals; | represents alternatives for the right-hand side of a production rule

Sample parse tree:

2017 Spring

slide-38
SLIDE 38

slide 35

CFG For Balanced Parentheses

Sample derivation: <balanced>  ( <balanced> )  (( <balanced> ))  (( <empty> ))  (( ))

Could we write this grammar using regular expressions or DFA? Why?

2017 Spring

slide-39
SLIDE 39

slide 36

CFG For Decimal Numbers (Redux)

Sample top-down leftmost derivation: <num>  <digit> <num>  7 <num>  7 <digit> <num>  7 8 <num>  7 8 <digit>  7 8 9

This grammar is right-recursive

2017 Spring

slide-40
SLIDE 40

slide 37

Compiler-compiler

  • A compiler-compiler is a program that

generates a compiler from a defined grammar

  • Parser can be built automatically from the BNF

description of the language’s CFG

  • Tools: yacc, Bison

2017 Spring

slide-41
SLIDE 41

slide 38

Compiler- compiler

Compiler: parser + code generator Execution code Programming language grammar G=(V, T, S, P) program Input data result

2017 Spring

slide-42
SLIDE 42