Compiler Construction Lecture 8: Syntax Analysis IV (More on LL (1) - - PowerPoint PPT Presentation

compiler construction
SMART_READER_LITE
LIVE PREVIEW

Compiler Construction Lecture 8: Syntax Analysis IV (More on LL (1) - - PowerPoint PPT Presentation

Compiler Construction Lecture 8: Syntax Analysis IV (More on LL (1) & Bottom-Up Parsing) Thomas Noll Lehrstuhl f ur Informatik 2 (Software Modeling and Verification) noll@cs.rwth-aachen.de


slide-1
SLIDE 1

Compiler Construction

Lecture 8: Syntax Analysis IV (More on LL(1) & Bottom-Up Parsing) Thomas Noll

Lehrstuhl f¨ ur Informatik 2 (Software Modeling and Verification) noll@cs.rwth-aachen.de http://moves.rwth-aachen.de/teaching/ss-14/cc14/

Summer Semester 2014

slide-2
SLIDE 2

Outline

1

Recap: LL(1) Parsing

2

Transformation to LL(1)

3

The Complexity of LL(1) Parsing

4

Recursive-Descent Parsing

5

Bottom-Up Parsing

6

Nondeterministic Bottom-Up Parsing

Compiler Construction Summer Semester 2014 8.2

slide-3
SLIDE 3

Characterization of LL(1)

Theorem (Characterization of LL(1))

G ∈ LL(1) iff for all pairs of rules A → β | γ ∈ P (where β = γ): la(A → β) ∩ la(A → γ) = ∅.

Proof.

  • n the board

Remark: the above theorem generally does not hold if k > 1 (cf. exercises)

Compiler Construction Summer Semester 2014 8.3

slide-4
SLIDE 4

Deterministic Top-Down Parsing

Approach: given G ∈ CFG Σ,

1

Verify that G ∈ LL(1) by computing the lookahead sets and checking alternatives for disjointness

2

Start with nondeterministic top-down parsing automaton NTA(G)

3

Use 1-symbol lookahead to control the choice of expanding productions:

(aw, Aα, z) ⊢ (aw, βα, zi) if πi = A → β and a ∈ la(πi) (ε, Aα, z) ⊢ (ε, βα, zi) if πi = A → β and ε ∈ la(πi) [matching steps as before: (aw, aα, z) ⊢ (w, α, z)]

= ⇒ deterministic top-down parsing automaton DTA(G) Remarks:

DTA(G) is actually not a pushdown automaton (a is read but not consumed). But: can be simulated using the finite control. Advantage of using lookahead is twofold: Removal of nondeterminism Earlier detection of syntax errors (in configurations (aw, Aα, z) where a / ∈

A→β∈P la(A → β))

Compiler Construction Summer Semester 2014 8.4

slide-5
SLIDE 5

Outline

1

Recap: LL(1) Parsing

2

Transformation to LL(1)

3

The Complexity of LL(1) Parsing

4

Recursive-Descent Parsing

5

Bottom-Up Parsing

6

Nondeterministic Bottom-Up Parsing

Compiler Construction Summer Semester 2014 8.5

slide-6
SLIDE 6

Transformation to LL(1)

Assume that G = N, Σ, P, S ∈ CFG Σ \ LL(1) (i.e., there exist A → β | γ ∈ P such that la(A → β) ∩ la(A → γ) = ∅)

Compiler Construction Summer Semester 2014 8.6

slide-7
SLIDE 7

Transformation to LL(1)

Assume that G = N, Σ, P, S ∈ CFG Σ \ LL(1) (i.e., there exist A → β | γ ∈ P such that la(A → β) ∩ la(A → γ) = ∅) Two heuristics for transforming G into G ′ ∈ LL(1):

1

Removal of left recursion

2

Left factorization (used in parser-generating systems such as ANTLR)

Compiler Construction Summer Semester 2014 8.6

slide-8
SLIDE 8

Transformation to LL(1)

Assume that G = N, Σ, P, S ∈ CFG Σ \ LL(1) (i.e., there exist A → β | γ ∈ P such that la(A → β) ∩ la(A → γ) = ∅) Two heuristics for transforming G into G ′ ∈ LL(1):

1

Removal of left recursion

2

Left factorization (used in parser-generating systems such as ANTLR) Remarks: Transformations generally preserve the semantics (= generated language) of CFGs but not the syntactic structure of words (different syntax trees). Transformations cannot always yield an LL(1) grammar (since not every context-free language is generated by an LL grammar; details later).

Compiler Construction Summer Semester 2014 8.6

slide-9
SLIDE 9

Left Recursion I

Definition 8.1 (Left recursion)

A grammar G = N, Σ, P, S ∈ CFG Σ is called left recursive if there exist A ∈ N and α ∈ X ∗ such that A ⇒+ Aα.

Compiler Construction Summer Semester 2014 8.7

slide-10
SLIDE 10

Left Recursion I

Definition 8.1 (Left recursion)

A grammar G = N, Σ, P, S ∈ CFG Σ is called left recursive if there exist A ∈ N and α ∈ X ∗ such that A ⇒+ Aα.

Corollary 8.2

If G ∈ CFG Σ is left recursive with A ⇒+ Aα, then there exists β ∈ X ∗ such that A ⇒+

l Aβ.

Compiler Construction Summer Semester 2014 8.7

slide-11
SLIDE 11

Left Recursion I

Definition 8.1 (Left recursion)

A grammar G = N, Σ, P, S ∈ CFG Σ is called left recursive if there exist A ∈ N and α ∈ X ∗ such that A ⇒+ Aα.

Corollary 8.2

If G ∈ CFG Σ is left recursive with A ⇒+ Aα, then there exists β ∈ X ∗ such that A ⇒+

l Aβ.

Example 8.3

The grammar (cf. Example 5.10) GAE : E → E+T | T T → T*F | F F → (E) | a | b is left recursive, and in Example 7.4 it was shown that GAE / ∈ LL(1)

Compiler Construction Summer Semester 2014 8.7

slide-12
SLIDE 12

Left Recursion II

Lemma 8.4

If G ∈ CFG Σ is left recursive, then G / ∈

k∈N LL(k).

Compiler Construction Summer Semester 2014 8.8

slide-13
SLIDE 13

Left Recursion II

Lemma 8.4

If G ∈ CFG Σ is left recursive, then G / ∈

k∈N LL(k).

Proof.

(for k = 1) Assume that G ∈ LL(1) is left recursive with A ⇒+

l Aβ.

Together with the reducedness of G this implies that S ⇒∗

l vAα ⇒+ l vAβα ⇒+ l vw for some v, w ∈ Σ∗ and α ∈ X ∗.

Compiler Construction Summer Semester 2014 8.8

slide-14
SLIDE 14

Left Recursion II

Lemma 8.4

If G ∈ CFG Σ is left recursive, then G / ∈

k∈N LL(k).

Proof.

(for k = 1) Assume that G ∈ LL(1) is left recursive with A ⇒+

l Aβ.

Together with the reducedness of G this implies that S ⇒∗

l vAα ⇒+ l vAβα ⇒+ l vw for some v, w ∈ Σ∗ and α ∈ X ∗.

The corresponding computation of DTA(G) (Def. 7.6) starts with (vw, S, ε) ⊢∗ (w, Aα, . . .) ⊢+ (w, Aβα, . . .).

Compiler Construction Summer Semester 2014 8.8

slide-15
SLIDE 15

Left Recursion II

Lemma 8.4

If G ∈ CFG Σ is left recursive, then G / ∈

k∈N LL(k).

Proof.

(for k = 1) Assume that G ∈ LL(1) is left recursive with A ⇒+

l Aβ.

Together with the reducedness of G this implies that S ⇒∗

l vAα ⇒+ l vAβα ⇒+ l vw for some v, w ∈ Σ∗ and α ∈ X ∗.

The corresponding computation of DTA(G) (Def. 7.6) starts with (vw, S, ε) ⊢∗ (w, Aα, . . .) ⊢+ (w, Aβα, . . .). But in the last state the behaviour of DTA(G) is determined by the same input (fi(w)) and stack symbol (A). Thus it enters a loop of the form (w, Aα, . . .) ⊢+ (w, Aβα, . . .) ⊢+ (w, Aββα, . . .) ⊢+ . . . and will never recognize w. Contradiction

Compiler Construction Summer Semester 2014 8.8

slide-16
SLIDE 16

Removing Direct Left Recursion

Direct left recursion occurs in productions of the form A → Aα1 | . . . | Aαm | β1 | . . . | βn where αi = ε and βj = A . . .

Compiler Construction Summer Semester 2014 8.9

slide-17
SLIDE 17

Removing Direct Left Recursion

Direct left recursion occurs in productions of the form A → Aα1 | . . . | Aαm | β1 | . . . | βn where αi = ε and βj = A . . . Transformation: replacement by right recursion A → β1A′ | . . . | βnA′ A′ → α1A′ | . . . | αmA′ | ε (with a new A′ ∈ N) which preserves L(G).

Compiler Construction Summer Semester 2014 8.9

slide-18
SLIDE 18

Removing Direct Left Recursion

Direct left recursion occurs in productions of the form A → Aα1 | . . . | Aαm | β1 | . . . | βn where αi = ε and βj = A . . . Transformation: replacement by right recursion A → β1A′ | . . . | βnA′ A′ → α1A′ | . . . | αmA′ | ε (with a new A′ ∈ N) which preserves L(G).

Example 8.5

GAE : E → E+T | T T → T*F | F F → (E) | a | b is transformed into G ′

AE :

E → TE ′ E ′ → +TE ′ | ε T → FT ′ T ′ → *FT ′ | ε F → (E) | a | b with G ′

AE ∈ LL(1) (see Example 7.5).

Compiler Construction Summer Semester 2014 8.9

slide-19
SLIDE 19

Removing Indirect Left Recursion

Indirect left recursion occurs in productions of the form (n ≥ 1) A → A1α1 | . . . A1 → A2α2 | . . . . . . An−1 → Anαn | . . . An → Aβ | . . .

Compiler Construction Summer Semester 2014 8.10

slide-20
SLIDE 20

Removing Indirect Left Recursion

Indirect left recursion occurs in productions of the form (n ≥ 1) A → A1α1 | . . . A1 → A2α2 | . . . . . . An−1 → Anαn | . . . An → Aβ | . . . Transformation: into Greibach Normal Form with productions of the form A → aB1 . . . Bn (where n ∈ N and each Bi = S) or S → ε (cf. Formale Systeme, Automaten, Prozesse)

Compiler Construction Summer Semester 2014 8.10

slide-21
SLIDE 21

Left Factorization

Applies to productions of the form A → αβ | αγ which are problematic if α “at least as long as lookahead”.

Compiler Construction Summer Semester 2014 8.11

slide-22
SLIDE 22

Left Factorization

Applies to productions of the form A → αβ | αγ which are problematic if α “at least as long as lookahead”. Transformation: delaying the decision by left factorization A → αA′ A′ → β | γ (with a new A′ ∈ N) which preserves L(G).

Compiler Construction Summer Semester 2014 8.11

slide-23
SLIDE 23

Left Factorization

Applies to productions of the form A → αβ | αγ which are problematic if α “at least as long as lookahead”. Transformation: delaying the decision by left factorization A → αA′ A′ → β | γ (with a new A′ ∈ N) which preserves L(G).

Example 8.6

Statement → if Condition then Statement else Statement fi | if Condition then Statement fi is transformed into Statement → if Condition then Statement S′ S′ → else Statement fi | fi

Compiler Construction Summer Semester 2014 8.11

slide-24
SLIDE 24

Outline

1

Recap: LL(1) Parsing

2

Transformation to LL(1)

3

The Complexity of LL(1) Parsing

4

Recursive-Descent Parsing

5

Bottom-Up Parsing

6

Nondeterministic Bottom-Up Parsing

Compiler Construction Summer Semester 2014 8.12

slide-25
SLIDE 25

The Complexity of LL(1) Parsing I

LL(1) parsing has time (and hence space) complexity O(|w|) (where w ∈ Σ∗ is the input word)

Compiler Construction Summer Semester 2014 8.13

slide-26
SLIDE 26

The Complexity of LL(1) Parsing I

LL(1) parsing has time (and hence space) complexity O(|w|) (where w ∈ Σ∗ is the input word) Here: proof for ε-free grammars (i.e., A → α ∈ P = ⇒ α = ε)

Compiler Construction Summer Semester 2014 8.13

slide-27
SLIDE 27

The Complexity of LL(1) Parsing I

LL(1) parsing has time (and hence space) complexity O(|w|) (where w ∈ Σ∗ is the input word) Here: proof for ε-free grammars (i.e., A → α ∈ P = ⇒ α = ε) General case: see O. Mayer: Syntaxanalyse, p. 211ff

Compiler Construction Summer Semester 2014 8.13

slide-28
SLIDE 28

The Complexity of LL(1) Parsing I

LL(1) parsing has time (and hence space) complexity O(|w|) (where w ∈ Σ∗ is the input word) Here: proof for ε-free grammars (i.e., A → α ∈ P = ⇒ α = ε) General case: see O. Mayer: Syntaxanalyse, p. 211ff

Lemma 8.7

Let G = N, Σ, P, S ∈ LL(1) be ε-free. If (w, S, ε) ⊢n (ε, ε, z) in DTA(G), then n ≤ (|w| + 1) · (|N| + 1).

Compiler Construction Summer Semester 2014 8.13

slide-29
SLIDE 29

The Complexity of LL(1) Parsing II

Proof.

Let (w, S, ε) ⊢n (ε, ε, z) in DTA(G). To show: n ≤ (|w| + 1) · (|N| + 1)

1

Clear: the computation involves |w| matching steps.

Compiler Construction Summer Semester 2014 8.14

slide-30
SLIDE 30

The Complexity of LL(1) Parsing II

Proof.

Let (w, S, ε) ⊢n (ε, ε, z) in DTA(G). To show: n ≤ (|w| + 1) · (|N| + 1)

1

Clear: the computation involves |w| matching steps.

2

Since G is ε-free, every matching step is preceded (and followed) by k ≥ 0 expansion steps of the form (av, A1α1, . . .) ⊢ (av, A2α2α1, . . .) . . . ⊢ (av, Akαk . . . α1, . . .) ⊢ (av, aαk+1 . . . α1, . . .) where Ai → Ai+1αi+1 for each i ∈ [k − 1] and Ak → aαk+1.

Compiler Construction Summer Semester 2014 8.14

slide-31
SLIDE 31

The Complexity of LL(1) Parsing II

Proof.

Let (w, S, ε) ⊢n (ε, ε, z) in DTA(G). To show: n ≤ (|w| + 1) · (|N| + 1)

1

Clear: the computation involves |w| matching steps.

2

Since G is ε-free, every matching step is preceded (and followed) by k ≥ 0 expansion steps of the form (av, A1α1, . . .) ⊢ (av, A2α2α1, . . .) . . . ⊢ (av, Akαk . . . α1, . . .) ⊢ (av, aαk+1 . . . α1, . . .) where Ai → Ai+1αi+1 for each i ∈ [k − 1] and Ak → aαk+1.

3

This implies that Ai = Aj for i = j (by Lemma 8.4, G is not left recursive), and hence k ≤ |N|.

Compiler Construction Summer Semester 2014 8.14

slide-32
SLIDE 32

The Complexity of LL(1) Parsing II

Proof.

Let (w, S, ε) ⊢n (ε, ε, z) in DTA(G). To show: n ≤ (|w| + 1) · (|N| + 1)

1

Clear: the computation involves |w| matching steps.

2

Since G is ε-free, every matching step is preceded (and followed) by k ≥ 0 expansion steps of the form (av, A1α1, . . .) ⊢ (av, A2α2α1, . . .) . . . ⊢ (av, Akαk . . . α1, . . .) ⊢ (av, aαk+1 . . . α1, . . .) where Ai → Ai+1αi+1 for each i ∈ [k − 1] and Ak → aαk+1.

3

This implies that Ai = Aj for i = j (by Lemma 8.4, G is not left recursive), and hence k ≤ |N|.

4

Altogether: n ≤ (|w| + 1) · (|N| + 1).

Compiler Construction Summer Semester 2014 8.14

slide-33
SLIDE 33

Outline

1

Recap: LL(1) Parsing

2

Transformation to LL(1)

3

The Complexity of LL(1) Parsing

4

Recursive-Descent Parsing

5

Bottom-Up Parsing

6

Nondeterministic Bottom-Up Parsing

Compiler Construction Summer Semester 2014 8.15

slide-34
SLIDE 34

Recursive-Descent Parsing I

Idea: avoid explicit use of pushdown store (as in DTA(G)) by employing recursive procedures (with implicit runtime stack)

Compiler Construction Summer Semester 2014 8.16

slide-35
SLIDE 35

Recursive-Descent Parsing I

Idea: avoid explicit use of pushdown store (as in DTA(G)) by employing recursive procedures (with implicit runtime stack) Advantage: simple implementation

Compiler Construction Summer Semester 2014 8.16

slide-36
SLIDE 36

Recursive-Descent Parsing I

Idea: avoid explicit use of pushdown store (as in DTA(G)) by employing recursive procedures (with implicit runtime stack) Advantage: simple implementation Ingredients: variable token for current token function next() for invoking the scanner procedure print(i) for displaying the leftmost analysis (or errors)

Compiler Construction Summer Semester 2014 8.16

slide-37
SLIDE 37

Recursive-Descent Parsing I

Idea: avoid explicit use of pushdown store (as in DTA(G)) by employing recursive procedures (with implicit runtime stack) Advantage: simple implementation Ingredients: variable token for current token function next() for invoking the scanner procedure print(i) for displaying the leftmost analysis (or errors) Method: to every A ∈ N we assign a procedure A() which tests token with regard to the lookahead sets of the A-productions, prints the corresponding rule number and evaluates the corresponding right-hand side as follows:

for a ∈ Σ: match token; call next() for A ∈ N: call A()

Compiler Construction Summer Semester 2014 8.16

slide-38
SLIDE 38

Recursive-Descent Parsing II

Example 8.8 (Arithmetic expressions; cf. Example 8.5)

proc main(); token := next(); E() proc E(); (* E → T E ′ *) if token in {’(’,’a’,’b’} then print(1); T(); E’() else print(error); stop fi proc E’(); (* E ′ → + T E ′ | ε *) if token = ’+’ then print(2); token := next(); T(); E’() elsif token in {EOF, ’)’} then print(3) else print(error); stop fi proc T(); (* T → F T ′ *) if token in {’(’,’a’,’b’} then print(4); F(); T’() else print(error); stop fi proc T’(); (* T ′ → * F T ′ | ε *) if token = ’*’ then print(5); token := next(); F(); T’() elsif token in {’+’,EOF,’)’} then print(6) else print(error); stop fi proc F(); (* F → ( E ) | a | b *) if token = ’(’ then print(7); token := next(); E(); if token = ’)’ then token := next() else print(error); stop fi elsif token = ’a’ then print(8); token := next() elsif token = ’b’ then print(9); token := next() else print(error); stop fi

Compiler Construction Summer Semester 2014 8.17

slide-39
SLIDE 39

Outline

1

Recap: LL(1) Parsing

2

Transformation to LL(1)

3

The Complexity of LL(1) Parsing

4

Recursive-Descent Parsing

5

Bottom-Up Parsing

6

Nondeterministic Bottom-Up Parsing

Compiler Construction Summer Semester 2014 8.18

slide-40
SLIDE 40

Repetition: Top-Down Parsing

Example 8.9

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7)

Compiler Construction Summer Semester 2014 8.19

slide-41
SLIDE 41

Repetition: Top-Down Parsing

Example 8.9

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Leftmost analysis of (a)*b:

E ( a ) * b

Compiler Construction Summer Semester 2014 8.19

slide-42
SLIDE 42

Repetition: Top-Down Parsing

Example 8.9

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Leftmost analysis of (a)*b: 2

E ( a ) * b T

Compiler Construction Summer Semester 2014 8.19

slide-43
SLIDE 43

Repetition: Top-Down Parsing

Example 8.9

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Leftmost analysis of (a)*b: 2 3

E ( a ) * b T T F

Compiler Construction Summer Semester 2014 8.19

slide-44
SLIDE 44

Repetition: Top-Down Parsing

Example 8.9

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Leftmost analysis of (a)*b: 2 3 4

E ( a ) * b T T F F

Compiler Construction Summer Semester 2014 8.19

slide-45
SLIDE 45

Repetition: Top-Down Parsing

Example 8.9

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Leftmost analysis of (a)*b: 2 3 4 5

E ( a ) * b T T F F E

Compiler Construction Summer Semester 2014 8.19

slide-46
SLIDE 46

Repetition: Top-Down Parsing

Example 8.9

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Leftmost analysis of (a)*b: 2 3 4 5 2

E ( a ) * b T T F F E T

Compiler Construction Summer Semester 2014 8.19

slide-47
SLIDE 47

Repetition: Top-Down Parsing

Example 8.9

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Leftmost analysis of (a)*b: 2 3 4 5 2 4

E ( a ) * b T T F F E T F

Compiler Construction Summer Semester 2014 8.19

slide-48
SLIDE 48

Repetition: Top-Down Parsing

Example 8.9

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Leftmost analysis of (a)*b: 2 3 4 5 2 4 6

E ( a ) * b T T F F E T F

Compiler Construction Summer Semester 2014 8.19

slide-49
SLIDE 49

Repetition: Top-Down Parsing

Example 8.9

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Leftmost analysis of (a)*b: 2 3 4 5 2 4 6 7

E ( a ) * b T T F F E T F

Compiler Construction Summer Semester 2014 8.19

slide-50
SLIDE 50

Bottom-Up Parsing I

Example 8.10

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7)

Compiler Construction Summer Semester 2014 8.20

slide-51
SLIDE 51

Bottom-Up Parsing I

Example 8.10

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Reversed rightmost analysis

  • f (a)*b:

( a ) * b

Compiler Construction Summer Semester 2014 8.20

slide-52
SLIDE 52

Bottom-Up Parsing I

Example 8.10

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Reversed rightmost analysis

  • f (a)*b:

6

( a ) * b F

Compiler Construction Summer Semester 2014 8.20

slide-53
SLIDE 53

Bottom-Up Parsing I

Example 8.10

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Reversed rightmost analysis

  • f (a)*b:

6 4

( a ) * b F T

Compiler Construction Summer Semester 2014 8.20

slide-54
SLIDE 54

Bottom-Up Parsing I

Example 8.10

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Reversed rightmost analysis

  • f (a)*b:

6 4 2

( a ) * b F T E

Compiler Construction Summer Semester 2014 8.20

slide-55
SLIDE 55

Bottom-Up Parsing I

Example 8.10

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Reversed rightmost analysis

  • f (a)*b:

6 4 2 5

( a ) * b F T E F

Compiler Construction Summer Semester 2014 8.20

slide-56
SLIDE 56

Bottom-Up Parsing I

Example 8.10

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Reversed rightmost analysis

  • f (a)*b:

6 4 2 5 4

( a ) * b F T E F T

Compiler Construction Summer Semester 2014 8.20

slide-57
SLIDE 57

Bottom-Up Parsing I

Example 8.10

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Reversed rightmost analysis

  • f (a)*b:

6 4 2 5 4 7

( a ) * b F T E F T F

Compiler Construction Summer Semester 2014 8.20

slide-58
SLIDE 58

Bottom-Up Parsing I

Example 8.10

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Reversed rightmost analysis

  • f (a)*b:

6 4 2 5 4 7 3

( a ) * b F T E F T F T

Compiler Construction Summer Semester 2014 8.20

slide-59
SLIDE 59

Bottom-Up Parsing I

Example 8.10

Grammar for arithmetic expressions: GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Reversed rightmost analysis

  • f (a)*b:

6 4 2 5 4 7 3 2

( a ) * b F T E F T F T E

Compiler Construction Summer Semester 2014 8.20

slide-60
SLIDE 60

Bottom-Up Parsing II

Approach:

1

Given G ∈ CFG Σ, construct a nondeterministic bottom-up parsing automaton (NBA) which accepts L(G) and which additionally computes corresponding (reversed) rightmost analyses

input alphabet: Σ pushdown alphabet: X

  • utput alphabet: [p] (where p := |P|)

state set: omitted transitions: shift: shifting input symbols onto the pushdown reduce: replacing the right-hand side of a production by its left-hand side (= inverse expansion steps)

Compiler Construction Summer Semester 2014 8.21

slide-61
SLIDE 61

Bottom-Up Parsing II

Approach:

1

Given G ∈ CFG Σ, construct a nondeterministic bottom-up parsing automaton (NBA) which accepts L(G) and which additionally computes corresponding (reversed) rightmost analyses

input alphabet: Σ pushdown alphabet: X

  • utput alphabet: [p] (where p := |P|)

state set: omitted transitions: shift: shifting input symbols onto the pushdown reduce: replacing the right-hand side of a production by its left-hand side (= inverse expansion steps)

2

Remove nondeterminism by allowing lookahead on the input: G ∈ LR(k) iff L(G) recognizable by deterministic bottom-up parsing automaton with lookahead of k symbols

Compiler Construction Summer Semester 2014 8.21

slide-62
SLIDE 62

Outline

1

Recap: LL(1) Parsing

2

Transformation to LL(1)

3

The Complexity of LL(1) Parsing

4

Recursive-Descent Parsing

5

Bottom-Up Parsing

6

Nondeterministic Bottom-Up Parsing

Compiler Construction Summer Semester 2014 8.22

slide-63
SLIDE 63

Nondeterministic Bottom-Up Automaton I

Definition 8.11 (Nondeterministic bottom-up parsing automaton)

Let G = N, Σ, P, S ∈ CFG Σ. The nondeterministic bottom-up parsing automaton of G, NBA(G), is defined by the following components. Input alphabet: Σ Pushdown alphabet: X Output alphabet: [p] Configurations: Σ∗ × X ∗ × [p]∗ (top of pushdown to the right) Transitions for w ∈ Σ∗, α ∈ X ∗, and z ∈ [p]∗: shifting steps: (aw, α, z) ⊢ (w, αa, z) if a ∈ Σ reduction steps: (w, αβ, z) ⊢ (w, αA, zi) if πi = A → β Initial configuration for w ∈ Σ∗: (w, ε, ε) Final configurations: {ε} × {S} × [p]∗

Compiler Construction Summer Semester 2014 8.23

slide-64
SLIDE 64

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7)

Compiler Construction Summer Semester 2014 8.24

slide-65
SLIDE 65

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε )

Compiler Construction Summer Semester 2014 8.24

slide-66
SLIDE 66

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε )

Compiler Construction Summer Semester 2014 8.24

slide-67
SLIDE 67

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε )

Compiler Construction Summer Semester 2014 8.24

slide-68
SLIDE 68

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε ) ⊢ ( )*b, (F , 6 )

Compiler Construction Summer Semester 2014 8.24

slide-69
SLIDE 69

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε ) ⊢ ( )*b, (F , 6 ) ⊢ ( )*b, (T , 64 )

Compiler Construction Summer Semester 2014 8.24

slide-70
SLIDE 70

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε ) ⊢ ( )*b, (F , 6 ) ⊢ ( )*b, (T , 64 ) ⊢ ( )*b, (E , 642 )

Compiler Construction Summer Semester 2014 8.24

slide-71
SLIDE 71

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε ) ⊢ ( )*b, (F , 6 ) ⊢ ( )*b, (T , 64 ) ⊢ ( )*b, (E , 642 ) ⊢ ( *b, (E) , 642 )

Compiler Construction Summer Semester 2014 8.24

slide-72
SLIDE 72

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε ) ⊢ ( )*b, (F , 6 ) ⊢ ( )*b, (T , 64 ) ⊢ ( )*b, (E , 642 ) ⊢ ( *b, (E) , 642 ) ⊢ ( *b, F , 6425 )

Compiler Construction Summer Semester 2014 8.24

slide-73
SLIDE 73

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε ) ⊢ ( )*b, (F , 6 ) ⊢ ( )*b, (T , 64 ) ⊢ ( )*b, (E , 642 ) ⊢ ( *b, (E) , 642 ) ⊢ ( *b, F , 6425 ) ⊢ ( *b, T , 64254 )

Compiler Construction Summer Semester 2014 8.24

slide-74
SLIDE 74

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε ) ⊢ ( )*b, (F , 6 ) ⊢ ( )*b, (T , 64 ) ⊢ ( )*b, (E , 642 ) ⊢ ( *b, (E) , 642 ) ⊢ ( *b, F , 6425 ) ⊢ ( *b, T , 64254 ) ⊢ ( b, T* , 64254 )

Compiler Construction Summer Semester 2014 8.24

slide-75
SLIDE 75

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε ) ⊢ ( )*b, (F , 6 ) ⊢ ( )*b, (T , 64 ) ⊢ ( )*b, (E , 642 ) ⊢ ( *b, (E) , 642 ) ⊢ ( *b, F , 6425 ) ⊢ ( *b, T , 64254 ) ⊢ ( b, T* , 64254 ) ⊢ ( ε, T*b , 64254 )

Compiler Construction Summer Semester 2014 8.24

slide-76
SLIDE 76

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε ) ⊢ ( )*b, (F , 6 ) ⊢ ( )*b, (T , 64 ) ⊢ ( )*b, (E , 642 ) ⊢ ( *b, (E) , 642 ) ⊢ ( *b, F , 6425 ) ⊢ ( *b, T , 64254 ) ⊢ ( b, T* , 64254 ) ⊢ ( ε, T*b , 64254 ) ⊢ ( ε, T*F, 642547 )

Compiler Construction Summer Semester 2014 8.24

slide-77
SLIDE 77

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε ) ⊢ ( )*b, (F , 6 ) ⊢ ( )*b, (T , 64 ) ⊢ ( )*b, (E , 642 ) ⊢ ( *b, (E) , 642 ) ⊢ ( *b, F , 6425 ) ⊢ ( *b, T , 64254 ) ⊢ ( b, T* , 64254 ) ⊢ ( ε, T*b , 64254 ) ⊢ ( ε, T*F, 642547 ) ⊢ ( ε, T , 6425473 )

Compiler Construction Summer Semester 2014 8.24

slide-78
SLIDE 78

Nondeterministic Bottom-Up Automaton II

Example 8.12

Grammar for arithmetic expressions (cf. Example 8.10): GAE : E → E+T | T (1, 2) T → T*F | F (3, 4) F → (E) | a | b (5, 6, 7) Bottom-up parsing of (a)*b: ((a)*b, ε , ε ) ⊢ ( a)*b, ( , ε ) ⊢ ( )*b, (a , ε ) ⊢ ( )*b, (F , 6 ) ⊢ ( )*b, (T , 64 ) ⊢ ( )*b, (E , 642 ) ⊢ ( *b, (E) , 642 ) ⊢ ( *b, F , 6425 ) ⊢ ( *b, T , 64254 ) ⊢ ( b, T* , 64254 ) ⊢ ( ε, T*b , 64254 ) ⊢ ( ε, T*F, 642547 ) ⊢ ( ε, T , 6425473 ) ⊢ ( ε, E , 64254732)

Compiler Construction Summer Semester 2014 8.24

slide-79
SLIDE 79

Correctness of NBA(G)

Theorem 8.13 (Correctness of NBA(G))

Let G = N, Σ, P, S ∈ CFG Σ and NBA(G) as before. Then, for every w ∈ Σ∗ and z ∈ [p]∗, (w, ε, ε) ⊢∗ (ε, S, z) iff ← − z is a rightmost analysis of w

Compiler Construction Summer Semester 2014 8.25

slide-80
SLIDE 80

Correctness of NBA(G)

Theorem 8.13 (Correctness of NBA(G))

Let G = N, Σ, P, S ∈ CFG Σ and NBA(G) as before. Then, for every w ∈ Σ∗ and z ∈ [p]∗, (w, ε, ε) ⊢∗ (ε, S, z) iff ← − z is a rightmost analysis of w

Proof.

similar to the top-down case (Theorem 6.1)

Compiler Construction Summer Semester 2014 8.25

slide-81
SLIDE 81

Nondeterminisn in NBA(G)

Observation: NBA(G) is generally nondeterministic Shift or reduce? Example: (bw, αa, z) ⊢

  • (w, αab, z)

(bw, αA, zi) if πi = A → a

Compiler Construction Summer Semester 2014 8.26

slide-82
SLIDE 82

Nondeterminisn in NBA(G)

Observation: NBA(G) is generally nondeterministic Shift or reduce? Example: (bw, αa, z) ⊢

  • (w, αab, z)

(bw, αA, zi) if πi = A → a If reduce: which “handle” β? Example: (w, αab, z) ⊢

  • (w, αA, zi)

(w, αaB, zj) if πi = A → ab and πj = B → b

Compiler Construction Summer Semester 2014 8.26

slide-83
SLIDE 83

Nondeterminisn in NBA(G)

Observation: NBA(G) is generally nondeterministic Shift or reduce? Example: (bw, αa, z) ⊢

  • (w, αab, z)

(bw, αA, zi) if πi = A → a If reduce: which “handle” β? Example: (w, αab, z) ⊢

  • (w, αA, zi)

(w, αaB, zj) if πi = A → ab and πj = B → b If reduce β: which left-hand side A? Example: (w, αa, z) ⊢

  • (w, αA, zi)

(w, αB, zj) if πi = A → a and πj = B → a

Compiler Construction Summer Semester 2014 8.26

slide-84
SLIDE 84

Nondeterminisn in NBA(G)

Observation: NBA(G) is generally nondeterministic Shift or reduce? Example: (bw, αa, z) ⊢

  • (w, αab, z)

(bw, αA, zi) if πi = A → a If reduce: which “handle” β? Example: (w, αab, z) ⊢

  • (w, αA, zi)

(w, αaB, zj) if πi = A → ab and πj = B → b If reduce β: which left-hand side A? Example: (w, αa, z) ⊢

  • (w, αA, zi)

(w, αB, zj) if πi = A → a and πj = B → a When to terminate parsing? Example: (ε, S, z)

final

⊢ (ε, A, zi) if πi = A → S

Compiler Construction Summer Semester 2014 8.26