Computational Linguistics II: Parsing Overview, Left-Recursion, - - PowerPoint PPT Presentation

computational linguistics ii parsing
SMART_READER_LITE
LIVE PREVIEW

Computational Linguistics II: Parsing Overview, Left-Recursion, - - PowerPoint PPT Presentation

Computational Linguistics II: Parsing Overview, Left-Recursion, Bottom-up Parsing Frank Richter & Jan-Philipp S ohn fr@sfs.uni-tuebingen.de, jp.soehn@uni-tuebingen.de Computational Linguistics II: Parsing p.1 The Big Picture


slide-1
SLIDE 1

Computational Linguistics II: Parsing

Overview, Left-Recursion, Bottom-up Parsing

Frank Richter & Jan-Philipp S¨

  • hn

fr@sfs.uni-tuebingen.de, jp.soehn@uni-tuebingen.de

Computational Linguistics II: Parsing – p.1

slide-2
SLIDE 2

The Big Picture

hierarchy grammar machine

  • ther

type 3

  • reg. grammar

DFA

  • reg. expressions
  • det. cf.

LR(k) grammar DPDA type 2 CFG PDA type 1 CSG LBA type 0 unrestricted Turing grammar machine

DFA: Deterministic finite state automaton (D)PDA: (Deterministic) Pushdown automaton CFG: Context-free grammar CSG: Context-sensitive grammar LBA: Linear bounded automaton

Computational Linguistics II: Parsing – p.2

slide-3
SLIDE 3

Problem: Left-Recursion

Grammar: S → Sb | a

Computational Linguistics II: Parsing – p.3

slide-4
SLIDE 4

Problem: Left-Recursion

Grammar: S → Sb | a Derivation for “abb”: abb

abb

abb

S Sb Sbb abb

abb

abb . . . Sbbb Sbbbb Sbbbbb

Computational Linguistics II: Parsing – p.3

slide-5
SLIDE 5

Problem: Left-Recursion

Grammar: S → Sb | a Derivation for “abb”: abb

abb

abb

S Sb Sbb abb

abb

abb . . . Sbbb Sbbbb Sbbbbb Problem: will never get around to finding a terminal in first position which it can match against the input ⇒ infinite loop

Computational Linguistics II: Parsing – p.3

slide-6
SLIDE 6

Left-Recursion

What exactly is left recursion?

Computational Linguistics II: Parsing – p.4

slide-7
SLIDE 7

Left-Recursion

What exactly is left recursion? Two cases:

Computational Linguistics II: Parsing – p.4

slide-8
SLIDE 8

Left-Recursion

What exactly is left recursion? Two cases:

  • 1. immediate left recursion: A → A α

lefthand side symbol is the same as first righthand side symbol

Computational Linguistics II: Parsing – p.4

slide-9
SLIDE 9

Left-Recursion

What exactly is left recursion? Two cases:

  • 1. immediate left recursion: A → A α

lefthand side symbol is the same as first righthand side symbol

  • 2. indirect left recursion: A → B α → . . . → A β

A extends via intermediate steps into another derivation part that starts with A.

Computational Linguistics II: Parsing – p.4

slide-10
SLIDE 10

Left-Recursion

What exactly is left recursion? Two cases:

  • 1. immediate left recursion: A → A α

lefthand side symbol is the same as first righthand side symbol

  • 2. indirect left recursion: A → B α → . . . → A β

A extends via intermediate steps into another derivation part that starts with A. Only first, non-terminal righthand side symbols are problematic.

Computational Linguistics II: Parsing – p.4

slide-11
SLIDE 11

Eliminating Immediate Left-Recursion

rule again: S → Sb | a

a, ab, abb, abbb

a(b)*

Computational Linguistics II: Parsing – p.5

slide-12
SLIDE 12

Eliminating Immediate Left-Recursion

rule again: S → Sb | a

a, ab, abb, abbb

a(b)* i.e. one part in front which ends the recursion, and the recursive part at the end

Computational Linguistics II: Parsing – p.5

slide-13
SLIDE 13

Eliminating Immediate Left-Recursion

rule again: S → Sb | a

a, ab, abb, abbb

a(b)* i.e. one part in front which ends the recursion, and the recursive part at the end idea: reformulated rule: S → a X | a X → b X | b

Computational Linguistics II: Parsing – p.5

slide-14
SLIDE 14

Eliminating Immediate Left-Recursion

rule again: S → Sb | a

a, ab, abb, abbb

a(b)* i.e. one part in front which ends the recursion, and the recursive part at the end idea: reformulated rule: S → a X | a X → b X | b more complicated ex.: S → Sb | Sd | ac | e

Computational Linguistics II: Parsing – p.5

slide-15
SLIDE 15

Eliminating Immediate Left-Recursion

rule again: S → Sb | a

a, ab, abb, abbb

a(b)* i.e. one part in front which ends the recursion, and the recursive part at the end idea: reformulated rule: S → a X | a X → b X | b more complicated ex.: S → Sb | Sd | ac | e language: ac, acb, acbb, acbbb, acbd, acdb, e, eb, ebb, ebd, edb, . . . ⇒ (ac | e) (b | d)*

Computational Linguistics II: Parsing – p.5

slide-16
SLIDE 16

Eliminating Immediate Left-Recursion

rule again: S → Sb | a

a, ab, abb, abbb

a(b)* i.e. one part in front which ends the recursion, and the recursive part at the end idea: reformulated rule: S → a X | a X → b X | b more complicated ex.: S → Sb | Sd | ac | e language: ac, acb, acbb, acbbb, acbd, acdb, e, eb, ebb, ebd, edb, . . . ⇒ (ac | e) (b | d)* reformulated: S → acX | eX | ac | e; X → bX | dX| b | d

Computational Linguistics II: Parsing – p.5

slide-17
SLIDE 17

Eliminating Immediate Left-Recursion

rule form: A → A α1 | . . . | A αn | β1 | . . . | βm

Computational Linguistics II: Parsing – p.6

slide-18
SLIDE 18

Eliminating Immediate Left-Recursion

rule form: A → A α1 | . . . | A αn | β1 | . . . | βm A_head: righthand sides that are not left-recursive: betas A_head → β1 | . . . | βm

Computational Linguistics II: Parsing – p.6

slide-19
SLIDE 19

Eliminating Immediate Left-Recursion

rule form: A → A α1 | . . . | A αn | β1 | . . . | βm A_head: righthand sides that are not left-recursive: betas A_head → β1 | . . . | βm A_tail: righthand sides to be duplicated: alphas A_tail → α1 | . . . | αn

Computational Linguistics II: Parsing – p.6

slide-20
SLIDE 20

Eliminating Immediate Left-Recursion

rule form: A → A α1 | . . . | A αn | β1 | . . . | βm A_head: righthand sides that are not left-recursive: betas A_head → β1 | . . . | βm A_tail: righthand sides to be duplicated: alphas A_tail → α1 | . . . | αn A_tails: recursion over alphas: A_tails → A_tail | A_tail A_tails

Computational Linguistics II: Parsing – p.6

slide-21
SLIDE 21

Eliminating Immediate Left-Recursion

rule form: A → A α1 | . . . | A αn | β1 | . . . | βm A_head: righthand sides that are not left-recursive: betas A_head → β1 | . . . | βm A_tail: righthand sides to be duplicated: alphas A_tail → α1 | . . . | αn A_tails: recursion over alphas: A_tails → A_tail | A_tail A_tails A: puts it together: A → A_head | A_head A_tails

Computational Linguistics II: Parsing – p.6

slide-22
SLIDE 22

Immediate Left-Recursion

left-recursive rule: A → A α1 | . . . | A αn | β1 | . . . | βm

Computational Linguistics II: Parsing – p.7

slide-23
SLIDE 23

Immediate Left-Recursion

left-recursive rule: A → A α1 | . . . | A αn | β1 | . . . | βm non-left-recursive rules: A → A_head | A_head A_tails A_head → β1 | . . . | βm A_tails → A_tail | A_tail A_tails A_tail → α1 | . . . | αn

Computational Linguistics II: Parsing – p.7

slide-24
SLIDE 24

Eliminating Indirect Left-Recursion

grammar: S

A B A

C B | b C

S a B

b

Computational Linguistics II: Parsing – p.8

slide-25
SLIDE 25

Eliminating Indirect Left-Recursion

grammar: S

A B A

C B | b C

S a B

b derivation: S ⇒ A B ⇒ C B B ⇒ S a B B ⇒ A B a B B

⇒ C B B a B B ⇒ S a B B a B B ⇒ A B a B B a B B ⇒

. . . ⇒ bb(abb)*

Computational Linguistics II: Parsing – p.8

slide-26
SLIDE 26

Eliminating Indirect Left-Recursion

grammar: S

A B A

C B | b C

S a B

b derivation: S ⇒ A B ⇒ C B B ⇒ S a B B ⇒ A B a B B

⇒ C B B a B B ⇒ S a B B a B B ⇒ A B a B B a B B ⇒

. . . ⇒ bb(abb)* idea: move left recursion closer, until it is in the same rule

Computational Linguistics II: Parsing – p.8

slide-27
SLIDE 27

Indirect Left-Recursion (2)

steps:

Computational Linguistics II: Parsing – p.9

slide-28
SLIDE 28

Indirect Left-Recursion (2)

steps:

  • 1. enumerate all non-terminals: A1, A2, . . . , An

Computational Linguistics II: Parsing – p.9

slide-29
SLIDE 29

Indirect Left-Recursion (2)

steps:

  • 1. enumerate all non-terminals: A1, A2, . . . , An
  • 2. check for A1 whether it is immediately left-recursive,

eliminate left-recursion

Computational Linguistics II: Parsing – p.9

slide-30
SLIDE 30

Indirect Left-Recursion (2)

steps:

  • 1. enumerate all non-terminals: A1, A2, . . . , An
  • 2. check for A1 whether it is immediately left-recursive,

eliminate left-recursion

  • 3. check for A2 whether it has a rule: A2 → A1 α

Computational Linguistics II: Parsing – p.9

slide-31
SLIDE 31

Indirect Left-Recursion (2)

steps:

  • 1. enumerate all non-terminals: A1, A2, . . . , An
  • 2. check for A1 whether it is immediately left-recursive,

eliminate left-recursion

  • 3. check for A2 whether it has a rule: A2 → A1 α
  • 4. yes? if A1 → β1 | . . . | βm:

A2 → β1 α | . . . | βm α

Computational Linguistics II: Parsing – p.9

slide-32
SLIDE 32

Indirect Left-Recursion (2)

steps:

  • 1. enumerate all non-terminals: A1, A2, . . . , An
  • 2. check for A1 whether it is immediately left-recursive,

eliminate left-recursion

  • 3. check for A2 whether it has a rule: A2 → A1 α
  • 4. yes? if A1 → β1 | . . . | βm:

A2 → β1 α | . . . | βm α

  • 5. check for An whether it has rules:

An → A1 α . . . An → An α

Computational Linguistics II: Parsing – p.9

slide-33
SLIDE 33

Indirect Left-Recursion (2)

steps:

  • 1. enumerate all non-terminals: A1, A2, . . . , An
  • 2. check for A1 whether it is immediately left-recursive,

eliminate left-recursion

  • 3. check for A2 whether it has a rule: A2 → A1 α
  • 4. yes? if A1 → β1 | . . . | βm:

A2 → β1 α | . . . | βm α

  • 5. check for An whether it has rules:

An → A1 α . . . An → An α

  • 6. yes? eliminate according to step 4

Computational Linguistics II: Parsing – p.9

slide-34
SLIDE 34

Indirect Left-Recursion – Example

enumerated grammar: S1

A B A2

C B | b C3

S a B4

b

Computational Linguistics II: Parsing – p.10

slide-35
SLIDE 35

Indirect Left-Recursion – Example

enumerated grammar: S1

A B A2

C B | b C3

S a B4

b 1: is S immediately left-recursive? no.

Computational Linguistics II: Parsing – p.10

slide-36
SLIDE 36

Indirect Left-Recursion – Example

enumerated grammar: S1

A B A2

C B | b C3

S a B4

b 1: is S immediately left-recursive? no. 2 → 1: rule A → S α? no.

Computational Linguistics II: Parsing – p.10

slide-37
SLIDE 37

Indirect Left-Recursion – Example

enumerated grammar: S1

A B A2

C B | b C3

S a B4

b 1: is S immediately left-recursive? no. 2 → 1: rule A → S α? no. 2 → 2: is A immediately left-recursive? no.

Computational Linguistics II: Parsing – p.10

slide-38
SLIDE 38

Indirect Left-Recursion – Example

enumerated grammar: S1

A B A2

C B | b C3

S a B4

b 1: is S immediately left-recursive? no. 2 → 1: rule A → S α? no. 2 → 2: is A immediately left-recursive? no. 3 → 1: rule C → S α? yes: C3 → S1 a

Computational Linguistics II: Parsing – p.10

slide-39
SLIDE 39

Indirect Left-Recursion – Example

enumerated grammar: S1

A B A2

C B | b C3

S a B4

b 1: is S immediately left-recursive? no. 2 → 1: rule A → S α? no. 2 → 2: is A immediately left-recursive? no. 3 → 1: rule C → S α? yes: C3 → S1 a replace by: C → A B a

Computational Linguistics II: Parsing – p.10

slide-40
SLIDE 40

Indirect Left-Recursion – Example

3 → 2: rule C → A α? yes (new rule!): C3 → A2 B a

Computational Linguistics II: Parsing – p.11

slide-41
SLIDE 41

Indirect Left-Recursion – Example

3 → 2: rule C → A α? yes (new rule!): C3 → A2 B a replace by: C → C B B a | b B a

Computational Linguistics II: Parsing – p.11

slide-42
SLIDE 42

Indirect Left-Recursion – Example

3 → 2: rule C → A α? yes (new rule!): C3 → A2 B a replace by: C → C B B a | b B a 3 → 3 (immediate left recursion): rule C → C α? yes (new rule!): C3 → C3 B B a | b B a

Computational Linguistics II: Parsing – p.11

slide-43
SLIDE 43

Indirect Left-Recursion – Example

3 → 2: rule C → A α? yes (new rule!): C3 → A2 B a replace by: C → C B B a | b B a 3 → 3 (immediate left recursion): rule C → C α? yes (new rule!): C3 → C3 B B a | b B a replace by: C → C_head | C_head C_tails C_head → b B a C_tails → C_tail | C_tail C_tails C_tail → a B B

Computational Linguistics II: Parsing – p.11