Grundlegende Parsingalgorithmen Top-Down & Bottom-Up Parsing - - PowerPoint PPT Presentation

grundlegende parsingalgorithmen
SMART_READER_LITE
LIVE PREVIEW

Grundlegende Parsingalgorithmen Top-Down & Bottom-Up Parsing - - PowerPoint PPT Presentation

Top-Down Parser Bottom-Up Parser Grundlegende Parsingalgorithmen Top-Down & Bottom-Up Parsing Kurt Eberle k.eberle@lingenio.de (Viele Folien, Teile von Folien, Materialien von Helmut Schmid s Parsing-Kurs WS14 T ubingen, u.a.) 24.


slide-1
SLIDE 1

Top-Down Parser Bottom-Up Parser

Grundlegende Parsingalgorithmen

Top-Down & Bottom-Up Parsing Kurt Eberle

k.eberle@lingenio.de

(Viele Folien, Teile von Folien, Materialien von Helmut Schmid’s Parsing-Kurs WS14 T¨ ubingen, u.a.)

  • 24. Februar 2020

1 / 17

slide-2
SLIDE 2

Top-Down Parser Bottom-Up Parser

¨ Uberblick

Top-Down Parser Bottom-Up Parser

2 / 17

slide-3
SLIDE 3

Top-Down Parser Bottom-Up Parser

¨ Uberblick

Top-Down Parser Bottom-Up Parser

3 / 17

slide-4
SLIDE 4

Top-Down Parser Bottom-Up Parser

Classification of Parsing Methods

◮ top-down vs. bottom-up ◮ derivation-oriented vs. table-driven vs. chart-based

4 / 17

slide-5
SLIDE 5

Top-Down Parser Bottom-Up Parser

Top-Down Parser

Idea: Systematically enumerate all left-most derivations until the input string has been derived. Left-most derivation: The left-most non-terminal is expanded in each step. Input: a n v a n

5 / 17

slide-6
SLIDE 6

Top-Down Parser Bottom-Up Parser

Top-Down Parser: Example

Input: a n v a n Grammar: S → NP VP NP → a n VP → v NP NP VP → v NP VP → v Left-most derivation: S ⇒ NP VP ⇒ a n VP ⇒ a n v NP NP ⇒ a n v a n NP ⇒ a n v a n a n (go 3 steps back) ⇒ a n v NP ⇒ a n v a n Other name: recursive descent parser

6 / 17

slide-7
SLIDE 7

Top-Down Parser Bottom-Up Parser

Top-Down Parser

◮ Non-deterministic regarding

◮ the choice of the non-terminal ◮ the choice of the rule

◮ Convention for NT selection: left-most derivation ◮ Backtracking in order to try different grammar rules

7 / 17

slide-8
SLIDE 8

Top-Down Parser Bottom-Up Parser

Formal Characterization TD

◮ Configuration:

Pair (α, r), s.t. α ∈ (V ∪ Σ)∗, r ∈ Σ∗

pair = < sentential form, remaining string to be recognized >

Start configuration: (S, w)

◮ configuration transitions:

◮ (aα, aw) → (α, w)

(“consumption” of an expected terminal symbol)

◮ (Aβ, r) → (αβ, r) with A → α ∈ P

(Expansion, left-most derivation step)

◮ End configuration:

(ε, ε) (complete parse found)

8 / 17

slide-9
SLIDE 9

Top-Down Parser Bottom-Up Parser

Configuration Transitions TD

(S , a n v a n a n) (NP VP , a n v a n a n) (a n VP , a n v a n a n) (n VP , n v a n a n) (VP , v a n a n) (v NP NP , v a n a n) (NP NP , a n a n) (a n NP , a n a n) (n NP , n a n) (NP , a n) (a n , a n) (n , n) ( ε , ε)

9 / 17

slide-10
SLIDE 10

Top-Down Parser Bottom-Up Parser

Problems of the TD Parser

◮ Left-recursive Non-terminals: A +

⇒ Aβ Danger of an infinite loop

◮ Rule selection:

“blind” expansion

◮ Inefficiency of Backtracking:

Partial analyses are repeated causing an exponential runtime

◮ Advantage:

easy to implement

10 / 17

slide-11
SLIDE 11

Top-Down Parser Bottom-Up Parser

Growth Curves

5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50 x/10 x**2/100 x**3/1000 x**6/1000000 2**x/1024 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 5 10 15 20 25 30 35 40 45 50 x/10 x**2/100 x**3/1000 x**6/1000000 2**x/1024

250/1024 seconds ≈ 35000 years

11 / 17

slide-12
SLIDE 12

Top-Down Parser Bottom-Up Parser

¨ Uberblick

Top-Down Parser Bottom-Up Parser

12 / 17

slide-13
SLIDE 13

Top-Down Parser Bottom-Up Parser

Bottom-Up Parser

Idea: Backward application of grammar rules (reductions) produces an inverted right-most derivation. Input: a n v a n a n Grammar: S → NP VP NP → a n VP → v NP NP VP → v NP VP → v Left-most Reduction: a n v a n a n ⇐ NP v a n a n ⇐ NP v NP a n ⇐ NP v NP NP ⇐ NP VP ⇐ S

13 / 17

slide-14
SLIDE 14

Top-Down Parser Bottom-Up Parser

Tree

✘ ✘ ✘ ✘ ✘ ✘ PPPP P

❅ ✏ ✏ ✏ ✏ ✏ ❳❳❳❳❳ ❳

S NP a n a n NP VP v NP n a

5 1 2 3 4

14 / 17

slide-15
SLIDE 15

Top-Down Parser Bottom-Up Parser

Formal Charakterization BU

◮ Configuration:

Pair (α, r), with α ∈ (V ∪ Σ)∗, r ∈ Σ∗

pair = < sentential form, remaining string to be recognized >

◮ Start configuration: (ε, w) ◮ Configuration transitions:

◮ (α, ar) → (αa, r)

(Shift action)

◮ (βα, r) → (βA, r) with A → α ∈ P

(Reduce action)

◮ End configuration: (S, ε)

⇒ “Shift-Reduce”-Parser

15 / 17

slide-16
SLIDE 16

Top-Down Parser Bottom-Up Parser

Configuration Transitions BU

(ε , a n v a n a n) (a , n v a n a n) (a n , v a n a n) (NP , v a n a n) (NP v , a n a n) (NP v a , n a n) (NP v a n , a n) (NP v NP , a n) (NP v NP a , n) (NP v NP a n , ε) (NP v NP NP , ε) (NP VP , ε) (S , ε)

16 / 17

slide-17
SLIDE 17

Top-Down Parser Bottom-Up Parser

Problems BU

◮ Rule cycles and ε productions may result in infinite loops. ◮ Rule selection:

“Blind shift”

◮ Inefficiency of Backtracking

17 / 17