Parsing: Episode I Matthew Might University of Utah matt.might.net - PowerPoint PPT Presentation

Parsing: Episode I Matthew Might University of Utah matt.might.net ucombinator.org

Administrivia • Project 1: Use the source!

Agenda • What is parsing? • Context-free languages • Context-free grammars • Recursive descent parsing • Properties of grammars

What is parsing? A parser converts a token stream from the lexer into a parse tree.

Example f x = x

Example f x = x ID(f) ID(x) EQUAL ID(x)

Example f x = x ID(f) ID(x) EQUAL ID(x) Dec FunDef ArgList Expr ID(f) EQUAL Arg Ref ID(x) ID(x)

Parsing methods • LALR( k ) • Nondet. rec. descent • LR( k ) • Predictive rec. descent • SLR( k ) • PEG/Packrat • LL( k ) • Combinators • Back-tracking search • Earley

Context-free languages

Context-free languages • Natural choice for describing syntax • Like regular expressions plus recursion

Example • Language of balanced parentheses • Language is context-free language • But language is not regular language

As formal language • Context-free languages are formal languages • Two operations allowed: catenation, union • Recursive equations are allowed as well

Example L B = { ǫ } ∪ ( { ( } · L B · { ) } · L B ) .

Problem: Recursion! How do we assign meaning to recursive definitions?

Fixed points!

Fixed points If x = f ( x ), then the point x is a fixed point of the function f .

Fixed points Fix ( f ) = { L : L = f ( L ) } .

Algebra • x = x 2 - 1 is a recursive definition of x • If f ( v ) = v 2 - 1, then x = f ( x ). • Solutions are the fixed points of f .

f ( x ) 0 x

f ( x ) = x 2 -1 f ( x ) 0 x

f ( x ) = x 2 -1 f ( x ) fixed line 0 x

Refactoring L B = f ( L B ) f ( L ) = { ǫ } ∪ ( { ( } · L · { ) } · L ) .

Candidates L B ∈ Fix ( f ),

Sensible choices � lfp( f ) = L L ∈ Fix ( f ) � gfp( f ) = L L ∈ Fix ( f )

Greatest fixed point • Includes infinitely long strings! • Example: ()()()()()()() ...

Kleene’s theorem (specialized) If a function f is continuous , then: ∞ � f n ( ∅ ) lfp( f ) = n ≥ 1

Continuous The function f is continuous only if: �� = f ( x i ) f x i i i

Constructive observation ∅ ⊆ f ( ∅ ) ⊆ f 2 ( ∅ ) ⊆ f 3 ( ∅ ) ⊆ · · ·

Excursion

In general � In general, for a set of recursive equations over the languages L 1 , . . . , L n , if L 1 = f 1 ( L 1 , . . . , L n ) L 2 = f 2 ( L 1 , . . . , L n ) . . . . . = . L n = f n ( L 1 , . . . , L n ), then these languages are a fixed point of the function F : P ( A ∗ ) n → P ( A ∗ ) n : F ( L 1 , . . . , L n ) = ( f 1 ( L 1 , . . . , L n ) , f 2 ( L 1 , . . . , L n ) , . . . f n ( L 1 , . . . , L n )), and by default, the least fixed point of this function: ( L 1 , . . . , L n ) = lfp( F ).

Context-free grammars

Context-free grammars A context-free grammar is a quadruple ( A, N, R, n 0 ), where: • the set A contains the terminal symbols of the language—its alphabet; and • the set N contains the non-terminal symbols of the language; and • the set R ⊆ N × ( A × N ) ∗ contains non-terminal-to-terminal substitution rules; and • the symbol n 0 ∈ N is the top-level “start” symbol.

Example A = { ( , ) } N = { B } R ∋ B → ( B ) B R ∋ B → ǫ n 0 = B .

Recognizing strings wnw ′ ∈ L ( A, N, R, n 0 ) ( n → s 1 . . . s n ) ∈ R ws 1 . . . s n w ′ ∈ L ( A, N, R, n 0 ).

Example B = n 0 ( B → ( B ) B ) ∈ R ( B → ǫ ) ∈ R B ∈ L ( G B ) ( B ) B ∈ L ( G B ) () ∈ L ( G B ).

Parse trees • Convenient diagrammatic notation • Demonstrates membership in language • Simultaneously shows structure of string

Example B B B ( ) ǫ ǫ

Example: Regexes A = { ( , ) , a , . . . , z , | , * } N = { E, T, F, K } R ∋ E → T | E R ∋ E → T R ∋ T → F T R ∋ T → F R ∋ F → K * R ∋ F → K R ∋ K → ( E ) R ∋ K → a , for every a ∈ { a , . . . , z } n 0 = E .

Parse tree: (a|b)* E A = { ( , ) , a , . . . , z , | , * } T N = { E, T, F, K } F R ∋ E → T | E R ∋ E → T K * R ∋ T → F T ( E ) R ∋ T → F T E R ∋ F → K * F T R ∋ F → K K F R ∋ K → ( E ) K a R ∋ K → a , for every a ∈ { a , . . . , z } b n 0 = E .

Ambiguous grammars A grammar is ambiguous if there is at least one string that has one or more parse trees.

Example: Ambiguity A = { ( , ) , + , * } ∪ Z N = { E } R ∋ E → E + E R ∋ E → E * E R ∋ E → z , for every z ∈ Z n 0 = E .

Example: 3 + 4 * 9 E E E E 3 + * 9 4 * 9 3 + 4

Left-recursion A grammar is left-recursive if a non-terminal symbol can derive a new string with itself in leftmost position.

Example: Left-recursion S → S , x S → x

Example: Factoring S → x , S S → x

Exercise: Nondeterministic recursive descent

Grammar X → ( X ∗ ) X → num X → sym X ∗ → X X ∗ X ∗ → ǫ .

Exercise: Predictive recursive descent

Lexer API • next() : Token • eat(t : TokenType) • peek(k : Int) : TokenType

CFG properties

Nullability The nullability function , δ : ( A ∪ N ) → {{ ǫ } , ∅ } , returns the set { ǫ } if the provided symbol can derive the empty string, and ∅ otherwise: δ ( a ) = ∅ δ ( n ) ⊇ δ ( s 1 ) · . . . · δ ( s n ) if ( n → s 1 . . . s n ) ∈ R δ ( n ) ⊇ { ǫ } if ( n → ǫ ) ∈ R .

Inclusion constraints X 1 ⊇ f 1 ( X 1 , . . . , X n ) . . . . . . X n ⊇ f n ( X 1 , . . . , X n ),

Solving inclusions X i ← ∅ for all i changed ← true while ( changed ) changed ← false X ′ i ← f i ( X 1 , . . . , X n ) if ( X i � = X ′ i ) X i ← X ′ i changed ← true .

First sets In context-free grammars, first sets are easily computed with subset-inclusion constraints; for every rule ( n → s 1 . . . s m ) ∈ R : m � first ( n ) ⊇ δ ( s 1 . . . s i − 1 ) · first ( s i ). i ≥ 1

Follow sets function follow : ( A ∪ N ) → A ; for every rule n → s 1 . . . s n n − 1 � follow ( s i ) ⊇ δ ( s i +1 . . . s j ) · first ( s j +1 ) j ≥ i ∪ δ ( s i +1 . . . s n ) · follow ( n ).

CFL trivia • Are regular languages context-free? • Are CFLs closed under complement? • Is the intersection of CFLs context-free? • Does a CFG accept no strings? • Does a CFG accept a finite set? • Does a CFG accept every string? • Is one CFL a subset of another CFL?

Parsing: Episode I Matthew Might University of Utah matt.might.net - PowerPoint PPT Presentation

Parsing: Episode I Matthew Might University of Utah matt.might.net ucombinator.org Administrivia Project 1: Use the source! Agenda What is parsing? Context-free languages Context-free grammars Recursive descent parsing

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS

First Episode Psychosis Alicia L. Cowdrey, MD Shasa L. Jackson, LMSW Vicki Staples, MEd, CPRP

Models of Human Parsing Experimental Data 2 Informatics 2A: Lecture 22 Eye-tracking Reading

Outline LR Parsing Review of bottom-up parsing LALR Parser Generators Computing the

Graph-Based Parsing Joakim Nivre Uppsala University Department of Linguistics and Philology

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Generalised Parsing and Combinator Parsing A Happy Marriage? L. Thomas van Binsbergen

Parsing as Deduction Joseph K uhner March 24, 2007 Joseph K uhner Parsing as Deduction

Bottom-up parsing LR parsing Construct parse tree for input from leaves up LR( k ) parsing

Compilers Shift-Reduce Parsing Alex Aiken Shift-Reduce Parsing Important Fact #1 about

Parsing, Part I Jim Royer April 2, 2019 CIS 352 Parsing, Part I 1 Miss Teen South

Programming Languages: Parsing Onur Tolga S ehito glu Computer Engineering,METU 27 May

* 07/16/96 Plan for Today Shift-reduce parsing The problem with predictive top down parsing

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernndez Institute for Logic,

Compiler Construction Lecture 6: Top-down parsing and LL(1) parser construction 2020-01-24

Grammars and Parsing Forth mini-homework If there is a number on the stack, and we enter dup

CSE 105 THEORY OF COMPUTATION Fall 2016 http://cseweb.ucsd.edu/classes/fa16/cse105-abc/ Today's

Scattering Amplitudes LECTURE 1 Jaroslav Trnka Center for Quantum Mathematics and Physics

Waves Tsunami caused by Sumatra earthquake Waves: Definitions Vibration ( V ) Back and forth

Finite square well: scattering states > 0 = 0 , < <

Universit at Augsburg Amplitude Equation for stoch. SH Equation Konrad Klepel Amplitude

Parsing: Episode I Matthew Might University of Utah matt.might.net - PowerPoint PPT Presentation

Parsing: Episode I Matthew Might University of Utah matt.might.net ucombinator.org Administrivia Project 1: Use the source! Agenda What is parsing? Context-free languages Context-free grammars Recursive descent parsing

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS

First Episode Psychosis Alicia L. Cowdrey, MD Shasa L. Jackson, LMSW Vicki Staples, MEd, CPRP

Models of Human Parsing Experimental Data 2 Informatics 2A: Lecture 22 Eye-tracking Reading

Outline LR Parsing Review of bottom-up parsing LALR Parser Generators Computing the

Graph-Based Parsing Joakim Nivre Uppsala University Department of Linguistics and Philology

Dependency Parsing II CMSC 470 Marine Carpuat Graph-based Dependency Parsing Slides credit:

Generalised Parsing and Combinator Parsing A Happy Marriage? L. Thomas van Binsbergen

Parsing as Deduction Joseph K uhner March 24, 2007 Joseph K uhner Parsing as Deduction

Bottom-up parsing LR parsing Construct parse tree for input from leaves up LR( k ) parsing

Compilers Shift-Reduce Parsing Alex Aiken Shift-Reduce Parsing Important Fact #1 about

Parsing, Part I Jim Royer April 2, 2019 CIS 352 Parsing, Part I 1 Miss Teen South

Programming Languages: Parsing Onur Tolga S ehito glu Computer Engineering,METU 27 May

* 07/16/96 Plan for Today Shift-reduce parsing The problem with predictive top down parsing

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernndez Institute for Logic,

Compiler Construction Lecture 6: Top-down parsing and LL(1) parser construction 2020-01-24

Grammars and Parsing Forth mini-homework If there is a number on the stack, and we enter dup

CSE 105 THEORY OF COMPUTATION Fall 2016 http://cseweb.ucsd.edu/classes/fa16/cse105-abc/ Today's

Scattering Amplitudes LECTURE 1 Jaroslav Trnka Center for Quantum Mathematics and Physics

Waves Tsunami caused by Sumatra earthquake Waves: Definitions Vibration ( V ) Back and forth

Finite square well: scattering states &gt; 0 = 0 , &lt; &lt;

Universit at Augsburg Amplitude Equation for stoch. SH Equation Konrad Klepel Amplitude

Finite square well: scattering states > 0 = 0 , < <