Abstract Syntax Trees & Top-Down Parsing Review of Parsing - PowerPoint PPT Presentation

Abstract Syntax Trees & Top-Down Parsing

Review of Parsing • Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree • Issues: – How do we recognize that s ∈ L(G) ? – A parse tree of s describes how s ∈ L(G) – Ambiguity: more than one parse tree (possible interpretation) for some string s – Error: no parse tree for some string s – How do we construct the parse tree? 2 Compiler Design 1 (2011)

Abstract Syntax Trees • So far, a parser traces the derivation of a sequence of tokens • The rest of the compiler needs a structural representation of the program • Abstract syntax trees – Like parse trees but ignore some details – Abbreviated as AST 3 Compiler Design 1 (2011)

Abstract Syntax Trees (Cont.) • Consider the grammar E → int | ( E ) | E + E • And the string 5 + (2 + 3) • After lexical analysis (a list of tokens) int 5 ‘+’ ‘(‘ int 2 ‘+’ int 3 ‘)’ • During parsing we build a parse tree … 4 Compiler Design 1 (2011)

Example of Parse Tree E • Traces the operation of the parser E E + • Captures the nesting structure • But too much info int 5 ( E ) – Parentheses – Single-successor nodes + E E int 2 int 3 5 Compiler Design 1 (2011)

Example of Abstract Syntax Tree PLUS PLUS 5 2 3 • Also captures the nesting structure • But abstracts from the concrete syntax more compact and easier to use a • An important data structure in a compiler 6 Compiler Design 1 (2011)

Semantic Actions • This is what we’ll use to construct ASTs • Each grammar symbol may have attributes – An attribute is a property of a programming language construct – For terminal symbols (lexical tokens) attributes can be calculated by the lexer • Each production may have an action – Written as: X → Y 1 … Y n { action } – That can refer to or compute symbol attributes 7 Compiler Design 1 (2011)

Semantic Actions: An Example • Consider the grammar E → int | E + E | ( E ) • For each symbol X define an attribute X.val – For terminals, val is the associated lexeme – For non-terminals, val is the expression’s value (which is computed from values of subexpressions) • We annotate the grammar with actions: E → int { E.val = int.val } | E 1 + E 2 { E.val = E 1 .val + E 2 .val } | ( E 1 ) { E.val = E 1 .val } 8 Compiler Design 1 (2011)

Semantic Actions: An Example (Cont.) • String: 5 + (2 + 3) • Tokens: int 5 ‘+’ ‘(‘ int 2 ‘+’ int 3 ‘)’ Productions Equations E → E 1 + E 2 E.val = E 1 .val + E 2 .val E 1 → int 5 E 1 .val = int 5 .val = 5 E 2 (E 3 ) E 2 .val = E 3 .val → E 3 E 4 + E 5 E 3 .val = E 4 .val + E 5 .val → E 4 int 2 E 4 .val = int 2 .val = 2 → E 5 int 3 E 5 .val = int 3 .val = 3 → 9 Compiler Design 1 (2011)

Semantic Actions: Dependencies Semantic actions specify a system of equations – Order of executing the actions is not specified • Example: E 3 .val = E 4 .val + E 5 .val – Must compute E 4 .val and E 5 .val before E 3 .val – We say that E 3 .val depends on E 4 .val and E 5 .val • The parser must find the order of evaluation 10 Compiler Design 1 (2011)

Dependency Graph • Each node labeled with E + a non-terminal E has one slot for its val attribute E 2 E 1 + • Note the dependencies int 5 ( E 3 ) 5 + + E 4 E 5 int 2 int 3 2 3 11 Compiler Design 1 (2011)

Evaluating Attributes • An attribute must be computed after all its successors in the dependency graph have been computed – In the previous example attributes can be computed bottom-up • Such an order exists when there are no cycles – Cyclically defined attributes are not legal 12 Compiler Design 1 (2011)

Semantic Actions: Notes (Cont.) • Synthesized attributes – Calculated from attributes of descendents in the parse tree – E.val is a synthesized attribute – Can always be calculated in a bottom-up order • Grammars with only synthesized attributes are called S-attributed grammars – Most frequent kinds of grammars 13 Compiler Design 1 (2011)

Inherited Attributes • Another kind of attributes • Calculated from attributes of the parent node(s) and/or siblings in the parse tree • Example: a line calculator 14 Compiler Design 1 (2011)

A Line Calculator • Each line contains an expression E → int | E + E • Each line is terminated with the = sign L → E = | + E = • In the second form, the value of evaluation of the previous line is used as starting value • A program is a sequence of lines P → ε | P L 15 Compiler Design 1 (2011)

Attributes for the Line Calculator • Each E has a synthesized attribute val – Calculated as before • Each L has a synthesized attribute val L → E = { L.val = E.val } | + E = { L.val = E.val + L.prev } • We need the value of the previous line • We use an inherited attribute L.prev 16 Compiler Design 1 (2011)

Attributes for the Line Calculator (Cont.) • Each P has a synthesized attribute val – The value of its last line P → ε { P.val = 0 } | P 1 L { P.val = L.val; L.prev = P 1 .val } • Each L has an inherited attribute prev – L.prev is inherited from sibling P 1 .val • Example … 17 Compiler Design 1 (2011)

Example of Inherited Attributes • val synthesized P L P + • prev inherited = + E 3 + ε • All can be 0 computed in + E 4 depth-first E 5 order int 2 int 3 2 3 18 Compiler Design 1 (2011)

Semantic Actions: Notes (Cont.) • Semantic actions can be used to build ASTs • And many other things as well – Also used for type checking, code generation, … • Process is called syntax-directed translation – Substantial generalization over CFGs 19 Compiler Design 1 (2011)

Constructing an AST • We first define the AST data type • Consider an abstract tree type with two constructors: n mkleaf(n) = PLUS mkplus( , ) = T 1 T 2 T 1 T 2 20 Compiler Design 1 (2011)

Constructing a Parse Tree • We define a synthesized attribute ast – Values of ast values are ASTs – We assume that int.lexval is the value of the integer lexeme – Computed using semantic actions E → int { E.ast = mkleaf(int.lexval) } | E 1 + E 2 { E.ast = mkplus(E 1 .ast, E 2 .ast) } | ( E 1 ) { E.ast = E 1 .ast } 21 Compiler Design 1 (2011)

Parse Tree Example • Consider the string int 5 ‘+’ ‘(‘ int 2 ‘+’ int 3 ‘)’ • A bottom-up evaluation of the ast attribute: E.ast = mkplus(mkleaf(5), mkplus(mkleaf(2), mkleaf(3)) PLUS PLUS 5 2 3 22 Compiler Design 1 (2011)

Review of Abstract Syntax Trees • We can specify language syntax using CFG • A parser will answer whether s ∈ L(G) • … and will build a parse tree • … which we convert to an AST • … and pass on to the rest of the compiler • Next two & a half lectures: – How do we answer s ∈ L(G) and build a parse tree? • After that: from AST to assembly language 23 Compiler Design 1 (2011)

Second-Half of Lecture 5: Outline • Implementation of parsers • Two approaches – Top-down – Bottom-up • Today: Top-Down – Easier to understand and program manually • Then: Bottom-Up – More powerful and used by most parser generators 24 Compiler Design 1 (2011)

Introduction to Top-Down Parsing • Terminals are seen in order of 1 appearance in the token stream: t 2 3 t 9 t 2 t 5 t 6 t 8 t 9 4 7 • The parse tree is constructed t 5 t 6 t 8 – From the top – From left to right 25 Compiler Design 1 (2011)

Recursive Descent Parsing • Consider the grammar E → T + E | T T → int | int * T | ( E ) • Token stream is: int 5 * int 2 • Start with top-level non-terminal E • Try the rules for E in order 26 Compiler Design 1 (2011)

Recursive Descent Parsing. Example (Cont.) • Try E 0 T 1 + E 2 → Token stream: int5 * int2 • Then try a rule for T 1 → ( E 3 ) – But ( does not match input token int 5 • Try T 1 → int . Token matches. – But + after T 1 does not match input token * • Try T 1 → int * T 2 – This will match but + after T 1 will be unmatched • Has exhausted the choices for T 1 – Backtrack to choice for E 0 E → T + E | T T → (E) | int | int * T 27 Compiler Design 1 (2011)

Recursive Descent Parsing. Example (Cont.) • Try E 0 T 1 → Token stream: int5 * int2 • Follow same steps as before for T 1 – And succeed with T 1 → int 5 * T 2 and T 2 → int 2 – With the following parse tree E 0 T 1 int 5 T 2 * E → T + E | T T → (E) | int | int * T int 2 28 Compiler Design 1 (2011)

Recursive Descent Parsing. Notes. • Easy to implement by hand • Somewhat inefficient (due to backtracking) • But does not always work … 29 Compiler Design 1 (2011)

Abstract Syntax Trees & Top-Down Parsing Review of Parsing - PowerPoint PPT Presentation

Abstract Syntax Trees & Top-Down Parsing Review of Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G) ? A parse tree of s

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Syntax Liam OConnor CSE, UNSW (and data61) Term3 2019 1 Abstract Syntax Parsing Bindings

Top-Down Parsing Slides modified from Louden Book and Dr. Scherger Top Down Parsing A

Abstract syntax trees COMP 520 Fall 2010 Abstract syntax trees (2) A compiler pass is a

Abstract Syntax Trees 27 February 2019 OSU CSE 1 Abstract Syntax Tree An abstract syntax

Top-Down Parsing Top-Down Parsing #1 Extra Credit Question Given this grammar G: E

Abstract Syntax Trees COMP 520: Compiler Design (4 credits) Professor Laurie Hendren

Natural Language Processing Syntax Parsing I Dan Klein UC Berkeley Parse Trees Phrase

Trees Trees CSE, IIT KGP Trees and Spanning Trees Trees and Spanning Trees A graph having

* 07/16/96 Plan for Today Shift-reduce parsing The problem with predictive top down parsing

Abstract Syntax Aslan Askarov aslan@cs.au.dk Revised from slides by E. Ernst Abstract syntax

Syntax Analysis Parsing Syntactic analysis = parsing Goal of parser: Find all syntax errors

Compiler Design Spring 2018 3.3 Top-down parsing Thomas R. Gross Computer Science Department

( ( ) ) ( ) ( ) = = Work = h log t n B- B -Trees Trees B B- -Trees

Trees Chapter 11 Chapter Summary Introduction to Trees Applications of Trees Tree

CS453 Abstract Syntax tree (AST) Visitor Design Pattern Visitor patterns main idea and

Performing Source-to-Source T ransformations with Clang European LLVM Conference Paris, 2013

Syntax-Directed Translation 1 CFGs so Far CFGs for Language Definition The CFGs weve

ASTs AST node classes The parsers output is an abstract syntax tree (AST) Each node in an AST

Intermediate Representation Abstract syntax tree, control- flow graph, three-address code

TWEAST: A Simple and Effective Technique to Implement Concrete-Syntax AST Rewriting Using Partial

Compiling Techniques Lecture 7: Abstract Syntax Christophe Dubach 3 October 2017 Christophe

Deep Learning on Code with an Unbounded Vocabulary ML4P, July 2018 Milan Cvitkovic , Badal Singh,