Abstract Syntax Aslan Askarov aslan@cs.au.dk Revised from slides by - - PowerPoint PPT Presentation

abstract syntax
SMART_READER_LITE
LIVE PREVIEW

Abstract Syntax Aslan Askarov aslan@cs.au.dk Revised from slides by - - PowerPoint PPT Presentation

Compilation 2014 Abstract Syntax Aslan Askarov aslan@cs.au.dk Revised from slides by E. Ernst Abstract syntax High-level source Pretty printing code Abstract syntax tree Lexing/Parsing Elaboration Lowering Code generation


slide-1
SLIDE 1

Compilation 2014

Abstract Syntax

Aslan Askarov aslan@cs.au.dk
 
 


Revised from slides by E. Ernst

slide-2
SLIDE 2

Abstract syntax

High-level source code Low-level target code Lexing/Parsing Lowering Code generation Elaboration Optimization Abstract syntax tree Pretty printing

slide-3
SLIDE 3

Recall ml-yacc file

… exp : ID ( A.Id (ID) ) | INT ( A.Number (INT) ) | LPAREN exp RPAREN ( exp ) | exp PLUS exp ( A.Op (A.Plus, exp1, exp2 )) | exp MINUS exp ( A.Op (A.Minus, exp1, exp2 )) …

datatype aexp = Id of string | Number of int | Op of binop * aexp * aexp and binop = Plus | Minus | Times | Div

semantic actions in parser .grm file ML-code What else can we put as semantic actions? Is it a good idea?

slide-4
SLIDE 4

Semantic actions in parser

  • In principle, it’s possible to write the entire compiler

in the parser semantic actions

  • Not a good idea: error-prone, difficult to maintain
  • Limited in features: mutually recursive

declarations

  • Rather have multiple passes over the program
  • need convenient representation
  • In practice, semantic actions in the parser generate

an abstract syntax tree (AST)

  • tree representation of the program.
slide-5
SLIDE 5

Concrete (surface) syntax vs abstract syntax

  • Concrete syntax corresponds to a grammar
  • requires pars ability
  • may even be seriously disfigured: T’ → * F T’
  • lots of otherwise unnecessary details
  • parenthesis, “then”, “else”, semicolons,

associativity, precedence, etc

  • syntactic sugar: a & b vs if a then b else 0
  • Abstract syntax
  • non-parsable, designed for later compiler phases
slide-6
SLIDE 6

Decorated abstract syntax trees

  • Idea: annotate AST with useful information, e.g.:
  • position in the source – for error reporting
  • types – for semantic analysis
slide-7
SLIDE 7

Including aux information in AST

  • exp : ID ( A.Id (ID, IDleft) )

| INT ( A.Number (INT) ) | LPAREN exp RPAREN ( exp )

  • | exp PLUS exp ( A.Op (A.Plus, exp1, exp2, exp1left ))

| exp MINUS exp ( A.Op (A.Minus, exp1, exp2, exp1left ))

  • | exp TIMES exp ( A.Op (A.Times, exp1, exp2, exp1left ))

| exp DIV exp ( A.Op (A.Div, exp1, exp2, exp1left ))

  • type pos = int
  • datatype aexp = Id of id * pos

| Number of int | Op of binop * aexp * aexp * pos

position information

slide-8
SLIDE 8

Pretty printing

  • Generating concrete syntax from abstract
  • Why?
  • Formatting
  • Debugging parsers
  • Design criteria for abstract syntax
  • a good AST design will contain all the relevant

information to go from abstract syntax to concrete via pretty printing, but no more

  • Implemented as a straightforward tree traversal
  • Library support
slide-9
SLIDE 9

Pretty printing libraries

  • Primitive type: block/document
  • Blocks are stacked vertically and horizontally to form bigger blocks
  • User provides functions from AST nodes to blocks
  • straightforward traversal over trees
  • Layout engine generates a indented string from the outermost

block

block block x := a + b block block block block block block

slide-10
SLIDE 10

Pretty printing

>_

slide-11
SLIDE 11

AST for Tiger

  • Just like before; nice syntax trees, though more

complex

  • Representation: note the use of records
  • Semantic issues: mutual dependencies expressed

using sublists of declarations

  • Overall robust design
slide-12
SLIDE 12

Excerpt from “absyn.sml”

  • structure Absyn = struct
  • datatype var = SimpleVar of S.symbol * pos

| FieldVar of var * S.symbol * pos | SubscriptVar of var * exp * pos

  • and exp = VarExp of var

| NilExp | IntExp of int | StringExp of string * pos | CallExp of calldata ... and decl = FunctionDec of fundecldata list | VarDec of vardecldata | TypeDec of tydecldata list ... and fundecldata = { name: S.symbol , params: fielddata list , result: (S.symbol * pos) option , body: exp , pos: pos} …

slide-13
SLIDE 13

AST of “test01.tig”

  • /* an array type and

an array variable */ let type arrtype = array of int var arr1: arrtype := arrtype [10] of 0 in arr1 end

LetExp TypeDec 'arrtype' ArrayTy 'int' VarDec 'arr1: arrtype' ArrayExp 'arrtype' init 10 size init SeqExp body VarExp SimpleVar 'arr1'

slide-14
SLIDE 14

AST of “test02.tig”

  • /* arr1 is valid since

* expression 0 is * int = myint */

  • let

type myint = int type arrtype = array of myint var arr1: arrtype := arrtype [10] of 0 in arr1 end

LetExp TypeDec 'myint' NameTy 'int' 'arrtype' ArrayTy 'myint' VarDec 'arr1: arrtype' ArrayExp 'arrtype' init 10 size init SeqExp body VarExp SimpleVar 'arr1'

slide-15
SLIDE 15

AST of “test07.tig”

  • /* mutually recursive

* functions */ let function do_nothing1 (a: int, b: string): int = (do_nothing2(a+1); 0)

  • function do_nothing2

(d: int): string = (do_nothing1(d, "str"); " ") in do_nothing1(0, "str2") end

LetExp Function Dec 'do_nothing1 : int' 'a: int' par 'b: string' par SeqExp body Call 'do_nothing2' + arg VarExp 1 SimpleVar 'a' 'do_nothing2: string' 'd: int' par SeqExp body Call 'do_nothing1' VarExp arg SimpleVar 'd' "str" arg " " SeqExp body Call 'do_nothing1' arg "str2 " arg

slide-16
SLIDE 16

AST of “test42.tig”

LetExp TypeDec 'arrtype1' ArrayTy 'int' 'rectype1' RecordTy 'string' 'name' 'string' 'address' 'int' 'id' 'int' 'age' 'arrtype2' ArrayTy 'rectype1' 'rectype2' RecordTy 'string' 'name' 'arrtype1' 'dates' 'arrtype3' ArrayTy 'string' VarDec 'arr1' ArrayExp 'arrtype1' init 10 size init VarDec 'arr2' ArrayExp 'arrtype2' init 5 size RecordExp 'rectype1' init "aname" 'name' "somewhere" 'address' 'id' 'age' VarDec 'arr3: arrtype3' ArrayExp 'arrtype3' init 100 size "" init VarDec 'rec1' RecordExp 'rectype1' init "Kapoios" 'name' "Kapou" 'address' 2432 'id' 44 'age' VarDec 'rec2' RecordExp 'rectype2' init "Allos" 'name' ArrayExp 'arrtype1' 'dates' 3 size 1900 init SeqExp body AssignExp SubscriptVar LHS 1 RHS SimpleVar 'arr1' AssignExp SubscriptVar LHS 3 RHS SimpleVar 'arr1' 9 AssignExp FieldVar LHS "kati" RHS SubscriptVar 'name' SimpleVar 'arr2' 3 AssignExp FieldVar LHS 23 RHS SubscriptVar 'age' SimpleVar 'arr2' 1 AssignExp SubscriptVar LHS "sfd" RHS SimpleVar 'arr3' 34 AssignExp FieldVar LHS "sdf" RHS SimpleVar 'rec1' 'name' AssignExp SubscriptVar LHS 2323 RHS FieldVar SimpleVar 'rec2' 'dates' AssignExp SubscriptVar LHS 2323 RHS FieldVar 2 SimpleVar 'rec2' 'dates'
  • /* correct declarations */

let type arrtype1 = array of int type rectype1 = {name: string, address: string, id: int, age: int} type arrtype2 = array of rectype1 type rectype2 = {name: string, dates: arrtype1} type arrtype3 = array of string

  • var arr1 := arrtype1 [10] of 0

var arr2 := arrtype2 [5] of rectype1{name="aname", address="somewhere", id=0, age=0} var arr3: arrtype3 := arrtype3 [100] of ""

  • var rec1 := rectype1{name="Kapoios", address="Kapou", id=02432, age=44}

var rec2 := rectype2{name="Allos", dates = arrtype1 [3] of 1900} in arr1[0] := 1; arr1[9] := 3; arr2[3].name := "kati"; arr2[1].age := 23; arr3[34] := "sfd";

  • rec1.name := "sdf";

rec2.dates[0] := 2323; rec2.dates[2] := 2323 end

slide-17
SLIDE 17

Issues to note

  • Structures easily get rather large
  • Not hard to read, except for size
  • Connection to source: ASTs require stored position
  • available under ml-yacc x, x1, start positions are

under xleft, x1left

slide-18
SLIDE 18

Summary

  • Parsing purpose: frontend of the compiler
  • single-pass compilation possible, but not messy,

limited in functionality

  • Abstract syntax tree:
  • tree representation of the program
  • produced by the semantic actions of LR parser
  • Pretty-printing:
  • allows us to reformat the source
  • Traversals
  • important programming idiom
  • will be using a lot in the compilation