semantic analysis
play

Semantic Analysis The role of semantic analysis in a compiler A - PowerPoint PPT Presentation

Outline Semantic Analysis The role of semantic analysis in a compiler A laundry list of tasks Scope Static vs. Dynamic scoping Implementation: symbol tables Types Static analyses that detect type errors


  1. Outline Semantic Analysis • The role of semantic analysis in a compiler – A laundry list of tasks • Scope – Static vs. Dynamic scoping – Implementation: symbol tables • Types – Static analyses that detect type errors – Statically vs. Dynamically typed languages 2 Where we are The Compiler Front-End Lexical analysis : program is lexically well-formed – Tokens are legal • e.g. identifiers have valid names, no stray characters, etc. – Detects inputs with illegal tokens Parsing : program is syntactically well-formed – Declarations have correct structure, expressions are syntactically valid, etc. – Detects inputs with ill-formed syntax Semantic analysis : – Last “front end” compilation phase – Catches all remaining errors 3 4

  2. Beyond Syntax Errors Why Have a Separate Semantic Analysis? Parsing cannot catch some errors • What’s wrong with foo(int a, char * s){...} this C code? (Note: it parses Some language constructs are not context-free int bar() { correctly) int f[3]; – Example: Identifier declaration and use int i, j, k; – An abstract version of the problem is: • Undeclared identifier char q, *p; L = { wcw | w ∈ (a + b) * } • Multiply declared identifier float k; • Index out of bounds – The 1st w represents the identifier’s declaration; foo(f[6], 10, j); • Wrong number or types of the 2nd w represents a use of the identifier break; arguments to function call i->val = 42; – This language is not context-free • Incompatible types for operation j = m + k; • break statement outside printf("%s,%s.\n",p,q); switch/loop goto label42; • goto with no label } 5 6 What Does Semantic Analysis Do? What’s Wrong? Performs checks beyond syntax of many kinds ... Example 1 Examples: let string y ← "abc" in y + 42 1. All used identifiers are declared 2. Identifiers declared only once Example 2 3. Types let integer y in x + 42 4. Procedures and functions defined only once 5. Procedures and functions used with the right number and type of arguments And others . . . The requirements depend on the language 7 8

  3. Attributes of an Identifier Semantic Processing : Syntax-Directed Translation Basic idea : Associate information with language name : character string (obtained from scanner) constructs by attaching attributes to the scope : program region in which identifier is valid grammar symbols that represent these constructs type : – Values for attributes are computed using semantic - integer rules associated with grammar productions - array: – An attribute can represent anything (reasonable) • number of dimensions that we choose; e.g. a string, number, type, etc. • upper and lower bounds for each dimension – A parse tree showing the values of attributes at • type of elements each node is called an annotated parse tree – function: • number and type of parameters (in order) • type of returned value • size of stack frame 9 10 Scope Scope (Cont.) • The scope of an identifier (a binding of a name • The scope of an identifier is the portion of a to the entity it names) is the textual part of program in which that identifier is accessible the program in which the binding is active • The same identifier may refer to different • Scope matches identifier declarations with uses things in different parts of the program – Important static analysis step in most languages – Different scopes for same name don’t overlap • An identifier may have restricted scope 11 12

  4. Static vs. Dynamic Scope Static Scoping Example • Most languages have static (lexical) scope let integer x ← 0 in – Scope depends only on the physical structure of { program text, not its run-time behavior x; – The determination of scope is made by the compiler let integer x ← 1 in – C, Java, ML have static scope; so do most languages x; x; • A few languages are dynamically scoped } – Lisp, SNOBOL – Lisp has changed to mostly static scoping – Scope depends on execution of the program Uses of x refer to closest enclosing definition 13 14 Dynamic Scope Static vs. Dynamic Scope • A dynamically-scoped variable refers to the Program scopes (input, output); closest enclosing binding in the execution of var a: integer; the program With static scope procedure first; rules, it prints 1 begin a := 1; end; procedure second; Example With dynamic scope var a: integer; g(y) = let integer a ← rules, it prints 2 42 in f(3); begin first; end; f(x) = a; begin – When invoking g(54) the result will be 42 a := 2; second; write(a); end. 15 16

  5. Dynamic Scope (Cont.) Scope of Identifiers • With dynamic scope, bindings cannot always be • In most programming languages identifier resolved by examining the program because bindings are introduced by they are dependent on calling sequences – Function declarations (introduce function names) • Dynamic scope rules are usually encountered in – Procedure definitions (introduce procedure names) interpreted languages – Identifier declarations (introduce identifiers) – Formal parameters (introduce identifiers) • Also, usually these languages do not normally have static type checking: – type determination is not always possible when dynamic rules are in effect 17 18 Scope of Identifiers (Cont.) Example: Use Before Definition • Not all kinds of identifiers follow the most- foo (integer x) closely nested scope rule { integer y • For example, function declarations y ← bar(x) – often cannot be nested ... – are globally visible throughout the program } bar (integer i): integer • In other words, a function name can be used { before it is defined ... } 19 20

  6. Other Kinds of Scope Implementing the Most-Closely Nested Rule • In O-O languages, method and attribute • Much of semantic analysis can be expressed as names have more sophisticated (static) scope a recursive descent of an AST rules – Process an AST node n – Process the children of n – Finish processing the AST node n • A method need not be defined in the class in which it is used, but in some parent class • When performing semantic analysis on a portion of the AST, we need to know which • Methods may also be redefined (overridden) identifiers are defined 21 22 Implementing Most-Closely Nesting (Cont.) Symbol Tables • Example: Purpose : To hold information about identifiers that is computed at some point and looked up – the scope of variable declarations is one subtree at later times during compilation let integer x ← 42 in E Examples: – type of a variable can be used in subtree E x – – entry point for a function Operations : insert , lookup , delete Common implementations : linked lists, hash tables 23 24

  7. Symbol Tables A Simple Symbol Table Implementation • Assuming static scope, consider again: • Structure is a stack let integer x ← 42 in E • Idea: • Operations – Before processing E , add definition of x to add_symbol( x ) push x and associated info, such as current definitions, overriding any other x ’s type, on the stack definition of x find_symbol( x ) search stack, starting from top, for – After processing E , remove definition of x x . Return first x found or NULL if none found and, if needed, restore old definition of x remove_symbol() pop the stack • A symbol table is a data structure that tracks • Why does this work? the current bindings of identifiers 25 26 Limitations A Fancier Symbol Table • The simple symbol table works for variable start/push a new nested scope • enter_scope() declarations finds current x (or null) • find_symbol(x) – Symbols added one at a time add a symbol x to the table • add_symbol(x) – Declarations are perfectly nested • check_scope(x) true if x defined in current scope • Doesn’t work for exits/pops the current scope • exit_scope() foo(x: integer, x: float); • Other problems? 27 28

  8. Function/Procedure Definitions Types • Function names can be used prior to their • What is a type? definition – This is a subject of some debate • We can’t check that for function names – The notion varies from language to language – using a symbol table – or even in one pass • Consensus • Solution – A type is a set of values and – Pass 1: Gather all function/procedure names – A set of operations on those values – Pass 2: Do the checking • Semantic analysis requires multiple passes • Type errors arise when operations are performed on values that do not support that operation – Probably more than two 29 30 Why Do We Need Type Systems? Types and Operations Consider the assembly language fragment • Certain operations are legal for values of each type addi $r1, $r2, $r3 – It doesn’t make sense to add a function pointer and an integer in C What are the types of $r1, $r2, $r3 ? – It does make sense to add two integers – But both have the same assembly language implementation! 31 32

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend