compiler design
play

Compiler Design Spring 2018 3.5 Limitations of context-free - PowerPoint PPT Presentation

Compiler Design Spring 2018 3.5 Limitations of context-free grammars 4.0 Semantic analysis Thomas R. Gross Computer Science Department ETH Zurich, Switzerland 1 Context-free grammars Efficient parsers exist for context-free languages


  1. Compiler Design Spring 2018 3.5 Limitations of context-free grammars 4.0 Semantic analysis Thomas R. Gross Computer Science Department ETH Zurich, Switzerland 1

  2. Context-free grammars § Efficient parsers exist for context-free languages § Should we look at other language classes? § Context-sensitive § Unrestricted grammars § Grammars are about checking properties 2

  3. Compiler structure § Parser builds parse tree § (Concrete syntax tree) § Can be turned into abstract syntax tree (AST) § Checks input for compliance with language spec § Can be turned into abstract syntax tree (AST) § Remove unnecessary detail § Most details related to grammar symbols is not critical § From parse tree / AST to code generation § We did this (in part) in Homework 1 3 § (Will do more in later Homework)

  4. Extra step: Error detection § Parse tree construction § Parser finds some kinds of errors but not all § Some kinds of errors can be detected only at runtime § “Syntax errors” § Efficient parsing algorithms known for Type-2 (context-free) grammars § Not always desirable: Find errors with parser § Limitations of context-free grammars 4

  5. A useful property: Variables declared § Consider a language like Java(Li) § The spec requires that all variables used have been declared int x; x = x + 1; 5

  6. Using parsing to check property § How could we express this property so that it can be checked by the parser? § A parse tree is constructed only for those programs that maintain this property (variables declared before use) § Otherwise error is signaled § Can we find a language L J to model this property? § Then we can think about a grammar G J such that L(G J ) = L J 6

  7. L J § L J = { a c a | a ∈ {a, b}* } § Terminals: a, b, c § Example words from L J § Not in L J § aacaa § ca § abcab § acb § aabacaaba § How does L J relate to our problem? 7

  8. void fct2() { void fct1() { int x; int x; { { x = y + 1 x = x + 1 } } } } § Could use L J = { a c a d | a ∈ {a, b}* } 9

  9. L J § L J allows us to model the following constraint Any variable that appears in the program/function/method has been declared previously § Terminal c defines a separation between the “body” of a unit and the definition block. § Useful property to check before code generation 10

  10. L J § L J allows us to model the following constraint Any variable that appears in the program/function/method has been declared previously § Bad news: (Theorem) There exists no context-free grammar G such that L J = L(G) § Proof: 11

  11. Another useful property: Matching parameters § Consider a language like Java(Li) § The spec requires that for all methods/functions, the number of formal parameters (at the place of method definition) matches the number of actual parameters (at the call site) int fct (int a, float b, xref c) { … } x = fct(a, b, c); 12

  12. Another useful property § How could we express this property so that it can be checked by the parser? § A parse tree is constructed only for those programs that maintain this property (actuals and formals match) § Otherwise error is signaled § Can we find a language L P to model this property? § Then we can think about a grammar G P such that L(G P ) = L P 13

  13. L P § L p = { a n b m c n d m } § a, b, c, d: terminals § Integers n, m ≥ 1 § Example words from L P § Not in L P § aabccd § aabcd § aaabbcccdd § abbbbcdddd § Why would we care about L P ? 14

  14. L P § L P allows us to model the following constraint For all methods/functions, the number of formal parameters (at the place of method definition) matches the number of actual parameters (at the call site) § Can be extended to deal with matching types § Tricky if type conversions are an option § Useful property to check before code generation 15

  15. L P § L P allows us to model the following constraint For all methods/functions, the number of formal parameters (at the place of method definition) matches the number of actual parameters (at the call site) § Bad news: (Theorem) There exists no context-free grammar G such that L P = L(G) § Proof: 16

  16. Comments § Context-free grammars cannot express all desirable constraints § Switching to context-sensitive not productive § Use “unrestricted grammar” instead… § Use a program to perform additional checks § Complete flexibility § Can be (and often is) an additional step in compiler § After parsing § Before code generation § Recall: Some checks must wait till run time 17

  17. More comments § Note: Parsing also used in (natural) language processing § No (complete) (context-free) grammar exists for English, German, … § More expensive approaches are needed § Ambiguity part of reality § May need to obtain (multiple, all ) parse trees § “The food is here!” vs. “The food is here?” § Interesting topic but not part of this class 18

  18. 4.0 Semantic analysis § Idea: before proceeding to code generation compiler checks program properties § Early feedback (while source information still available) § Avoid subsequent complications 19

  19. Semantic analysis § Idea: before proceeding to code generation compiler checks program properties § Also the time to transform program § Often done at the time parse tree is transformed into AST § Example transformations § Type casts § Add default parameters to method/function calls § Construct initializer 20

  20. 4.1 Syntax-directed translation § Parsing: Control table M decides which production to use § So far: Recorded production (as “action”) § General: Attach code to production § E.g., add node to syntax tree § E.g., keep track of definitions § As the parser recognizes a word § It produces an AST (or other desired data structure) § And/or computes predicate 21

  21. Attribute grammars § Context free grammar extended with (context-sensitive) information § “Attributes” § Attached to non-terminals § Attributes have values § Value assigned during parsing § Value evaluated in a conditional statement (see later) 22

  22. Attribute grammars § Types of attributes 1. Synthesized attributes § Value obtained from attributes of children of non-terminal 2. Inherited attributes § Value obtained from attribute of parent of non-terminal § Or from attribute(s) of sibling(s) of non-terminal 23

  23. Example § Example (expression evaluation) § E à E + T § Production: E 0 à E 1 + T § Attribute § Integer value § E 0 . Value := E 1 . Value + T. Value § Note: Use E 1 vs E 0 to distinguish two occurrences of E in production 25

  24. Attributes § Consider L = { a n b n c n }. § Terminals: a, b, c § n integer ≥ 1 § L cannot be produced by a context-free grammar § We would like to use a context free grammar (and parser) to recognize L § Idea: Use attributes to deal with aspects parser cannot handle § Attribute domain: Integers § Result predicate: “true” if w = a k b k c k for some k 27

  25. Example (cont’d) § Consider G 19 S à A B C A à aA | a B à bB | b C à cC | c § Start symbol is S § L = { a n b n c n } ⊂ L(G 19 ) 28

  26. Rules § Attach a rule to each production § Rules for A productions A 0 à a A 1 <A 0 >.Na := <A 1 >.Na + 1 A à a <A>.Na := 1 § Rules for B, C productions similar § Condition for S à A B C 29 § <A>.Na == <B>.Nb == <C>.Nc

  27. S à A B C Rules A à aA | a B à bB | b C à cC | c Productions S à A B C if and only if <A>.Na == <B>.Nb == <C>.Nc A 0 à a A 1 <A 0 >.Na := <A 1 >.Na + 1 A à a <A>.Na := 1 B 0 à b B 1 <B 0 >.Nb := <B 1 >.Nb + 1 B à b <B>.Nb := 1 C 0 à c C 1 <C 0 >.Nc := <C 1 >.Nc + 1 C à c <C>.Nc := 1 30

  28. aabbcc Stack Input Action $ aabbcc$ a$ abbcc$ aa$ bbcc$ A à a; <A>.Na:=1 Aa$ bbcc$ A 0 à a A 1 ; < A 0 >.Na:=<A 1 >.Na+1=2 A$ bbcc$ bA$ bbcc$ B à b; <B>.Nb:=1 bbA$ bbcc$ B à bB; <B 0 >.Nb:=2 BbA$ bcc$ BA$ cc$ 32

  29. aabbcc Stack Input Action BA$ cc$ cBA$ c$ ccBA$ $ C à c; <C>.Nc:=1 CcBA$ $ C 0 à c C 1 ; C 0 >.Nc:=<C 1 >.Nc+1=2 CBA$ $ S à A B C; S$ $ Na==Nb==Nc ? True ACCEPT 34

  30. aabbcc – tree view S Condition: true A B C Na = 2 Nb = 2 Nc = 2 a A b B c C Na = 1 Nb = 1 Nc = 1 a b c 35

  31. Question What type of parser (top-down or bottom-up) did we use to parse w (and to implement the checks)? Why? (Hint: Top-of-stack arbitrarily picked to be on the left, that is, position of top-of-stack does not convey any information.) 36

  32. Syntax(-based) analysis § Powerful tool § Easy to get carried away § Once a topic of active research 37

  33. Semantic analysis § Goal: Identify problems early on float f; int [] iarray; int j; iarray = new int [10]; iarray [f] = j; § Idea: check AST § Either report error § Modify AST int j; float f; j = f; // replace with: j = round(f) 38

  34. 4.2 Symbol table § Symbol table: Central repository of information about program symbols § Checks must exploit structure of program 39

  35. Symbol table § Many checks require gathering/retrieving information about symbols § Function/method names § Class names § Variable/field names § Function/method types § Class types § Variable/field types 40

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend