day 3
play

Day 3 If you are still using the default password that was assigned - PowerPoint PPT Presentation

Day 3 If you are still using the default password that was assigned when your account was created, CHANGE IT NOW! (It can be the same as your email password.) Day 3 Steps in compiling: (Optional preprocessing) Lexical analysis


  1. Day 3 If you are still using the “default” password that was assigned when your account was created, CHANGE IT NOW! (It can be the same as your email password.)

  2. Day 3 Steps in compiling: ● (Optional preprocessing) ● Lexical analysis (“scanning”) ● Syntactic analysis (“parsing”) ● Semantic analysis ● Intermediate code generation ● Optimization ● Code generation ● (Optional final optimization)

  3. Lexical Analysis Start with a numbered list of token types: A token is any 0 <unsigned int> 8 ‘;’ component of a program that is 1 ‘(‘ 9 ‘=’ generally treated 2 ‘)’ 10 “==” as an indivisible piece, e.g., a 3 ‘+’ 11 “<=” variable name, an 4 ‘-’ 12 “>=” operator such as <=, a punctuation 5 “for” 13 <string literal> mark such as a semicolon, a string 14 < 6 “while” constant, etc. ... … etc. ... 7 <identifier> (not reserved)

  4. Lexical Analysis For each token type, give a description. This can be either a literal string (e.g., “<=” or “while” to describe an operator or reserved word), or else a <rule> (e.g., the rule <unsigned int> might stand for “a sequence of one or more digits”; the rule <identifier> might stand for “a letter followed by a sequence of zero or more letters or digits”.

  5. Lexical Analysis Lexical analysis produces a “token stream” in which the progam is reduced to a sequence of token types, each with its identifying number and the actual string (in the program) corresponding to it.

  6. Lexical Analysis 10, ”==” 6, ”while” 0, ”3” 7, ”x” 2, ”)” 11, ”<=” // see if 3 occurs 7, ”found” 0, ”10” while x <= 10 9, ”=” 7, ”a” 0, ”1” 9, ”=” a = x+1 7, ”a” 7, ”x” 9, ”=” while (a == 3) 3, ”+” 7, ”f” 0, ”1” found = 1 1, ”(“ 6, ”while” 7, ”x” a = f(x) 1, ”(“ 2, ”)” 7, ”a” Program Stream of Tokens

  7. Syntactic Analysis The syntax of a language is described by a “grammar” that specifies the legal combinations of tokens. Grammars are often specified in BNF notation (“Backus Naur Form”): <item1> ::= valid replacements for <item1> <item2> ::= valid replacements for <item2> ...etc. ...

  8. Syntactic Analysis This is a simplified version of example 2.4, page 46 in Scott Example : an expression can be either a simple variable identifier; an integer; or an expression, followed by an operator, followed by another expression: CLassic BNF notation <expr> ::= <id> | <int> | <expr> <op> <expr> Alternative notations : The book uses this notation (but as three separate rules) expr id | int | expr op expr expr ::= id | int | expr { op expr } * The symbol “ | ” means “or” The “{...}*” means “zero or more repetitions of the items in {...}”

  9. Grammars (“Context-free grammars”) ● Collection of VARIABLES (things that can be replaced by other things), also called NON-TERMINALS. ● Collection of TERMINALS (“constants”, strings that can’t be replaced) ● One special variable called the START SYMBOL. ● Collection of RULES, also called PRODUCTIONS. variable rule1 | rule2 | rule3 | … (You can also write each rule on a separate line--our book does this)

  10. In-Class Exercise Here is a grammar. A, B, and C are non- terminals, 0, 1, and 2 are terminals. The start symbol is A, the rules are: A 0A | 1C | 2B | 0 B 0B | 1A | 2C | 1 C 0C | 1B | 2A | 2

  11. In-Class Exercise A 0A | 1C | 2B | 0 B 0B | 1A | 2C | 1 C 0C | 1B | 2A | 2 2011020 can be parsed (done at the board)!

  12. In-Class Exercise A 0A | 1C | 2B | 0 B 0B | 1A | 2C | 1 C 0C | 1B | 2A | 2 Can 1112202 be parsed? (Explain at board) Can 00102 be parsed? (Explain at board) Can 2120 be parsed? (Explain at board)

  13. Syntactic Analysis The “{...}+” means “one or more repetitions of the items in {...}” prog { statement } + statement assignment | loop | io In this example, assignment id = expression “=”, “while”, “(“, and “)” loop while ( expression ) prog are “A program is one or more statements.” terminals “A statement is an assignment, a loop, or an input/output command.” “An assignment is an identifier, followed by “=”, followed by an expression.”

  14. Syntactic Analysis The process of verifying that a token stream represents a valid application of the rules is called parsing . Using the BNF rules we can construct a parse tree: <prog> <statement> <prog> <assignment> <statement <prog> <id> = <expr> <assignment> <statement> … etc. .... … etc. … … etc. ...

  15. Sample Parse Tree (portion)

  16. A Failed Parse

  17. Grammar for Java, version 8 Overview of notation used: https://docs.oracle.com/javase/specs/jls/se8/html/jls-2.html The full syntax grammar: https://docs.oracle.com/javase/specs/jls/se8/html/jls-19.html

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend