1
CS553 Lecture Scanning and Parsing 2
Scanning and Parsing
Announcements – Project 1 is 5% of total grade – Project 2 is 10% of total grade – Project 3 is 15% of total grade – Project 4 is 10% of total grade Today – Outline of planned topics for course – Overall structure of a compiler – Lexical analysis (scanning) – Syntactic analysis (parsing)
CS553 Lecture Scanning and Parsing 3
Structure of a Typical Interpreter
“sentences” Synthesis
- ptimization
code generation target language IR IR code generation IR Analysis character stream lexical analysis “words” tokens semantic analysis syntactic analysis AST annotated AST interpreter
Compiler
CS553 Lecture Scanning and Parsing 4
Lexical Analysis (Scanning)
Break character stream into tokens (“words”)– Tokens, lexemes, and patterns – Lexical analyzers are usually automatically generated from patterns (regular expressions) (e.g., lex)
Examples“.*” “hi”, “mom” string [0-9]+ | [0-9]*.[0-9]+ 3.14159,570 number [a-zA-Z_]+[a-zA-Z0-9_]* foo,index identifier < | <= | = | != | ... <,<=,=,!=,... relation if if if const const const pattern lexeme(s) token const pi := 3.14159 ⇒ const, identifier(pi), assign,number(3.14159)
CS553 Lecture Scanning and Parsing 5
Interaction Between Scanning and Parsing
Lexical analyzer Parser character stream lexer.next() lexer.peek() token parse tree
- r AST