what is a compiler
play

What is a Compiler? A compiler translates a source specification into - PDF document

8/27/2012 What is a Compiler? A compiler translates a source specification into a target specification. Traditionally, we consider compilers that take a source language and produce CS 1622: target (machine) code. However, there can be many


  1. 8/27/2012 What is a Compiler? A compiler translates a source specification into a target specification. Traditionally, we consider compilers that take a source language and produce CS 1622: target (machine) code. However, there can be many different types of targets. Source Language Target Language Introduction to Compiler Design → C/C++ Machine code → Java Java Bytecode → Perl Perl Bytecode → Java Bytecode Machine code Jonathan Misurda jmisurda@cs.pitt.edu Compilers vs. Interpreters C Compiler Compilation – To translate a source program in one language into an executable program in another language and produce results while executing the new program • gcc Examples: C, C++, FORTRAN Object Interpretation – To read a source program and produce the results while Preprocessed C source files source Executable understanding that program • Examples: BASIC, LISP cpp cc1 ld .c .o Hybrid – Try to use both (such as in Java) Preprocessor Compiler Linker 1. Translate source code to bytecode 2. Execute by interpretation on a JVM or 2. Execute by compilation using a JIT Java Compiler Compilation Executable Java source Class files Compiler Output Source javac .java .class Compiler Data Class files JVM .class Pros: Cons: Virtual Machine • • Fast execution Complexity • • Can exploit machine Must be done before architecture features execution 1

  2. 8/27/2012 Interpreter Phases of Compilation Source Code Interpreter Output Source Lexical Analyzer Token Sequence Data Syntax Analyzer Syntax Tree Semantic Analyzer Intermediate Representation Pros: Cons: • • Machine independent Time overhead Code Optimizer • Easy to debug • Space overhead Optimized IR • Flexible to modify Code Generator Assembly/Machine Code Phases Phases Lexical Analysis C ode optimization • Recognize token – smallest stand-alone unit of meaningful information • Modify program representation so that program: • • Analyze input (strings of characters) from source Runs faster • Scan from left to right • Uses less memory • • Report errors Uses less power • In general, reduce the consumed resources Syntax Analysis • Group tokens into hierarchical groups Code generation • Differentiate if-statement, while-statement, ... • Produce target code • Report errors • Instruction selection Semantic Analysis • Memory allocation • Determine the meaning using the structure • Resource allocation — registers, processors, etc. • Checks are performed to ensure components fit together meaningfully • Limited analysis to catch inconsistencies, e.g., type checking • Put semantic meaningful items in the structure • Produce IR (easier to generate optimized machine code from IRs) Lexing Parsing Input: Source program Input: Sequence of tokens Output: Sequence of tokens Output: Abstract Syntax Tree Example: Example: ID(‘x’) > NUM(‘3’) ) { ID(‘y’) IF ( INCREMENT ; } if(x > 3) { if-statement y++; } cond_expr stmt_list ID(‘x’) > NUM(‘3’) ) { ID(‘y’) IF ( INCREMENT ; } > post-inc x 3 y 2

  3. 8/27/2012 Code Generation Data Structures for Compilation Input: Intermediate representation Abstract Syntax Tree • Output: Target code Stores the information from the parse and lexing phases • Walk the tree to produce IR or target code Example: Symbol Table slti $t1, 3, $s0 if-statement • Collect and maintain information about identifiers beq $t1, $zero, L1 • Attributes: type, address, scope, size addi $s1, $s1, 1 cond_expr • L1: Used by most compiler passes and phases stmt_list • Some phases add information: > • lexing, parsing, semantic analysis post-inc • Some phases use information: x 3 • Semantic analysis, code optimization, code generation y • Debuggers also can make use of a symbol table • gcc -g keeps a version of the symbol table in the object code Three-pass Compiler Compiler Construction Automatic Generators: IR Machine Code • Lexical Analysis — Lex, Flex, JLex, JFlex Source Code IR Front End Middle Back End • Syntax Analysis — Yacc, Bison, JavaCUP, JavaCC • Semantic Analysis • Code Optimization Error • Code Generation Passes : number of times through a program representation • 1-pass, 2-pass, multi-pass compilation • Language becomes more complex → more passes Phases : conceptual and sometimes physical stages • Symbol table coordinates information between phases • Phases are not completely separate • Semantic phase may do things that syntax phase should do • Interaction is possible 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend