design and implementation of a tricore backend for the
play

Design and Implementation of a TriCore Backend for the LLVM Compiler - PowerPoint PPT Presentation

Design and Implementation of a TriCore Backend for the LLVM Compiler Framework Studienarbeit Christoph Erhardt Friedrich-Alexander-Universit at Erlangen-N urnberg November 20, 2009 A TriCore Backend for LLVM (November 20, 2009) 1 25


  1. Design and Implementation of a TriCore Backend for the LLVM Compiler Framework Studienarbeit Christoph Erhardt Friedrich-Alexander-Universit¨ at Erlangen-N¨ urnberg November 20, 2009 A TriCore Backend for LLVM (November 20, 2009) 1 – 25

  2. Overview Overview The TriCore Processor Architecture The LLVM Compiler Infrastructure Design and Implementation of the Backend Evaluation & Conclusion A TriCore Backend for LLVM (November 20, 2009) Overview 2 – 25

  3. Motivation What do we need it for? TriCore chips are omnipresent around here: Quadcopter High striker Carolo Cup ... The RTSC (Real-Time Systems Compiler) project: Operating system aware compiler for real-time applications Processes atomic basic blocks Based on LLVM A TriCore Backend for LLVM (November 20, 2009) Overview 3 – 25

  4. Motivation What do we need it for? TriCore chips are omnipresent around here: Quadcopter High striker Carolo Cup ... The RTSC (Real-Time Systems Compiler) project: Operating system aware compiler for real-time applications Processes atomic basic blocks Based on LLVM RTSC should be able to generate TriCore machine code A TriCore Backend for LLVM (November 20, 2009) Overview 3 – 25

  5. The TriCore Processor Architecture Overview Three-in-one architecture Real-time microcontroller unit DSP Superscalar RISC processor A TriCore Backend for LLVM (November 20, 2009) The TriCore Processor Architecture 4 – 25

  6. The TriCore Processor Architecture Overview Three-in-one architecture Real-time microcontroller unit DSP Superscalar RISC processor Basic features Load/store architecture 32-bit data, address, and instruction words Some special 16-bit instruction words for higher code density Little-endian byte order 16 data + 16 address registers A TriCore Backend for LLVM (November 20, 2009) The TriCore Processor Architecture 4 – 25

  7. Peculiarities Some things that TriCore handles in an unusual way Strict distinction between data and address registers: Also reflected in the calling conventions Serious problem for the compiler! Data registers are also used for floating-point operands Special DSP-oriented instructions and addressing modes Task/context model: Automatic context save/restore upon call/return Context save areas (linked lists managed by hardware) A TriCore Backend for LLVM (November 20, 2009) The TriCore Processor Architecture 5 – 25

  8. The LLVM Compiler Infrastructure Overview Open-source compiler infrastructure project started in 2000 Main sponsor: Apple Inc. Written in C++ A TriCore Backend for LLVM (November 20, 2009) The LLVM Compiler Infrastructure 6 – 25

  9. Basic Architecture The classical three tiers of a compiler Clang x86 code C source x86 frontend generator assembly ... ... ... ... Fortran LLVM-GCC SPARC code SPARC Optimizer source frontend generator assembly LLVM LLVM assembly/ assembly/ bitcode bitcode Language-specific frontends Optimizer: generic IR, analysis/transformation passes Several backends for machine code generation A TriCore Backend for LLVM (November 20, 2009) The LLVM Compiler Infrastructure 7 – 25

  10. Unique Characteristics What does LLVM have that others don’t? Not merely a compiler, but a compiler infrastructure : Static compilation Just-in-time compilation Strictly modular, library-based architecture: Easily extendible Possibility to incorporate parts of LLVM in other projects BSD-style licence Produces highly optimized machine code in an efficient way: Memory-efficient Time-efficient A TriCore Backend for LLVM (November 20, 2009) The LLVM Compiler Infrastructure 8 – 25

  11. Design and Implementation of the Backend Overview Extensive generic code generation framework: Makes work a lot easier ... but also imposes some problems in specific cases Fixed class hierarchy Many target-independent algorithms: Instruction scheduling Register colouring ... Code generation process executed by a series of passes A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 9 – 25

  12. Code Generation Process List DAGs List DAGs DAGs LLVM code native SSA form not legalized legalized (SSA form) instructions DAG DAG Instruction SSA-Based Scheduling Lowering Legalization Selection Optimization TriCoreTargetLowering TriCoreDAGToDAGISel TriCoreInstrInfo TriCoreInstrInfo List SSA form TriCoreRegisterInfo TriCoreAsmPrinter TriCoreLoadStoreOpt TriCoreVirtInstrResolver TriCoreInstrInfo TriCoreInstrInfo Post- Assembly Peephole Pro-/Epilogue Register Allocation Printing Optimization Insertion Allocation Passes List List Text List List with resolved with resolved assembly with physical with physical stack stack code registers registers references references A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 10 – 25

  13. TableGen One tool to rule them all... Problem Backend contains large portions of descriptive data C++ obviously not suitable A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 11 – 25

  14. TableGen One tool to rule them all... Problem Backend contains large portions of descriptive data C++ obviously not suitable TableGen Language for domain-specific modelling Similar to object-oriented approach: Classes, records (objects), attributes Inheritance Definition files ( .td ) preprocessed by tblgen tool → Auto-generation of C++ code Used for description of: Subtargets, registers Calling conventions Instruction set A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 11 – 25

  15. SelectionDAG Construction Largely automated Directed acyclic graph Per basic block Nodes: instructions Edges: Data dependencies Control flow dependencies Example %mul = mul i32 %a, %a %mul4 = mul i32 %b, %b %add = add nsw i32 %mul4, %mul ret i32 %add

  16. SelectionDAG Construction Largely automated EntryToken Register %reg1025 Register %reg1024 0xa8321e8 0xa832988 0xa832900 ch i32 i32 Directed acyclic graph CopyFromReg CopyFromReg 0xa832878 0xa832a10 Per basic block i32 ch i32 ch Nodes: instructions mul mul Edges: 0xa832548 0xa8325d0 i32 i32 Data dependencies Control flow dependencies Register %D2 ր add 0xa8327f0 0xa832658 i32 i32 Example CopyToReg 0xa8324c0 ch flag %mul = mul i32 %a, %a %mul4 = mul i32 %b, %b %add = add nsw i32 %mul4, %mul TriCoreISD::RET_FLAG 0xa8326e0 ret i32 %add ch GraphRoot isel input for euclidSquare:entry

  17. Troubles The integer vs. pointer problem Problem TriCore strictly distinguishes between addresses and data integers Have to be put into separate register files → calling conventions! LLVM’s backend framework treats pointers just like integers... A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 13 – 25

  18. Troubles The integer vs. pointer problem Problem TriCore strictly distinguishes between addresses and data integers Have to be put into separate register files → calling conventions! LLVM’s backend framework treats pointers just like integers... Solution Annotation of “pointer / no pointer” flag in value type class Promotion of this flag throughout the DAG construction phase (required some hacks...) Case differentiations in all relevant situations A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 13 – 25

  19. Instruction Selection Largely auto-generated EntryToken Register %reg1025 Register %reg1024 0xa8321e8 0xa832988 0xa832900 ch i32 i32 CopyFromReg CopyFromReg 0xa832878 0xa832a10 i32 ch i32 ch mul mul 0xa832548 0xa8325d0 i32 i32 Pattern matching → Register %D2 add 0xa8327f0 0xa832658 i32 i32 def MULrr2 : Rr2Instr<0x0a, (outs DR:$c), (ins DR:$a, DR:$b), "mul\t$c, $a, $b", [(set DR:$c, (mul DR:$a, DR:$b))]>; CopyToReg 0xa8324c0 ch flag TriCoreISD::RET_FLAG 0xa8326e0 ch GraphRoot isel input for euclidSquare:entry

  20. Instruction Selection Largely auto-generated EntryToken Register %reg1025 Register %reg1024 EntryToken Register %reg1025 0xa8321e8 0xa832988 0xa832900 0xa8321e8 0xa832988 ch i32 i32 ch i32 Register %reg1024 CopyFromReg CopyFromReg CopyFromReg 0xa832900 0xa832878 0xa832a10 0xa832878 i32 i32 ch i32 ch i32 ch mul mul MULrr2 CopyFromReg 0xa832548 0xa8325d0 0xa832548 0xa832a10 i32 i32 i32 i32 ch Pattern matching → Register %D2 Register %D2 add MADDrrr2 0xa8327f0 0xa8327f0 0xa832658 0xa832658 i32 i32 i32 i32 def MULrr2 : Rr2Instr<0x0a, (outs DR:$c), (ins DR:$a, DR:$b), "mul\t$c, $a, $b", [(set DR:$c, (mul DR:$a, DR:$b))]>; CopyToReg CopyToReg 0xa8324c0 0xa8324c0 ch flag ch flag TriCoreISD::RET_FLAG RETsys 0xa8326e0 0xa8326e0 ch ch GraphRoot GraphRoot isel input for euclidSquare:entry scheduler input for euclidSquare:entry

  21. Scheduling & Register Allocation Target-independent algorithms Scheduling DAGs → list (SSA form) Target-independent algorithm using data from the instruction description table A TriCore Backend for LLVM (November 20, 2009) Design and Implementation of the Backend 15 – 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend