high level synthesis
play

High-Level Synthesis Creating Custom Circuits from High-Level Code - PowerPoint PPT Presentation

High-Level Synthesis Creating Custom Circuits from High-Level Code Hao Zheng Comp Sci & Eng University of South Florida 1 Existing Design Flow Register-transfer (RT) synthesis Specify RT structure (muxes, registers, etc) Allows


  1. High-Level Synthesis Creating Custom Circuits from High-Level Code Hao Zheng Comp Sci & Eng University of South Florida 1

  2. Existing Design Flow ➜ Register-transfer (RT) synthesis ➜ Specify RT structure (muxes, registers, etc) ➜ Allows precise specification ➜ But, time consuming, difficult, error prone Synthesizable HDL RT Synthesis Technology Mapping Netlist Placement Physical Design Bitfile Routing FPGA ASIC Processor 2

  3. ��� ���������� ����������������� ��� ���������� ��������������������� �������������������������� ����������� ��� ������ ��� ��� ��� ��� ��� ���� ��������������������� Existing Design Flow Xilinx: Introduction to FPGA Design with Vivado HLS, 2013 3

  4. Forthcoming Design Flow C/C++, Java, etc. High-level Synthesis Synthesizable HDL HDL RT Synthesis Technology Mapping Netlist Placement Physical Design Bitfile Routing FPGA ASIC Processor 4

  5. ��� �������� ����������� ���� ����������������� ��������������������� �������������������������� ��������������������� ��� ������ ��� ������������� ��� ��� ��� ���� Forthcoming Design Flow Xilinx: Introduction to FPGA Design with Vivado HLS, 2013 5

  6. HLS Overview ➜ Input: ➜ High-level languages (e.g., C) ➜ Behavioral hardware description languages (e.g., VHDL) ➜ State diagrams / logic networks ➜ Tools: ➜ Parser ➜ Library of modules ➜ Constraints: ➜ Area constraints (e.g., # modules of a certain type) ➜ Delay constraints (e.g., set of operations finish in # clock cycles) ➜ Output – RTL models ➜ Operation scheduling (time) and binding (resource) ➜ Control generation and detailed interconnections 6

  7. High-level Synthesis - Benefits ➜ Ratio of C to VHDL developers (10000:1 ?) ➜ Easier to specify complex functions ➜ Technology/architecture independent designs ➜ Manual HW design potentially slower ➜ Similar to assembly code era ➜ Programmers used to beat compiler ➜ But, no longer the case ➜ Ease of HW/SW partitioning ➜ enhance overall system efficiency ➜ More efficient verification and validation ➜ Easier to V & V of high-level code 7

  8. High-level Synthesis ➜ More challenging than SW compilation ➜ Compilation maps behavior into assembly instructions ➜ Architecture is known to compiler ➜ HLS creates a custom architecture to execute specified behavior ➜ Huge hardware exploration space ➜ Best solution may include microprocessors ➜ Ideally, should handle any high-level code ➜ But, not all code appropriate for hardware 8

  9. High-level Synthesis: An Example ➜ First, consider how to manually convert high-level code into circuit acc = 0; for (i=0; i < 128; i++) acc += a[i]; ➜ Steps Build FSM for controller 1) Build datapath based on FSM 2) 9

  10. A Manual Example ➜ Build a FSMD acc = 0; acc=0, i = 0 for (i=0; i < 128; i++) acc += a[i]; i >= 128 i < 128 Done <= 1 load a[i] acc += a[i] i++ 10

  11. A Manual Example – Cont’d ➜ Combine controller + datapath Start In from memory &a 0 0 2x1 MUX MUX 2x1 MUX 2x1 a[i] acc addr i Controller 1 128 1 + + < + acc = 0; Done Memory Read for (i=0; i < 128; i++) acc Memory address acc += a[i]; 11

  12. High-Level Synthesis – Overview acc = 0; for (i=0; i < 128; i++) acc += a[i]; High-Level Synthesis In from memory &a 0 0 2x1 2x1 2x1 a[i] addr i acc Controller 1 128 1 + + < + Done Memory Read acc Memory address 12

  13. A Manual Example - Optimization ➜ Alternatives ➜ Use one adder (plus muxes) In from memory &a 0 0 2x1 MUX MUX 2x1 MUX 2x1 a[i] acc addr i 1 128 MUX MUX < + acc Memory address 13

  14. A Manual Example – Summary ➜ Comparison with high-level synthesis ➜ Determining when to perform each operation => Scheduling ➜ Allocating resource for each operation => Resource allocation ➜ Mapping operations to allocated resources => Binding 14

  15. High-Level Synthesis Could be C, C++, Java, Perl, Python, SystemC, high-level code ImpulseC, etc. High-Level Synthesis Custom Circuit Usually a RT VHDL/Verilog description, but could as low level as a bit file for FPGA, or a gate netlist. 15

  16. Main Steps High-level Code Converts code to intermediate representation - allows all following Front-end Syntactic Analysis steps to use language independent format. Intermediate Representation Optimization Determines when each operation will Scheduling/Resource Allocation execute, and resources used Back-end Maps operations onto physical resources Binding/Resource Sharing Cycle accurate RTL code 16

  17. Parsing & Syntactic Analysis 17

  18. Syntactic Analysis • Definition: Analysis of code to verify syntactic correctness - Converts code into intermediate representation • Steps: similar to SW compilation Lexical analysis (Lexing) 1) Parsing 2) Code generation – intermediate representation 3) High-level Code Lexical Analysis Syntactic Analysis Parsing Intermediate Representation 18

  19. Intermediate Representation ➜ Parser converts an input program to intermediate representation ➜ Why use intermediate representation? ➜ Easier to analyze/optimize than source code ➜ Theoretically can be used for all languages ➜ Makes synthesis back end language independent Java Perl C Code Syntactic Analysis Syntactic Analysis Syntactic Analysis Scheduling, resource Intermediate allocation, binding, Representation independent of source language - sometimes optimizations too Back End 19

  20. Intermediate Representation ➜ Different Types ➜ Abstract Syntax Tree ➜ Control/Data Flow Graph (CDFG) ➜ Sequencing Graph ➜ We will focus on CDFG ➜ Combines control flow graph (CFG) and data flow graph (DFG) ➜ CFG ---> controller ➜ DFG ---> datapath 20

  21. Control Flow Graphs (CFGs) ➜ Represents control flow dependencies of basic blocks ➜ A basic block is a section of code that always executes from beginning to end ➜ I.e. no jumps into or out of block, nor branching acc=0, i = 0 acc = 0; for (i=0; i < 128; i++) i < 128? no yes acc += a[i]; Done acc += a[i] i ++ 21

  22. Control Flow Graphs: Your Turn • Find a CFG for the following code. i = 0; while (i < 10) { if (x < 5) y = 2; else if (z < 10) y = 6; i++; } 22

  23. Data Flow Graphs ➜ Represents data dependencies between operations within a single basic block b c a d x = a+b; * + y = c*d; z = x - y; - z y x 23

  24. Control/Data Flow Graph ➜ Combines CFG and DFG ➜ Maintains DFG for each node of CFG acc = 0; for (i=0; i < 128; i++) 0 0 acc += a[i]; acc i acc=0; i=0; if (i < 128) acc a[i] i 1 Done acc += a[i] + + i ++ i acc 24

  25. Transformation/Optimization 25

  26. Synthesis Optimizations ➜ After creating CDFG, HLS optimizes it with the following goals ➜ Reduce area ➜ Reduce latency ➜ Increase parallelism ➜ Reduce power/energy ➜ 2 types of optimizations ➜ Data flow optimizations ➜ Control flow optimizations 26

  27. Data Flow Optimizations ➜ Tree-height reduction ➜ Generally made possible from commutativity, associativity, and distributivity x = a + b + c + d a b c d c d a b + + + + + + c d a a b c d b * + * + + + 27

  28. Data Flow Optimizations ➜ Operator Strength Reduction ➜ Replacing an expensive ( � strong � ) operation with a faster one ➜ Common example: replacing multiply/divide with shift 0 multiplications 1 multiplication b[i] = a[i] << 3; b[i] = a[i] * 8; c = b << 2; a = b * 5; a = b + c; c = b << 2; a = b * 13; d = b << 3; a = c + d + b; 28

  29. Data Flow Optimizations • Constant propagation - Statically evaluate expressions with constants x = 0; x = 0; y = x * 15; y = 0; z = y + 10; z = 10; 29

  30. Data Flow Optimizations ➜ Function Specialization ➜ Create specialized code for common inputs ➜ Treat common inputs as constants ➜ If inputs not known statically, must include if statement for each call to specialized function int f (int x) { int f_opt () { int f (int x) { y = x * 15; return 10; y = x * 15; return y + 10; } return y + 10; Treat } } frequent input as a constant for (I=0; I < 1000; I++) for (I=0; I < 1000; I++) f(0); f_opt(); … … } } 30

  31. Data Flow Optimizations ➜ Common sub-expression elimination ➜ If expression appears more than once, repetitions can be replaced a = x + y; a = x + y; . . . . . . . . . . . . . . . . . . . . . . . . b = c * 25 + x + y; b = c * 25 + a; x + y already determined 31

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend