High-Level Synthesis Xilinx Vivado HLS Hao Zheng Comp Sci & - - PowerPoint PPT Presentation

high level synthesis
SMART_READER_LITE
LIVE PREVIEW

High-Level Synthesis Xilinx Vivado HLS Hao Zheng Comp Sci & - - PowerPoint PPT Presentation

High-Level Synthesis Xilinx Vivado HLS Hao Zheng Comp Sci & Eng University of South Florida 1 Reading The Zynq Book , chapter 14, 15 Vivado Design Suite Tutorial: High-Level Synthesis 2 Overview 3 4 5 6 Implementation


slide-1
SLIDE 1

High-Level Synthesis

Xilinx Vivado HLS

Hao Zheng Comp Sci & Eng University of South Florida

1

slide-2
SLIDE 2

Reading

➜The Zynq Book, chapter 14, 15 ➜Vivado Design Suite Tutorial: High-Level

Synthesis

2

slide-3
SLIDE 3

3

Overview

slide-4
SLIDE 4

4

slide-5
SLIDE 5

5

slide-6
SLIDE 6

6

slide-7
SLIDE 7

Implementation Considerations

➜Resources / area ➜Throughput ➜Clock frequency ➜Latency ➜Power consumption ➜I/O requirements

7

Controlled by synthesis directives

slide-8
SLIDE 8

8

slide-9
SLIDE 9

9

Native Types in C/C++

slide-10
SLIDE 10

10

Arbitrary Precision – Integer

1 ≤ N ≤ 1024

slide-11
SLIDE 11

11

Typical C/C++ Construct to RTL Mapping

Operators Control flows Scalars Arrays Memories Wires or registers Control logics Functional units Functions Modules Arguments Input/output ports à à à à à à HW Components C Constructs

slide-12
SLIDE 12

Function Hierarchy

➜Each function is synthesized to a RTL module

➜Function inlining eliminates hierarchy

➜ The function main() cannot be synthesized

➜ Used to develop C-testbench

12

  • void A() { .. body A .. }

void C() { .. body C .. } void B() { C(); } void TOP( ) { A(…); B(…); }

TOP A B C Source code RTL hierarchy

slide-13
SLIDE 13

Function Arguments

➜Function arguments become module ports

➜Interface follows certain protocol to synchronize data

exchange

13

  • TOP
  • ut1

in1 in2

Datapath

FSM

in1_vld in2_vld

  • ut1_vld

void TOP(int* in1, int* in2, int* out1) { *out1 = *in1 + *in2; }

slide-14
SLIDE 14

Expressions

➜Expressions and operations are synthesized to

datapath components

➜Timing constraints influence the degree of registering

14

  • char A, B, C, D,

int P; P = (A+B)*C+D

  • +

+

A B C D P

slide-15
SLIDE 15

Arrays

➜An array is typically implemented by a mem block

➜Read & write array -> RAM; Constant array -> ROM ➜ An array can be partitioned and map to multiple RAMs ➜ Multiples arrays can be merged and map to one RAM ➜ An array can be partitioned into individual elements and

map to registers

15

  • void TOP(int)

{ int A[N]; for (i = 0; i < N; i++) A[i+x] = A[i] + i; } N-1 N-2 … 1

TOP

DOUT DIN ADDR CE WE

RAM

A[N] A_out

A_in

slide-16
SLIDE 16

Loops

➜By default, loops are rolled

➜Each loop iteration corresponds to a “sequence” of

states (possibly a DAG)

➜This state sequence will be repeated multiple times

based on the loop trip count

16

  • void TOP (…) {

... for (i = 0; i < N; i++) b += a[i]; }

TOP S1 a[i] b

+

LD S2

slide-17
SLIDE 17

Loop Unrolling

➜To expose higher parallelism and achieve shorter

latency

➜Pros

➜Decrease loop overhead ➜Increase parallelism for scheduling ➜Facilitate constant propagation and

array-to-scalar promotion

➜Cons – increase operation count,

which may negatively impact area, power, and timing

17

  • for (int i = 0; i < N; i++)

A[i] = C[i] + D[i]; A[0] = C[0] + D[0]; A[1] = C[1] + D[1]; A[2] = C[2] + D[2]; .....

slide-18
SLIDE 18

Loop Pipelining

➜ Loop pipelining is one of the most important

  • ptimizations for high-level synthesis

➜ Allows a new iteration to begin processing before the previous

iteration is complete

➜ Key metric: Initiation Interval (II) in # cycles

18

  • for (i = 0; i < N; ++i)

p[i] = x[i] * y[i];

II = 1

ld ld ld

  • st

st st ld – Load st – Store ld ld

×

st x[i] y[i]

p[i]

i=0 i=1 i=2 cycles ld st i=3

slide-19
SLIDE 19

19

Synthesis of Loops – Case Study

By default, Vivado intends to optimize area, so loops are rolled

slide-20
SLIDE 20

20

Synthesis of Loops – Case Study

slide-21
SLIDE 21

21

Merging Loops

slide-22
SLIDE 22

22

Merging Loops

slide-23
SLIDE 23

23

Interface Synthesis

slide-24
SLIDE 24

24

Port Directions

slide-25
SLIDE 25

Port Protocols

➜Simple: ap_none, ap_stable, ap_ack ➜Ports with validation: ap_vld, ap_ovld, ap_hs ➜Memory Interface: ap_memory, bram ➜ap_fifo— ➜ap_bus— ➜AXI: axis, s_axilite, m_axi.

25

slide-26
SLIDE 26

26

Backup