From Dataflow Specifications to Customised Reconfigurable Datapaths - - PowerPoint PPT Presentation

from dataflow specifications to customised reconfigurable
SMART_READER_LITE
LIVE PREVIEW

From Dataflow Specifications to Customised Reconfigurable Datapaths - - PowerPoint PPT Presentation

From Dataflow Specifications to Customised Reconfigurable Datapaths Using HLS: the OpenCL Case for FPGAs Rubn Salvador [Kindly hosted by INSA: KDesnos, MPelcat, JFNezan, DMenard, LMorin ] Universidad Politcnica de Madrid (UPM) School of


slide-1
SLIDE 1

Universidad Politécnica de Madrid (UPM) School of Telecommunications Systems and Engineering (ETSIST) Research Center on Software Technologies and Multimedia Systems (CITSEM)

From Dataflow Specifications to Customised Reconfigurable Datapaths Using HLS: the OpenCL Case for FPGAs

Rubén Salvador

[Kindly hosted by INSA: KDesnos, MPelcat, JFNezan, DMenard, LMorin…] Dataflow Workshop Rennes, 12-14 December 2017

slide-2
SLIDE 2

Rubén Salvador From Dataflow to Customised FPGA Datapaths

2

Context

slide-3
SLIDE 3

Rubén Salvador From Dataflow to Customised FPGA Datapaths

3

Context

slide-4
SLIDE 4

Rubén Salvador From Dataflow to Customised FPGA Datapaths

4

OUTLINE

Motivation OpenCL FPGA Dataflow Mapping On Top Of OpenCL FPGA Implementation Steps

slide-5
SLIDE 5

Rubén Salvador From Dataflow to Customised FPGA Datapaths

5

OUTLINE

Motivation OpenCL FPGA Dataflow Mapping On Top Of OpenCL FPGA Implementation Steps

slide-6
SLIDE 6

Rubén Salvador From Dataflow to Customised FPGA Datapaths

6

Dataflow: a (naive) view from a newcomer

Dataflow

A flow of data… …moves… …is transformed…

between (2) points in space along its way

…and sinks.

Spatial Computing

datapath hardware FPGA architecture point to point comms

  • n-chip memory
slide-7
SLIDE 7

Rubén Salvador From Dataflow to Customised FPGA Datapaths

7

Customised FPGA-based datapaths for dataflow graphs

A B C

Dataflow FPGA HLS

System Level Integration (SW) developers love

Wide Community Embrace

slide-8
SLIDE 8

Rubén Salvador From Dataflow to Customised FPGA Datapaths

8

Conquering Computing Community Embrace OpenCL Dataflow Pros

  • Functionally portable
  • Wide community acceptance
  • Support for HLS
  • Graph analysis & Guarantees
  • Schedulability, deadlocks, FIFO

sizing

  • Concurrent execution model
  • Comms interaction

Cons

  • No dataflow (streaming) friendly
  • Global memory comms
  • Compute accelerator model
  • Data offload (writes/reads)
  • Throughput oriented (vs latency)
  • Niche domain
  • Most work for multi/manycore

What can dataflow bring to the OpenCL community?

slide-9
SLIDE 9

Rubén Salvador From Dataflow to Customised FPGA Datapaths

9

OUTLINE

Motivation OpenCL FPGA Dataflow Mapping On Top Of OpenCL FPGA Implementation Steps

slide-10
SLIDE 10

Rubén Salvador From Dataflow to Customised FPGA Datapaths

10

OpenCL: framework for heterogeneous/parallel computing Work Group (WG) Work Item (WI)

Data parallelism Task parallelism

SIMD

slide-11
SLIDE 11

Rubén Salvador From Dataflow to Customised FPGA Datapaths

11

OpenCL FPGA Model

Intel FPGA SDK for OpenCL: Best Practices Guide https://www.altera.com/en_US/pdfs/lit erature/hb/opencl-sdk/aocl-best- practices-guide.pdf https://www.altera.com/products/design- software/embedded-software- developers/opencl/developer-zone.html

slide-12
SLIDE 12

Rubén Salvador From Dataflow to Customised FPGA Datapaths

12

OUTLINE

Motivation OpenCL FPGA Dataflow Mapping On Top Of OpenCL FPGA Implementation Steps

slide-13
SLIDE 13

Rubén Salvador From Dataflow to Customised FPGA Datapaths

13

Dataflow On Top Of OpenCL: SoC FPGAs

Desired Features & Expected Gains

Hardware acceleration (custom datapath) Reduced processor (communication)

  • verhead

Reduced memory transactions Self-timed Execution

slide-14
SLIDE 14

Rubén Salvador From Dataflow to Customised FPGA Datapaths

14

Dataflow On Top Of OpenCL FPGA

Recent (2017) proposal: add dataflow semantics to OpenCL standard OpenCL Khronos Group Standard Tool Expertise & Design Space Exploration Leverage OpenCL FPGA constructs to generate efficient dataflow Dataflow-driven “OpenCL” code generation

OpenCL Community Dataflow Community

slide-15
SLIDE 15

Rubén Salvador From Dataflow to Customised FPGA Datapaths

15

MoCs semantics for OpenCL Pipes

Synchronous Dataflow (SDF) Bulk Synchronous Parallel (BSP)

MoCs semantics to OpenCL

Proposal for the OpenCL Standard

Kapre, Nachiket, and Hiren Patel. Applying Models of Computation to OpenCL Pipes for FPGA Computing. Proc. 5th IWOCL. ACM, 2017.

a.k.a.: compiler’s job

OpenCL compute model + MoC Comms Schemes

slide-16
SLIDE 16

Rubén Salvador From Dataflow to Customised FPGA Datapaths

16

OpenCL Increasing Streaming Support

Pipes (OpenCL 2.0)

Standard OpenCL Kernel-to-Kernel communication

Channels (Intel FPGA)

Preferred Kernel-to-Kernel communication

Host-Kernel Pipes

… only prototype demo so far

Kang, K., and P. Yiannacouras. Host Pipes: Direct Streaming Interface Between OpenCL Host and Kernel. Proc. 5th IWOCL. ACM, 2017.

Overlap multi-kernel operation Self-triggered kernels (free run decoupled from host)

slide-17
SLIDE 17

Rubén Salvador From Dataflow to Customised FPGA Datapaths

17

Kernel Operation Possibilities

Autorun kernels

No host-kernel communication logic Autostart & Auto-restart Communicate through channels

Intel FPGA SDK for OpenCL: Best Practices Guide https://www.altera.com/en_US/pdfs/literature/hb/opencl- sdk/aocl-best-practices-guide.pdf

slide-18
SLIDE 18

Rubén Salvador From Dataflow to Customised FPGA Datapaths

18

Channels

Channels

Kernel execution decoupled from host Blocking/Non-blocking Read/Write API Synchronization mechanisms I/O Channels -> Streaming DSP

Intel FPGA SDK for OpenCL: Programming Guide https://www.altera.com/content/dam/altera- www/global/en_US/pdfs/literature/hb/opencl- sdk/aocl_programming_guide.pdf

slide-19
SLIDE 19

Rubén Salvador From Dataflow to Customised FPGA Datapaths

19

OUTLINE

Motivation OpenCL FPGA Dataflow Mapping On Top Of OpenCL FPGA Implementation Steps

slide-20
SLIDE 20

Rubén Salvador From Dataflow to Customised FPGA Datapaths

20

Mapping (Pi)SDF Dataflow Graphs To OpenCL Model

1.- Actor Firing Rules within OpenCL Kernels Check overhead vs. performance 2.- Code Generatrion from PREESM 2.1.- Actor firing rules - scheduler Acceptable? 2.2.- FIFO analysis & Buffer generation 2.3.- Memory Access Optimization

Kernel

K

A B C

slide-21
SLIDE 21

Rubén Salvador From Dataflow to Customised FPGA Datapaths

21

Mapping (PiSDF) Dataflow Graphs To OpenCL Model

2.1.- Actor firing rules (scheduler) Actor I/O IFs, firing rules, templates

Kernel

K

Enough with channels sync? Borrow from CA CAPH ¿?

  • Host code:
  • Platform initialization: automatic
  • Job management: automatic
  • input data & result data
  • "only" necessary for the host/device frontier
  • pointers mapped to device buffers
  • Kernel code:
  • I/O interfaces (firing rules): automatic
  • Functionality: manual (provided by user)
slide-22
SLIDE 22

Rubén Salvador From Dataflow to Customised FPGA Datapaths

22

Mapping (PiSDF) Dataflow Graphs To OpenCL Model

2.1.- Actor firing rules (scheduler)

Kernel

K

2.2.- Buffer generation Leverage current PREESM buffer generation Pipes vs Channels vs Ad-hoc Buffer 2.3.- Memory Accesses Optimization Streaming Dataflow Shared/Global Memory Local FPGA DDRs (kernel only) Different workloads? Out-of-order accesses? Enough with channels sync? Borrow from CA CAPH ¿? Actor I/O IFs, firing rules, templates

slide-23
SLIDE 23

Rubén Salvador From Dataflow to Customised FPGA Datapaths

23

future(future)

3.- Hack the Flow Kernel (actor) functionality Component library Host-Device communications Area / Latency / Throughput Device Wrapper Plug HDL / CAPH Compute upper bound? Open Run Time ?¿ Dynamic Reconfiguration ?¿ Intel Xeon + FPGA Graph Reconfiguration HPC community DSE: Predictability DSE: Predictability 4.- New devices

slide-24
SLIDE 24

Rubén Salvador From Dataflow to Customised FPGA Datapaths

24

Thanks for your attention!!

ruben.salvador@upm.es https://twitter.com/RubenSalvadorP http://blogs.upm.es/rubensalvador/