hardwired networks on chip for fpgas
play

Hardwired networks on chip for FPGAs Kees Goossens (TUD, NXP) - PDF document

Hardwired networks on chip for FPGAs Kees Goossens (TUD, NXP) Muhammad Aqeel Wahlah (TUD) 2 overview applications network on chip FPGA key ideas hardwired NOC unified interconnect data coercion / type casting dynamic partial


  1. Hardwired networks on chip for FPGAs Kees Goossens (TUD, NXP) Muhammad Aqeel Wahlah (TUD) 2 overview applications network on chip FPGA key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions Kees Goossens 2009-06-02 Tubs.CITY 1

  2. 3 applications A1 BA A2 BAC C1 C2 C3 T1 T2 T3 task / function mapped on IP – includes storage / buffering application: set of communicating IPs / tasks / ... – data, control, code – communication via connections use case: set of concurrent applications Kees Goossens 2009-06-02 Tubs.CITY 4 network on chip (NOC) connects ports on hardware blocks (IP) – data, control connections: virtual wires T3 A1 A2 programmable at run-time IP – set up & destroy connections by programming control registers in the NOC NOC NI NI BA IP styles of communication IP NI R R NI T2 – address-based / IP memory-mapped R NI – streaming BAC IP T1 real-time / quality of service Kees Goossens 2009-06-02 Tubs.CITY 2

  3. 5 FPGA fabric IO processor soft IP are configured in LUT LUT – configurable elements (LUT) CPU – and switch boxes (not shown) with a given configuration granularity (frame) using the configuration de/encrypt LUT LUT interconnect (ICAP) accelerator hard IP off-chip – CPU memory LUT LUT – on-chip memories (BRAM, ...) – off-chip memory interfaces on-chip – decryption IP memory – etc. LUT LUT on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 6 application on FPGA IO A1 processor map application LUT LUT (IPs + interconnect + storage) A2 CPU on soft + hard IP soft control interconnect soft data interconnect traditionally data and control de/encrypt LUT LUT interconnects are separate accelerator could also use NOC for both BAC off-chip memory LUT LUT on-chip BA memory LUT LUT on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 3

  4. 7 multiple applications on FPGA IO A1 processor interconnects and IPs of different LUT LUT applications share reconfiguration A2 T3 CPU regions (frames) – dynamic reconfiguration is soft control interconnect soft data interconnect global, not partial de/encrypt LUT LUT – applications interfere accelerator T1 BAC off-chip memory LUT LUT on-chip BA memory T2 LUT LUT on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 8 overview application network on chip FPGA key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions Kees Goossens 2009-06-02 Tubs.CITY 4

  5. 9 1. hardwired interconnect IO A1 processor replace soft interconnect(s) CFR by hard interconnect(s) A2 T3 CPU interconnect regions of LUTs (CFR) hard interconnect(s) ~35 X smaller area de/encrypt CFR accelerator ~5 X higher speed – program, don’t configure BAC off-chip bit-level (CFR) vs. T1 memory CFR transaction-level (NOC) reconfigurability on-chip – memory mapped BA memory – streaming CFR T2 on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 10 1. hardwired interconnect IO c3 C1 processor dynamic partial reconfiguration CFR – no constraints on soft IP T3 CPU placement C2 hard interconnect(s) loss of flexibility de/encrypt CFR accelerator – fewer LUTs BAC off-chip T1 memory CFR on-chip memory T2 CFR on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 5

  6. 11 2. unified interconnect IO A1 processor one interconnect (e.g. NOC) for CFR – data for functional mode A2 T3 CPU – control for programming single hard interconnect – bitstreams for configuration de/encrypt CFR accelerator dynamic partitioning of different interconnects BAC off-chip T1 memory CFR on-chip BA memory CFR T2 on-chip memory Kees Goossens ICAP 2009-06-02 Tubs.CITY 12 3. data coercion bitstream IO processor data = control = bitstream = … CFR CPU connect a data port single hard interconnect to a configuration port – decrypt bitstreams de/encrypt CFR accelerator off-chip memory CFR data on-chip memory CFR on-chip memory Kees Goossens 2009-06-02 Tubs.CITY 6

  7. 13 3. data coercion IO processor data = control = bitstream CFR CPU PH connect a data port to a configuration port single hard interconnect – decrypt bitstreams de/encrypt CFR accelerator – run-time compute / optimise bitstream bitstreams • JIT, peephole off-chip memory CFR on-chip memory CFR IP on-chip memory Kees Goossens 2009-06-02 Tubs.CITY 14 3. data coercion IO processor data = control = bitstream = test CFR data TR CPU connect a data port single hard interconnect to a configuration port – decrypt bitstreams de/encrypt CFR accelerator – run-time compute / optimise bitstreams data off-chip connect a data port to a test port TV memory CFR – run-time structural test on-chip memory DUT CFR on-chip memory test data Kees Goossens 2009-06-02 Tubs.CITY 7

  8. 15 overview applications network on chip FPGA key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions Kees Goossens 2009-06-02 Tubs.CITY 16 dynamic partial reconfiguration “hardware operating system” implements run-time scheduling of 1. multiple concurrent applications – independent applications on own virtual platform • no communication, no interference – activation given by user, environment, etc. T1 T2 T3 app T A app AC app D A1 BA A2 BAC C1 C2 C3 time Kees Goossens 2009-06-02 Tubs.CITY 8

  9. 17 dynamic partial reconfiguration “hardware operating system” implements run-time scheduling of 1. multiple concurrent applications 2. parts of single applications (soft IP, “hardware tasks”) – multiplex resources of a single application app T A C app D A1 BA A2 BAC C1 C2 C3 time Kees Goossens 2009-06-02 Tubs.CITY 18 dynamic partial reconfiguration “hardware operating system” implements run-time scheduling of 1. multiple concurrent applications 2. parts of single applications (soft IP, “hardware tasks”) – multiplex resources of a single application – internal state state app T A C app D A1 BA A2 BAC C1 C2 C3 time Kees Goossens 2009-06-02 Tubs.CITY 9

  10. 19 dynamic partial reconfiguration 1. system manager – resource management (CFR, NOC, …) • inter-application virtual platforms T application manager A C BAC application manager system manager time Kees Goossens 2009-06-02 Tubs.CITY 20 dynamic partial reconfiguration 1. system manager – resource management (CFR, NOC, …) • inter-application virtual platforms • intra-application phases – NOC programming – soft IP / (sub)-application configuration A C BAC application manager system manager time Kees Goossens 2009-06-02 Tubs.CITY 10

  11. 21 dynamic partial reconfiguration 1. system manager 2. application manager – application programming T application manager A C BAC application manager system manager time Kees Goossens 2009-06-02 Tubs.CITY 22 dynamic partial reconfiguration 1. system manager A1 BA A2 BAC C1 C2 C3 2. application manager – application programming – intra-application persistent data management state A C BAC application manager system manager time Kees Goossens 2009-06-02 Tubs.CITY 11

  12. 23 overview applications FPGA network on chip key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions Kees Goossens 2009-06-02 Tubs.CITY 24 modelling SystemC – bit & cycle accurate NOC model – behavioural CFR models – accurate bitstream structure – behavioural hard IP models model – starting / stopping of applications • dynamic, based on user input – starting / stopping of sub-applications • dynamic, based on flow of data – configuration: loading of bitstreams for soft IP; clock & reset – programming: of NOC, system & sub-application managers – management of persistent state Kees Goossens 2009-06-02 Tubs.CITY 12

  13. 25 example IO A1 processor system manager CFR – program NOC for configuration A2 system CPU manager single hard interconnect de/encrypt CFR accelerator BAC off-chip memory application CFR manager on-chip BA memory CFR on-chip memory Kees Goossens 2009-06-02 Tubs.CITY bitstream 26 programming example data IO A1 processor system manager CFR – program NOC for configuration A2 system – configure: load bitstreams CPU manager • including bitstream syntax, etc. single hard interconnect de/encrypt CFR accelerator BAC off-chip memory application CFR manager on-chip BA memory CFR on-chip memory Kees Goossens 2009-06-02 Tubs.CITY 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend