Hardwired networks on chip for FPGAs Kees Goossens (TUD, NXP) - - PDF document

hardwired networks on chip for fpgas
SMART_READER_LITE
LIVE PREVIEW

Hardwired networks on chip for FPGAs Kees Goossens (TUD, NXP) - - PDF document

Hardwired networks on chip for FPGAs Kees Goossens (TUD, NXP) Muhammad Aqeel Wahlah (TUD) 2 overview applications network on chip FPGA key ideas hardwired NOC unified interconnect data coercion / type casting dynamic partial


slide-1
SLIDE 1

1

Hardwired networks on chip for FPGAs

Kees Goossens (TUD, NXP) Muhammad Aqeel Wahlah (TUD)

Kees Goossens 2009-06-02 Tubs.CITY 2

  • verview

applications network on chip FPGA key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions

slide-2
SLIDE 2

2

Kees Goossens 2009-06-02 Tubs.CITY 3

applications

BAC T1 T2 T3 C1 C2 C3 A1 A2 BA task / function mapped on IP – includes storage / buffering application: set of communicating IPs / tasks / ... – data, control, code – communication via connections use case: set of concurrent applications

Kees Goossens 2009-06-02 Tubs.CITY 4

network on chip (NOC)

connects ports on hardware blocks (IP) – data, control connections: virtual wires programmable at run-time – set up & destroy connections by programming control registers in the NOC styles of communication – address-based / memory-mapped – streaming real-time / quality of service R R R NI NI NI NI NI IP IP IP IP IP NOC

T1 T2 T3 BAC A1 A2 BA

slide-3
SLIDE 3

3

Kees Goossens 2009-06-02 Tubs.CITY 5

FPGA fabric

LUT LUT LUT LUT IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory LUT LUT LUT LUT ICAP

soft IP are configured in – configurable elements (LUT) – and switch boxes (not shown) with a given configuration granularity (frame) using the configuration interconnect (ICAP) hard IP – CPU –

  • n-chip memories (BRAM, ...)

  • ff-chip memory interfaces

– decryption IP – etc.

Kees Goossens 2009-06-02 Tubs.CITY 6

LUT LUT LUT LUT

application on FPGA

LUT LUT LUT LUT IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory A2 A1 BAC BA ICAP

map application (IPs + interconnect + storage)

  • n soft + hard IP

traditionally data and control interconnects are separate could also use NOC for both

soft data interconnect soft control interconnect

slide-4
SLIDE 4

4

Kees Goossens 2009-06-02 Tubs.CITY 7

LUT LUT LUT LUT

multiple applications on FPGA

LUT LUT LUT LUT IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory A2 A1 BAC BA ICAP T3 T1

interconnects and IPs of different applications share reconfiguration regions (frames) – dynamic reconfiguration is global, not partial – applications interfere

soft data interconnect soft control interconnect T2

Kees Goossens 2009-06-02 Tubs.CITY 8

  • verview

application network on chip FPGA key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions

slide-5
SLIDE 5

5

Kees Goossens 2009-06-02 Tubs.CITY 9

  • 1. hardwired interconnect

CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory A2 A1 BAC BA ICAP T3 T1 T2

replace soft interconnect(s) by hard interconnect(s) interconnect regions of LUTs (CFR) ~35 X smaller area ~5 X higher speed – program, don’t configure bit-level (CFR) vs. transaction-level (NOC) reconfigurability – memory mapped – streaming

hard interconnect(s)

Kees Goossens 2009-06-02 Tubs.CITY 10

hard interconnect(s)

  • 1. hardwired interconnect

CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory BAC ICAP T3 T1 T2

dynamic partial reconfiguration – no constraints on soft IP placement loss of flexibility – fewer LUTs

C1 C2 c3

slide-6
SLIDE 6

6

Kees Goossens 2009-06-02 Tubs.CITY 11

  • 2. unified interconnect

CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory A2 A1 BAC BA ICAP T3 T1 T2

  • ne interconnect (e.g. NOC) for

– data for functional mode – control for programming – bitstreams for configuration dynamic partitioning of different interconnects

single hard interconnect

Kees Goossens 2009-06-02 Tubs.CITY 12

single hard interconnect

  • 3. data coercion

CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory

data = control = bitstream = … connect a data port to a configuration port – decrypt bitstreams bitstream data

slide-7
SLIDE 7

7

Kees Goossens 2009-06-02 Tubs.CITY 13

single hard interconnect

  • 3. data coercion

CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory PH IP

data = control = bitstream connect a data port to a configuration port – decrypt bitstreams – run-time compute / optimise bitstreams

  • JIT, peephole

bitstream

Kees Goossens 2009-06-02 Tubs.CITY 14

single hard interconnect

  • 3. data coercion

CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory TR TV DUT

data = control = bitstream = test connect a data port to a configuration port – decrypt bitstreams – run-time compute / optimise bitstreams connect a data port to a test port – run-time structural test data test data data

slide-8
SLIDE 8

8

Kees Goossens 2009-06-02 Tubs.CITY 15

  • verview

applications network on chip FPGA key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions

Kees Goossens 2009-06-02 Tubs.CITY 16

dynamic partial reconfiguration

“hardware operating system” implements run-time scheduling of

  • 1. multiple concurrent applications

– independent applications on own virtual platform

  • no communication, no interference

– activation given by user, environment, etc.

T1 T2 T3 BAC C1 C2 C3 A1 A2 BA

app T time app D A app AC

slide-9
SLIDE 9

9

Kees Goossens 2009-06-02 Tubs.CITY 17

dynamic partial reconfiguration

“hardware operating system” implements run-time scheduling of

  • 1. multiple concurrent applications
  • 2. parts of single applications (soft IP, “hardware tasks”)

– multiplex resources of a single application

BAC C1 C2 C3 A1 A2 BA

app T time app D A C

Kees Goossens 2009-06-02 Tubs.CITY 18

dynamic partial reconfiguration

“hardware operating system” implements run-time scheduling of

  • 1. multiple concurrent applications
  • 2. parts of single applications (soft IP, “hardware tasks”)

– multiplex resources of a single application – internal state

BAC C1 C2 C3 A1 A2 BA

app T time app D A C

state

slide-10
SLIDE 10

10

Kees Goossens 2009-06-02 Tubs.CITY 19

dynamic partial reconfiguration

  • 1. system manager

– resource management (CFR, NOC, …)

  • inter-application virtual platforms

time

system manager A C application manager BAC T application manager

Kees Goossens 2009-06-02 Tubs.CITY 20

dynamic partial reconfiguration

  • 1. system manager

– resource management (CFR, NOC, …)

  • inter-application virtual platforms
  • intra-application phases

– NOC programming – soft IP / (sub)-application configuration time

system manager A C application manager BAC

slide-11
SLIDE 11

11

Kees Goossens 2009-06-02 Tubs.CITY 21

dynamic partial reconfiguration

  • 1. system manager
  • 2. application manager

– application programming time

system manager A C application manager BAC T application manager

Kees Goossens 2009-06-02 Tubs.CITY 22

dynamic partial reconfiguration

  • 1. system manager
  • 2. application manager

– application programming – intra-application persistent data management time

system manager A C application manager BAC

BAC C1 C2 C3 A1 A2 BA

state

slide-12
SLIDE 12

12

Kees Goossens 2009-06-02 Tubs.CITY 23

  • verview

applications FPGA network on chip key ideas – hardwired NOC – unified interconnect – data coercion / type casting dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”) example conclusions

Kees Goossens 2009-06-02 Tubs.CITY 24

modelling

SystemC – bit & cycle accurate NOC model – behavioural CFR models – accurate bitstream structure – behavioural hard IP models model – starting / stopping of applications

  • dynamic, based on user input

– starting / stopping of sub-applications

  • dynamic, based on flow of data

– configuration: loading of bitstreams for soft IP; clock & reset – programming: of NOC, system & sub-application managers – management of persistent state

slide-13
SLIDE 13

13

Kees Goossens 2009-06-02 Tubs.CITY 25

single hard interconnect

example

system manager – program NOC for configuration CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory A2 A1 BAC BA system manager

application manager Kees Goossens 2009-06-02 Tubs.CITY 26

single hard interconnect

example

system manager – program NOC for configuration – configure: load bitstreams

  • including bitstream syntax, etc.

CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory A2 A1 BAC BA system manager

application manager bitstream programming data

slide-14
SLIDE 14

14

Kees Goossens 2009-06-02 Tubs.CITY 27

single hard interconnect

example

system manager – program NOC for configuration – configure: load bitstreams – program NOC for (sub)-application A CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory A2 A1 BAC BA system manager

application manager bitstream programming data Kees Goossens 2009-06-02 Tubs.CITY 28

single hard interconnect

example

system manager – program NOC for configuration – configure: load bitstreams – program NOC for (sub)-application A – program & start application manager

  • including clocking & reset

CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory A2 A1 BAC BA system manager

application manager bitstream programming data

slide-15
SLIDE 15

15

Kees Goossens 2009-06-02 Tubs.CITY 29

single hard interconnect

example

system manager – program NOC for configuration – configure: load bitstreams – program NOC for (sub)-application A – program & start application manager application manager – programs & starts sub-app A

  • soft IP fn is modelled by CFR

CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory A2 A1 BAC BA system manager

application manager

bitstream programming data Kees Goossens 2009-06-02 Tubs.CITY 30

single hard interconnect

example

system manager – program NOC for configuration – configure: load bitstreams – program NOC for (sub)-application A – program & start application manager application manager – programs & starts sub-app A sub-application A runs CFR CFR CFR CFR IO processor CPU

  • n-chip

memory

  • ff-chip

memory de/encrypt accelerator

  • n-chip

memory A2 A1 BAC BA system manager

application manager

bitstream programming data

slide-16
SLIDE 16

16

Kees Goossens 2009-06-02 Tubs.CITY 31

conclusions

ideas: – hardwire NOC – unified interconnects – data coercion / type casting very detailed model many simplifications & restrictions many open issues – design flow: soft IP placement, binding, relocation, etc. – application model:

  • extend use-case model with intra-application dynamism
  • more general notions of persistent state

– implementation: separation of system & application managers

Kees Goossens 2009-06-02 Tubs.CITY 32