Reconfigurable Computing Computing Reconfigurable On- -line line - - PowerPoint PPT Presentation

reconfigurable computing computing reconfigurable on line
SMART_READER_LITE
LIVE PREVIEW

Reconfigurable Computing Computing Reconfigurable On- -line line - - PowerPoint PPT Presentation

Reconfigurable Computing Computing Reconfigurable On- -line line communication communication strategies strategies On Chapter 7 7 Chapter Prof. Dr.- -Ing. Jrgen Teich Ing. Jrgen Teich Prof. Dr. Lehrstuhl fr Hardware- -Software


slide-1
SLIDE 1

Reconfigurable Computing

Reconfigurable Reconfigurable Computing Computing On On-

  • line

line communication communication strategies strategies Chapter Chapter 7 7

  • Prof. Dr.
  • Prof. Dr.-
  • Ing. Jürgen Teich
  • Ing. Jürgen Teich

Lehrstuhl für Hardware Lehrstuhl für Hardware-

  • Software

Software-

  • Co

Co-

  • Design

Design

slide-2
SLIDE 2

Reconfigurable Computing

2

On On-

  • line connection

line connection -

  • Motivation

Motivation

  • Routing-conscious temporal placement

algorithms consider distance among components during placement

  • However, they do not consider

implementation of a dynamic connection mechanism required for communication among components.

  • In this section, we will investigate existing approaches for

solving the communication problem between components dynamically placed on and removed from the device, namely: Bus-based approaches Circuit routing Network-on-Chip (NoC) approaches

slide-3
SLIDE 3

Reconfigurable Computing

BUS BUS-

  • based communication

based communication

3

slide-4
SLIDE 4

Reconfigurable Computing

4

BUS BUS -

  • oriented communication
  • riented communication
  • Many components connected at fixed

locations

  • One arbiter for BUS-Management
  • SoC (System on Chip): Buses can be

used to connect different modules

ARM AMBA

Advance high-performance bus (AHB) Advance peripheral bus (APB)

IBM CoreConnect

Processor local bus (PLB) On-chip peripheral bus (OPB)

Silicore Whishbone

Mod4 Mod 1 Mod3 Mod2 Arbiter

slide-5
SLIDE 5

Reconfigurable Computing

5

  • Using standard bus-arbiter

(Becker)

Device is divided into slots Each task must be placed in a slot Each component implements the bus-transaction Each component can be a master An arbiter manages the bus- assignment

OS-frame

Quelle: ITIV, Uni Karlsruhe (TH)

Decompessor Control

Bus-Macro

Master- Module

Controller Com

ICAP

Module 0

Mod Com

Module 1

Mod Com Mod Com

Module 2

Mod Com

Module 3

Mod Com

Module 4

Quelle: ITIV, Uni Karlsruhe (TH)

BUS BUS -

  • oriented communication
  • riented communication
slide-6
SLIDE 6

Reconfigurable Computing

6

  • Encapsulating the BUS-transaction

in a wrapper (Platzner, Walder)

Divide the device into slots Each task must be placed in a given slot A slot is enveloped in a wrapper which hides the bus-transaction process

  • Communication takes place through

a fixed module called the OS.

Each module can send a message by writing in its send buffer The OS copies messages from the send buffers to the receive buffers of modules The receive modules read their message from its receive buffer

OS-frame

task- slot task- slot task- slot task- slot Inter Frame Communication Channels (IFCC)

Communication via the OS Communication via the OS

slide-7
SLIDE 7

Reconfigurable Computing

7

  • Communication with off-chip module is also done via the OS

OS-frame

Communication via the OS Communication via the OS

slide-8
SLIDE 8

Reconfigurable Computing

Circuit switching Circuit switching

8

slide-9
SLIDE 9

Reconfigurable Computing

9

  • Architecture:

Set of Processing elements Communication signals are set between two PEs using a set of switches on a path from the source to the destination Advantage: Direct communication. No need to process packets Drawbacks: Computing a route is expensive. Difficult to be done on-line Routed lines create a large amount of prohibited area Prohibited area can be overcome by using an extra layer exclusively for circuit routing

Dynamic Networks Dynamic Networks – – circuit routing circuit routing

Prohibited area

slide-10
SLIDE 10

Reconfigurable Computing

10

  • A set of n processing elements and k segmented buses
  • Crosspoints (switches) are used to set the connection

between the segments at run-time

The reconfigurable multiple bus (RMB) approach The reconfigurable multiple bus (RMB) approach

PE 5 PE 4 PE 3 PE 2 PE 1 Switches

slide-11
SLIDE 11

Reconfigurable Computing

11

  • The sender always initiates a communication request and

terminates (frees) an established communication path

  • Each communication path is granted until the end of the

communication

OS-frame

The reconfigurable multiple bus (RMB) approach The reconfigurable multiple bus (RMB) approach

PE 5 PE 4 PE 3 PE 2 PE 1

slide-12
SLIDE 12

Reconfigurable Computing

12

  • On a columnwise reconfigurable device, the RMB

provides a modular communication infrastructure

  • All the switches in one column are grouped together
  • The separation of horizontal reconfigurable regions is

done via bus macros

OS-frame

The reconfigurable multiple bus (RMB) approach The reconfigurable multiple bus (RMB) approach

PE 5 PE 4 PE 3 PE 2 PE 1 Bus macros

slide-13
SLIDE 13

Reconfigurable Computing

13

Algorithms Algorithms for for Reconfiguration Reconfiguration

T1 T2 T3 T7 T6 T5 T8 T4 T9 T1

M1

1

M1

1

M1

1

M1

1

T1 T2 T3

slide-14
SLIDE 14

Reconfigurable Computing

14

Algorithms Algorithms for for Reconfiguration Reconfiguration

T1 T2 T3

FPGA RMB

M1 M2 M3

slide-15
SLIDE 15

Reconfigurable Computing

15

Algorithms Algorithms for for Reconfiguration Reconfiguration

FPGA RMB M1 M2 M3

slide-16
SLIDE 16

Reconfigurable Computing

16

Algorithms Algorithms for for Reconfiguration Reconfiguration

FPGA RMB M1 M2 M3

slide-17
SLIDE 17

Reconfigurable Computing

17

Algorithms Algorithms for for Reconfiguration Reconfiguration

FPGA RMB M1 M2 M3

slide-18
SLIDE 18

Reconfigurable Computing

18

Algorithms Algorithms for for Reconfiguration Reconfiguration

FPGA RMB M1 M2 M3

slide-19
SLIDE 19

Reconfigurable Computing

19

Algorithms Algorithms for for Reconfiguration Reconfiguration

FPGA RMB M1 M2 M3 FPGA RMB M1 M2 M3 FPGA RMB M1 M2 M3

{ } (

) ( ) ( ) { }

j k i E j i

n k

σ σ

σ

≤ ≤ ∈

, , max min

, , 1 K

( )

( ) ( )

j i

E j i

σ σ

σ

∈ ,

max min

( ) ( )

( )

E j i

j i

,

min σ σ

σ

Minimum Bandwidth (MBW) Minimum Cutwidth Linear Arrangement (MCLA) Optimal Linear Arrangement (OLA)

slide-20
SLIDE 20

Reconfigurable Computing

20

Algorithms Algorithms for for Reconfiguration Reconfiguration

FPGA RMB M1 M2 M3

slide-21
SLIDE 21

Reconfigurable Computing

21

Example: Video game Pong Example: Video game Pong

slide-22
SLIDE 22

Reconfigurable Computing

22

Video game: Module Relocation Video game: Module Relocation

Racket Position User Input Ball Position Visualization

4 20 20 38

slide-23
SLIDE 23

Reconfigurable Computing

23

Video game: Module Relocation Video game: Module Relocation

Racket Position User Input Ball Position Visualization

4 20 20 38

slide-24
SLIDE 24

Reconfigurable Computing

24

4 20 20 38

Video game: Module Relocation Video game: Module Relocation

Racket Position User Input Ball Position Visualization

CP User Input CP Racket Position CP Ball Position CP Visualization

Task:

  • Place modules such that the least number of

bus segments is required Solution:

  • Integer Linear Program (FPL’06)
slide-25
SLIDE 25

Reconfigurable Computing

25

4 20 20 38

Video game: Module Relocation Video game: Module Relocation

Racket Position User Input Ball Position Visualization

CP User Input CP Racket Position CP Ball Position CP Visualization CP User Input CP Ball Position CP Visualization CP Racket Position

58 parallel segments

slide-26
SLIDE 26

Reconfigurable Computing

26

Video game: Module Relocation Video game: Module Relocation

CP User Input CP Ball Position CP Visualization CP Racket Position

Length of longest connection is 3 58 parallel segments

Task:

  • Place modules such that for given maximal

number of parallel bus segments the length of the longest connection distance is minimized Solution:

  • Integer Linear Program (FPL’06)
slide-27
SLIDE 27

Reconfigurable Computing

27

Video game: Module Relocation Video game: Module Relocation

CP User Input CP Ball Position CP Visualization CP Racket Position

Length of longest connection is 2

slide-28
SLIDE 28

Reconfigurable Computing

28

Video game: Module Relocation Video game: Module Relocation

Length of longest connection is 2

CP Racket Position CP User Input CP Ball Position CP Visualization

slide-29
SLIDE 29

Reconfigurable Computing

29

Erlangen Erlangen Slot Slot Machine Machine

slide-30
SLIDE 30

Reconfigurable Computing

30

Video game: Video game: Erlangen Erlangen Slot Machine (ESM) Slot Machine (ESM)

slide-31
SLIDE 31

Reconfigurable Computing

31

Implementation Implementation

CP0 CP2 CP1 CP3

Racket Position User Input Display Ball Position

slide-32
SLIDE 32

Reconfigurable Computing

32

References References

CP0 CP2 CP1 CP3

Racket Position User Input Display Ball Position

[1] Minimizing Communication Costs for Reconfigurable Slot Modules

  • S. Fekete, J. van der Veen, M. Majer, J. Teich

In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL), Madrid, Spain, August 28-30, 2006. [2] A Practical Approach for Circuit Routing on Dynamic Reconfigurable Devices

  • A. Ahmadinia, C. Bobda, J. Ding, M. Majer, J. Teich, J. van der Veen and S. Fekete,

In Proceedings of the IEEE International Workshop on Rapid System Prototyping (RSP), Montreal, Canada, pp. 84-90, June 8-10, 2005. [3] The Erlangen Slot Machine: A flexible FPGA-platform for partially reconfigurable applications at run-time.

  • J. Angermeier, D. Göhringer, M. Majer and J. Teich.

Tutorial, 20th International Conference on Architecture of Computing Systems (ARCS 2007), Springer LNCS series, Zurich, Switzerland, March 12-15, 2007. [4] The Erlangen Slot Machine: A Dynamically Reconfigurable FPGA-Based Computer.

  • M. Majer, J. Teich, A. Ahmadinia and C. Bobda.

Journal of VLSI Signal Processing Systems, Springer, vol. 46(2), March 2007. [5] The Erlangen Slot Machine - A Platform for Interdisciplinary Research in Reconfigurable Computing.

  • J. Angermeier, D. Göhringer, M. Majer, J. Teich, S. Fekete and J. van der Veen.

it - Information Technology, Heft 3/2007, Oldenbourg, München, 2007. [6] Optimal free-space management and routing-conscious dynamic placement for reconfigurable computing.

  • A. Ahmadinia, C. Bobda, S. Fekete, J. Teich and J. van der Veen.

IEEE Transactions on Computers, volume 56, number 3, 2007.

slide-33
SLIDE 33

Reconfigurable Computing

Packet Packet-

  • based communication

based communication

33

slide-34
SLIDE 34

Reconfigurable Computing

34

A Network on Chip consists of

A set of processing elements A set of network elements also called routers Each PE is connected to a network element Each PE is assigned to the same address as its corresponding network element Communication is packet-based Each packet contains the destination address and some data Routers are used to forward packets in the right direction according to the destination address A router contains little logic. It may have some buffer for storage of packets in case of high traffic

NoC (Network on Chip) NoC (Network on Chip) – – based based communication communication

Router NoC

slide-35
SLIDE 35

Reconfigurable Computing

35

  • Limitations of fixed NoC communication

Fixed position for modules Larger modules must be split

Packet based communication inside a

component is not efficient

  • Direct communication must be used on a

module boundary

  • We seek a network infrastructure which

allows modules placed at a given location to

use all the resources in their area

changes according to the placement of

modules on the device

Each component always accesses other

components and pins for communication

Dynamic Networks Dynamic Networks

slide-36
SLIDE 36

Reconfigurable Computing

36

  • Architecture: like NoC architecture

Set of Processing elements Set of network elements implementing routers in their basic configuration Each PE is connected to a network element Direct communication among neighbour PEs Communication is packet-based Each packet contains the destination address and some data The ratio router size/module size must be kept small

Dynamic Networks Dynamic Networks – – DyNoC (Dynamic NoC) DyNoC (Dynamic NoC)

slide-37
SLIDE 37

Reconfigurable Computing

37

Dynamics in the NoC

Each module is represented as a rectangular box encapsulating a given function All resources (routers and PEs) in a placement area of a module are assigned to the module

Therefore, the network logic should be flexible to be used as logic in a given module

Upon completion, each module restores its routers to their basic configuration Up to a selected router, all the routers in the area of a component are no more accessible from the network Each placed component accesses the network using the router attached to it North-East (NE) PE Network varies with temporal placement of modules on the device

Dynamic Networks Dynamic Networks – – DyNoC DyNoC

slide-38
SLIDE 38

Reconfigurable Computing

38

  • Module and pin reachability:

A module (pin) is reachable iff all the messages sent to this module (pin) can reach their destination.

  • We define the component graph

G = (V,E) as follows:

V is the set of components and pins An edge (u,v) belongs to E iff a path exists between u an v

  • If G is connected, then all

components and pins are reachable

  • This increases the architectural

requirements

Dynamic Networks Dynamic Networks – – DyNoC DyNoC -

  • Reachability

Reachability

slide-39
SLIDE 39

Reconfigurable Computing

39

  • Additional architectural requirements

A ring of network elements must be available around the chip The PEs at the chip boundary must be connected to the router at the chip boundary Each placed component accesses the network using the PE associated to it North-East (NE) PE Only PEs are allowed to be at the boundary of a component

Dynamic Networks Dynamic Networks – – DyNoC DyNoC -

  • Reachability

Reachability

slide-40
SLIDE 40

Reconfigurable Computing

40

  • Theorem (Bobda et al.): If each component is

synthesized in such a way that it is internally surrounded only by processing elements, then each placement on the reconfigurable device causes a strongly connected component graph.

  • Proof:

Assume that the corresponding component graph is not strongly connected, then at least two components abut

  • r one component abuts the device

boundary. Consider, for example, case 1): Either the two components must overlap Or, one component uses some routers on its boundary.

Dynamic Networks Dynamic Networks – – DyNoC DyNoC – – Reachability Reachability

PE PE PE PE PE PE PE PE

A

PE PE PE PE PE

X X X X X X X X

PE PE PE PE PE PE PE PE PE PE PE

X A

slide-41
SLIDE 41

Reconfigurable Computing

41

  • Example of a feasible placement

Dynamic Networks Dynamic Networks – – DyNoC DyNoC – – Reachability Reachability

slide-42
SLIDE 42

Reconfigurable Computing

42

  • Routing in a mesh without obstacles
  • The XY-router

Fast and Efficient Local decisive 5 inputs and 5 outputs channels Input-FIFO on each channel

Dynamic Networks Dynamic Networks – – DyNoC DyNoC – – Routing Routing

The router compares its address to the destination address of a packet

If X-router < X-packet, packet is sent east If X-router > X-packet, packet is sent west If X-router = X-packet and Y-router < Y-packet, packet is sent north If X-router = X-packet and Y-router > Y-packet, packet is sent south If X-router = X-packet and Y-router = Y-packet, copy packet to local FIFO

x y

slide-43
SLIDE 43

Reconfigurable Computing

43

  • The dynamic placement of

components creates obstacles in the network

  • The routing must be able to recognize
  • bstacles and be able to surround

components.

  • Vertical and horizontal obstacles are

treated differently

Dynamic Networks Dynamic Networks – – DyNoC DyNoC – – Routing Routing

Obstacles

slide-44
SLIDE 44

Reconfigurable Computing

44

  • Dealing with obstacles

XY-Router Additionally:

Activate signal to neighbours If the router is available, this control signal is high. Otherwise, it is low

  • Component surrounding strategies are

required

  • The S-XY (Surround XY) router

Operates in three modes

The N-XY: Normal operating mode. The packets are routed according to the XY strategy The SH-XY: The router enters this mode when a horizontal obstacle is found The SV-XY: The router enters this mode when a vertical obstacle is found

Dynamic Networks Dynamic Networks – – DyNoC DyNoC – – Routing Routing – – S S-

  • XY

XY

slide-45
SLIDE 45

Reconfigurable Computing

45

  • The SH-XY mode: Surrounding obstacles in the horizontal

direction

Dynamic Networks Dynamic Networks – – DyNoC DyNoC – – Routing Routing – – S S-

  • XY

XY

Dest Obstacle Component

Routing Path 2 Routing Path 1

YDest > YRouter

XDest < XRouter

XDest = XRouter YDest < YRouter XDest = XRouter YDest = YRouter

  • Stamp packets to avoid “ping-pong” game
slide-46
SLIDE 46

Reconfigurable Computing

46

Implementation Implementation

  • Virtex II 6000

– 4x4 DyNoC – 7% of FPGA-Usage – Router latency: 2,5ns – 32bit Data-BUS and 6x4x32bit FIFO per Router

slide-47
SLIDE 47

Reconfigurable Computing

47

  • Surrounding obstacles in the vertical direction
  • Place a stamp on packets to avoid a “ping-pong”

game

Dynamic Networks Dynamic Networks – – DyNoC DyNoC – – Routing Routing – – S S-

  • XY

XY

Obstacle Component

Destination Component

Routing Path1 Routing Path2

Ping pong game

slide-48
SLIDE 48

Reconfigurable Computing

48

  • Theorem (Bobda et al.): The S-XY algorithm is deadlock-

free, i.e., each packet will reach its destination after a finite number of steps.

  • Proof: Exercise

Prove that

Each component is reacheable, i.e., a path is always available

from source to destination

A packet is never blocked in the network (Theorem 1) Since a packet can never be blocked, this will happen only if a

packet is looping around a component.

Prove that this will never happen!

Dynamic Networks Dynamic Networks – – DyNoC DyNoC – – Routing Routing – – S S-

  • XY

XY

slide-49
SLIDE 49

Reconfigurable Computing

49

  • The decision to left/right or up/down is taken arbitrarily
  • In the worst case, the path can be very long
  • To avoid this, consider guiding the router by the components

Dynamic Networks Dynamic Networks – – DyNoC DyNoC – – Routing Routing – – S S-

  • XY

XY

00 00 01 01 01 01 01 00 00 00 00 01 01 01 01 01 00 00

Guided routing C4 C 3 C2 C1

S D