Trade Offs in the Design of a Router with Both Guaranteed and - - PowerPoint PPT Presentation

trade offs in the design of a router with both guaranteed
SMART_READER_LITE
LIVE PREVIEW

Trade Offs in the Design of a Router with Both Guaranteed and - - PowerPoint PPT Presentation

Trade Offs in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks on Chip E. Rijpkema, K. Goossens, A. R dulescu, J. Dielissen, J. van Meerbergen, P. Wielage, and E. Waterlander why Networks-on-Chip problems


slide-1
SLIDE 1

Trade Offs in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks

  • n Chip
  • E. Rijpkema, K. Goossens, A. R dulescu,
  • J. Dielissen, J. van Meerbergen, P. Wielage,

and E. Waterlander

slide-2
SLIDE 2

2

Æthereal Network-on-Chip Philips Research

why Networks-on-Chip

  • deep sub micron
  • wire cost
  • timing closure
  • design complexity
  • increasing # of IP blocks
  • increasing dynamism

problems observed for SoC design decouple computation from communication

presentation session transport application network data link physical application demands hardware technology network independent network dependent services IP IP IP R R R router network

slide-3
SLIDE 3

3

Æthereal Network-on-Chip Philips Research

why Networks-on-Chip

  • problems observed for SoC design
  • deep sub micron
  • design complexity
  • key: decouple computation from communication

presentation session transport application network data link physical application demands hardware technology network independent network dependent services IP IP IP R R R router network

slide-4
SLIDE 4

4

Æthereal Network-on-Chip Philips Research

  • utline
  • services
  • combined router architecture
  • guaranteed throughput router architecture
  • best-effort router architecture
  • router prototype
  • conclusions
slide-5
SLIDE 5

5

Æthereal Network-on-Chip Philips Research

services I

  • we need a network that is
  • predictable
  • cost effective
  • build guarantees on top of guarantees
  • efficient network is efficient at every layer

constraints requirements guarantees efficiency services application demands hardware technology

slide-6
SLIDE 6

6

Æthereal Network-on-Chip Philips Research

services II

  • timeless guarantees
  • guaranteed data integrity
  • guaranteed data delivery
  • guaranteed in-order delivery
  • time related guarantees

(over bounded time interval)

  • guaranteed throughput
  • guaranteed latency

best-effort service (BE) guaranteed throughput service (GT)

slide-7
SLIDE 7

7

Æthereal Network-on-Chip Philips Research

1 2 3 4 1 2 3 4 1 2 3 4

guarantees vs. best-effort

rwc ravg rwc

guaranteed throughput guaranteed delivery guaranteed throughput

(bounded interval)

guaranteed delivery combination is beneficial

  • GT requires dimensioning for guaranteed throughput
  • BE requires dimensioning for average throughput

time time time

slide-8
SLIDE 8

8

Æthereal Network-on-Chip Philips Research

BE & GT combined architecture

  • conceptually, two disjoint routers
  • a router with GT service class
  • a router with BE service class
  • to obtain an efficient combination routers must

have similar architectures

priority/arbitration BE router GT router

programming

1 2 3 4 rwc

slide-9
SLIDE 9

9

Æthereal Network-on-Chip Philips Research

preferred solution

buffering strategy

  • output queuing
  • highest cost
  • highest performance
  • input queuing
  • lowest cost
  • lowest performance
  • virtual output queuing
  • moderate cost
  • high performance

1 2 N 1 2 N X

  • 1

2 N 1 2 N X

  • 1

2 N 1 2 N

slide-10
SLIDE 10

10

Æthereal Network-on-Chip Philips Research

contention

  • links in network are shared resources
  • contention occurs when multiple data request same

link at same time

  • GT and BE resolve contention differently
slide-11
SLIDE 11

11

Æthereal Network-on-Chip Philips Research

guaranteed throughput

  • to guarantee latency or bandwidth over finite interval
  • cannot drop data
  • must bound contention
  • rate-based scheduling
  • has high buffer costs (deep fifos/output queuing)
  • deadline-based scheduling
  • even higher buffer costs (deep priority queues)
  • contention-free routing
  • low buffer costs (shallow fifos)
slide-12
SLIDE 12

12

Æthereal Network-on-Chip Philips Research

contention-free routing I

  • scheduling packet injection in network

to avoid contention

  • in space: disjoint paths

as in pure circuit switching

  • in time: time-division multiplexing

as with a statically scheduled bus

  • in time and space: our solution
slide-13
SLIDE 13

13

Æthereal Network-on-Chip Philips Research

contention-free routing II

  • divide time in slots
  • a block is amount of data that fits in a slot
  • block entering router in slot n enters next in slot n+1
  • matches with input queuing

block 1 block 2 block 3

slot time 1 2 N 1 2 N X

  • n

n+1 n+3 n+2

slide-14
SLIDE 14

14

Æthereal Network-on-Chip Philips Research

contention-free routing III

  • routers have tables that
  • store contention resolution & routing information
  • allow distributed programming
  • small blocks low buffering cost

low latency throughput guarantee on smaller period

1

  • 1

2 3 4

  • S-1

1 1

  • 1

2 3 4

  • S-1
  • 1
  • 1

2 3 4

  • S-1

1

  • 1
  • 1

2 3 4

  • S-1

1

  • 1

1

  • 1

2 3 4

  • S-1

small slots

slide-15
SLIDE 15

15

Æthereal Network-on-Chip Philips Research

best-effort architecture

  • to ensure high resource utilization
  • statistical multiplexing
  • packet-switching
  • but implement BE service class
  • packet-switching
  • network flow control (routing mode)
  • contention resolution
slide-16
SLIDE 16

16

Æthereal Network-on-Chip Philips Research

packets and flits

  • packet = header + payload
  • packet might be transmitted in smaller parts called flits
  • flits divide time in iterations and must be scheduled
  • smaller flit size

higher scheduling rate lower latency less storage

flit 2 flit 3 flit 1 flit 4 payload

H

flit 2 flit 3 flit 4

time

flit 1

slide-17
SLIDE 17

17

Æthereal Network-on-Chip Philips Research

network flow control (routing mode)

  • store and forward routing
  • first receive whole packet
  • then transmit whole packet
  • virtual cut-through routing
  • send flit immediately
  • if next router can receive entire packet
  • wormhole routing
  • send flit immediately
  • if next router can receive that flit

packet flit flit flit packet packet latency storage performance/cost network flow control per router

slide-18
SLIDE 18

18

Æthereal Network-on-Chip Philips Research

contention resolution

  • queuing at input set paths from inputs to outputs
  • router has switch
  • bipartite graph matching
  • algorithm must
  • be fair
  • have low complexity (to schedule at flit rate)
  • approximation of maximal matching

1 2 3 1 2 3 X

slide-19
SLIDE 19

19

Æthereal Network-on-Chip Philips Research

combining GT and BE

  • links must be shared by GT and BE traffic
  • grain size of interleaving must match
  • block size = flit size
  • smallest value for this is given by implementation
  • minimize scheduler latency L
  • maximize data path speed

F

  • flit size = block size = F · L
slide-20
SLIDE 20

20

Æthereal Network-on-Chip Philips Research

router prototype

  • snapshot of current

prototype router:

  • input queuing
  • arity 5
  • 32 bits wide words
  • 8 flits deep BE queues
  • 256 slots
  • 0.25 mm2 CMOS12
  • 500 MHz data path
  • 166 MHz control path
  • flit size is 3 words
  • throughput per link:

500MHz·32bits = 16Gb/s

control

slide-21
SLIDE 21

21

Æthereal Network-on-Chip Philips Research

conclusions

  • for NoCs, guaranteed services are essential
  • demonstrated the useful combination of:
  • BE service class timeless guarantees
  • GT service class BE + time related guarantees
  • made trade-offs to come to efficient combined router
  • proved feasibility with router prototype
slide-22
SLIDE 22

22

Æthereal Network-on-Chip Philips Research

router prototype

  • snapshot of current prototype router:
  • 5 input and 5 output ports (arity 5)
  • 0.25 mm2 CMOS12
  • 500 MHz data path, 166 MHz control path
  • flit size of 3 words of 32 bits
  • 500x32 = 16 Gb/s throughput per link
  • 256 slots & 5x1 flit fifos for guaranteed-throughput traffic
  • 6x8 flit fifos for best-effort traffic
slide-23
SLIDE 23

23

Æthereal Network-on-Chip Philips Research

control control