Low Power Flitwise Routing in an Unidirectional Torus with Minimal - - PowerPoint PPT Presentation

low power flitwise routing in an unidirectional torus
SMART_READER_LITE
LIVE PREVIEW

Low Power Flitwise Routing in an Unidirectional Torus with Minimal - - PowerPoint PPT Presentation

Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering J org Mische and Theo Ungerer Systems and Networking Department of Computer Science University of Augsburg NoCArc Workshop Vancouver, Canada, 2012-12-01 J org


slide-1
SLIDE 1

Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

  • rg Mische and Theo Ungerer

Systems and Networking Department of Computer Science University of Augsburg

NoCArc Workshop Vancouver, Canada, 2012-12-01

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

1

slide-2
SLIDE 2

Motivation

On-chip Networks for Embedded Systems

◮ Connect small cores by a NoC ◮ Routers must be small and power-efficient, too

Alternative Router Design

◮ Reduce area and power consumption ◮ Reduce complexity of router ◮ Provide acceptable network throughput

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

2

slide-3
SLIDE 3

Outline

Router Microarchitecture Routing Algorithm Evaluation of Throughput and Costs

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

3

slide-4
SLIDE 4

Conventional Router in a Mesh

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

4

slide-5
SLIDE 5

Conventional Router Architecture

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

5

slide-6
SLIDE 6

Conventional Router Architecture

mesh

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

5

slide-7
SLIDE 7

Conventional Router Architecture

mesh

◮ 4+1 input ports,

4+1 output ports

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

5

slide-8
SLIDE 8

Conventional Router Architecture

mesh

◮ 4+1 input ports,

4+1 output ports

◮ 5x5 crossbar

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

5

slide-9
SLIDE 9

Conventional Router Architecture

mesh

◮ 4+1 input ports,

4+1 output ports

◮ 5x5 crossbar

wormhole routing

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

5

slide-10
SLIDE 10

Conventional Router Architecture

mesh

◮ 4+1 input ports,

4+1 output ports

◮ 5x5 crossbar

wormhole routing

◮ input or output buffers

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

5

slide-11
SLIDE 11

Conventional Router Architecture

mesh

◮ 4+1 input ports,

4+1 output ports

◮ 5x5 crossbar

wormhole routing

◮ input or output buffers ◮ buffers for

virtual channels

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

5

slide-12
SLIDE 12

Conventional Router Architecture

mesh

◮ 4+1 input ports,

4+1 output ports

◮ 5x5 crossbar

wormhole routing

◮ input or output buffers ◮ buffers for

virtual channels

◮ pipelined logic

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

5

slide-13
SLIDE 13

Unidirectional Torus

comparision with mesh

bidirectional 1× bandwith

  • 0.5×

#ports/links + 2× link length

link area

  • 0.5×

link capacity

  • 2012-12-01

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

6

slide-14
SLIDE 14

Folded Unidirectional Torus

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

7

slide-15
SLIDE 15

Reducing Router Complexity

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

8

slide-16
SLIDE 16

Reducing Router Complexity

unidirectional torus

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

8

slide-17
SLIDE 17

Reducing Router Complexity

unidirectional torus

◮ 2+1 input ports,

2+1 output ports

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

8

slide-18
SLIDE 18

Reducing Router Complexity

unidirectional torus

◮ 2+1 input ports,

2+1 output ports

◮ 3x3 crossbar

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

8

slide-19
SLIDE 19

Reducing Router Complexity

unidirectional torus

◮ 2+1 input ports,

2+1 output ports

◮ 3x3 crossbar

semi-bufferless routing

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

8

slide-20
SLIDE 20

Reducing Router Complexity

unidirectional torus

◮ 2+1 input ports,

2+1 output ports

◮ 3x3 crossbar

semi-bufferless routing

◮ 1 buffer

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

8

slide-21
SLIDE 21

Reducing Router Complexity

unidirectional torus

◮ 2+1 input ports,

2+1 output ports

◮ 3x3 crossbar

semi-bufferless routing

◮ 1 buffer ◮ simplified routing logic

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

8

slide-22
SLIDE 22

Paternoster Routing

◮ constantly rotating x-rings

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-23
SLIDE 23

Paternoster Routing

◮ constantly rotating x-rings

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-24
SLIDE 24

Paternoster Routing

◮ constantly rotating x-rings

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-25
SLIDE 25

Paternoster Routing

◮ constantly rotating x-rings

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-26
SLIDE 26

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-27
SLIDE 27

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-28
SLIDE 28

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-29
SLIDE 29

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-30
SLIDE 30

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-31
SLIDE 31

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-32
SLIDE 32

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-33
SLIDE 33

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-34
SLIDE 34

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot ◮ x transport

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-35
SLIDE 35

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot ◮ x transport ◮ turn to corner buffer

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-36
SLIDE 36

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot ◮ x transport ◮ turn to corner buffer ◮ enter y-ring if empty slot

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-37
SLIDE 37

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot ◮ x transport ◮ turn to corner buffer ◮ enter y-ring if empty slot

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-38
SLIDE 38

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot ◮ x transport ◮ turn to corner buffer ◮ enter y-ring if empty slot ◮ y transport

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-39
SLIDE 39

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot ◮ x transport ◮ turn to corner buffer ◮ enter y-ring if empty slot ◮ y transport

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-40
SLIDE 40

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot ◮ x transport ◮ turn to corner buffer ◮ enter y-ring if empty slot ◮ y transport

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-41
SLIDE 41

Paternoster Routing

◮ constantly rotating x-rings ◮ constantly rotating y-rings ◮ enter x-ring if empty slot ◮ x transport ◮ turn to corner buffer ◮ enter y-ring if empty slot ◮ y transport ◮ eject

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

9

slide-42
SLIDE 42

Conflicts

Corner buffer full → extra round Conflict at local eject port → extra round No free slot → request from predecessor

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

10

slide-43
SLIDE 43

Flow Control

Single Flit Packets

◮ Enable large data chunks by preserving flit order ◮ Simplify flow control

Preserving flit order

◮ Fixed route (X-Y) ◮ Preserve order in extra rounds

Overhead

◮ Each flit carries destination → increase link width ◮ In return: save head flit

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

11

slide-44
SLIDE 44

Evaluation

◮ Main focus: comparability with other low power router

publications

◮ Baseline: conventional router with 2 buffers per input port and

2 virtual channels (VC2x2)

◮ Throughput: synthetic traffic patterns from booksim ◮ Power/area: Orion 2.0 at 65nm

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

12

slide-45
SLIDE 45

Uniform Random Traffic

0.00 0.05 0.10 0.15 0.20 0.25 0.30 20 40 60 80 100

  • ffered load [flits/cycle/node]

average latency [cycles] PN1 PN2 PN4 PN8 PN16 PN32 VC2x2

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

13

slide-46
SLIDE 46

Tornado Traffic

0.00 0.05 0.10 0.15 0.20 0.25 0.30 20 40 60 80 100

  • ffered load [flits/cycle/node]

average latency [cycles] PN1 PN2 PN4 PN8 PN16 PN32 VC2x2

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

14

slide-47
SLIDE 47

Tornado Traffic Pattern

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

15

slide-48
SLIDE 48

Neighbor Traffic Pattern

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

16

slide-49
SLIDE 49

Neighbor Traffic

0.0 0.2 0.4 0.6 0.8 1.0 20 40 60 80 100

  • ffered load [flits/cycle/node]

average latency [cycles] PN1−SW PN32−SW PN−NE VC2x2

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

17

slide-50
SLIDE 50

Area

VC2x8 VC2x4 VC2x2 VC2x1 VC1x8 VC1x4 VC1x2 VC1x1 PN64 PN32 PN16 PN8 PN4 PN2 PN1 crossbar VC allocator switch allocator buffers area [mm²] 0.00 0.02 0.04 0.06 0.08 0.10 0.12

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

18

slide-51
SLIDE 51

Power vs. Throughput

  • 0.0

0.1 0.2 0.3 0.4 10 20 30 40 50

  • PN(1,...,64)

VC1x(1,2,4) VC2x(1,2,4) VC4x(1,2,4) saturation throughput [flits/cycle/node] power [mW]

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

19

slide-52
SLIDE 52

Conclusion

Summary

◮ Semi-bufferless routing in an unidirectional torus ◮ Save energy by omitting

packets, flow control and virtual channels

◮ Low power for low throughput

Future Work

◮ Real workloads ◮ Power modelling for structural size < 65nm (DSENT / VHDL) ◮ Guaranteed Service (GS) for real-time applications

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

20

slide-53
SLIDE 53

Thank You

2012-12-01 J¨

  • rg Mische / Low Power Flitwise Routing in an Unidirectional Torus with Minimal Buffering

21