of Efficient 3D Network-on-Chip for Custom Multi-Core SoC Akram Ben - - PowerPoint PPT Presentation

of efficient 3d network on chip
SMART_READER_LITE
LIVE PREVIEW

of Efficient 3D Network-on-Chip for Custom Multi-Core SoC Akram Ben - - PowerPoint PPT Presentation

BWCCA 2010 Fukuoka, Japan November 4-6 2010 Architecture and Design of Efficient 3D Network-on-Chip for Custom Multi-Core SoC Akram Ben Ahmed, Abderazek Ben Abdallah, Kenichi Kuroda The University of Aizu School of Computer Science and


slide-1
SLIDE 1

Architecture and Design

  • f Efficient 3D Network-on-Chip

for Custom Multi-Core SoC

Akram Ben Ahmed, Abderazek Ben Abdallah, Kenichi Kuroda The University of Aizu School of Computer Science and Engineering, Adaptive Systems Laboratory, Aizu-Wakamatsu, Japan. Email:m5141153@u-aizu.ac.jp

BWCCA 2010 Fukuoka, Japan November 4-6 2010

The University of Aizu Adaptive systems lab 1

slide-2
SLIDE 2

Outline

  • Introduction
  • 2D-OASIS-NoC Overview
  • Minimal Hop Routing Algorithm
  • 3D-OASIS-NoC Architecture
  • Design Results
  • Conclusion

The University of Aizu Adaptive systems lab 2

slide-3
SLIDE 3

Introduction

  • Communication becomes an essential part in

current Systems On chip (SoC).

  • Networks-On-chip (NoC) overcomes bus-based

systems problems.

  • NoC features:

– Simple and scalable architecture. – Connects processors, memories and other custom designs together. – Switches packets instead of switching wires.

The University of Aizu Adaptive systems lab 3

slide-4
SLIDE 4

2D-OASIS-NoC overview

  • 4x4 Mesh topology
  • Wormhole

switching

  • Stall-and-Go flow

control

  • 76 bit flit

The University of Aizu Adaptive systems lab 4

FIFO FIFO FIF O FIFO

WEST EAST NORTH SOUTH

  • K. Mori, A. Ben Abdallah, K. Kuroda, Design and Evaluation of a Complexity Effective Network-on-Chip Architecture on FPGA, Proc. of The 19th Intelligent

System Symposium (FAN 2009), pp.318-321, Sep. 2009.

slide-5
SLIDE 5

2D-OASIS-NoC pipeline stages

Cycle 1 2 3 4 5 6 7 8 9 10 RC SA CT RC SA CT RC SA CT

03 13 12

The University of Aizu Adaptive systems lab 5

XY routing Bidirectional links 76 bit flit 5 ports switch 4x4 Mesh topology

slide-6
SLIDE 6

2D-OASIS-NoC drawbacks

  • 2D-NoC advantages become limited and 3D-NoC

showed better performance:

The University of Aizu Adaptive systems lab 6

―Decreases the number of hops.

  • Effect the latency and the throughput
slide-7
SLIDE 7

Contribution

  • Efficient routing algorithm named minimal

hop routing algorithm (MHRA).

  • 3D architecture, design and preliminary

results. Reduce overall traffic latency by hops minimization

The University of Aizu Adaptive systems lab 7

slide-8
SLIDE 8

Minimal hop routing algorithm

The University of Aizu Adaptive systems lab 8

xadr == xdst xadr < xdst yadr == ydst yadr < ydst zadr == zdst zadr < zdst Next_port = EAST Next_port = WEST Next_port = NORTH Next_port = SOUTH Next_port = UP Next_port = DOWN Next_port = LOCAL No No Yes Yes Yes No No No Yes Yes No Yes

Start Route To switch allocator

slide-9
SLIDE 9

Adaptive systems lab 9 The University of Aizu

1 001 001 011 000….1

xaddr= 000 < xdst= 001

0000100

Minimal hop routing algorithm

300 301 310 311

011 001 001 000 000 000 EAST

Node 000 Node 001

module

To next node From switch allocator From previous node

Input port architecture

Current node addresses Destination node addresses

EAST=0000100

2x2x4 Mesh topology

Payload Next port

Packet format

slide-10
SLIDE 10

300 301 310 311 Adaptive systems lab 10 The University of Aizu

Minimal hop routing algorithm

011 001 001 000 000 001 Node 011 Node 001 xaddr= 001 = xdst= 001 yaddr= 000 < ydst= 001

NORTH

NORTH= 0000010

1 001 001 011 000….1 0000010

slide-11
SLIDE 11

Adaptive systems lab 11 The University of Aizu

Minimal hop routing algorithm

300 301 310 311

011 001 001 000 001 001 Node 111 Node 011 xaddr= 001 = xdst= 001 yaddr= 001 = ydst= 001 zaddr= 000 < zdst= 011

UP

UP= 0100000

1 001 001 011 000….1 0100000

slide-12
SLIDE 12

Adaptive systems lab 12 The University of Aizu

Minimal hop routing algorithm

xaddr= 001 = xdst= 001 yaddr= 001 = ydst= 001 zaddr= 011 = zdst= 011 011 001 001 011 001

300 301 310 311

001

LOCAL

LOCAL= 0000001

1 001 001 011 000….1 0000001

slide-13
SLIDE 13

3D-OASIS-NoC architecture: Switch architecture

  • The University of Aizu

Adaptive systems lab 13

SOUTH NORTH EAST WEST

PE S W E N

R

D U

slide-14
SLIDE 14

Flow Control Scheduling

3D-OASIS-NoC architecture: Switch allocation

STALL-Go Flow control

Round Robin

The University of Aizu Adaptive systems lab 14 stop-in (7) data-sent (7) sw-req(7) port-req (49) tail-sent (49) grant-out (7) sw-cntrl (49)

slide-15
SLIDE 15

3D-OASIS-NoC architecture: Crossbar traversal

From switch allocator To the Next node From Input port

The University of Aizu Adaptive systems lab 15

slide-16
SLIDE 16

Design results: Design methodology

  • Verilog HDL is used.
  • Quartus II
  • Target device : Stratix III
  • Modelsim

The University of Aizu Adaptive systems lab 16

Module # code lines Define.v 46 Route.v 80 Fifo.v 100 Input_port.v 113 Stop_go.v 56 Matrix_arb.v 111 Sw_alloc.v 109 Mux_out.v 55 Crossbar.v 45 Router.v 69 Network.v 158 Total 942

slide-17
SLIDE 17

Design results: Configuration parameters

Parameters 2D 3D Network size 4x4-mesh 2x2x4-mesh Buffer depth 4 4 Flit size 28 bit 33 bit Header 12 bit 17 bit Payload 16 bit 16 bit Switching Wormhole Wormhole Flow control Stall-Go Stall-Go Scheduling Round-robin Round-robin Routing X-Y MHRA

The University of Aizu Adaptive systems lab 17

slide-18
SLIDE 18

Design results: Delay Analysis

  • Flits payload are randomly generated.
  • One single destination node: OASIS-NoC (00) and 3D-

OASIS-NoC (000).

2D (Destination node:00) 3D (Destination node:000) Improvement % Node(Y-X) Delay Node(Z-Y-X) Delay 33 2200 311 1900 13.6 23 2700 211 2100 22.2 13 2600 111 1900 27 03 2300 011 1700 26

22% improvement

The University of Aizu Adaptive systems lab 18

slide-19
SLIDE 19

Design results: Hardware Complexity

Architecture Area (ALUTs) Power(mW) Speed(MHz) Balance Speed Area 2D 11016 867.97 123.3 126.23 106.18 3D 16812 883.10 113.51 114.23 97.68

8.5% decreased 52% increased 1.74 %

  • verhead

The University of Aizu Adaptive systems lab 19

slide-20
SLIDE 20

Conclusion

  • Combining the 3D integration with Network on Chips
  • ffers a good opportunity for big Multi-core SoC designs.
  • We present a hardware design for 3D OASIS Network-on-

Chip.

  • 3D-OASIS-NoC achieves about 22% overall delay

reduction compared with OASIS-NoC with only 1.74%

  • verhead and 52% additional area.

The University of Aizu Adaptive systems lab 20

slide-21
SLIDE 21

Future work

  • Test the design with Larger workloads

(like JPEG application).

  • Reduce the routing algorithm complexity.

The University of Aizu Adaptive systems lab 21

slide-22
SLIDE 22

Thank you

The University of Aizu Adaptive systems lab 22