Bounded Arbitration Algorithm for QoS-Supported On-chip Com m - - PowerPoint PPT Presentation

bounded arbitration algorithm for qos supported on chip
SMART_READER_LITE
LIVE PREVIEW

Bounded Arbitration Algorithm for QoS-Supported On-chip Com m - - PowerPoint PPT Presentation

Bounded Arbitration Algorithm for QoS-Supported On-chip Com m unication Moham m ad Abdullah Al Faruque Gereon Weiss Joerg Henkel Chair for Embedded Systems (CES) University of Karlsruhe (TH) Karlsruhe, Germany Chair for Em bedded System s


slide-1
SLIDE 1

1 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Bounded Arbitration Algorithm for QoS-Supported On-chip Com m unication

Moham m ad Abdullah Al Faruque Gereon Weiss Joerg Henkel Chair for Embedded Systems (CES) University of Karlsruhe (TH) Karlsruhe, Germany

slide-2
SLIDE 2

2 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Overview

Motivation Quality-of-Service (QoS) Related Works Bounded Arbitration Algorithm

  • QoS Specification
  • Algorithm Description
  • NoC Simulation Environment

Evaluation & Results Summary

slide-3
SLIDE 3

3 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Next Generation Handheld Devices

  • The download and the TV continue when

an incoming call is accepted

  • Games, Sensor nodes, Navigation etc.

Huge Computational Power and Application Concurrency

  • Computational Power MPSoC
  • Varying requirements Implementing

QoS mechanism

Dow nload File

X

TV – Channel …

X

Incoming Video Call!

Phone

X

Motivation

slide-4
SLIDE 4

4 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

  • Problems with Bus

communication

  • Deep Sub Micron problems
  • Scalability issues

[ M. Horowitz et. al.]

μP μP DSP DSP

MEM FPGA

ARB

B UART MPEG B

ARB ARB

μP μP μP

ETH

Motivation

hi-perf bus peripheral bus system bus

Bus based MPSoC design

  • Computation Power is provided

by multiple processing units

  • QoS is implemented by

different types of buses

Efficient QoS Mechanism at a m inim um cost is necessary Com m unication centric ( NoC based) MPSoC

  • Quality of Service needs to be

implemented in the packet

  • Needs to differentiate at

transaction level

  • Needs to provide tight guarantee

(i.e. guaranteed BW)

  • Resource utilization needs to be

kept minimum

  • Packet based communication
  • Highly Scalable
slide-5
SLIDE 5

5 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Guarantees / Probabilistic on

Perform ance related guarantee Reliability related guarantee

Perform ance:

  • Max end-to-end latency
  • Min throughput (% Bandwidth)
  • Max deviation of latency (Jitter)

Reliability:

  • In-order data transmission
  • Correctness of data
  • No loss of data (Lossless transmission)
  • Availability

Quality Of Services

Service Class Specification

slide-6
SLIDE 6

6 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Overview

Motivation Quality-of-Service (QoS) Related W orks Bounded Arbitration Algorithm Evaluation & Results Summary

slide-7
SLIDE 7

7 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Related W orks

Connection Based Approach: i.e. Æthereal

  • K. Goossens et al. Æthereal Network on Chip: Concepts,

Architectures, and Implementations, 2005.

  • E. Rijpkema et al. Trade-offs in the design of a router

with both guaranteed and best-effort services for networks on chip, 2003.

Service Class Based Approach: i.e. DiffServ, QNoC

  • Evgeny Bolotin et al. QNoC: QoS architecture and design

process for network on chip, 2004.

  • M. D. Harmanci et al. Quantitative modeling and

comparison of communication schemes to guarantee quality-of-service in networks-on-chip, 2005.

  • N. Kavaldjiev et al. A virtual channel Network-on-Chip

for GT and BE traffic, 2006.

slide-8
SLIDE 8

8 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

+ TDMA / Guaranteed throughput + Contention free routing + Buffer reduction

Connection Based Service Class Based

  • Connection management
  • Fixed resource reservation
  • Classification at design time
  • Lookup-tables

+ High resource utilization + Priority aware service

  • No connections (Relative

guarantees)

  • Contention / Starvation
  • Buffer requirements
  • Inflexibility of classification

Advantages Disadvantages

Related W orks

Underutilization No Hard Guarantees

slide-9
SLIDE 9

9 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Focus of W ork

High Utilization & Tight Guarantees Our Link Arbitration Possible solution can be:

  • Buffers: Not assigned to particular class
  • Links: With efficient arbitration

Bounded Arbitration Algorithm ( BAA) Adaptive approach with sharing of resources

slide-10
SLIDE 10

10 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Overview

Motivation Quality-of-Service (QoS) Related Works Bounded Arbitration Algorithm

  • QoS Specification
  • Algorithm Description
  • NoC Simulation Environment

Evaluation & Results Summary

slide-11
SLIDE 11

11 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

QoS Specification

Service Class Latency Bandw idth Min./ Max. Jitter Exam ple Priority H

Low … … Signal

Priority X1

… Fixed Guarantee No / Fixed Video Stream … … … … …

Priority Xn

… Fixed Guarantee

  • k

Data Stream

Priority BE

As low as possible Not fixed

  • k

Best Effort

slide-12
SLIDE 12

12 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Bounded Arbitration Algorithm

Slot_assigned = FALSE Initialize Yes !slot_assigne d k <= VCdelay_jitter

k slotk < Cp max

& STavl Assign the slots statements Yes Yes No Initialize No No

Slot_assigned = FALSE Initialize Yes !slot_assigne d k <= VCjitter

k slotk < Cp max

& STavl Assign the slots statements Yes Yes No Initialize No No

First Part

  • Highest Priority transctions

get through first with lower bound slot allocation Middle Part

  • The Scheduling Table is

now filled with latency sensitive transactions Last Part

  • Rest of the Best Effort

traffics are now allowed till their upper bound assigned slot

slide-13
SLIDE 13

13 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Bounded Arbitration Algorithm

VC VC ARBITER Link

Service Class

3

Service Class

2 VC

Service Class

1 3 3 3 2 2 2 1 3 3 1 1 Priority 3: 30% / 50% yes yes Priority 2: 30% / 30% yes no Priority 1: 10% / 100% no yes Classes Min / Max BW Latency Jitter

slide-14
SLIDE 14

14 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Area and no# err Bits Const Error Control Block Area and buffer depth Const Virtual Channel Area And other Const Flow Control Length/ Area Link Design Area And Memory Req Output Port Area + memory I nput Port Area & power consumption Positional Router

Synthesizable NoC

Area Crossbar Design Area + memory Control-bits arbiter Area Network I nterface

Hardw are Evaluation ( ProDesign FPGA Board)

Synthesizable ASI Ps

NoC Sim ulation Environm ent

VHDL Model for Synthesis SystemC model for Simulation

slide-15
SLIDE 15

15 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

NoC Sim ulation Environm ent

  • Cycle-accurate design
  • Deeply pipelined structure

Adm ission control Buffer resources Packet control

I D: Input Decoder OA: Output Arbiter VCA: Virtual channel Arbiter VCS: Virtual Channel Selector VC: Virtual Channel OD: Output Decoder PCA: Physical Channel Arbiter

slide-16
SLIDE 16

16 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Overview

Motivation Quality-of-Service (QoS) Related Works Bounded Arbitration Algorithm Evaluation & Results Summary

slide-17
SLIDE 17

17 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Case study: MPEG4 Video Decoder

MPEG4 Video Decoder

  • Are the guarantees met?
  • Comparison: BAA, RR and FAA
  • BAA: Bounded Arbitration Algorithm
  • RR: Round Robin Arbitration
  • FAA: Fixed Arbitration Algorithm
  • Latency, Throughput !
  • Waste of bandwidth !
  • Granularity of specification !
slide-18
SLIDE 18

18 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Traffic Specifications

Service Class Min BW Max BW Latency Sensitive Jitter allow ed MPEG4 800 MB/ s 2000 MB/ s yes yes

Stim ulus1

400 MB/ s 2000 MB/ s no yes

Stim ulus2

400 MB/ s 2000 MB/ s no yes

Stim ulus3

400 MB/ s 2000 MB/ s no yes

slide-19
SLIDE 19

19 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

S

IP

S

IP

S

IP

S

IP

S

IP

S

IP

S

IP

S

IP

S

IP

S

IP

S

IP

S

IP

S

IP

S

IP

S

IP

S

IP

NI NI NI NI NI NI NI NI NI NI NI NI NI NI NI NI

S

IP

S

IP

S

IP

S

IP

NI NI NI NI

MPEG4 Mapping

Stimulus3 Producer (0x2) PREDICT (2x2) iDCT (1x3) iQUANT (1x2) INPUT (1x1) ADD (2x3) IO (2x3) Stimulus2 Consumer (2x4) Stimulus1 Producer (1x0) Stimulus2 Producer (2x0) Stimulus1 Consumer (1x4) Stimulus3 Consumer (3x2)

slide-20
SLIDE 20

20 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Bandw idth Sharing

Bandwidth Arbitration

Stim ulusB Stim ulusA Stim ulusC MPEG4

Round Robin Arbitration

5 0 0 MB/ s 5 0 0 MB/ s 5 0 0 MB/ s 5 0 0 MB/ s Min BW requirem ent for MPEG4 8 0 0 MB/ s

slide-21
SLIDE 21

21 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Bandw idth Sharing

Bandwidth Arbitration

MPEG4

Bounded Arbitration Algorithm

Stim ulusB Stim ulusA Stim ulusC 8 0 0 MB/ s 4 0 0 MB/ s 4 0 0 MB/ s 4 0 0 MB/ s Min BW requirem ent for MPEG4 8 0 0 MB/ s

slide-22
SLIDE 22

22 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Throughput Com parison

Average throughput of all connections

500 1000 1500 2000 2500

Connections

Throughput (MB/s) BAA RR FAA

MPEG4 Stim ulus1 -3

Inp Inp Inp iQuant iDCT MC S1 S2 S3 iQuant iDCT MC iDCT ADD ADD

slide-23
SLIDE 23

23 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Latency Com parison

Average latency of all connections 1000 2000 3000 4000

Connections

Latency (Cycles) BAA RR FAA

MPEG4 Stim ulus1 -3

Inp Inp Inp iQuant iDCT MC S1 S2 S3 iQuant iDCT MC iDCT ADD ADD

slide-24
SLIDE 24

24 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Bandw idth W aste

The Fixed arbitration is wasting 25% of reserved BW because

  • f unpredictability of injection rate
slide-25
SLIDE 25

25 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Specification Granularity

There is on an avg. 20% waste of BW as expected for 5% error in the classification

slide-26
SLIDE 26

26 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Results

Guarantees & Bounds are met BAA: High resource utilization (avg. 97% ) Resource sharing crucial Importance of specification granularity

slide-27
SLIDE 27

27 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Sum m ary

Efficient QoS mechanism is necessary in NoC Hard guarantee & resource utilization BAA is analyzed, superiority is shown

  • High utilization (up to 100% )
  • Better throughput/ latency

Future Work

  • QoS Management for Self-Adaptive system
  • Efficient Admission Control algorithm design
slide-28
SLIDE 28

28 / 28

Chair for Em bedded System s University of Karlsruhe ( TH)

Thank you for attention !