Superscalar Design: An Introduction Virendra Singh Associate - - PowerPoint PPT Presentation

superscalar design
SMART_READER_LITE
LIVE PREVIEW

Superscalar Design: An Introduction Virendra Singh Associate - - PowerPoint PPT Presentation

Superscalar Design: An Introduction Virendra Singh Associate Professor C omputer A rchitecture and D ependable S ystems L ab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail:


slide-1
SLIDE 1

CADSL

Superscalar Design:

An Introduction

Virendra Singh

Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay

http://www.ee.iitb.ac.in/~viren/ E-mail: viren@ee.iitb.ac.in

EE-739: Processor Design

Lecture 23 (11 March 2013)

slide-2
SLIDE 2

CADSL

Superscalar Pipeline Stages Superscalar Pipeline Stages

Instruction Buffer Fetch Dispatch Buffer Decode Issuing Buffer Dispatch Completion Buffer Execute Store Buffer Complete Retire

In Program Order In Program Order Out

  • f

Order 11 Mar 2013 EE-739@IITB 2

slide-3
SLIDE 3

CADSL

07 Mar 2013 EE-739@IITB 3

Superscalar Architecture

  • Wide pipelines for enhanced throughput
  • ILP is not necessarily exploited by widening

the pipelines and adding more resources

  • Processor policies towards fetching

decoding, and executing instruction have significant effect on its ability to discover instructions which can be executed concurrently

  • Instruction issue policy limits or enhances

performance because it determines the processor’s look ahead capability

slide-4
SLIDE 4

CADSL

Highway

11 Mar 2013 EE-739@IITB 4

slide-5
SLIDE 5

CADSL

Bad Traffic

11 Mar 2013 EE-739@IITB 5

slide-6
SLIDE 6

CADSL

Instruction Flow Instruction Flow

  • Challenges:
  • Branches: control

dependences

  • Branch target

misalignment

  • Instruction cache misses

Instruction Memory

PC 3 instructions fetched

 Objective: Fetch multiple instructions per

cycle

11 Mar 2013 EE-739@IITB 6

slide-7
SLIDE 7

CADSL

11 Mar 2013 EE-739@IITB 7

Instruction Fetch

 Fetch s instructions from I-cache

 I-Cache must be wide enough that each row of the I-Cache array can store s instructions and that an entire row can be accessed  Fetch width = Row width  Assume access latency is 1 cycle

slide-8
SLIDE 8

CADSL

11 Mar 2013 EE-739@IITB 8

I-Cache Organization I-Cache Organization

Tag Tag Tag Tag

D E C

1 cache line = 1 physical row

slide-9
SLIDE 9

CADSL

11 Mar 2013 EE-739@IITB 9

I-Cache Organization I-Cache Organization

Tag Tag Tag D E C

1 cache line = 2 physical rows

slide-10
SLIDE 10

CADSL

Instruction Flow Instruction Flow

  • Challenges:

– Branches: control dependences – Branch target misalignment – Instruction cache misses

  • Solutions

– Code alignment (static vs. dynamic) – Prediction/speculation

 Objective: Fetch multiple instructions per

cycle

11 Mar 2013 EE-739@IITB 10

slide-11
SLIDE 11

CADSL

Fetch Alignment Fetch Alignment

11 Mar 2013 EE-739@IITB 11

slide-12
SLIDE 12

CADSL

11 Mar 2013 EE-739@IITB 12

Instruction Fetch

 2 – way set associative I-Cache with a line size of 16 instructions (64 bytes)  Each row of the I-Cache stores 4 associative sets 9two per set) of instructions  Each line of I-cache spans four physical rows  Physical I-cache array is actually composed

  • f 4 independent sub-arrays

 One instruction can be accessed form one array

slide-13
SLIDE 13

CADSL

RIOS-I Fetch Hardware RIOS-I Fetch Hardware

11 Mar 2013 EE-739@IITB 13

slide-14
SLIDE 14

CADSL

Issues in Decoding Issues in Decoding

  • Primary Tasks
  • Identify individual instructions (!)
  • Determine instruction types
  • Determine dependences between

instructions

  • Two important factors

 Instruction set architecture  Pipeline width

11 Mar 2013 EE-739@IITB 14

slide-15
SLIDE 15

CADSL

Pentium Pro Fetch/Decode Pentium Pro Fetch/Decode

11 Mar 2013 EE-739@IITB 15

slide-16
SLIDE 16

CADSL

Thank You

11 Mar 2013 EE-739@IITB 16