Superscalar Design: An Introduction Virendra Singh Associate - - PowerPoint PPT Presentation

▶

Jan 17, 2024 527 likes •699 views

Superscalar Design: An Introduction Virendra Singh Associate Professor C omputer A rchitecture and D ependable S ystems L ab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail:

SLIDE 1

CADSL

Superscalar Design:

An Introduction

Virendra Singh

Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay

http://www.ee.iitb.ac.in/~viren/ E-mail: viren@ee.iitb.ac.in

EE-739: Processor Design

Lecture 23 (11 March 2013)

SLIDE 2

CADSL

Superscalar Pipeline Stages Superscalar Pipeline Stages

Instruction Buffer Fetch Dispatch Buffer Decode Issuing Buffer Dispatch Completion Buffer Execute Store Buffer Complete Retire

In Program Order In Program Order Out

Order 11 Mar 2013 EE-739@IITB 2

SLIDE 3

CADSL

07 Mar 2013 EE-739@IITB 3

Superscalar Architecture

Wide pipelines for enhanced throughput
ILP is not necessarily exploited by widening

the pipelines and adding more resources

Processor policies towards fetching

decoding, and executing instruction have significant effect on its ability to discover instructions which can be executed concurrently

Instruction issue policy limits or enhances

performance because it determines the processor’s look ahead capability

SLIDE 4

CADSL

Highway

11 Mar 2013 EE-739@IITB 4

SLIDE 5

CADSL

Bad Traffic

11 Mar 2013 EE-739@IITB 5

SLIDE 6

CADSL

Instruction Flow Instruction Flow

Challenges:
Branches: control

dependences

Branch target

misalignment

Instruction cache misses

Instruction Memory

PC 3 instructions fetched

 Objective: Fetch multiple instructions per

cycle

11 Mar 2013 EE-739@IITB 6

SLIDE 7

CADSL

11 Mar 2013 EE-739@IITB 7

Instruction Fetch

 Fetch s instructions from I-cache

 I-Cache must be wide enough that each row of the I-Cache array can store s instructions and that an entire row can be accessed  Fetch width = Row width  Assume access latency is 1 cycle

SLIDE 8

CADSL

11 Mar 2013 EE-739@IITB 8

I-Cache Organization I-Cache Organization

Tag Tag Tag Tag

D E C

1 cache line = 1 physical row

SLIDE 9

CADSL

11 Mar 2013 EE-739@IITB 9

I-Cache Organization I-Cache Organization

Tag Tag Tag D E C

1 cache line = 2 physical rows

SLIDE 10

CADSL

Instruction Flow Instruction Flow

Challenges:

– Branches: control dependences – Branch target misalignment – Instruction cache misses

Solutions

– Code alignment (static vs. dynamic) – Prediction/speculation

 Objective: Fetch multiple instructions per

cycle

11 Mar 2013 EE-739@IITB 10

SLIDE 11

CADSL

Fetch Alignment Fetch Alignment

11 Mar 2013 EE-739@IITB 11

SLIDE 12

CADSL

11 Mar 2013 EE-739@IITB 12

Instruction Fetch

 2 – way set associative I-Cache with a line size of 16 instructions (64 bytes)  Each row of the I-Cache stores 4 associative sets 9two per set) of instructions  Each line of I-cache spans four physical rows  Physical I-cache array is actually composed

f 4 independent sub-arrays

 One instruction can be accessed form one array

SLIDE 13

CADSL

RIOS-I Fetch Hardware RIOS-I Fetch Hardware

11 Mar 2013 EE-739@IITB 13

SLIDE 14

CADSL

Issues in Decoding Issues in Decoding

Primary Tasks
Identify individual instructions (!)
Determine instruction types
Determine dependences between

instructions

Two important factors

 Instruction set architecture  Pipeline width

11 Mar 2013 EE-739@IITB 14

SLIDE 15

CADSL

Pentium Pro Fetch/Decode Pentium Pro Fetch/Decode

11 Mar 2013 EE-739@IITB 15

SLIDE 16

CADSL

Thank You

11 Mar 2013 EE-739@IITB 16