PIPELINING: HAZARDS Mahdi Nazm Bojnordi Assistant Professor School - - PowerPoint PPT Presentation

pipelining hazards
SMART_READER_LITE
LIVE PREVIEW

PIPELINING: HAZARDS Mahdi Nazm Bojnordi Assistant Professor School - - PowerPoint PPT Presentation

PIPELINING: HAZARDS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission deadline: Jan. 30 th This lecture Impacts of


slide-1
SLIDE 1

PIPELINING: HAZARDS

CS/ECE 6810: Computer Architecture

Mahdi Nazm Bojnordi

Assistant Professor School of Computing University of Utah

slide-2
SLIDE 2

Overview

¨ Announcement

¤ Homework 1 submission deadline: Jan. 30th

¨ This lecture

¤ Impacts of pipelining on performance ¤ The MIPS five-stage pipeline ¤ Pipeline hazards

n Structural hazards n Data hazards

slide-3
SLIDE 3

Pipelining Technique

¨ Improving throughput at the expense of latency

¤ Delay: D = T + nδ ¤ Throughput: IPS = n/(T + nδ) Combinational Logic Critical Path Delay = 30 Combinational Logic Critical Path Delay = 15 Combinational Logic Critical Path Delay = 15

  • Comb. Logic

Delay = 10

  • Comb. Logic

Delay = 10

  • Comb. Logic

Delay = 10 D = IPS = D = IPS = D = IPS =

slide-4
SLIDE 4

Pipelining Technique

¨ Improving throughput at the expense of latency

¤ Delay: D = T + nδ ¤ Throughput: IPS = n/(T + nδ) Combinational Logic Critical Path Delay = 30 Combinational Logic Critical Path Delay = 15 Combinational Logic Critical Path Delay = 15

  • Comb. Logic

Delay = 10

  • Comb. Logic

Delay = 10

  • Comb. Logic

Delay = 10 D = 31 IPS = 1/31 D = 32 IPS = 2/32 D = 33 IPS = 3/33

slide-5
SLIDE 5

Pipelining Latency vs. Throughput

¨ Theoretical delay and throughput models for

perfect pipelining

5 10 15 20 50 100 150 200 Relative Performance Number of Pipeline Stages Delay (D) Throughput (IPS)

slide-6
SLIDE 6

Five Stage MIPS Pipeline

slide-7
SLIDE 7

Simple Five Stage Pipeline

¨ A pipelined load-store architecture that processes

up to one instruction per cycle

Write Back

  • Inst. Fetch
  • Inst. Decode

Execute Memory Inst. Memory Register File ALU Data Memory PC

slide-8
SLIDE 8

Instruction Fetch

¨ Read an instruction from memory (I-Cache)

¤ Use the program counter (PC) to index into the I-

Memory

¤ Compute NPC by incrementing current PC

n What about branches? ¨ Update pipeline registers

¤ Write the instruction into the pipeline registers

slide-9
SLIDE 9

Instruction Fetch

Memory PC + 4 NPC Instruction Branch Target Pipeline Register Why increment by 4? NPC = PC + 4 clock clock

slide-10
SLIDE 10

Instruction Fetch

Memory PC + 4 NPC Instruction Branch Target Pipeline Register Why increment by 4? NPC = PC + 4 Critical Path = Max{P1, P2, P3} P1 P2 P3 clock clock

slide-11
SLIDE 11

Instruction Decode

¨ Generate control signals for the opcode bits ¨ Read source operands from the register file (RF)

¤ Use the specifiers for indexing RF

n How many read ports are required? ¨ Update pipeline registers

¤ Send the operand and immediate values to next stage ¤ Pass control signals and NPC to next stage

slide-12
SLIDE 12

Instruction Decode

Register File ctrl Pipeline Register NPC NPC Instruction Pipeline Register reg reg decode target

slide-13
SLIDE 13

Execute Stage

¨ Perform ALU operation

¤ Compute the result of ALU

n Operation type: control signals n First operand: contents of a register n Second operand: either a register or the immediate value

¤ Compute branch target

n Target = NPC + immediate ¨ Update pipeline registers

¤ Control signals, branch target, ALU results, and

destination

slide-14
SLIDE 14

Execute Stage

ALU ctrl Pipeline Register NPC Target Pipeline Register reg reg + reg ctrl Res

slide-15
SLIDE 15

Memory Access

¨ Access data memory

¤ Load/store address: ALU outcome ¤ Control signals determine read or write access

¨ Update pipeline registers

¤ ALU results from execute ¤ Loaded data from D-Memory ¤ Destination register

slide-16
SLIDE 16

Memory Access

ctrl Pipeline Register Target Pipeline Register Res reg Dat ctrl Res Memory addr data data

slide-17
SLIDE 17

Register Write Back

¨ Update register file

¤ Control signals determine if a register write is needed ¤ Only one write port is required

n Write the ALU result to the destination register, or n Write the loaded data into the register file

slide-18
SLIDE 18

Five Stage Pipeline

¨ Ideal pipeline: IPC=1

¤ Is there enough resources to keep the pipeline stages

busy all the time?

+ 4

PC

+ Mem Reg. File ALU Mem Reg. File

  • Inst. Fetch

Decode Execute Memory Writeback

slide-19
SLIDE 19

Pipeline Hazards

slide-20
SLIDE 20

Pipeline Hazards

¨ Structural hazards: multiple instructions compete for

the same resource

¨ Data hazards: a dependent instruction cannot

proceed because it needs a value that hasn’t been produced

¨ Control hazards: the next instruction cannot be

fetched because the outcome of an earlier branch is unknown

slide-21
SLIDE 21

Structural Hazards

¨ 1. Unified memory for instruction and data

R1ß Mem[R2] R7ß R1+R0 R6ß R4-R5 R3ß Mem[R20]

slide-22
SLIDE 22

Structural Hazards

¨ 1. Unified memory for instruction and data

R1ß Mem[R2] R7ß R1+R0 R6ß R4-R5 R3ß Mem[R20] Separate inst. and data memories.

slide-23
SLIDE 23

Structural Hazards

¨ 1. Unified memory for instruction and data ¨ 2. Register file with shared read/write access ports

R1ß Mem[R2] R7ß R1+R0 R6ß R4-R5 R3ß Mem[R20]

slide-24
SLIDE 24

Structural Hazards

¨ 1. Unified memory for instruction and data ¨ 2. Register file with shared read/write access ports

R1ß Mem[R2] R7ß R1+R0 R6ß R4-R5 R3ß Mem[R20] Register access in half cycles.

slide-25
SLIDE 25

Data Hazards

¨ True dependence: read-after-write (RAW)

¤ Consumer has to wait for producer R1ß Mem[R2] R3ß R1+R0 R4ß R1-R3 Loading data from memory.

slide-26
SLIDE 26

Data Hazards

¨ True dependence: read-after-write (RAW)

¤ Consumer has to wait for producer R1ß Mem[R2] R3ß R1+R0 R4ß R1-R3 Loaded data will be available two cycles later.

slide-27
SLIDE 27

Data Hazards

¨ True dependence: read-after-write (RAW)

¤ Consumer has to wait for producer R1ß Mem[R2] R3ß R1+R0 R4ß R1-R3 Nothing Nothing Inserting two bubbles.

slide-28
SLIDE 28

Data Hazards

¨ True dependence: read-after-write (RAW)

¤ Consumer has to wait for producer R1ß Mem[R2] R3ß R1+R0 R4ß R1-R3 Nothing Inserting single bubble + RF bypassing. Load delay slot. SW vs. HW management?

slide-29
SLIDE 29

Data Hazards

¨ True dependence: read-after-write (RAW)

¤ Consumer has to wait for producer R1ß R2+R3 R3ß R1+R0 R4ß R1-R3 R5ß R1+R0 Using the result of an ALU instruction.

slide-30
SLIDE 30

Data Hazards

¨ True dependence: read-after-write (RAW)

¤ Consumer has to wait for producer R1ß R2+R3 R3ß R1+R0 R4ß R1-R3 R5ß R1+R0 Forwarding ALU result. Using the result of an ALU instruction.

slide-31
SLIDE 31

Data Hazards

¨ True dependence: read-after-write (RAW) ¨ Anti dependence: write-after-read (WAR)

¤ Write must wait for earlier read R1ß R2+R1 R2ß R8+R9

slide-32
SLIDE 32

Data Hazards

¨ True dependence: read-after-write (RAW) ¨ Anti dependence: write-after-read (WAR)

¤ Write must wait for earlier read No WAR hazards in 5-stage pipeline! R1ß R2+R1 R2ß R8+R9

slide-33
SLIDE 33

Data Hazards

¨ True dependence: read-after-write (RAW) ¨ Anti dependence: write-after-read (WAR) ¨ Output dependence: write-after-write (WAW)

¤ Old writes must not overwrite the younger write R1ß R2+R3 R1ß R8+R9

slide-34
SLIDE 34

Data Hazards

¨ True dependence: read-after-write (RAW) ¨ Anti dependence: write-after-read (WAR) ¨ Output dependence: write-after-write (WAW)

¤ Old writes must not overwrite the younger write No WAW hazards in 5-stage pipeline! R1ß R2+R3 R1ß R8+R9

slide-35
SLIDE 35

Data Hazards

¨ Forwarding with additional hardware

slide-36
SLIDE 36

Data Hazards

¨ How to detect and resolve data hazards

¤ Show all of the data hazards in the code below R1ß Mem[R2] R2ß R1+R0 R1ß R1-R2 Mem[R3] ß R2

slide-37
SLIDE 37

Data Hazards

¨ How to detect and resolve data hazards

¤ Show all of the data hazards in the code below R1ß Mem[R2] R2ß R1+R0 R1ß R1-R2 Mem[R3] ß R2 WAW WAR RAW