MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of - - PowerPoint PPT Presentation

memory system
SMART_READER_LITE
LIVE PREVIEW

MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of - - PowerPoint PPT Presentation

MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 3810: Computer Organization Overview Notes Homework 9 (deadline Apr. 9 th ) n Verify your submitted file before midnight This lecture


slide-1
SLIDE 1

MEMORY SYSTEM

CS/ECE 3810: Computer Organization

Mahdi Nazm Bojnordi

Assistant Professor School of Computing University of Utah

slide-2
SLIDE 2

Overview

¨ Notes

¤ Homework 9 (deadline Apr. 9th)

n Verify your submitted file before midnight ¨ This lecture

¤ Pipeline hazards

n Control

¤ Memory system

n Cache

slide-3
SLIDE 3

Recall: Pipeline Hazards

¨ Structural hazards: multiple instructions compete for

the same resource

¨ Data hazards: a dependent instruction cannot

proceed because it needs a value that hasn’t been produced

¨ Control hazards: the next instruction cannot be

fetched because the outcome of an earlier branch is unknown

slide-4
SLIDE 4

Control Hazards

¨ Sample C++ code

for (i=100; i != 0; i--) { sum = sum + i; } total = total + sum;

How many branches in this code?

slide-5
SLIDE 5

Control Hazards

¨ Sample C++ code

add $2, $2, $1 J for add $3, $3, $2 addi $1, $1, -1 next: beq $0, $1, next for: addi $1, $0, 100

for (i=100; i != 0; i--) { sum = sum + i; } total = total + sum;

What are possible target instructions?

slide-6
SLIDE 6

Control Hazards

¨ Sample C++ code

add $2, $2, $1 J for add $3, $3, $2 addi $1, $1, -1 next:

IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg IM ALU Reg

beq $0, $1, next for: addi $1, $0, 100

for (i=100; i != 0; i--) { sum = sum + i; } total = total + sum;

IM Reg

What happens inside the pipeline?

slide-7
SLIDE 7

Control Hazards

¨ The outcome of the branch

slide-8
SLIDE 8

Handling Control Hazards

¨ 1. introducing stall cycles and delay slots

¤ How many cycles/slots? ¤ One branch per every six instructions on average!! nothing add $2, $2, $1 addi $1, $1, -1 nothing

IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg IM ALU Reg

beq $0, $1, next for: addi $1, $0, 100

IM Reg

2 additional delay slots per 6 cycles!

J for

slide-9
SLIDE 9

Handling Control Hazards

¨ 1. introducing stall cycles and delay slots

¤ How many cycles/slots? ¤ One branch per every six instructions on average!! nothing addi $1, $1, -1 J for add $2, $2, $1

IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg IM ALU Reg

beq $0, $1, next for: addi $1, $0, 100

IM Reg

1 additional delay slot, but longer path

nothing

slide-10
SLIDE 10

Handling Control Hazards

¨ 1. introducing stall cycles and delay slots

¤ How many cycles/slots? ¤ One branch per every six instructions on average!! nothing J for addi $1, $1, -1 add $2, $2, $1

IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg IM ALU Reg

beq $0, $1, next for: addi $1, $0, 100

IM Reg

Reordering instructions may help

add r3, r3, r2 next:

slide-11
SLIDE 11

Handling Control Hazards

¨ Strategies for filling up the branch delay slot

¤ (a) is the best choice; what about (b) and (c)?

slide-12
SLIDE 12

Handling Control Hazards

¨ 1. introducing stall cycles and delay slots

¤ How many cycles/slots? ¤ One branch per every six instructions on average!! nothing J for addi $1, $1, -1 add $2, $2, $1

IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg IM ALU Reg

beq $0, $1, next for: addi $1, $0, 100

IM Reg

Jump and function calls can be resolved in the decode stage.

add r3, r3, r2 next:

slide-13
SLIDE 13

Handling Control Hazards

¨ 1. introducing stall cycles and delay slots ¨ 2. predict the branch outcome n simply assume the branch is taken or not taken n predict the next PC

add $2, $2, $1 J for add r3, r3, r2 addi $1, $1, -1

IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg Reg IM ALU DM Reg IM ALU Reg

beq $0, $1, next for: addi $1, $0, 100

IM Reg

May need to cancel the wrong path

next:

slide-14
SLIDE 14

Handling Control Hazards

¨ Pipeline without branch predictor

IF (br) PC Reg Read Compare Br-target PC + 4

slide-15
SLIDE 15

Handling Control Hazards

¨ Pipeline with branch predictor

IF (br) PC Reg Read Compare Br-target PC + 4 Branch Predictor

slide-16
SLIDE 16

Handling Control Hazards

¨ The 2-bit branch predictor

slide-17
SLIDE 17

Summary of the Pipeline

slide-18
SLIDE 18

Memory System

¨ Data and instructions are stored on DRAM chips

¤ DRAM has high bit density and low speed ¤ An access DRAM may take about 300 processor cycles

¨ How to bridge the speed gap?

~300X Memory Processor

slide-19
SLIDE 19

Memory Hierarchy

¨ The basic structure of a memory hierarchy.

Registers 1KB 1 cycle L1 data or instruction Cache 32KB 2 cycles L2 cache 2MB 15 cycles Memory 1GB 300 cycles Disk 80 GB 10M cycles

slide-20
SLIDE 20

Memory Hierarchy

¨ The basic structure of a memory hierarchy. ¨ Multiple levels of the memory

Upper Level Lower Level Idea: keep important data closer to processor.

slide-21
SLIDE 21

Cache Architecture

¨ Design principles

¤ Temporal locality: if you used some data recently, you

will likely use it again

¤ Spatial locality: if you used some data recently, you

will likely access its neighbors

¨ Cache terminology

¤ Access time ¤ Hit vs. miss ¤ Miss penalty Processor Cache Memory

slide-22
SLIDE 22

Direct-Mapped Cache

¨ Cache address

slide-23
SLIDE 23

Direct-Mapped Cache

¨ Cache lookup