memory system
play

MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of - PowerPoint PPT Presentation

MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 3810: Computer Organization Overview Notes Homework 9 (deadline Apr. 9 th ) n Verify your submitted file before midnight This lecture


  1. MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 3810: Computer Organization

  2. Overview ¨ Notes ¤ Homework 9 (deadline Apr. 9 th ) n Verify your submitted file before midnight ¨ This lecture ¤ Pipeline hazards n Control ¤ Memory system n Cache

  3. Recall: Pipeline Hazards ¨ Structural hazards: multiple instructions compete for the same resource ¨ Data hazards: a dependent instruction cannot proceed because it needs a value that hasn’t been produced ¨ Control hazards: the next instruction cannot be fetched because the outcome of an earlier branch is unknown

  4. Control Hazards ¨ Sample C++ code for (i=100; i != 0; i--) { sum = sum + i; } total = total + sum; How many branches in this code?

  5. Control Hazards ¨ Sample C++ code for (i=100; i != 0; i--) { sum = sum + i; } total = total + sum; addi $1, $0, 100 for: beq $0, $1, next add $2, $2, $1 addi $1, $1, -1 J for next: add $3, $3, $2 What are possible target instructions?

  6. Control Hazards ¨ Sample C++ code for (i=100; i != 0; i--) { sum = sum + i; } total = total + sum; addi $1, $0, 100 IM Reg ALU DM Reg for: beq $0, $1, next IM Reg ALU DM Reg add $2, $2, $1 IM Reg ALU DM Reg addi $1, $1, -1 IM Reg ALU DM J for IM Reg ALU next: add $3, $3, $2 IM Reg What happens inside the pipeline?

  7. Control Hazards ¨ The outcome of the branch

  8. Handling Control Hazards ¨ 1. introducing stall cycles and delay slots ¤ How many cycles/slots? ¤ One branch per every six instructions on average!! addi $1, $0, 100 IM Reg ALU DM Reg beq $0, $1, next for: IM Reg ALU DM Reg nothing IM Reg ALU DM Reg nothing IM Reg ALU DM add $2, $2, $1 IM Reg ALU addi $1, $1, -1 IM Reg 2 additional delay slots per 6 cycles! J for

  9. Handling Control Hazards ¨ 1. introducing stall cycles and delay slots ¤ How many cycles/slots? ¤ One branch per every six instructions on average!! addi $1, $0, 100 IM Reg ALU DM Reg beq $0, $1, next for: IM Reg ALU DM Reg nothing IM Reg ALU DM Reg add $2, $2, $1 IM Reg ALU DM addi $1, $1, -1 IM Reg ALU J for IM Reg 1 additional delay slot, but longer path nothing

  10. Handling Control Hazards ¨ 1. introducing stall cycles and delay slots ¤ How many cycles/slots? ¤ One branch per every six instructions on average!! addi $1, $0, 100 IM Reg ALU DM Reg beq $0, $1, next for: IM Reg ALU DM Reg nothing IM Reg ALU DM Reg add $2, $2, $1 IM Reg ALU DM J for IM Reg ALU addi $1, $1, -1 IM Reg Reordering instructions may help next: add r3, r3, r2

  11. Handling Control Hazards ¨ Strategies for filling up the branch delay slot ¤ (a) is the best choice; what about (b) and (c)?

  12. Handling Control Hazards ¨ 1. introducing stall cycles and delay slots ¤ How many cycles/slots? ¤ One branch per every six instructions on average!! addi $1, $0, 100 IM Reg ALU DM Reg beq $0, $1, next for: IM Reg ALU DM Reg nothing IM Reg ALU DM Reg add $2, $2, $1 IM Reg ALU DM J for IM Reg ALU addi $1, $1, -1 Jump and function calls can be IM Reg resolved in the decode stage. next: add r3, r3, r2

  13. Handling Control Hazards ¨ 1. introducing stall cycles and delay slots ¨ 2. predict the branch outcome n simply assume the branch is taken or not taken n predict the next PC addi $1, $0, 100 IM Reg ALU DM Reg beq $0, $1, next for: IM Reg ALU DM Reg add $2, $2, $1 IM Reg ALU DM Reg addi $1, $1, -1 IM Reg ALU DM J for IM Reg ALU add r3, r3, r2 next: IM Reg May need to cancel the wrong path

  14. Handling Control Hazards ¨ Pipeline without branch predictor IF (br) Reg Read Compare PC Br-target PC + 4

  15. Handling Control Hazards ¨ Pipeline with branch predictor IF (br) Reg Read Compare PC Br-target Branch Predictor PC + 4

  16. Handling Control Hazards ¨ The 2-bit branch predictor

  17. Summary of the Pipeline

  18. Memory System ¨ Data and instructions are stored on DRAM chips ¤ DRAM has high bit density and low speed ¤ An access DRAM may take about 300 processor cycles ¨ How to bridge the speed gap? Processor Memory ~300X

  19. Memory Hierarchy ¨ The basic structure of a memory hierarchy. L1 data or Registers Memory instruction L2 cache 1KB 1GB Disk Cache 2MB 1 cycle 300 cycles 15 cycles 80 GB 32KB 10M cycles 2 cycles

  20. Memory Hierarchy ¨ The basic structure of a memory hierarchy. ¨ Multiple levels of the memory Idea: keep important data closer to processor. Upper Level Lower Level

  21. Cache Architecture ¨ Design principles ¤ Temporal locality: if you used some data recently, you will likely use it again ¤ Spatial locality: if you used some data recently, you will likely access its neighbors ¨ Cache terminology Processor ¤ Access time ¤ Hit vs. miss Cache ¤ Miss penalty Memory

  22. Direct-Mapped Cache ¨ Cache address

  23. Direct-Mapped Cache ¨ Cache lookup

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend