hazards
play

Hazards 1 Today Quiz recap Quiz 2 correction Flash memory Data - PowerPoint PPT Presentation

Hazards 1 Today Quiz recap Quiz 2 correction Flash memory Data Hazards Watch for announcement about signing up for a project interview. Remember! There is a midterm next Tuesday that covers everything through this


  1. Hazards 1

  2. Today • Quiz recap • Quiz 2 correction • Flash memory • Data Hazards • Watch for announcement about signing up for a project interview. • Remember! There is a midterm next Tuesday that covers everything through this Thursday • Review session: Next Monday, 7pm cse4140 2

  3. Hazards: Key Points • Hazards cause imperfect pipelining • They prevent us from achieving CPI = 1 • They are generally causes by “counter flow” data dependences in the pipeline • Three kinds • Structural -- contention for hardware resources • Data -- a data value is not available when/where it is needed. • Control -- the next instruction to execute is not known. • Two ways to deal with hazards • Removal -- add hardware and/or complexity to work around the hazard so it does not exist • Bypassing/forwarding • Speculation • Stall -- Sacrifice performance to prevent the hazard from occurring • Bubbles 3

  4. Data Dependences • A data dependence occurs whenever one instruction needs a value produced by another. • Register values (for now) • Also memory accesses (more on this later) sw $t1, 0($t2) add $s0, $t0, $t1 sub $t2, $s0, $t3 ld $t3, 0($t2) add $t3, $s0, $t4 and $t3, $t2, $t4 ld $t4, 16($s4) 4

  5. Dependences in the pipeline • In our simple pipeline, these instructions cause a hazard Cycles Fetch Deco Mem Write EX add $s0, $t0, $t1 de back Fetch Deco Mem Write EX sub $t2, $s0, $t3 de back • 5

  6. How can we fix it? • Ideas? 6

  7. Solution 1: Make the compiler deal with it. • Expose hazards to the big A architecture • A result is available N instructions after the instruction that generates it. • In the meantime, the register file has the old value. • “delay slots” • What is N? • Can it change? • What can the compiler do? Fetch Deco Mem Write EX de back 7

  8. Compiling for delay slots • The compiler must fill the delay slots with other instructions • What if it can’t? No-ops Rearrange instructions add $s0, $t0, $t1 add $s0, $t0, $t1 sub $t2, $s0, $t3 and $t7, $t5, $t4 add $t3, $s0, $t4 sub $t2, $s0, $t3 • and $t7, $t5, $t4 add $t3, $s0, $t4 8

  9. Solution 2: Stall • When you need a value that is not ready, “stall” • Suspend the execution of the executing instruction • and those that follow. • This introduces a pipeline “bubble” Cycles Fetch Deco Mem Write EX add $s0, $t0, $t1 de back Fetch Deco Mem Write EX Stall sub $t2, $s0, $t3 de back 9

  10. Stalling the pipeline • All pipeline stages preceding the stage where the hazard occurs freeze • Disable the PC update • Disable the pipeline registers • This essentially equivalent to always inserting a nop when a hazard exists • Insert nop control bits at stalled stage (decode in our example) • How is this solution still potentially “better” than relying on the compiler? The compiler can still act like there are delay slots to avoid stalls. Implementation details are not exposed in the ISA 10

  11. The Impact of Stalling On Performance • ET = I * CPI * CT • I and CT are constant • What is the impact of stalling on CPI? • What do we need to know to figure it out? 11

  12. The Impact of Stalling On Performance • ET = I * CPI * CT • I and CT are constant • What is the impact of stalling on CPI? • Fraction of instructions that stall: 30% • Baseline CPI = 1 • Stall CPI = 1 + 2 = 3 • New CPI = 0.3*3 + 0.7*1 = 1.6 12

  13. Solution 3: Bypassing/Forwarding • Data values are computed in _____ and _______but “publicized in write back”? results known Results "published" inputs are needed to registers Fetch Deco Mem Write EX de back 13

  14. Bypassing or Forwarding • Take the values, where ever they are Cycles Fetch Deco Mem Write EX add $s0, $t0, $t1 de back Fetch Deco Mem Write EX sub $t2, $s0, $t3 de back • 14

  15. Forwarding Paths Cycles Fetch Deco Mem Write EX add $s0, $t0, $t1 de back Fetch Deco Mem Write EX sub $t2, $s0, $t3 de back Fetch Deco Mem Write EX sub $t2, $s0, $t3 de back Fetch Deco Mem Write EX sub $t2, $s0, $t3 de back 15

  16. Forwarding in Hardware Add 4 Add Shi< le< 2 Read Addr 1 Instruc(on Data Read Register Memory Memory Data 1 IFetch/Dec Read Addr 2 Read File Exec/Mem Dec/Exec Read ALU PC Address Address Write Addr Data Mem/WB Read Data 2 Write Data Write Data Sign Extend 16 32

  17. Forwarding Control • The forwarding unit detects instances when the destination and source registers of executing instructions match • Set the control lines on the ALU input muxes accordingly • Stall if, for some reason, forwarding is not possible. 17

  18. Forwarding for Loads • Load values come from the Mem stage Cycles Fetch Deco Mem Write EX ld $s0, (0)$t0 de back Fetch Deco Mem EX sub $t2, $s0, $t3 de Time travel presents significant implementation challenges 18

  19. What can we do? • Punt to the compiler • Complete solution. • Same dangers apply as before. • Always stall. • Forward when possible, stall otherwise • Here the compiler still has leverage • Code will be faster if the compiler generates code as if there is a delay slot. • If the compiler can’t fix it, the hardware will stall 19

  20. Performance cost of stalling • ET = I * CPI * CT • CPI = %Stall * StallTime • % Stall is determined by how aggressive our bypassing is and the quality of our compiler. • Stall time is related to pipeline depth. In our case, it is 1 or 2, because our pipeline is shallow. • In deeper pipelines, it can larger. 20

  21. Hardware Cost of Forwarding • In our pipeline, adding forwarding required relatively little hardware. • For deeper pipelines it gets much more expensive • Roughly: ALU * pipeline stages you need to forward over • Some modern processor have multiple ALUs (4-5) • And deeper pipelines (4-5 stages of to forward across) 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend