example task doing a load of laundry w ash d ry f old
play

Example Task: Doing a load of laundry W ash, D ry, F old Each - PowerPoint PPT Presentation

Example Task: Doing a load of laundry W ash, D ry, F old Each laundry load takes T hours Completing n tasks requires nT hours T 2T 3T 4T 5T 6T 7T 8T


  1. Example • Task: Doing a load of laundry – W ash, D ry, F old – Each laundry load takes T hours • Completing n tasks requires nT hours T 2T 3T 4T 5T 6T 7T 8T 9T WDF WDF WDF WDF WDF WDF WDF WDF WDF 1

  2. Parallel Processing • M independent machines – W ash, D ry, F old – Do M laundry loads concurrently • Completing n tasks takes T x ceiling(n/M) hours T 2T 3T WDF WDF Requires M units to achieve speedup WDF WDF WDF WDF WDF WDF WDF 2

  3. Pipelining • Divide each task into component microtasks • Each microtask requires unit time ( same for all microtasks) – One microtask performed per stage • With p stages per task: n tasks require np time units. W 1 D 1 F 1 W 2 D 2 F 2 W 3 D 3 F 3 1 2 3 4 5 6 7 8 9 5

  4. Pipelining Task i begins immediately after task i- 1 completes its first stage p -stage pipeline: task n completes at time step n + p - 1 (vs. n x p sequential ) In steady-state p tasks concurrently active Pipeline Fill: 1-2 W 1 D 1 F 1 Steady State: 3-4 Pipeline Flush: 5-6 W 2 D 2 F 2 W 3 D 3 F 3 W 4 D 4 F 4 1 2 3 4 5 6 6

  5. Pipelining Task i begins immediately after task i- 1 completes its first stage For a p -stage pipeline: task n completes at time step n + p - 1 In steady-state p tasks concurrently active Latency of a task = p time steps (Not changed by Pipelining) 1 2 3 4 5 6 7 8 9 W D F W D F W D F W D F W D F W D F W D F By reducing time for n tasks from np to n+p-1 Increases task throughput 7

  6. Latency and Throughput • Latency of a task: – Time elapsed between start and finish of the task Assume that W, D, F take 1 hour each In all designs latency is the same (3 hours) • Throughput : Number of tasks completed per unit time – Non-pipelined design : T = p time units per task: Throughput = 1/p 1 task completes every p time units – Pipelined design with p-stage pipeline (unit time per stage) n tasks in n+p-1 time: Throughput = n/(n+p-1) = 1/(1 + (p-1)/n) approaches 1 (as n >> p) 1 task completes per time unit Speedup = T non-pipelined /T pipelined = np/(n+p-1) For n >> p Speedup approaches p – Parallel processing design with M machines: Every T = p time units M tasks complete Throughput = M/p and Speedup M 8

  7. Multi Cycle Implementation ALUWrite P C IR AWrite A REG ALU MEM ALUout IRWrite FILE PCWrite B MDR BWrite ALUop 4 MEMRead MDRWrite STATE MACHINE DECODER 9

  8. Multi-Cycle Design State Machine Model Instruction Fetch : IR = IM[PC]; 0 PC = PC+4 Instruction Decode: Generate Control Signals 1 A = REG[$rs] B = REG[$rt] ALUout = PC + Shift(SE(offset)) R-R : p= A q = B lw : p= A q = SE(d) sw : p= A q =SE(d) beq : p = A; q = B; 2 5 8 ALUout = p op q ALUout = p op q ALUout = p op q 10 Z = (p .eq. q); 10 R-R : lw : If (z == 1) PC = ALUOUT; sw : 3 6 9 REG[$rd] = ALUout MDR = DM[ALUout] DM[ALUout] = B lw : 7 REG[$rt] = MDR S6 S0 S1 S5 S7 S0 S1 S2 S3 S0 2 LD (5 cycles) ADD (4 cycles)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend