Midnight Laundry 6 PM 7 8 9 - PDF document

IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 1 Midnight Laundry 6 PM 7 8 9 10 11 12 1 2 AM Time Task� order A� � B� � C� � D 2

Smarty Laundry 6 PM 7 8 9 10 11 12 1 2 AM Time Task� order A� � B� � C� � D 6 PM 7 8 9 10 11 12 1 2 AM Time Task� order A� � B� � C� � D 3 Pipelining • Improve performance by increasing instruction throughput Program 200 400 600 800 1200 1400 1000 1600 1800 execution Time order (in instructions) Instruction Data lw $1, 100($0) R e g A L U R e g fetch a c ce ss Instruction Data lw $2, 200($0) R e g A L U R e g 800 ps fetch a cc e ss Instruction lw $3, 300($0) 800 ps fetch 800 ps Program 200 400 600 800 1200 1400 1000 execution Time order (in instructions) Instruction Data R e g A L U R e g lw $1, 100($0) fetch a cc e ss Instruction Data lw $2, 200($0) R e g A L U R e g 200 ps fetch a cc e ss Instruction Data R e g A L U R e g lw $3, 300($0) 200 ps fetch a c ce ss 200 ps 200 ps 200 ps 200 ps 200 ps Ideal speedup is number of stages in the pipeline. Do we achieve this? 4

Basic Idea IF: Instruction fetch ID: Instruction decode/� EX: Execute/� MEM: Memory access WB: Write back register file read address calculation Add � Add� 4 ADD result Shift� left 2 0 Read� M Read� register 1 data 1 u Address � PC Zero x Read� ALU 1 ALU� register 2 Address� result Instruction 0 Read� 1 � Registers data M � M Write � Read� Data� u u � � Instruction� register � data 2 Memory x � x memory 1 Write� 0 � data Write� data 16 32 Sign� extend 5 Pipelined Datapath IF/ID ID/EX EX/MEM MEM/WB Add Add Add 4 result Shift left 2 0 M u � PC Address Read x Instruction register 1 Read data 1 1 Read Zero register 2 Instruction ALU ALU Registers Read memory Read Address result data Write 0 0 data 2 register M M Data u � u � Write memory x x data 1 1 Write data 16 32 Sign extend 6

Pipeline Diagrams 200 400 600 800 1000 Clock cycle: 1 2 3 4 5 6 7 Time EX WB IF ID MEM add $s0, $t0, $t1 add $s0, $s1, $s2 200 400 600 800 1000 Time EX WB sub $a1, $s2, $a3 IF ID MEM add $s0, $t0, $t1 200 400 600 800 1 Time add $t0, $t1, $t2 EX WB add $s0, $t0, $t1 IF ID MEM Assumptions: • Reads to memory or register file in 2 nd half of clock cycle • Writes to memory or register file in 1 st half of clock cycle What could go wrong? 7 Problem: Dependencies • Problem with starting next instruction before first is finished Clock cycle: 1 2 3 4 5 6 7 8 200 400 600 800 1000 Time sub $s0, $s1, $s2 IF ID EX MEM WB add $s0, $t0, $t1 200 400 600 800 1000 Time and $a1, $s0, $a3 IF ID EX MEM WB add $s0, $t0, $t1 200 400 600 800 1000 Time add $t0, $t1, $s0 IF ID EX MEM WB add $s0, $t0, $t1 200 400 600 800 Time or $t2, $s0, $s0 IF ID EX MEM WB add $s0, $t0, $t1 Dependencies that “go backward in time” are ____________________ Will the “or” instruction work properly? 8

Solution: Forwarding Use temporary results, don’t wait for them to be written Clock cycle: 1 2 3 4 5 6 7 8 200 400 600 800 1000 Time sub $s0, $s1, $s2 EX WB add $s0, $t0, $t1 IF ID MEM 200 400 600 800 1000 Time and $a1, $s0, $a3 EX WB add $s0, $t0, $t1 IF ID MEM 200 400 600 800 1000 Time add $t0, $t1, $s0 EX WB IF ID MEM add $s0, $t0, $t1 200 400 600 800 Time or $t2, $s0, $s0 EX WB IF ID MEM add $s0, $t0, $t1 Where do we need this? Will this deal with all hazards? 9 Problem? 200 400 600 800 1000 Clock cycle: 1 2 3 4 5 6 7 Time lw $t0, 0($s1) IF ID EX MEM WB add $s0, $t0, $t1 200 400 600 800 1000 Time sub $a1, $t0, $a3 IF ID EX MEM WB add $s0, $t0, $t1 200 400 600 800 10 Time add $a2, $t0, $t2 IF ID EX MEM WB add $s0, $t0, $t1 Forwarding not enough… When an instruction tries to ___________ a register following a ____________ to the same register. 10

Solution: “Stall” later instruction until result is ready Clock cycle: 1 2 3 4 5 6 7 lw $t0, 0($s1) sub $a1, $t0, $a3 add $a2, $t0, $t2 Why does the stall start after ID stage? 11 Assumptions • For exercises/exams/everything assume… – The MIPS 5-stage pipeline – That we have forwarding …unless told otherwise 12

Exercise #1 – Pipeline diagrams • Draw a pipeline stage diagram for the following sequence of instructions. Start at cycle #1. You don’t need fancy pictures – just text for each stage: ID, MEM, etc. add $s1, $s3, $s4 lw $v0, 0($a0) sub $t0, $t1, $t2 • What is the total number of cycles needed to complete this sequence? • What is the ALU doing during cycle #4? • When does the sub instruction writeback its result? • When does the lw instruction access memory? 13 Exercise #2 – Data hazards • Consider this code: 1. add $s1, $s3, $s4 2. add $v0, $s1, $s3 3. sub $t0, $v0, $t2 4. and $a0, $v0, $s1 1. Draw lines showing all the data dependencies in this code 2. Which of these dependencies do not need forwarding to avoid stalling? 14

Exercise #3 – Data hazards • Draw a pipeline diagram for this code. Show stalls where needed. 1. add $s1, $s3, $s4 2. lw $v0, 0($s1) 3. sub $v0, $v0, $s1 15 Exercise #4 – More Data hazards HW: 4-81 to 4-82 • Draw a pipeline diagram for this code. Show stalls where needed. 1. lw $s1, 0($t0) 2. lw $v0, 0($s1) 3. sw $v0, 4($s1) 4. sw $t0, 0($t1) 16

The Pipeline Paradox • Pipelining does not ________________ the execution time of any ______________ instruction • But by _____________________ instruction execution, it can greatly improve performance by ________________ the ________________ 17 Structural Hazards • Occur when the hardware can’t support the combination of instructions that we want to execute in the same clock cycle • MIPS instruction set designed to reduce this problem • But could occur if: 18

Control Hazards • What might be a problem with pipelining the following code? beq $a0, $a1, Else lw $v0, 0($s1) sw $v0, 4($s1) Else: add $a1, $a2, $a3 • What other kinds of instructions would cause this problem? 19 Control Hazard Strategy #1: Predict not taken • What if we are wrong? • Assume branch target and decision known at end of ID cycle. Show a pipeline diagram for when branch is taken. beq $a0, $a1, Else lw $v0, 0($s1) sw $v0, 4($s1) Else: add $a1, $a2, $a3 20

Control Hazard Strategies 1. Predict not taken One cycle penalty when we are wrong – not so bad Penalty gets bigger with longer pipelines – bigger problem 2. 3. 21 Branch Prediction Taken Not taken Predict taken Predict taken Taken Taken Not taken Not taken Predict not taken Predict not taken Taken Not taken With more sophistication can get 90-95% accuracy Good prediction key to enabling more advanced pipelining techniques! 22

Code Scheduling to Improve Performance • Can we avoid stalls by rescheduling? lw $t0, 0($t1) add $t2, $t0, $t2 lw $t3, 4($t1) add $t4, $t3, $t4 • Dynamic Pipeline Scheduling – Hardware chooses which instructions to execute next – Will execute instructions out of order (e.g., doesn’t wait for a dependency to be resolved, but rather keeps going!) – Speculates on branches and keeps the pipeline full (may need to rollback if prediction incorrect) 23 Dynamic Pipeline Scheduling • Let hardware choose which instruction to execute next (might execute instructions out of program order) • Why might hardware do better job than programmer/compiler? Example #1 Example #2 lw $t0, 0($t1) sw $s0, 0($s3) add $t2, $t0, $t2 lw $t0, 0($t1) lw $t3, 4($t1) add $t2, $t0, $t2 add $t4, $t3, $t4 24

Exercise #1 • Can you rewrite this code to eliminate stalls? 1. lw $s1, 0($t0) 2. lw $v0, 0($s1) 3. sw $v0, 4($s1) 4. add $t0, $t1, $t2 25 Exercise #2 HW: 4-86 to 4-87 • Show a pipeline diagram for the following code, assuming: – The branch is predicted not taken – The branch actually is taken lw $t1, 0($t0) beq $s1, $s2, Label2 sub $v0, $v1, $v2 Label2: add $t0, $t1, $t2 26

Midnight Laundry 6 PM 7 8 9 - PDF document

IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 1 Midnight Laundry 6 PM 7 8 9 10 11 12 1 2 AM Time

Midnight Laundry 2 Smarty Laundry 3 Pipelining Improve performance by increasing

Laundry Time Grace Rundelli When do people take their laundry out of the machines?? What I

Atlas Laundry System Overview and Programming Instructions DEMA Engineering DEMA Engineering

Tackling Climate Change Through Laundry Encouraging Energy Efficient Laundry Habits Campus Wide

LAUNDRY LOVE OUR MISSION AT ST. PETERS, BLAIRSVILLE What is Laundry Love ? Founded in 2003

Laundry to Landscape Workshop Welcome! Graywater Overview Laundry to landscape Permit

Example Task: Doing a load of laundry W ash, D ry, F old Each laundry load takes T

LI QUI D LAUNDRY DETERGENT PACKET EXPOSURES REPORTED TO THE NATI ONAL POI SON DATA SYSTEM (

Growth What Happened? Why is our market share down in laundry? Data Technicians Only You New

Expansion Project Mifflin Street Site Prior to Laundry Demolition Removed blighted structure

SUSTAINABLE LAUNDRY ON CAMPUS FYE 100: CAN I CREATE A SUSTAINABLE LIFE (FA19) INSTRUCTOR: DR.

Pipelining PIPELINING what Seymour Cray taught the laundry industry How to correctly pipeline

2014 PROJECT GRADUATION SENIOR MIDNIGHT CRUISE - 6/20/2014 2014 SENIORS ARE INVITED TO CELEBRATE

Midnight Sun Mining

Project Midnight Southray (BDF 150014) Drug preventive service for high risk non-Chinese ethnic

Administrative - A1 is due Today (midnight). You can use up to 3 late days - A2 will be up this

Dependences and Hazards Lecture 17 CS301 Administrative Daily Review of todays lecture

CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 Martha Kim

I can represent the multiplication visually by drawing the vector. National 5 Slides WB 29th Jan

Deep Learning for Mobile Part II Instructor - Simon Lucey 16-623 - Designing Computer Vision

Linear Regression 4/14/17 Hypothesis Space Supervised learning For every input in the data

Kaplan-Meier estimate Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis in R

I-205 SB Closed at X Johnson Creek Blvd I-205 SB Detour Route: Johnson Creek Blvd WB to OR213

Bus Use of Shoulders in ODOT District 12 March 19, 2015 Introductions Introductions

Sambuz

Useful Links

Newsletter

Mail Us