Lecture notes for CS 433 - Chapter 2, part 2 9/26/18 Branch - PDF document

Lecture notes for CS 433 - Chapter 2, part 2 9/26/18 Branch Prediction Buffer Strategies: Limitations Chapter 3 – Instruction-Level Parallelism and Limitations its Exploitation (Part 3) May use bit from wrong PC Target must be known when branch resolved ILP vs. Parallel Computers Dynamic Scheduling (Section 3.4, 3.5) Dynamic Branch Prediction (Section 3.3, 3.9, and Appendix C) Hardware Speculation and Precise Interrupts (Section 3.6) Multiple Issue (Section 3.7) Static Techniques (Section 3.2, Appendix H) Limitations of ILP Multithreading (Section 3.11) Putting it Together (Mini-projects) Branch Target Buffer or Cache (Section 3.9) Branch Target Cache With Target Instruction Store target PC along with prediction Store target instruction along with prediction Accessed in IF stage Send target instruction instead of branch into ID Next IF stage uses target PC Zero cycle branch - branch folding No bubbles on correctly predicted taken branch Used for unconditional jumps Must store tag E.g., ARM Cortex A-53 More state Can remove not-taken branches? Sarita Adve 1

Lecture notes for CS 433 - Chapter 2, part 2 9/26/18 Return Address Stack (Section 3.9) Speculative Execution Hardware stack for addresses for returns How far can we go with branch prediction? Speculative fetch? Call pushes return address in stack Speculative issue? Return pops the address Speculative execution? Perfect prediction if stack length ³ call depth Speculative write? Speculative Execution Reorder Buffer Allows instructions after branch to execute before knowing if branch Overview will be taken Instructions complete out-of-order Reorder buffer reorganizes instructions Must be able to undo if branch is not taken Modify state in-order Often try to combine with dynamic scheduling Key insight: Split Write stage into Complete and Commit Entry Busy Type Dest Result State Excep Complete out of order 1 0 2 1 LD 4 Exec 0 No state update head 3 1 BR Exec 0 Commit in order 4 1 ADD 6 75 Compl 0 tail 5 0 State updated (instruction no longer speculative) 0 Use reorder buffer N 0 Instruction tag now is reorder buffer entry Sarita Adve 2

Lecture notes for CS 433 - Chapter 2, part 2 9/26/18 Re-order Buffer Pipeline Precise Interrupts Again Precise interrupts hard with dynamic scheduling Issue: Consider our canonical code fragment: LF F6,34(R2) LF F2,45(R3) Execute: MULTF F0,F2,F4 SUBF F8,F6,F2 DIVF F10,F0,F6 ADDF F6,F8,F2 Complete: What happens if DIVF causes an interrupt? ADDF has already completed Out-of-order completion makes interrupts hard But reorder buffer can help! Commit: Reorder Buffer for Precise Interrupts Re-order Buffer Drawback Operands need to be read from reorder buffer or registers Alternative: Rename registers Sarita Adve 3

Lecture notes for CS 433 - Chapter 2, part 2 9/26/18 Rename Registers + Reorder Buffer Many current machines More physical registers than logical registers Reorder buffer does not have values Read all values from registers Rename mechanism Rename map stores mapping from logical to physical registers (Logical register Rl mapped to physical register Rp) On issue, Rl mapped to Rp-new On completion, write to Rp-new On commit, old mapping of Rl discarded (free Rp-old) On misprediction, new mapping of Rl discarded (free Rp-new) Sarita Adve 4

Lecture notes for CS 433 - Chapter 2, part 2 9/26/18 Branch - PDF document

Lecture notes for CS 433 - Chapter 2, part 2 9/26/18 Branch Prediction Buffer Strategies: Limitations Chapter 3 Instruction-Level Parallelism and Limitations its Exploitation (Part 3) May use bit from wrong PC Target must be known when

Lecture notes for CS 433 - Chapter 4 11/7/2019 Chapter 5: Thread-Level Parallelism Part 1

Legionella Detection Test Kits sales@novatech-usa.com www.novatech-usa.com Tel: (866) 433-6682

Pocket Lecture Pocket Lecture Pocket Lecture Pocket Lecture Listen Audio Notes Progress

433-380 Graphics and Computation Department of Computer Science and Software Engineering, The

Robo sapiens The Forefront of AI? CPSC 433 Christian Jacob Dept. of Computer Science Dept. of

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 11/27/2006 Chapter 13

C R RAO AIMSCS Lecture Notes Series Author (s): B.L.S. PRAKASA RAO Title of the Notes : Brief

Alexander Volya 2016, Feb. GGI Lecture notes www.volya.net Alexander Volya 2016, Feb. GGI

IBM Model 701 (Early 1950's) CS 140 Lecture Notes: Introduction Slide 1 IBM 7094 (Early 1960's)

Conformal Field Theories, Conformal Bootstrap and Applications Konstantinos Deligiannis December

Introduction to Xilinx System Generator Part I Evan Everett and Michael Wu ELEC 433 - Spring

Symbol Timing Synchronization Part II: Over-the-air Testing ELEC 433 Evan Everett and Michael Wu

Problem solved: IBM Notes Replacement 2 IBM Notes Replacement Migrating from IBM Notes to

Printout Tuesday, October 29, 2019 7:38 PM Quick Notes Page 1 Quick Notes Page 2 Quick Notes

Briefing Notes The Briefing Notes Page The Briefing Notes include: An introduction to the

Slides from lecture Friday, April 26, 2019 12:02 PM Unfiled Notes Page 1 Unfiled Notes Page 2

CS422 Computer Architecture Spring 2004 Lecture 13, 17 Feb 2004 Bhaskaran Raman Department of

Computer System Overview Chapter 1 Operating System Exploits the hardware resources of one

Introduc>on to MARIE 2 Schedule Today and Monday 13

Register Renaming & Out-of-Order Execution Nima Honarmand Spring 2016 :: CSE 502

CENG4480 Lecture 06: Sound Record Bei Yu byu@cse.cuhk.edu.hk (Latest update: October 18, 2017)

Security: Buffer Overflows and Defenses CS 416: Operating Systems Design Department of Computer

Prevalent in Unit Testing? Wes Masri American University of Beirut Electrical and Computer

Symstra: A Framework for Generating Object-Oriented Unit Tests using Symbolic Execution 1 3 1

Lecture notes for CS 433 - Chapter 2, part 2 9/26/18 Branch - PDF document

Lecture notes for CS 433 - Chapter 2, part 2 9/26/18 Branch Prediction Buffer Strategies: Limitations Chapter 3 Instruction-Level Parallelism and Limitations its Exploitation (Part 3) May use bit from wrong PC Target must be known when

Lecture notes for CS 433 - Chapter 4 11/7/2019 Chapter 5: Thread-Level Parallelism Part 1

Legionella Detection Test Kits sales@novatech-usa.com www.novatech-usa.com Tel: (866) 433-6682

Pocket Lecture Pocket Lecture Pocket Lecture Pocket Lecture Listen Audio Notes Progress

433-380 Graphics and Computation Department of Computer Science and Software Engineering, The

Robo sapiens The Forefront of AI? CPSC 433 Christian Jacob Dept. of Computer Science Dept. of

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 11/27/2006 Chapter 13

C R RAO AIMSCS Lecture Notes Series Author (s): B.L.S. PRAKASA RAO Title of the Notes : Brief

Alexander Volya 2016, Feb. GGI Lecture notes www.volya.net Alexander Volya 2016, Feb. GGI

IBM Model 701 (Early 1950's) CS 140 Lecture Notes: Introduction Slide 1 IBM 7094 (Early 1960's)

Conformal Field Theories, Conformal Bootstrap and Applications Konstantinos Deligiannis December

Introduction to Xilinx System Generator Part I Evan Everett and Michael Wu ELEC 433 - Spring

Symbol Timing Synchronization Part II: Over-the-air Testing ELEC 433 Evan Everett and Michael Wu

Problem solved: IBM Notes Replacement 2 IBM Notes Replacement Migrating from IBM Notes to

Printout Tuesday, October 29, 2019 7:38 PM Quick Notes Page 1 Quick Notes Page 2 Quick Notes

Briefing Notes The Briefing Notes Page The Briefing Notes include: An introduction to the

Slides from lecture Friday, April 26, 2019 12:02 PM Unfiled Notes Page 1 Unfiled Notes Page 2

CS422 Computer Architecture Spring 2004 Lecture 13, 17 Feb 2004 Bhaskaran Raman Department of

Computer System Overview Chapter 1 Operating System Exploits the hardware resources of one

Introduc&gt;on to MARIE 2 Schedule Today and Monday 13

Register Renaming &amp; Out-of-Order Execution Nima Honarmand Spring 2016 :: CSE 502

CENG4480 Lecture 06: Sound Record Bei Yu byu@cse.cuhk.edu.hk (Latest update: October 18, 2017)

Security: Buffer Overflows and Defenses CS 416: Operating Systems Design Department of Computer

Prevalent in Unit Testing? Wes Masri American University of Beirut Electrical and Computer

Symstra: A Framework for Generating Object-Oriented Unit Tests using Symbolic Execution 1 3 1

Introduc>on to MARIE 2 Schedule Today and Monday 13

Register Renaming & Out-of-Order Execution Nima Honarmand Spring 2016 :: CSE 502