precise exceptions and
play

Precise Exceptions and Idea: Have multiple different functional - PDF document

3/16/17 Multi-Cycle Execution Not all instructions take the same amount of time for execution Precise Exceptions and Idea: Have multiple different functional units that take Out-of-Order Execution different number of cycles


  1. 3/16/17 Multi-Cycle Execution • Not all instructions take the same amount of time for “execution” Precise Exceptions and • Idea: Have multiple different functional units that take Out-of-Order Execution different number of cycles • Can be pipelined or not pipelined Samira Khan • Can let independent instructions start execution on a different functional unit before a previous long-latency instruction finishes execution 2 The Von Neumann Model/Architecture ISSUES IN PIPELINING: MULTI-CYCLE EXECUTE • Instructions can take different number of cycles in • Also called stored program computer (instructions in EXECUTE stage memory). Two key properties: • Integer ADD versus FP Multiply • Stored program F D E E E E E E E E W FMUL R4 ß R1, R2 • Instructions stored in a linear memory array ADD R3 ß R1, R2 F D E W • Memory is unified between instructions and data F D E W • The interpretation of a stored value depends on the control signals F D E W F D E E E E E E E E W • Sequential instruction processing FMUL R2 ß R5, R6 F D E W ADD R4 ß R5, R6 • One instruction processed (fetched, executed, and completed) at a time F D E W • Program counter (instruction pointer) identifies the current instr. • What is wrong with this picture? • Program counter is advanced sequentially except for control transfer • What if FMUL incurs an exception? instructions • Sequential semantics of the ISA NOT preserved! 3 4 1

  2. 3/16/17 PRECISE EXCEPTIONS/INTERRUPTS HANDLING EXCEPTIONS IN PIPELINING • The architectural state should be consistent when the • Exceptions versus interrupts exception/interrupt is ready to be handled • Cause • Exceptions: internal to the running thread • Interrupts: external to the running thread 1. All previous instructions should be completely retired. • When to Handle • Exceptions: when detected (and known to be non-speculative) 2. No later instruction should be retired. • Interrupts: when convenient • Except for very high priority ones • Power failure Retire = commit = finish execution and update arch. state • Machine check • Priority: process (exception), depends (interrupt) • Handling Context: process (exception), system (interrupt) 5 6 ENSURING PRECISE EXCEPTIONS IN WHY DO WE WANT PRECISE EXCEPTIONS? PIPELINING • Aid software debugging • Idea: Make each operation take the same amount of time • Enable (easy) recovery from exceptions, e.g. page faults FMUL R3 ß R1, R2 F D E E E E E E E E W ADD R4 ß R1, R2 F D E E E E E E E E W • Enable (easily) restartable processes F D E E E E E E E E W F D E E E E E E E E W F D E E E E E E E E W F D E E E E E E E E W F D E E E E E E E E W • Downside • What about memory operations? • Each functional unit takes 500 cycles? 7 8 2

  3. 3/16/17 SOLUTION: REORDER BUFFER (ROB) • Idea: Complete instructions out-of-order, but reorder them before making results visible to architectural state • When instruction is decoded it reserves an entry in the ROB • When instruction completes, it writes result into ROB entry V DEST DEST CO REG VAL MPL • When instruction oldest in ROB and it has completed, its ETE Oldest FMUL result moved to reg. file or memory 1 R4 -- 0 ADD 1 R3 -- 0 1 0 FMUL 1 0 Youngest 1 0 ADD Func Unit Register Instruction Reorder File Func Unit Cache Buffer Func Unit Reorder File 9 REORD RE RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T RE REORD RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T CYCLE 5 CYCLE 5 OPERATION OP ONS OP OPERATION ONS V DEST DEST CO V DEST DEST CO REG VAL MPL REG VAL MPL Oldest Oldest ETE ETE 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 FMUL FMUL 1 R4 -- 0 1 R4 -- 0 ADD 1 R3 1000 1 ADD 1 R3 1000 1 F D E E E E E E E E R W F D E E E E E E E E R W 1 0 1 0 F D E R W F D E R W 1 0 1 0 Youngest F D E R W FMUL 1 R2 -- 0 F D E R W Youngest FMUL 1 R2 -- 0 ADD ADD 1 R4 -- 0 F D E R W F D E R W FMUL R2 ß R5, R6 F D E E E E E E E E R W F D E E E E E E E E R W FMUL R2 ß R5, R6 ADD R4 ß R5, R6 ADD R4 ß R5, R6 F D E R W F D E R W F D E R W F D E R W Reorder File Reorder File 11 12 3

  4. 3/16/17 RE REORD RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T REORD RE RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T CYCLE 11 CYCLE 12 OPERATION OP ONS OPERATION OP ONS RETIRE V DEST DEST CO V DEST DEST CO REG VAL MPL OLDEST REG VAL MPL Oldest Oldest ETE ETE 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 FMUL FMUL 1 R4 101 0 1 R4 101 1 ADD 1 R3 1000 1 ADD 1 R3 1000 1 F D E E E E E E E E R W F D E E E E E E E E R W 1 0 1 0 F D E R W F D E R W 1 0 1 0 Youngest F D E R W FMUL 1 R2 -- 0 F D E R W Youngest FMUL 1 R2 -- 0 ADD 1 R4 -- 0 ADD 1 R4 -- 0 F D E R W F D E R W FMUL R2 ß R5, R6 F D E E E E E E E E R W F D E E E E E E E E R W FMUL R2 ß R5, R6 ADD R4 ß R5, R6 ADD R4 ß R5, R6 F D E R W F D E R W F D E R W F D E R W Reorder File Reorder File 13 14 RE REORD RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T REORD RE RDER ER BU BUFFER: FFER: INDEP EPEN ENDEN ENT T CYCLE 12 CYCLE 12 OPERATION OP ONS OP OPERATION ONS RETIRE V DEST DEST CO V DEST DEST CO OLDEST REG VAL MPL REG VAL MPL ETE ETE 0 1 2 3 4 5 6 7 8 9 10 11 Oldest 0 1 2 3 4 5 6 7 8 9 10 11 Oldest FMUL 0 R4 101 1 0 ADD 1 R3 1000 1 ADD 1 R3 1000 1 F D E E E E E E E E R W F D E E E E E E E E R W 1 0 1 0 F D E R W F D E R W 1 0 1 0 Youngest F D E R W FMUL 1 R2 -- 0 F D E R W Youngest FMUL 1 R2 -- 0 ADD 1 R4 -- 0 ADD 1 R4 -- 0 F D E R W F D E R W FMUL R2 ß R5, R6 F D E E E E E E E E R W F D E E E E E E E E R W FMUL R2 ß R5, R6 ADD R4 ß R5, R6 ADD R4 ß R5, R6 F D E R W F D E R W F D E R W F D E R W Reorder File Reorder File What if a later operation needs a value in the reorder buffer? 15 Read reorder buffer in parallel with the register file. How? 16 4

  5. 3/16/17 REORDER BUFFER: HOW TO ACCESS? Search for Register Value • A register value can be in the register file, reorder buffer, (or bypass paths) VAL V V DEST DEST CO REG VAL MPL R1 1 1 Register ETE Instruction R2 0 Oldest File 0 Cache R3 0 ADD 1 R3 1000 1 Func Unit R4 0 1 0 R5 5 1 1 0 Func Unit R6 6 1 Youngest 1 R2 -- 0 R7 8 1 ADD 1 R4 -- 0 Content Reorder Func Unit R8 8 1 Addressable Buffer R9 9 1 Memory R10 10 1 (searched with bypass path R11 11 0 register ID) 17 SIMPLIFYING REORDER BUFFER ACCESS Search for Register Value • Idea: Use indirection • Access register file first • If register not valid, register file stores the ID of the reorder buffer VAL TAG V V DEST DEST CO entry that contains (or will contain) the value of the register REG VAL MPL R1 1 1 ETE • Mapping of the register to a ROB entry R2 5 0 Oldest 0 R3 2 0 • Access reorder buffer next ADD 1 R3 1000 1 R4 6 0 1 0 R5 5 1 1 0 R6 6 1 Youngest 1 R2 -- 0 • What is in a reorder buffer entry? R7 8 1 ADD 1 R4 -- 0 R8 8 1 R9 9 1 V DestRegID DestRegVal StoreAddr StoreData BranchTarget PC/IP Control/valid bits R10 10 1 • Can it be simplified further? R11 11 1 19 5

  6. 3/16/17 Reorder Buffer in Intel Pentium III REORDER BUFFER PROS AND CONS • Pro • Conceptually simple for supporting precise exceptions • Con • Reorder buffer needs to be accessed to get the results that are Boggs et al., “The yet to be written to the register file Microarchitecture of the Pentium 4 Processor,” Intel • CAM or indirection à increased latency and complexity Technology Journal, 2001. 21 22 In-Order Pipeline with Reorder Buffer • Decode (D): Access regfile/ROB, allocate entry in ROB, check if instruction can execute, if so dispatch instruction • Execute (E): Instructions can complete out-of-order • Completion (R): Write result to reorder buffer Out-of-Order Execution • Retirement/Commit (W): Check for exceptions; if none, write result to (Dynamic Instruction Scheduling) architectural register file or memory; else, flush pipeline and start from exception handler • In-order dispatch/execution, out-of-order completion, in-order retirement Integer add E Integer mul E E E E R W F D FP mul E E E E E E E E E . . . E E E E E E E Load/store 23 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend