1 Register Renaming Examples Register Mapping Status Loop: - - PDF document

1
SMART_READER_LITE
LIVE PREVIEW

1 Register Renaming Examples Register Mapping Status Loop: - - PDF document

Generic Superscalar Processor Models Issue queue based Lecture 11: Modern Superscalar FU Rename Wakeup D-cache Regfile bypass Fetch select Processor Models FU commit schedule Generic Superscalar Models, Issue execute Reservation


slide-1
SLIDE 1

1

1

Lecture 11: Modern Superscalar Processor Models

Generic Superscalar Models, Issue Queue-based Pipeline, Multiple-Issue Design

2

Generic Superscalar Processor Models

Fetch Rename Wakeup select Regfile FU FU bypass D-cache execute commit Fetch Rename ROB FU FU bypass D-cache execute commit Reg

Wakeup select

Issue queue based Reservation based (already studied) Revised from Paracharla PhD thesis 1998 schedule schedule

3

Issue Queue Based Pipeline

Fetch->Rename->Issue->Reg-read-> Execute- >Writeback/Commit Core structure: register mapping table Rename: translate architectural registers into physical registers Issue: send instruction out to register read and then execution Commit: Process mis-prediction/exception, update register renaming Why study? Used in Alpha 21264, MIPS R10000, Intel P4

4

Compare Reservation Station and Issue Queue

Pipeline Stage Sequence

1.

RS: IF -> REN -> REG/ROB->SCHD->…

2.

IQ: IF -> REN -> SCHD -> REG ->…

  • Mapping Table vs. Status Table

1.

RS: Status table chooses architectural register

  • r ROB

2.

IQ: Always renames to a physical register

  • Register file

1.

RS: Architectural register file stores architectural states

2.

IQ: Physical register file; No architectural register file! Mapping table determines architectural states

5

Compare Reservation Station and Issue Queue

Reservation Station

1.

RS: busy, fu, op, Qj, Qk, Vj, Vk

2.

IQ: busy, fu, op, Pj, Pk, ReadyJ, ReadyK

ROB

1.

RS: Store register values

2.

IQ: No register contents

Pros and Cons of IQ:

  • No copying between ROB and register
  • Efficient use of register
  • Bad: Complex mapping table design

6

Register Mapping Table

Records the mapping from virtual, architectural registers to physical registers Mapping is stored in RAM or CAM memories

Arch reg (virtual) Phy reg

R1 => P3 R2 => P10 R3 => P6 R4 => P8 R5 => P12

slide-2
SLIDE 2

2

7

Register Renaming Examples

Loop: LW R2, 0(R1) ADD R2, R2, 1 SW R2, 0(R1) ADD R1, R1, 4 BNE R2, R3, LOOP LW returns 100, R1=1000 Renamed dynamic instructions: … BNE P2, P3, Loop LW P32, 0(P1) ADD P33, P32, 1 SW R33, 0(P1) ADD P34, P1, 4 BNE P34, P3, LOOP … Assume at first BNE.rename, R1-R31 mapped to P1-P31, P32-P127 are free First BNE may be predicted either correctly or not

8

Register Mapping Status

R1 => P1 R2 => P32 R3 => P3 R4 => P4 R5 => P5

P1=>R1 R2 => P33 R3 => P3 R4 => P4 R5 => P5

R1 => P1 R2 => P2 R3 => P3 R4 => P4 R5 => P5

P1=4000 P2=200 … P32=100 P33=101 P34=4004

At commit (possible sequence)

P1=4000 P2=200 … P32=100 P33=101 P34=4004 P1=4000 P2=200 … P32=100 P33=? P34=4004 P1=4000 P2=200 … P32=100 P33=101 P34=4004 P1=>R1 R2 => P33 R3 => P3 R4 => P4 R5 => P5

P1=>R34 R2 => P33 R3 => P3 R4 => P4 R5 => P5

No change

9

Commit and Rollback

R1 => P1 R2 => P32 R3 => P3 R4 => P4 R5 => P5

P1=>R1 R2 => P33 R3 => P3 R4 => P4 R5 => P5

R1 => P1 R2 => P2 R3 => P3 R4 => P4 R5 => P5

P1=>R1 R2 => P33 R3 => P3 R4 => P4 R5 => P5

P1=>R34 R2 => P33 R3 => P3 R4 => P4 R5 => P5

P1=4000 P2=200 … P32=100 P33=? P34=4004

Commit successful: make the next mapping status as committed mapping status free the previous physical register Mis-prediction/exception: flush pipeline, flush the following mappings

Rename point commit point

10

Program Execution Correctness

Only committed instructions write to register

and memory

Yes, from programmer’s viewpoint -- only committed instructions’ register output becomes visible

Maintain correct data flow – a child instruction

always use the values from its parents

Yes, in renamed form, and not affected by speculative execution

Register/memory receives the value of last write

Yes, from programmer’s viewpoint -- architectural mapping status is updated in program order

Note memory correctness is not affected

11

Mapping Table Design – MIPS R1000

RAM-based structure:

Automatically, parallel saving on branches at rename On mis-prediction: restore the previous mapping immediately,

flush pipeline, restart fetch at the alternative PC

On commit of branch instruction: make the corresponding

mapping as the committed one

Stall if branch stack is full

Mapping after Br4 Mapping after Br3 Mapping after Br2 Mapping after Br1 Committed mapping Branch stack

Alternative PC4 Alternative PC3 Alternative PC2 Alternative PC1

Mapping tables

Current mapping Committed mapping

12

Mapping Table Design – MIPS R1000

How about precise exception?

Cannot preserve every mapping status for every

instruction

Solution: record the change of mapping in ROB

ROB: Contains Dest Architectural Register,

Renamed physical register, Old renamed physical register

On exception: rollback mapping one instruction by

  • ne instruction, four instructions per cycle

Slow performance – but how frequent is

exception?

Note branch mis-prediction has fast recovery

slide-3
SLIDE 3

3

13

Mapping Table Design – Alpha 21264

CAM structure

Associative searching on architecture register index,

  • utput physical register index (through an encoder)

One column represents one mapping, allocated to each

instruction with register output at rename

One pair of valid bit changes per one dest renaming Fast recovery even on exceptions

  • Arch. Reg #
  • Arch. Reg #
  • Arch. Reg #

… …

  • Arch. Reg #

p0 p1 p2 pk

1 1 1 1 1 1 Valid bits

current mapping committed mapping

Match and valid

14

Multiple Issue Pipelines

Each pipeline stages accept k instructions – k- issue processor

Alpha 21264 – 4-issue MIPS R1000 – 4-issue Intel P4 – 3-issue

Memory structure must have multiple ports proportional to issue width! What if k instructions at rename have dependence among them? Need Dependence check logic!

15

Dependence Check Logic

Any change to the first renaming? What is the change to the second one? Third and forth ones?

mapping table

Rs0 Rt0 Rd0 Ps0Ps1 Rs1 Rt1 Rd1 Rs2 Rt2 Rd2 Rs3 Rt3 Rd3 Ps0Ps1 Ps0 Ps1 Ps0Ps1 Pd0 Pd1 Pd2 Pd3

No dependence check yet