What about branches?
- Branch outcomes are not known until EXE
- What are our options?
1
What about branches? Branch outcomes are not known until EXE What - - PowerPoint PPT Presentation
What about branches? Branch outcomes are not known until EXE What are our options? 1 Control Hazards 2 Today Quiz Control Hazards Midterm review Return your papers 3 Key Points: Control Hazards Control occur when we
1
2
3
4
5 add $s1, $s3, $s2 sub $s6, $s5, $s2 beq $s6, $s7, somewhere and $s2, $s3, $s1
EX
Deco de Fetch Mem Write back
6
EX
Deco de Fetch Mem Write back
6
EX
Deco de Fetch Mem Write back
7
EX
Deco de Fetch Mem Write back
7
EX
Deco de Fetch Mem Write back
8
if (Instruction is branch) { if ($s1 != $s2) { PC = PC + offset; } else { PC = PC + 4; } } else { PC = PC + 4; }
EX
Deco de Fetch Mem Write back
8
if (Instruction is branch) { if ($s1 != $s2) { PC = PC + offset; } else { PC = PC + 4; } } else { PC = PC + 4; }
EX
Deco de Fetch Mem Write back
9
10
EX
Deco de Fetch Mem Write back
add $s0, $t0, $t1
EX
Deco de Fetch Mem Write back
sub $t2, $s0, $t3
Cycles
EX
Deco de Fetch Mem Write back
EX
Deco de Fetch Mem Write back
sub $t2, $s0, $t3 sub $t2, $s0, $t3
11
12
EX
Deco de Fetch Mem
EX
Deco de Fetch Mem Write back
bne $t2, $s0, somewhere
Cycles
EX
Deco de Fetch Mem Write back
add $t2, $s4, $t1 ... somewhere: sub $t2, $s0, $t3
EX
Deco de Fetch Mem Write back
Taken
add $s0, $t0, $t1
Branch Delay
13
EX
Deco de Fetch Mem Write back
add $s0, $t0, $t1
EX
Deco de Fetch Mem Write back
bne $t2, $s0, somewhere
Cycles
Deco de Fetch
sub $t2, $s0, $t3 sub $t2, $s0, $t3
Stall
EX
Deco de Fetch
14
14
14
14
15
16
EX
Deco de Fetch Mem Write back
bne $t2, $s0, somewhere
Cycles
bne $t2, $s4, else ... else: sub $t2, $s0, $t3
EX
Deco de Fetch Mem Write back
Taken Not-taken
add $s0, $t0, $t1
16
EX
Deco de Fetch Mem Write back
bne $t2, $s0, somewhere
Cycles
bne $t2, $s4, else ... else: sub $t2, $s0, $t3
EX
Deco de Fetch Mem Write back
Taken Not-taken
add $s0, $t0, $t1
EX
Deco de Fetch Mem Write back
16
EX
Deco de Fetch Mem Write back
bne $t2, $s0, somewhere
Cycles
bne $t2, $s4, else ... else: sub $t2, $s0, $t3
EX
Deco de Fetch Mem Write back
Taken Not-taken
add $s0, $t0, $t1
EX
Deco de Fetch Mem Write back Deco de Fetch
16
EX
Deco de Fetch Mem Write back
bne $t2, $s0, somewhere
Cycles
bne $t2, $s4, else ... else: sub $t2, $s0, $t3
EX
Deco de Fetch Mem Write back
Taken Not-taken
add $s0, $t0, $t1
EX
Deco de Fetch Mem Write back Deco de Fetch
Squash
17
17
17
17
!"#$ %$$&"''
!"#$%&'()" *+,)%-
.// 01 2 (&)*"+,#*# !"#$+%$$&+- !"#$+%$$&+. (&)*"+%$$&
3+45#$+% 657+
!"#$ +,#*#+- !"#$ +,#*#+.
0. .89 :;5< 7+<=> .//
?@$@ *+,)%-
%$$&"'' (&)*"+,#*# !"#$ ,#*# !6+$';A?+' ?+'ABC+' BC+'A*+, *+,ADE :54" BC$+"/
Read Address
Instruc(on Memory
Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr
Register File
Read Data 1 Read Data 2 16 32 ALU Shi< le< 2
Data Memory
Address Write Data Read Data IFetch/Dec Dec/Exec Exec/Mem Mem/WB Sign Extend Add Sign Extend Add Shi< le< 2
20
21
every branch?
22
identify a branch (in our case, this is less than 1)
23
branch?
25
branch?
26
27
28
29 1 2 3 4 5 6 7 F D- D+ C B- B+ A # of students
30
branch behavior.
taken branch. All 10 are pretty predictable.
it’s always the same.
31
32
32
32
33
33
It can differentiate between branches. Bad behavior by one won’t mess up others.... mostly.
33
It can differentiate between branches. Bad behavior by one won’t mess up others.... mostly.
Infinite! Bigger is better, but don’t mess with the cycle
33
It can differentiate between branches. Bad behavior by one won’t mess up others.... mostly.
Infinite! Bigger is better, but don’t mess with the cycle
Accuracy is still not great.
34
for(i = 0; i < 10; i++) { for(j = 0; j < 4; j++) { } }
34
for(i = 0; i < 10; i++) { for(j = 0; j < 4; j++) { } }
iteration Actual prediction new prediction 1 taken not taken taken 2 taken taken taken 3 taken taken taken 4 not taken taken not taken 1 taken not taken take 2 taken taken taken 3 taken taken taken
34
for(i = 0; i < 10; i++) { for(j = 0; j < 4; j++) { } }
iteration Actual prediction new prediction 1 taken not taken taken 2 taken taken taken 3 taken taken taken 4 not taken taken not taken 1 taken not taken take 2 taken taken taken 3 taken taken taken
35
State
00 -- strongly not taken 01 -- weakly not taken 10 -- weakly taken 11 -- strongly taken
Predicti
not taken not taken taken taken
36
for(i = 0; i < 10; i++) { for(j = 0; j < 4; j++) { } }
36
for(i = 0; i < 10; i++) { for(j = 0; j < 4; j++) { } }
iteration Actual state prediction new state 1 taken weakly taken taken strongly taken 2 taken strongly taken taken strongly taken 3 taken strongly taken taken strongly taken 4 not taken strongly taken taken weakly taken 1 taken weakly taken taken strongly taken 2 taken strongly taken taken strongly taken 3 taken strongly taken taken strongly taken
36
for(i = 0; i < 10; i++) { for(j = 0; j < 4; j++) { } }
iteration Actual state prediction new state 1 taken weakly taken taken strongly taken 2 taken strongly taken taken strongly taken 3 taken strongly taken taken strongly taken 4 not taken strongly taken taken weakly taken 1 taken weakly taken taken strongly taken 2 taken strongly taken taken strongly taken 3 taken strongly taken taken strongly taken
dynamic branch
prediction.
37
taken branch. All 10 are pretty predictable.
it’s always the same.
38
39
for(i = 0; i < 10; i++) { for(j = 0; j < 4; j++) { } }
39
for(i = 0; i < 10; i++) { for(j = 0; j < 4; j++) { } }
iteration Actual 1 taken 2 taken 3 taken 4 not taken 1 taken 2 taken 3 taken 4 not taken 1 taken 2 taken 3 taken 4 not taken
40
40
iteration Actual Branch history Steady state prediction 1 taken 11111 2 taken 11111 3 taken 11111 4 not taken 11111
taken 11110 taken 1 taken 11101 taken 2 taken 11011 taken 3 taken 10111 taken 4 not taken , 01111 not taken
taken 11110 taken 1 taken 11101 taken 2 taken 11011 taken 3 taken 10111 taken 4 not taken , 01111 not taken
taken 11110 taken 1 taken 11101 taken 2 taken 11011 taken 3 taken 10111 taken 4 not taken , 01111 not taken
40
iteration Actual Branch history Steady state prediction 1 taken 11111 2 taken 11111 3 taken 11111 4 not taken 11111
taken 11110 taken 1 taken 11101 taken 2 taken 11011 taken 3 taken 10111 taken 4 not taken , 01111 not taken
taken 11110 taken 1 taken 11101 taken 2 taken 11011 taken 3 taken 10111 taken 4 not taken , 01111 not taken
taken 11110 taken 1 taken 11101 taken 2 taken 11011 taken 3 taken 10111 taken 4 not taken , 01111 not taken
41
41
the per-PC predictor if the loop executes 4 iterations and we keep 4 history bits?
42
the per-PC predictor if the loop executes 4 iterations and we keep 4 history bits?
43
the per-PC predictor if the loop executes 4 iterations and we keep 4 history bits?
43
With more iterations, the benefit of history decreases, so a shorter history is ok.
taken branch. All 10 are pretty predictable.
it’s always the same.
44
Pretty good, as long as the history is not too long
low-order bits of the PC.
indexed by the history for that branch.
polluted.
45
Table of history registers Predictor table 0 Predictor table 1 PC Prediction
46
keep track which predictor is most often correct.
predictor.
47
the jump.
the BTB.
destination.
48
49