PMPM: P rediction by Combining M ultiple P artial M atches Hongliang - - PowerPoint PPT Presentation
PMPM: P rediction by Combining M ultiple P artial M atches Hongliang - - PowerPoint PPT Presentation
PMPM: P rediction by Combining M ultiple P artial M atches Hongliang Gao Huiyang Zhou Computer Science Department University of Central Florida Prediction by Partial Matching (PPM) Prediction by Partial Matching (PPM) Originally proposed
University of Central Florida 2
Prediction by Partial Matching (PPM) Prediction by Partial Matching (PPM)
- Originally proposed for data compression by Cleary and Witten.
Introduced to branch prediction by Chen et. al.
- For branch prediction:
– Each static branch has a set of Markov predictors from order 0 to
- rder m.
– The “longest match” policy: use the m immediately preceding history bits to search a pattern in the highest order Markov predictor.
- Assumptions of the PPM algorithm
– Longer history provides a more accurate context (true). – A prediction counter associated with a more accurate context will provide higher prediction accuracy (false).
University of Central Florida 3
The The “ “longest match longest match” ” policy is not optimal policy is not optimal
- The confidence-based PPM
– Use the longest confident (ctr <> 0) match. – Misprediction rate (MPKI) reductions vs. PPM. – Max H = 40: – Max H = 0 to 40:
- 1%
0% 1% 2% 3% 4% gzip gcc crafty com press raytrace javac m trt eon gap bzip2 vpr m cf parser jess db m pegaudio jack perlbm k vortex tw
- lf
A V G
- 0.5%
0.0% 0.5% 1.0% 1.5% 5 10 15 20 25 30 35 40 Max History Length
University of Central Florida 4
Introduction Introduction
- Key observation on PPM
– The “longest match” policy is not optimal for branch prediction.
- Our contributions
– A novel algorithm: Prediction by combining Multiple Partial Matches (PMPM). – A PMPM-based idealistic branch predictor. – A PMPM-based realistic branch predictor.
University of Central Florida 5
Prediction by combining Multiple Partial Matches Prediction by combining Multiple Partial Matches
- Different branches favor different history lengths
– Using a longer history than necessary:
- Uncorrelated history information -> noise -> distribute useful
information into more prediction counters.
- Long history repeats less frequently -> only capture most
recently behaviors
– Especially harmful for “not-correlated / random-like” branches.
- Solution
– Combining multiple matches
- Why?
- How: summation -> integrates both direction AND confidence.
- Which: several longest confident matches with non-zero
prediction counters.
Combine multiple counters A specific 10-bit history, one ctr is enough If we use 15-bit histories, may need 32 ctrs Combine multiple counters: including more history behaviors.
University of Central Florida 6
Prediction accuracy of PMPM Prediction accuracy of PMPM
- Configuration
– Combine the L longest confident matches. – Maximum global history length: 40. – Prediction =
- Prediction accuracy
3.6 3.65 3.7 3.75 3.8 3.85 3.9 3.95 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 L Average MPKI PMPM-L PPM
), (
1
≠ ≥
∑
= i L i i
Ctr Ctr
Combine all Minimum MPKI
University of Central Florida 7
The idealistic PMPM predictor The idealistic PMPM predictor
br tag LHR GHR Path history Input meta ctr & bias ctr Short-GHR table
(Len 1-32)
tag LRU ctr ubit bf
Long-GHR table
(geometrical lengths)
tag LRU ctr ubit bf
LHR table
(Len 1-32)
tag LRU ctr ubit bf ∑ Global prediction select M (=7) longest matched useful counters bias ctr ∑ Local prediction select N (=16) longest matched useful counters meta ctr final prediction
University of Central Florida 8
Prediction accuracy of the idealistic PMPM predictor Prediction accuracy of the idealistic PMPM predictor
2 4 6 8 10 12 14 gzip vpr gcc mcf crafty parser compress jess raytrace db javac mpegaudio mtrt jack eon perlbmk gap vortex bzip2 twolf AVG MPKI
PPM iPMPM
- PPM: Same predictor structure, but using the “longest match”
prediction policy.
- Average MPKI
– PPM: 3.330 – PMPM: 2.824
- 15.2%
- 20.9%
University of Central Florida 9
The TAGE predictor The TAGE predictor
Bimodal table gtable6 (shortest) tag ctr ubit gtable5 tag ctr ubit gtable0 (longest) tag ctr ubit pc,ghr, path tag: pc,ghr pc … select one matched counter (the longest match OR the 2nd longest match) final prediction
University of Central Florida 10
The realistic PMPM predictor The realistic PMPM predictor
Bimodal table gtable6 (shortest) tag ctr ubit gtable5 tag ctr ubit gtable0 (longest) tag ctr ubit pc,ghr, path tag: pc,ghr pc … select one matched counter (the longest match OR the 2nd longest match) final prediction ltable tag ctr ubit pc,lhr tag: pc counter selection logic ∑ gtable ctrs: 4 groups: (gtable6, gtable5), (gtable4, gtable3), (gtable2, gtable1), (gtable0). Total (max): a 2-bit bimodal ctr, a 5-bit ltable ctr four 5-bit gtable ctrs.
University of Central Florida 11
Ahead pipelining Ahead pipelining
A B C D Initiate a 3-block ahead prediction Prediction of D is available Cycle1 Cycle2 Cycle3 Cycle4
Indexes.
- 1. Tags.
- 2. Read 4 adjacent
entries.
- 1. Calculate 4
potential predictions.
- 2. Use information
- f B and C to
select out one prediction.
University of Central Florida 12
Compared to the TAGE predictors Compared to the TAGE predictors
- Configuration
– 32kB, same global history series (5 - 131), similar structures. – Compared to the TAGE predictor:
- PMPM-G (GH only): 2-bit larger ctrs, 2-bit smaller tags.
– Compared to the PMPM-G predictor:
- PMPM-GL(GH and LH): one ltable, smaller bimodal
table, smaller tags for 3 gtables.
- Average MPKI:
– TAGE: 3.666 – PMPM-G: 3.597 (higher aliasing, gcc +7.3%) – PMPM-GL: 3.441
University of Central Florida 13
The realistic PMPM predictor for CBP2 The realistic PMPM predictor for CBP2
- Submitted configuration
– Save some storage for miscellaneous registers, counters etc. – Empirically tuned inputs, tag widths etc.
- Several optimizations
– Shared hysteresis bits in the bimodal table (proposed in the EV8 predictor). – Detect traces with high branch footprints and reset ubits periodically (borrowed from the TAGE predictor). – Limited ubit updates if all predictions from gtables are same.
University of Central Florida 14
The realistic PMPM predictor for CBP2 The realistic PMPM predictor for CBP2 -
- accuracy
accuracy
- Observations:
– High accuracy: 3.416 MPKI – The local history is still important for some benchmarks (e.g., raytrace, mtrt and vortex) although we already use a very long (203) global history.
Trace CBP2-GL CBP2-G Trace CBP2-GL CBP2-G gzip 9.712 10.346 vpr 8.945 9.063 gcc 3.690 3.637 mcf 10.092 10.033 crafty 2.581 2.565 parser 5.215 5.244 compress 5.537 5.819 jess 0.393 0.433 raytrace 0.542 0.963 db 2.319 2.380 javac 1.107 1.159 mpegaudio 1.102 1.159 mtrt 0.657 1.009 jack 0.688 0.763 eon 0.276 0.359 perlbmk 0.314 0.484 gap 1.431 1.745 vortex 0.137 0.331 bzip2 0.037 0.042 twolf 13.551 13.616 Average PMPM-CBP2-GL: 3.416 PMPM-CBP2-G: 3.557
University of Central Florida 15
The realistic PMPM predictor for CBP2 The realistic PMPM predictor for CBP2 – – ahead pipelining ahead pipelining
3.2 3.3 3.4 3.5 3.6 3.7 3.8 1-block 2-block 3-block 4-block Average MPKI
P M P M -CB P 2-GL P M P M -CB P 2-G
0.15 MPKI, 4.4% 0.11 MPKI, 3.0%
University of Central Florida 16
Summary Summary
- Key observation on PPM
– The “longest match” policy is not optimal for branch prediction.
- Solution
– Prediction by combining Multiple Partial Matches (PMPM)
- PMPM-based predictor designs