PMPM: P rediction by Combining M ultiple P artial M atches Hongliang - PowerPoint PPT Presentation

PMPM: P rediction by Combining M ultiple P artial M atches Hongliang Gao Huiyang Zhou Computer Science Department University of Central Florida

Prediction by Partial Matching (PPM) Prediction by Partial Matching (PPM) • Originally proposed for data compression by Cleary and Witten. Introduced to branch prediction by Chen et. al. • For branch prediction: – Each static branch has a set of Markov predictors from order 0 to order m . – The “longest match” policy: use the m immediately preceding history bits to search a pattern in the highest order Markov predictor. • Assumptions of the PPM algorithm – Longer history provides a more accurate context (true). – A prediction counter associated with a more accurate context will provide higher prediction accuracy (false). University of Central Florida 2

The “ “longest match longest match” ” policy is not optimal policy is not optimal The • The confidence-based PPM – Use the longest confident (ctr <> 0) match. – Misprediction rate (MPKI) reductions vs. PPM. – Max H = 40: 4% 3% 2% 1% 0% G vortex gzip crafty raytrace javac bzip2 gcc eon gap vpr db jack olf trt cf parser jess pegaudio k -1% press perlbm V m m tw A com m – Max H = 0 to 40: 1.5% 1.0% 0.5% 0.0% 0 5 10 15 20 25 30 35 40 -0.5% Max History Length University of Central Florida 3

Introduction Introduction • Key observation on PPM – The “longest match” policy is not optimal for branch prediction. • Our contributions – A novel algorithm: Prediction by combining Multiple Partial Matches (PMPM). – A PMPM-based idealistic branch predictor. – A PMPM-based realistic branch predictor. University of Central Florida 4

Prediction by combining Multiple Partial Matches Prediction by combining Multiple Partial Matches • Different branches favor different history lengths Combine multiple counters – Using a longer history than necessary: • Uncorrelated history information -> noise -> distribute useful information into more prediction counters. Combine multiple counters: A specific 10-bit history, one ctr is enough including more history behaviors. If we use 15-bit histories, may need 32 ctrs • Long history repeats less frequently -> only capture most recently behaviors – Especially harmful for “not-correlated / random-like” branches. • Solution – Combining multiple matches • Why? • How: summation -> integrates both direction AND confidence. • Which: several longest confident matches with non-zero prediction counters. University of Central Florida 5

Prediction accuracy of PMPM Prediction accuracy of PMPM • Configuration – Combine the L longest confident matches. – Maximum global history length: 40. L ∑ – Prediction = ≥ ≠ ( Ctr 0 ), Ctr 0 i i = i 1 Combine all • Prediction accuracy 3.95 PMPM-L 3.9 Minimum MPKI PPM Average MPKI 3.85 3.8 3.75 3.7 3.65 3.6 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 L University of Central Florida 6

The idealistic PMPM predictor The idealistic PMPM predictor Input GHR meta ctr & bias ctr Path history br tag LHR Short-GHR table Long-GHR table LHR table (Len 1-32) (geometrical lengths) (Len 1-32) tag LRU ctr ubit bf tag LRU ctr ubit bf tag LRU ctr ubit bf select M (=7) longest matched select N (=16) longest matched useful counters useful counters bias ctr ∑ ∑ Global prediction Local prediction meta ctr final prediction University of Central Florida 7

Prediction accuracy of the idealistic PMPM predictor Prediction accuracy of the idealistic PMPM predictor -20.9% 14 PPM -15.2% 12 iPMPM 10 8 MPKI 6 4 2 0 raytrace javac vpr parser mpegaudio jack vortex gzip gcc mcf crafty compress jess db mtrt eon perlbmk gap bzip2 twolf AVG • PPM: Same predictor structure, but using the “longest match” prediction policy. • Average MPKI – PPM: 3.330 – PMPM: 2.824 University of Central Florida 8

The TAGE predictor The TAGE predictor tag: pc,ghr pc pc,ghr, path Bimodal gtable6 gtable5 gtable0 (shortest) (longest) table … tag ctr ubit tag ctr ubit tag ctr ubit select one matched counter (the longest match OR the 2nd longest match) final prediction University of Central Florida 9

The realistic PMPM predictor The realistic PMPM predictor tag: pc tag: pc,ghr gtable ctrs: pc pc,lhr pc,ghr, 4 groups: (gtable6, gtable5), (gtable4, path gtable3), (gtable2, gtable1), (gtable0). Bimodal ltable gtable6 gtable5 gtable0 (shortest) (longest) table Total (max): … a 2-bit bimodal ctr, a 5-bit ltable ctr tag ctr ubit tag ctr ubit tag ctr ubit tag ctr ubit four 5-bit gtable ctrs. select one matched counter counter selection logic (the longest match OR the 2nd longest match) ∑ final prediction University of Central Florida 10

Ahead pipelining Ahead pipelining Initiate a 3-block Prediction of D ahead prediction is available A B C D Cycle1 Cycle2 Cycle3 Cycle4 Indexes . 1. Tags . 1. Calculate 4 potential 2. Read 4 adjacent predictions. entries. 2. Use information of B and C to select out one prediction. University of Central Florida 11

Compared to the TAGE predictors Compared to the TAGE predictors • Configuration – 32kB, same global history series (5 - 131), similar structures. – Compared to the TAGE predictor: • PMPM-G (GH only): 2-bit larger ctrs, 2-bit smaller tags. – Compared to the PMPM-G predictor: • PMPM-GL(GH and LH): one ltable, smaller bimodal table, smaller tags for 3 gtables. • Average MPKI: – TAGE: 3.666 – PMPM-G: 3.597 (higher aliasing, gcc +7.3%) – PMPM-GL: 3.441 University of Central Florida 12

The realistic PMPM predictor for CBP2 The realistic PMPM predictor for CBP2 • Submitted configuration – Save some storage for miscellaneous registers, counters etc. – Empirically tuned inputs, tag widths etc. • Several optimizations – Shared hysteresis bits in the bimodal table (proposed in the EV8 predictor). – Detect traces with high branch footprints and reset ubits periodically (borrowed from the TAGE predictor). – Limited ubit updates if all predictions from gtables are same. University of Central Florida 13

The realistic PMPM predictor for CBP2 - - accuracy accuracy The realistic PMPM predictor for CBP2 • Observations: – High accuracy: 3.416 MPKI – The local history is still important for some benchmarks (e.g., raytrace, mtrt and vortex ) although we already use a very long (203) global history. Trace CBP2-GL CBP2-G Trace CBP2-GL CBP2-G gzip 9.712 10.346 vpr 8.945 9.063 gcc 3.690 3.637 mcf 10.092 10.033 crafty 2.581 2.565 parser 5.215 5.244 compress 5.537 5.819 jess 0.393 0.433 raytrace 0.542 0.963 db 2.319 2.380 javac 1.107 1.159 mpegaudio 1.102 1.159 mtrt 0.657 1.009 jack 0.688 0.763 eon 0.276 0.359 perlbmk 0.314 0.484 gap 1.431 1.745 vortex 0.137 0.331 bzip2 0.037 0.042 twolf 13.551 13.616 Average PMPM-CBP2-GL: 3.416 PMPM-CBP2-G: 3.557 University of Central Florida 14

The realistic PMPM predictor for CBP2 – – ahead pipelining ahead pipelining The realistic PMPM predictor for CBP2 0.11 MPKI, 3.0% 3.8 3.7 Average MPKI 0.15 MPKI, 4.4% 3.6 3.5 3.4 P M P M -CB P 2-GL 3.3 P M P M -CB P 2-G 3.2 1-block 2-block 3-block 4-block University of Central Florida 15

Summary Summary • Key observation on PPM – The “longest match” policy is not optimal for branch prediction. • Solution – Prediction by combining Multiple Partial Matches (PMPM) • PMPM-based predictor designs – Idealistic predictor: 2.824 MPKI. – Realistic predictor: 3.416 MPKI. University of Central Florida 16

Thank you and Questions? Thank you and Questions? Computer Science Department University of Central Florida

PMPM: P rediction by Combining M ultiple P artial M atches Hongliang - PowerPoint PPT Presentation

PMPM: P rediction by Combining M ultiple P artial M atches Hongliang Gao Huiyang Zhou Computer Science Department University of Central Florida Prediction by Partial Matching (PPM) Prediction by Partial Matching (PPM) Originally proposed

Th us, suc h a grammar is unlik ely to b e useful for a programmi ng

MIMO systems Systems with more than one input and output A system with M ultiple I nputs and M

Marinha Grande / Portugal www.pmpm-mp.com To be innovative is to see beyond the obvious, it is to

F AMILIAL P ARTIAL L IPODYSTROPHY 2 Stephen Power BSc Biomedical Science 2015/2016 Familial

P ARTIAL S PLITTING OF L ONGEVITY AND F INANCIAL R ISKS : T HE L IFE N OMINAL C HOOSING S WAPTIONS

P ETRI N ETS AS P ARTIAL O RDER S EMANTICS FOR B IOCHEMICAL N ETWORKS Monika Heiner Ina Koch

for uncountable Paul Larson Miami University September 26, 2015 T RIVIALITY T URZANKI S Q

P ARTIAL T YPE E QUIVALENCES FOR V ERIFIED D EPENDENT I NTEROPERABILITY ( JOINT WORK WITH P.-E.

ROM P P ETRI ETRI N N ETS ETS TO TO P AR TIAL D D IF NTIAL E E QUATI TIONS ARTIAL IFFERENTIAL AND

Combining Models Oliver Schulte - CMPT 726 Bishop PRML Ch. 14 Combining Models: Some Theory

T15 September 22, 2005 1:30 PM D EFECT P REDICTION WITH R ELIABILITY G ROWTH M ODELING Michael

P rediction of U nderlying L atent C lasses via K -means and H ierarchical C lustering A lgorithm

P RINCIPLED K ERNEL P REDICTION FOR S PATIALLY V ARYING BSSRDF S Oskar Elek and Jaroslav K

Q UESTIONS ? David.Snowdon@nicta.com.au http://ertos.nicta.com.au 17 Q UESTIONS ? The

R ELAPSED OR R EFRACTORY M ULTIPLE M YELOMA (RRMM) A ILAWADHI S 1 , S TIFF P 2 , I BRAHIM E 3 , V

The K Kidney in Mult ultiple le M Myelo loma ma Tarek ElBaz, MD. Prof. Internal Medicine

Presentation to Psychology-Law Society Annual Conference New Orleans, Louisiana (March 6, 2020)

October is Bullying Prevention Month Hello! Its your friendly 7th grade counselor, Ms.

Applying a Strengths-based Approach to Research with Marginalized Youth Noelle Hurd, PhD, MPH

1 You have in front of you a brief summary overview of our efforts to implement the current

iDASH - Secure Genome Analysis Task 1A Competition Using ObliVM Task 1B Set union Task 2A Xiao

The Future of Water Supply In Colorado CCI 2018 Summer Conference Denver Post Andy Mueller,

Executive Report January 2013 PRESENTATION COVERAGE CONTENT 1. Summary Report on Compliance 2.

Utilization of hazardous materials in oil based mud waste to turn into value added polymeric