1
play

1 Predictor for a Single Branch Branch History Table of 1-bit - PDF document

Reducing Branch Penalty Branch penalty in dynamically scheduled processors: wasted cycles due to pipeline flushing on mis- predicted branches Lecture 9: Branch Prediction I Reduce branch penalty: 1. Prediction analysis, 1-bit predictor,


  1. Reducing Branch Penalty Branch penalty in dynamically scheduled processors: wasted cycles due to pipeline flushing on mis- predicted branches Lecture 9: Branch Prediction I Reduce branch penalty: 1. Prediction analysis, 1-bit predictor, Predict branch/jump instructions AND branch direction (taken or not taken) 2-bit predictor, branch history table, branch target buffer 2. Predict branch/jump target address (for taken branches) 3. Speculatively execute instructions along the predicted path 1 2 Prediction and Prediction Output Mis-Prediction Cases Prediction is made for EVERY For predicted taken branches (fetch_pc != pc instruction + 4), mis-predicted if the inst The only ACCURATE input is pred_PC the current PC PC � is not a branch/jump instruction; or If pre-decoded, inst type is � available � target address was predicted wrong; or Prediction is made on ALL types of instructions � is a branch but not taken Prediction output is the next Predictors PC value (which is either IM current PC + 4 or a branch target) For predicted not taken branches (fetch_pc Three guesses are made: (1) if the next inst is a branch/jump == pc + 4), mis-predicted if the inst at all; (2) if the “branch” would be taken; (3) what is the target INST feedback PC of the “taken branch”. � is a jump instruction; or From “execution” � is a branch instruction, AND the branch is taken part 3 4 Branch (direction) Prediction Mis-prediction Detections and Feedbacks Detections: Predict branch direction: taken or not taken (T/NT) At commit (most cases) BNE R1, R2, L1 At the end of decoding FETCH predictors taken … � The inst must be non- RENAME L1: … Not taken speculative Static prediction: compilers decide the direction SCHEDULE Dynamic prediction: hardware decides the Feedbacks: direction using dynamic information From commit stage REG 1-bit Branch-Prediction Buffer From decoding 1. 2-bit Branch-Prediction Buffer EXE 2. Or from WB if Correlating Branch Prediction Buffer 3. speculative feedback is WB Tournament Branch Predictor 4. allowed and more … 5. COMMIT 5 6 1

  2. Predictor for a Single Branch Branch History Table of 1-bit Predictor BHT also Called Branch Prediction Buffer in General Form textbook K-bit Branch Can use only one 1-bit 1. Access address predictor, but accuracy is 2. Predict state low Output T/NT PC BHT: use a table of simple 2 k predictors, indexed by bits from PC 3. Feedback T/NT Similar to direct mapped 1-bit prediction cache Prediction Prediction More entries, more cost, Feedback but less conflicts, higher accuracy T NT BHT can contain complex NT predictors 1 0 Predict Taken Predict Taken T 7 8 1-bit BHT Weakness 2-bit Saturating Counter Example: in a loop, 1-bit BHT will cause 2 mispredictions Solution: 2-bit scheme where change prediction only if get misprediction twice: (Figure 3.7, p. 249) Consider a loop of 9 iterations before exit: T for (…){ NT for (i=0; i<9; i++) 11 10 Predict Taken Predict Taken a[i] = a[i] * 2.0; T } T NT � End of loop case, when it exits instead of looping NT 01 00 Predict Not as before Predict Not T Taken � First time through loop on next time through Taken code, when it predicts exit instead of looping NT � Only 80% accuracy even if loop 90% of the time Blue: stop, not taken Gray: go, taken Adds hysteresis to decision making process 9 10 Correlating Branch Predictor Correlating Branches Idea: taken/not taken of Code example showing Assemble code Branch address (4 bits) recently executed the potential branches is related to behavior of next branch 1-bits per branch local predictors (as well as the history of If (d==0) BNEZ R1, L1 that branch behavior) d=1; DADDIU R1,R0,#1 � Then behavior of If (d==1) recent branches L1: DADDIU R3,R1,#-1 Prediction Prediction selects between, say, 2 … BNEZ R3, L2 predictions of next L2: branch, updating just that prediction … � (1,1) predictor: 1-bit Observation: if BNEZ1 is not taken, then BNEZ2 global, 1-bit local 1-bit global is taken branch history (0 = not taken) 11 12 2

  3. Correlating Branch Predictor Accuracy of Different Schemes (Figure 3.15, p. 206) General form: (m, n) 20% Branch address (4 bits) predictor 4096 Entries 2-bit BHT 18% Frequency of Mispredictions � m bits for global Unlimited Entries 2-bit BHT 2-bits per branch 16% history, n bits for local 1024 Entries (2,2) BHT local predictors history 14% Frequency of Mispredictions � Records correlation 12% 11% between m+1 branches 10% Prediction � Simple implementation: Prediction 8% global history can be 6% 6% 6% store in a shift 6% 5% 5% register 4% 4% � Example: (2,2) 2% predictor, 2-bit global, 1% 1% 2-bit global 0% 2-bit local 0% branch history nasa7 matrix300 tomcatv doducd spice fpppp gcc espresso eqntott li (01 = not taken then taken) 4,096 entries: 2-bits per entry Unlimited entries: 2-bits/entry 1,024 entries (2,2) 13 14 Branch Target Buffer Estimate Branch Penalty Branch Target Buffer (BTB): Address of branch index to get prediction AND branch address (if taken) EX: BHT correct rate � Note: must check for branch match now, since can’t use wrong is 95%, BTB hit branch address Example: BTB combined with BHT rate is 95% Branch PC Predicted PC PC of instruction Average miss penalty FETCH is 6 cycles How much is the branch penalty? Extra =? Yes: instruction is prediction state branch and use bits No: branch not predicted PC as predicted, proceed normally next PC (Next PC = PC+4) 15 16 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend