Data Speculation Adam Wierman Daniel Neill Lipasti and Shen. - PowerPoint PPT Presentation

Data Speculation Adam Wierman Daniel Neill Lipasti and Shen. Exceeding the dataflow limit, 1996. Sodani and Sohi. Understanding the differences between value prediction and instruction reuse , 1998. Architecture Carnegie Mellon 1 School of Computer Science

A Taxonomy of Speculation What can we Speculative Execution speculate on? Control Speculation Data Speculation Branch Direction Branch Target Data Location Data Value Question: What makes speculation possible? Architecture Carnegie Mellon 2 School of Computer Science

Value Locality How often does the same value result from the same instruction twice in a row Question: Where does value locality occur? Somewhat Single-cycle Arithmetic (i.e. addq $1 $2) Yes Single-cycle Logical (i.e bis $1 $2) No Multi-cycle Arithmetic (i.e. mulq $1 $2) Yes Register Move (i.e. cmov $1 $2) Yes Integer Load (i.e. ldq $1 8($2)) No Store with base register update Yes FP Load Somewhat FP Multiply Somewhat FP Add Yes FP Move Architecture Carnegie Mellon 3 School of Computer Science

Value Locality Question: Why is speculation useful? addq $1 $2 $3 addq $3 $1 $4 addq $3 $2 $5 Speculation lets all these run in parallel on a superscalar machine Architecture Carnegie Mellon 4 School of Computer Science

Exploiting Value Locality “predict the results of instructions based on previously seen results” Value Prediction (VP) Instruction Reuse (IR) “recognize that a computation chain has been previously performed and therefore need not be performed again” Architecture Carnegie Mellon 5 School of Computer Science

Exploiting Value Locality Fetch Decode Issue Execute Commit Predict if mispredicted Verify Value Value Prediction (VP) Instruction Reuse (IR) Fetch Decode Issue Execute Commit Check for Verify arguments if reused previous use are the same Architecture Carnegie Mellon 6 School of Computer Science

Value Prediction (Lipasti & Shen, 1996) Architecture Carnegie Mellon 7 School of Computer Science

Value prediction • Speculative prediction of register values – Values predicted during fetch and dispatch, forwarded to dependent instructions. – Dependent instructions can be issued and executed immediately. – Before committing a dependent instruction, we must verify the predictions. If wrong: must restart dependent instruction w/ correct values. Fetch Decode Issue Execute Commit Predict if mispredicted Verify Value Architecture Carnegie Mellon 8 School of Computer Science

Overview Classification Table (CT) Value Prediction Table (VPT) PC Pred History Value History PC Should I predict? Predicted Value Prediction Architecture Carnegie Mellon 9 School of Computer Science

How to predict values? Classification Table (CT) Value Prediction Table (VPT) PC Pred History Value History PC Value Prediction Table (VPT) – Cache indexed by instruction address (PC) – Mapped to one or more 64-bit values – Values replaced (LRU) when instruction first encountered or when prediction incorrect. – 32 KB cache: 4K 8-byte entries Prediction Architecture Carnegie Mellon 10 School of Computer Science

Estimating prediction accuracy Classification Table (CT) Value Prediction Table (VPT) PC Pred History Value History PC Classification Table (CT) Predicted Value – Cache indexed by instruction address (PC) – Mapped to 2-bit saturating counter, incremented when correct and decremented when wrong. 0,1 = don’t use prediction 2 = use prediction 3 = use prediction and don’t replace value if wrong – 1K entries sufficient Prediction Architecture Carnegie Mellon 11 School of Computer Science

Verifying predictions • Predicted instruction executes normally. • Dependent instruction cannot commit until predicted instruction has finished executing. • Computed result compared to predicted; if ok then dependent instructions can commit. • If not, dependent instructions must reissue and execute with computed value. Miss penalty = 1 cycle later than no prediction. Fetch Decode Issue Execute Commit Predict if mispredicted Verify Value Architecture Carnegie Mellon 12 School of Computer Science

Results • Realistic configuration, on simulated (current and near-future) PowerPC gave 4.5-6.8% speedups. – 3-4x more speedup than devoting extra space to cache. • Speedups vary between benchmarks (grep: 60%) • Potential speedups up to 70% for idealized configurations. – Can exceed dataflow limit (on idealized machine). Architecture Carnegie Mellon 13 School of Computer Science

Instruction Reuse (Sodani & Sohi, 1998) Architecture Carnegie Mellon 14 School of Computer Science

Instruction Reuse • Obtain results of instructions from their previous executions. – If previous results still valid, don’t execute the instruction again, just commit the results! • Non-speculative, early verification – Previous results read in parallel with fetch. – Reuse test in parallel with decode. – Only execute if reuse test fails. Fetch Decode Issue Execute Commit Check for Verify arguments if reused previous use are the same Architecture Carnegie Mellon 15 School of Computer Science

How to reuse instructions? • Reuse buffer – Cache indexed by instruction address (PC) – Stores result of instruction along with info needed for establishing reusability: Operand register names Pointer chain of dependent instructions – Assume 4K entries (each entry takes 4x as much space as VPT: compare to 16K VP) – 4-way set-associative. Architecture Carnegie Mellon 16 School of Computer Science

Reuse Scheme • Dependent chain of results (each points to previous instruction in chain) – Entry is reusable if the entries on which it depends have been reused (can’t reuse out of order). – Start of chain: reusable if “valid” bit set; invalidated when operand registers overwritten. – Special handling of loads and stores. • Instruction will not be reused if: – Inputs not ready for reuse test (decode stage) – Different operand registers Architecture Carnegie Mellon 17 School of Computer Science

Results • Attempts to evaluate “realistic” and “comparable” schemes for VP and IR on simulated MIPS architecture. • Are these really realistic? Assume oracle or || test. • Net performance: VP better on some benchmarks; IR better on some. All speedups typically 5-10%. • More interesting question: can the two schemes be combined? • Claim: 84-97% of redundant instructions reusable. Architecture Carnegie Mellon 18 School of Computer Science

Comparing VP and IR “predict the results of instructions based on previously seen results” Value Prediction (VP) Instruction Reuse (IR) “recognize that a computation chain has been previously performed and therefore need not be performed again” Architecture Carnegie Mellon 19 School of Computer Science

Comparing VP and IR IR can’t predict when: 1. Inputs aren’t ready “predict the results of instructions 2. Same result follows from different inputs based on previously seen results” 3. VP makes a lucky guess Value Prediction (VP) Which captures Which captures more redundancy? Instruction Reuse (IR) more redundancy? “recognize that a computation chain has been previously performed and therefore need not be performed again” Architecture Carnegie Mellon 20 School of Computer Science

Comparing VP and IR “predict the results of instructions based on previously seen results” Value Prediction (VP) Which handles Which captures misprediction Instruction Reuse (IR) more redundancy? better? “recognize that a computation chain has been previously performed and therefore IR is non-speculative, so it need not be performed again” never mispredicts Architecture Carnegie Mellon 21 School of Computer Science

Comparing VP and IR “predict the results of instructions based on previously seen results” Value Prediction (VP) Which integrates Which captures best with branches? Instruction Reuse (IR) more redundancy? IR “recognize that a computation chain has 1. Mispredicted branches are detected earlier been previously performed and therefore 2. Instructions from mispredicted branches need not be performed again” can be reused. VP 1. Causes more misprediction Architecture Carnegie Mellon 22 School of Computer Science

Comparing VP and IR “predict the results of instructions based on previously seen results” Value Prediction (VP) Which is better Which captures for resource Instruction Reuse (IR) more redundancy? contention? “recognize that a computation chain has IR might not even need to execute been previously performed and therefore need not be performed again” the instruction Architecture Carnegie Mellon 23 School of Computer Science

Comparing VP and IR “predict the results of instructions based on previously seen results” Value Prediction (VP) Which is better Which captures for execution Instruction Reuse (IR) more redundancy? latency? “recognize that a computation chain has VP causes some instructions to be been previously performed and therefore executed twice (when values are need not be performed again” mispredicted), IR executes once or not at all. Architecture Carnegie Mellon 24 School of Computer Science

Data Speculation Adam Wierman Daniel Neill Lipasti and Shen. - PowerPoint PPT Presentation

Data Speculation Adam Wierman Daniel Neill Lipasti and Shen. Exceeding the dataflow limit, 1996. Sodani and Sohi. Understanding the differences between value prediction and instruction reuse , 1998. Architecture Carnegie Mellon 1 School of

Years Guri Sohi University of Wisconsin-Madison Outline Speculation infancy performance

Prediction and speculation : the role of stochastic models of program behaviour in the

BCs Speculation & Vacancy Tax Register to claim your exemption by March 31 st , 2019 What

Sentiment and speculation in a market with heterogeneous beliefs Ian Martin Dimitris

Bloom Filtering Cache Misses for Accurate Data Speculation and Prefetching Jih-Kwon Peir,

Personality as an Emergent Process: Some Metaphors/Speculation Steven N. Durlauf University of

Reaching the Poor? Facts and (A Little) Speculation Robert Cull March 17, 2015 The Promise (1)

Imitation is Suicide Glen Ford, CTO experience observation wild speculation What is culture

What is Excessive Speculation and Why is There So Much of It? (with apologies to Gertrude

Speculation and Price Volatility: I m plications for Farm er Marketing Scott I rw in sirw

DeAliaser: Alias Speculation Using Atomic Region Support Wonsun Ahn*, Yuelu Duan, Josep Torrellas

Contents Outline 1 Methodology 2 Model 3 Triumvirate of spreads Stylized Facts 4 Key

A Natural Language Database Interface using Fuzzy Semantics ...wild speculation about the nature

Loop Selection for Thread-Level Speculation Shengyue Wang, Xiaoru Dai, Kiran S. Yellajyosula,

Complexity-Effective Issue Queue Design Under Load-Hit Speculation Tali Moreshet and R. Iris

HARDWARE SPECULATION Mahdi Nazm Bojnordi Assistant Professor School of Computing University of

Chapter 9 such statements as they tend to sound pretty silly in 5 years Alternative

Disciplina Sistemas de Computao Aula 04 Aviso Slides e Arquivos j esto no site

Hardware-Software Codesign 3. Mapping Applications To Architectures Lothar Thiele Computer

Dataflow Supercomputers Michael J. Flynn Maxeler T echnologies and Stanford University Outline

CS510 Software Engineering Static Program Analysis Asst. Prof. Mathias Payer Department of

An FPGA-Based Scalable Hardware Scheduler For Data-Flow Models Roberto Giorgi, Marco Procaccini,

Computing in Space PRACE Keynote Oskar Mencer, April 2014 Thinking

CS654 Advanced Computer Architecture Lec 8 Instruction Level Parallelism Peter Kemper

Data Speculation Adam Wierman Daniel Neill Lipasti and Shen. - PowerPoint PPT Presentation

Data Speculation Adam Wierman Daniel Neill Lipasti and Shen. Exceeding the dataflow limit, 1996. Sodani and Sohi. Understanding the differences between value prediction and instruction reuse , 1998. Architecture Carnegie Mellon 1 School of

Years Guri Sohi University of Wisconsin-Madison Outline Speculation infancy performance

Prediction and speculation : the role of stochastic models of program behaviour in the

BCs Speculation &amp; Vacancy Tax Register to claim your exemption by March 31 st , 2019 What

Sentiment and speculation in a market with heterogeneous beliefs Ian Martin Dimitris

Bloom Filtering Cache Misses for Accurate Data Speculation and Prefetching Jih-Kwon Peir,

Personality as an Emergent Process: Some Metaphors/Speculation Steven N. Durlauf University of

Reaching the Poor? Facts and (A Little) Speculation Robert Cull March 17, 2015 The Promise (1)

Imitation is Suicide Glen Ford, CTO experience observation wild speculation What is culture

What is Excessive Speculation and Why is There So Much of It? (with apologies to Gertrude

Speculation and Price Volatility: I m plications for Farm er Marketing Scott I rw in sirw

DeAliaser: Alias Speculation Using Atomic Region Support Wonsun Ahn*, Yuelu Duan, Josep Torrellas

Contents Outline 1 Methodology 2 Model 3 Triumvirate of spreads Stylized Facts 4 Key

A Natural Language Database Interface using Fuzzy Semantics ...wild speculation about the nature

Loop Selection for Thread-Level Speculation Shengyue Wang, Xiaoru Dai, Kiran S. Yellajyosula,

Complexity-Effective Issue Queue Design Under Load-Hit Speculation Tali Moreshet and R. Iris

HARDWARE SPECULATION Mahdi Nazm Bojnordi Assistant Professor School of Computing University of

Chapter 9 such statements as they tend to sound pretty silly in 5 years Alternative

Disciplina Sistemas de Computao Aula 04 Aviso Slides e Arquivos j esto no site

Hardware-Software Codesign 3. Mapping Applications To Architectures Lothar Thiele Computer

Dataflow Supercomputers Michael J. Flynn Maxeler T echnologies and Stanford University Outline

CS510 Software Engineering Static Program Analysis Asst. Prof. Mathias Payer Department of

An FPGA-Based Scalable Hardware Scheduler For Data-Flow Models Roberto Giorgi, Marco Procaccini,

Computing in Space PRACE Keynote Oskar Mencer, April 2014 Thinking

CS654 Advanced Computer Architecture Lec 8 Instruction Level Parallelism Peter Kemper

BCs Speculation & Vacancy Tax Register to claim your exemption by March 31 st , 2019 What