superscalar design
play

Superscalar Design: An Introduction Virendra Singh Associate - PowerPoint PPT Presentation

Superscalar Design: An Introduction Virendra Singh Associate Professor C omputer A rchitecture and D ependable S ystems L ab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail:


  1. Superscalar Design: An Introduction Virendra Singh Associate Professor C omputer A rchitecture and D ependable S ystems L ab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail: viren@ee.iitb.ac.in EE-739: Processor Design Lecture 24 (12 March 2013) CADSL

  2. Superscalar Pipeline Stages Superscalar Pipeline Stages Fetch Instruction Buffer Decode In Program Order Dispatch Buffer Dispatch Issuing Buffer Out Execute of Order Completion Buffer Complete In Program Store Buffer Order Retire 14 Mar 2013 EE-739@IITB 2 CADSL

  3. Superscalar Architecture  Wide pipelines to exploit ILP  ILP is not necessarily exploited by widening the pipelines and adding more resources  Processor policies towards fetching decoding, and executing instruction have significant effect on its ability to discover instructions which can be executed concurrently  Instruction issue policy limits or enhances performance because it determines the processor’s look ahead capability 14 Mar 2013 EE-739@IITB 3 CADSL

  4. Issues in Decoding Issues in Decoding • Primary Tasks  Identify individual instructions (!)  Determine instruction types  Determine dependences between instructions • Two important factors  Instruction set architecture  Pipeline width 14 Mar 2013 EE-739@IITB 4 CADSL

  5. Pentium Pro Fetch/Decode Pentium Pro Fetch/Decode 14 Mar 2013 EE-739@IITB 5 CADSL

  6. Predecoding in the AMD K5 Predecoding in the AMD K5 14 Mar 2013 EE-739@IITB 6 CADSL

  7. Instruction Dispatching  Diversified pipeline  Different type instructions executed by different FU in different pipelines  Distributed control  Operands are fetched from RF  Operands may not be available  Reservation station 14 Mar 2013 EE-739@IITB 7 CADSL

  8. Instruction Dispatch and Issue Instruction Dispatch and Issue • Parallel pipeline  Centralized instruction fetch  Centralized instruction decode • Diversified pipeline  Distributed instruction execution 14 Mar 2013 EE-739@IITB 8 CADSL

  9. Necessity of Instruction Dispatch Necessity of Instruction Dispatch 14 Mar 2013 EE-739@IITB 9 CADSL

  10. Centralized Reservation Station Centralized Reservation Station 14 Mar 2013 EE-739@IITB 10 CADSL

  11. Distributed Reservation Station Distributed Reservation Station 14 Mar 2013 EE-739@IITB 11 CADSL

  12. Issues in Instruction Execution Issues in Instruction Execution • Current trends  More parallelism  bypassing very challenging  Deeper pipelines  More diversity • Functional unit types  Integer  Floating point  Load/store  most difficult to make parallel  Branch  Specialized units (media) • Very wide datapaths (256 bits/register or more) 14 Mar 2013 EE-739@IITB 12 CADSL

  13. Bypass Networks Bypass Networks I-Cache PC BR Fetch Q Scan BR Decode Predict FP FX/LD 1 FX/LD 2 BR/CR Reorder Buffer Issue Q Issue Q Issue Q Issue Q FX1 FX2 CR BR LD1 LD2 Unit Unit Unit Unit FP1 FP2 Unit Unit Unit Unit StQ D-Cache • O(n 2 ) interconnect from/to FU inputs and outputs • Associative tag-match to find operands • Solutions (hurt IPC, help cycle time) – Use RF only (IBM Power4) with no bypass network – Decompose into clusters (Alpha 21264) 14 Mar 2013 EE-739@IITB 13 CADSL

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend