pipelining
play

Pipelining PIPELINING what Seymour Cray taught the laundry industry - PowerPoint PPT Presentation

Pipelining PIPELINING what Seymour Cray taught the laundry industry How to correctly pipeline circuits Ive got 3 months Worth of laundry Funny, considering that hes only got To do tonight one outfit Acknowledgement: The


  1. Pipelining PIPELINING what Seymour Cray taught the laundry industry How to correctly pipeline circuits… I’ve got 3 months Worth of laundry Funny, considering that he’s only got To do tonight… one outfit… • Acknowledgement: The following slides have been provided by Prof. Ward in September 2004. • Reformatting of PowerPoint and addition of two more slide done September 2007 by Jens Sparsø. • Slides are used in DTU course 02154 Digital Systems Engineering (fall 2008). • Due to my (Joachim Rodrigues) position at DTU, I took the freedom to use the slides in EITF35. 02340 Lectu 02340 cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 2 02340 02340 Lectu cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 3 One load at a time Forget EITF35… lets solve a “Real Problem” Everyone knows that the real reason that MIT students put Step 1: INPUT: off doing laundry so long is not Device: Washer dirty laundry because they procrastinate, Function: Fill, Agitate, Spin are lazy, or even have better things to do. Washer PD = 30 mins Step 2: The fact is, doing one load at a OUTPUT: time is not smart. 6 more weeks Device: Dryer Function: Heat, Spin Dryer PD = 60 mins Total = Washer PD + Dryer PD 90 = _________ mins 02340 Lectu 02340 cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 4 02340 02340 Lectu cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 5

  2. Doing N loads of laundry Doing N Loads… the MIT way MIT students Step 1: Step 1: Here’s how they do laundry at “pipeline” the laundry Harvard, the “combinational” way. process. Step 2: Step 2: (Of course, this is just an urban legend. No one at Harvard That’s why we wait! Step 3: Step 3: actually does laundry. The … butlers all arrive on Wednesday morning, pick up the dirty Actually, it’s more like N*60 Step 4: laundry and return it all pressed + 30 if we account for the … and starched in time for startup transient correctly. Total = N * Max(Washer PD , Dryer PD ) When doing pipeline analysis, afternoon tea) we’re mostly interested in N*60 = ____________ mins Total = N*(Washer PD + Dryer PD ) the “steady state” where we assume we have an infinite N*90 = ____________ mins supply of inputs. 02340 02340 Lectu cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 6 02340 02340 Lectu cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 7 Some definitions Okay, back to circuits… Latency: For combinational logic: latency = t PD , F The delay from when an input is established until the output throughput = 1/t PD. associated with that input becomes valid. X H P(X) We can’t get the answer faster, but are we making effective use 90 90 (Harvard Laundry = _________ mins) Assuming that the wash is started as of our hardware at all times? G soon as possible and waits (wet) in the ( MIT Laundry = _________ mins) 120 120 washer until dryer is available. X Throughput: F(X) The rate of which inputs or outputs are processed. G(X) (Harvard Laundry = _________ outputs/min) 1/90 1/90 P(X) ( MIT Laundry = _________ outputs/min) 1/60 1/60 F & G are “idle”, just holding their outputs stable while H performs its computation 02340 02340 Lectu cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 8 02340 02340 Lectu cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 9

  3. Pipelined Circuits Pipeline diagrams use registers to hold H’s input stable! F 15 15 Clock cycle X H P(X) P(X 25 25 i i+1 i+2 i+3 G 20 20 Now F & G can be working on input X i+1 F while H is performing its computation 15 Input X i X i+1 X i+2 X i+3 on X i . We’ve created a 2-stage pipeline : … if we have a valid input X during clock Pipeline stages X H P(X) 25 cycle j, P(X) is valid during clock j+2. F Reg F(X i ) F(X i+1 ) F(X i+2 ) G … 20 G(X i ) G(X i+1 ) G(X i+2 ) G Reg Suppose F, G, H have propagation delays of 15, 20, 25 ns and we are using ideal zero-delay registers: H Reg H(X i ) H(X i+1 ) H(X i+2 ) latency throughput unpipelined 45 1/45 The results associated with a particular set of input 2-stage pipelined ______ ______ 50 1/25 data moves diagonally through the diagram, progressing through one pipeline stage each clock cycle. worse better 02340 02340 Lectu cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 10 10 02340 02340 Lectu cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 11 11 Pipeline diagrams (alternative view) Pipeline Conventions Slide added by DEFINITION: F J. Sparsø a K-Stage Pipeline (“K-pipeline”) is an acyclic circuit having exactly K 15 15 Clock cycles registers on every path from an input to an output. X H P(X) P(X 25 25 i i+1 i+2 i+3 … G a COMBINATIONAL CIRCUIT is thus an 0-stage pipeline. 20 20 F(X i ) CONVENTION: H(X i ) X i Every pipeline stage, hence every K-Stage pipeline, has a register on its G(X i ) OUTPUT (not on its input). F(X i+1 ) Inputs ALWAYS: X i+1 H(X i+1 ) The CLOCK common to all registers must have a period sufficient to G(X i+1 ) cover propagation over combinational paths PLUS (input) register t PD PLUS (output) register t SETUP . F(X i+2 ) X i+2 H(X i+2 ) G(X i+2 ) The LATENCY of a K-pipeline is K times the period of the clock common to all registers. … … … The THROUGHPUT of a K-pipeline is the • Each row shows the processing of a particular set of input data. frequency of the clock. (In a processor the processing of an instruction. You’ll see plenty…) 02340 Lectu 02340 cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 12 12 02340 Lectu 02340 cture 3 e 3 / / Ackn Acknow owledgemen ledgement: Slides Slides from MI from MIT T cou course 6.004 6.004 prov ovided by Prof. ided by Prof. Wa Ward Sep rd Septemb ember 2004 r 2004 13 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend