Intelligent RAM (IRAM) Richard Fromm, David Patterson, Krste - PowerPoint PPT Presentation

Intelligent RAM (IRAM) Richard Fromm, David Patterson, Krste Asanovic, Aaron Brown, Jason Golbus, Ben Gribstad, Kimberly Keeton, Christoforos Kozyrakis, David Martin, Stylianos Perissakis, Randi Thomas, Noah Treuhaft, Katherine Yelick, Tom Anderson, John Wawrzynek rfromm@cs.berkeley.edu http://iram.cs.berkeley.edu/ EECS, University of California Berkeley, CA 94720-1776 USA 1 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

IRAM Vision Statement L Proc o f Microprocessor & DRAM $ $ g a on a single chip: L2$ I/O I/O i b Bus G on-chip memory latency Bus c 5-10X, bandwidth 50-100X D R A M G improve energy efficiency 2X-4X (no off-chip bus) I/O I/O G serial I/O 5-10X vs. buses Proc D G smaller board area/volume f Bus R G adjustable memory size/width a A b M D R A M 2 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Outline G Today’s Situation: Microprocessor & DRAM G IRAM Opportunities G Initial Explorations G Energy Efficiency G Directions for New Architectures G Vector Processing G Serial I/O G IRAM Potential, Challenges, & Industrial Impact 3 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Processor-DRAM Gap (latency) Relative Performance µProc 1000 “Moore’s Law” 60%/yr. Processor-Memory 100 Performance Gap: (grows 50% / year) 10 DRAM 7%/yr. 1 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 Time 4 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Processor-Memory Performance Gap “Tax” Processor % Area % Transistors (~cost) (~power) G Alpha 21164 37% 77% G StrongArm SA110 61% 94% G Pentium Pro 64% 88% G 2 dies per package: Proc/I$/D$ + L2$ G Caches have no inherent value, only try to close performance gap 5 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Today’s Situation: Microprocessor G Rely on caches to bridge gap G Microprocessor-DRAM performance gap G time of a full cache miss in instructions executed 1st Alpha (7000): 340 ns/5.0 ns = 68 clks x 2 or 136 ns 2nd Alpha (8400): 266 ns/3.3 ns = 80 clks x 4 or 320 ns 3rd Alpha (t.b.d.): 180 ns/1.7 ns =108 clks x 6 or 648 ns G X latency x 3X clock rate x 3X Instr/clock ⇒ - 5X 1 2 G Power limits performance (battery, cooling) G Shrinking number of desktop ISAs? G No more PA-RISC; questionable future for MIPS and Alpha G Future dominated by IA-64? 6 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Today’s Situation: DRAM DRAM Revenue per Quarter $20,000 $16B $15,000 (Miillions) $10,000 $7B $5,000 $0 4 4 4 4 5 5 5 5 6 6 6 6 7 9 9 9 9 9 9 9 9 9 9 9 9 9 Q Q Q Q Q Q Q Q Q Q Q Q Q 1 2 3 4 1 2 3 4 1 2 3 4 1 G Intel: 30%/year since 1987; 1/3 income profit 7 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Today’s Situation: DRAM G Commodity, second source industry ⇒ high volume, low profit, conservative G Little organization innovation (vs. processors) in 20 years: page mode, EDO, Synch DRAM G DRAM industry at a crossroads: G Fewer DRAMs per computer over time G Growth bits/chip DRAM: 50%-60%/yr G Nathan Myrvold (Microsoft): mature software growth (33%/yr for NT), growth MB/$ of DRAM (25%-30%/yr) G Starting to question buying larger DRAMs? 8 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Fewer DRAMs/System over Time DRAM Generation (from Pete ‘86 ‘89 ‘92 ‘96 ‘99 ‘02 MacWilliams, 1 Mb 4 Mb 16 Mb 64 Mb 256 Mb 1 Gb Intel) 32 8 Memory per 4 MB Minimum Memory Size DRAM growth 16 4 8 MB @ 60% / year 8 2 16 MB 4 1 32 MB Memory per 8 2 64 MB System growth 4 1 @ 25%-30% / year 128 MB 8 2 256 MB 9 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Multiple Motivations for IRAM G Some apps: energy, board area, memory size G Gap means performance challenge is memory G DRAM companies at crossroads? G Dramatic price drop since January 1996 G Dwindling interest in future DRAM? G Too much memory per chip? G Alternatives to IRAM: fix capacity but shrink DRAM die, packaging breakthrough, more out-of-order CPU, ... 10 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

DRAM Density G Density of DRAM (in DRAM process) is much higher than SRAM (in logic process) G Pseudo-3-dimensional trench or stacked capacitors give very small DRAM cell sizes StrongARM 64 Mbit DRAM Ratio 0.35 µ m logic 0.40 µ m DRAM Process Transistors/cell 6 1 6:1 Cell size ( µ m 2 ) 26.41 1.62 16:1 ( λ 2 ) 216 10.1 21:1 Density (Kbits/ µ m 2 ) 10.1 390 1:39 (Kbits/ λ 2 ) 1.23 62.3 1:51 11 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Potential IRAM Latency: 5 - 10X G No parallel DRAMs, memory controller, bus to turn around, SIMM module, pins… G New focus: Latency oriented DRAM? G Dominant delay = RC of the word lines G Keep wire length short & block sizes small? G 10-30 ns for 64b-256b IRAM “RAS/CAS”? G AlphaStation 600: 180 ns=128b, 270 ns=512b Next generation (21264): 180 ns for 512b? 12 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Potential IRAM Bandwidth: 50-100X G 1024 1Mbit modules(1Gb), each 256b wide G 20% @ 20 ns RAS/CAS = 320 GBytes/sec G If cross bar switch delivers 1/3 to 2/3 of BW of 20% of modules ⇒ 100 - 200 GBytes/sec G FYI: AlphaServer 8400 = 1.2 GBytes/sec G 75 MHz, 256-bit memory bus, 4 banks 13 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Potential Energy Efficiency: 2X-4X G Case study of StrongARM memory hierarchy vs. IRAM memory hierarchy (more later...) G cell size advantages ⇒ much larger cache ⇒ fewer off-chip references ⇒ up to 2X-4X energy efficiency for memory G less energy per bit access for DRAM 14 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Potential Innovation in Standard DRAM Interfaces G Optimizations when chip is a system vs. chip is a memory component G Lower power via on-demand memory module activation? G Improve yield with variable refresh rate? G “Map out” bad memory modules to improve yield? G Reduce test cases/testing time during manufacturing? G IRAM advantages even greater if innovate inside DRAM memory interface? (ongoing work...) 15 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

“Vanilla” Approach to IRAM G Estimate performance of IRAM implementations of conventional architectures G Multiple studies: G “Intelligent RAM (IRAM): Chips that remember and compute”, 1997 Int’l. Solid-State Circuits Conf. , Feb. 1997. G “Evaluation of Existing Architectures in IRAM Systems”, Workshop on Mixing Logic and DRAM, 24th Int’l. Symp. on Computer Architecture , June 1997. G “The Energy Efficiency of IRAM Architectures”, 24th Int’l. Symp. on Computer Architecture , June 1997. 16 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

“Vanilla” IRAM - Performance Conclusions G IRAM systems with existing architectures provide only moderate performance benefits G High bandwidth / low latency used to speed up memory accesses but not computation G Reason: existing architectures developed under the assumption of a low bandwidth memory system G Need something better than “build a bigger cache” G Important to investigate alternative architectures that better utilize high bandwidth and low latency of IRAM 17 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

IRAM Energy Advantages G IRAM reduces the frequency of accesses to lower levels of the memory hierarchy, which require more energy G IRAM reduces energy to access various levels of the memory hierarchy G Consequently, IRAM reduces the average energy per instruction: Energy per memory access = AE L1 + ( MR L1 × AE L2 + ( MR L2 × AE off-chip )) where AE = access energy and MR = miss rate 18 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Energy to Access Memory by Level of Memory Hierarchy G For 1 access, measured in nJoules: Conventional IRAM on-chip L1$(SRAM) 0.5 0.5 on-chip L2$(SRAM vs. DRAM) 2.4 1.6 L1 to Memory (off- vs. on-chip) 98.5 4.6 L2 to Memory (off-chip) 316.0 (n.a.) G Based on Digital StrongARM, 0.35 µm technology G Calculated energy efficiency (nanoJoules per instruction) G See “The Energy Efficiency of IRAM Architectures,” 24th Int’l. Symp. on Computer Architecture , June 1997 19 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

IRAM Energy Efficiency Conclusions G IRAM memory hierarchy consumes as little as 29% (Small) or 22% (Large) of corresponding conventional models G In worst case, IRAM energy consumption is comparable to conventional: 116% (Small), 76% (Large) G Total energy of IRAM CPU and memory as little as 40% of conventional, assuming StrongARM as CPU core G Benefits depend on how memory-intensive the application is 20 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

A More Revolutionary Approach G “...wires are not keeping pace with scaling of other features. … In fact, for CMOS processes below 0.25 micron ... an unacceptably small percentage of the die will be reachable during a single clock cycle .” G “Architectures that require long-distance, rapid interaction will not scale well ...” G “Will Physical Scalability Sabotage Performance Gains?” Matzke, IEEE Computer (9/97) 21 Richard Fromm, IRAM tutorial, ASP-DAC ‘98, February 10, 1998

Intelligent RAM (IRAM) Richard Fromm, David Patterson, Krste - PowerPoint PPT Presentation

Intelligent RAM (IRAM) Richard Fromm, David Patterson, Krste Asanovic, Aaron Brown, Jason Golbus, Ben Gribstad, Kimberly Keeton, Christoforos Kozyrakis, David Martin, Stylianos Perissakis, Randi Thomas, Noah Treuhaft, Katherine Yelick, Tom

CACHING BEYOND RAM CACHING BEYOND RAM memcached.org/blog @dormando WHY RAM? WHY RAM?

An Introduction to Intelligent RAM (IRAM) David Patterson, Krste Asanovic, Aaron Brown, Ben

An Introduction to Intelligent RAM (IRAM) David Patterson, Krste Asanovic, Aaron Brown, Ben

Intelligent RAM (IRAM): Chips that remember and compute David Patterson, Thomas Anderson, Krste

Cross Ram Support Set Ram accessories 1 Cross Ram Support Set Set composition The Cross

Vector IRAM: ISA and Micro-architecture Christoforos E. Kozyrakis Computer Science Division

EMS RAM PUMPS EMS RAM PUMPS INDUSTRIES LTD INDUSTRIES LTD Press ENTER to continue EMS

Random Access Memory (RAM) Key features RAM is traditionally packaged as a chip.

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel

The Spiritual Meaning of Ramayana www.naradakush.nl SRI RAM Sri Ram is the embodiment of the

RAM POWERPOINT PRESENTATION 2020 SLIDE 1 Rotarians Against Malaria (RAM) was begun in 1995 and

Adaptive Garbled RAM from Adaptive Garbled RAM from Laconic Oblivious Transfer Sanjam Garg

Memory Address Map CS RD RAM 0 0 RD WR 1K x 8 WR DB(0..7) 1 AB Decoder AB(10..11) 2

Intelligent Management Solutions For Keys and Items Traka - iFOB iFOB = intelligent FOB The

Intelligent Computer Mathematics Intelligent Computing? OR Franz Lichtenberger Mathematics

Low Energy Spin Transfer T orque RAM (STT -RAM / SPRAM) Zach Foresta April 23, 2009 Overview

Distribution build / delivery Distribution build / delivery styles, one style to rule them

Emerging Technology Trends Chris Green Accelerator Controls Modernization Workshop Friday

Rocket Your Success Lesson 2: Packaging The Art of Packaging Proper Packaging gives

Packaging Mathematical Structures F. Garillot 1 , G. Gonthier 2 , A. Mahboubi 3 , L. Rideau 4 1 :

Software and Computing R&D Adam Lyon (Associate Division Head of Systems for Scientific

Worker Node Software Management: the VO perspective Mark Santcroos Dennis van Dok Introduction

Mesosphere and Percona Server for MongoDB Peter Schwaller, Senior Director Server Eng. (Percona)

CM30174 + CM50206 Introduction to Intelligent Agents Marina De Vos, Julian Padget Introduction /

Intelligent RAM (IRAM) Richard Fromm, David Patterson, Krste - PowerPoint PPT Presentation

Intelligent RAM (IRAM) Richard Fromm, David Patterson, Krste Asanovic, Aaron Brown, Jason Golbus, Ben Gribstad, Kimberly Keeton, Christoforos Kozyrakis, David Martin, Stylianos Perissakis, Randi Thomas, Noah Treuhaft, Katherine Yelick, Tom

CACHING BEYOND RAM CACHING BEYOND RAM memcached.org/blog @dormando WHY RAM? WHY RAM?

An Introduction to Intelligent RAM (IRAM) David Patterson, Krste Asanovic, Aaron Brown, Ben

An Introduction to Intelligent RAM (IRAM) David Patterson, Krste Asanovic, Aaron Brown, Ben

Intelligent RAM (IRAM): Chips that remember and compute David Patterson, Thomas Anderson, Krste

Cross Ram Support Set Ram accessories 1 Cross Ram Support Set Set composition The Cross

Vector IRAM: ISA and Micro-architecture Christoforos E. Kozyrakis Computer Science Division

EMS RAM PUMPS EMS RAM PUMPS INDUSTRIES LTD INDUSTRIES LTD Press ENTER to continue EMS

Random Access Memory (RAM) Key features RAM is traditionally packaged as a chip.

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel

The Spiritual Meaning of Ramayana www.naradakush.nl SRI RAM Sri Ram is the embodiment of the

RAM POWERPOINT PRESENTATION 2020 SLIDE 1 Rotarians Against Malaria (RAM) was begun in 1995 and

Adaptive Garbled RAM from Adaptive Garbled RAM from Laconic Oblivious Transfer Sanjam Garg

Memory Address Map CS RD RAM 0 0 RD WR 1K x 8 WR DB(0..7) 1 AB Decoder AB(10..11) 2

Intelligent Management Solutions For Keys and Items Traka - iFOB iFOB = intelligent FOB The

Intelligent Computer Mathematics Intelligent Computing? OR Franz Lichtenberger Mathematics

Low Energy Spin Transfer T orque RAM (STT -RAM / SPRAM) Zach Foresta April 23, 2009 Overview

Distribution build / delivery Distribution build / delivery styles, one style to rule them

Emerging Technology Trends Chris Green Accelerator Controls Modernization Workshop Friday

Rocket Your Success Lesson 2: Packaging The Art of Packaging Proper Packaging gives

Packaging Mathematical Structures F. Garillot 1 , G. Gonthier 2 , A. Mahboubi 3 , L. Rideau 4 1 :

Software and Computing R&amp;D Adam Lyon (Associate Division Head of Systems for Scientific

Worker Node Software Management: the VO perspective Mark Santcroos Dennis van Dok Introduction

Mesosphere and Percona Server for MongoDB Peter Schwaller, Senior Director Server Eng. (Percona)

CM30174 + CM50206 Introduction to Intelligent Agents Marina De Vos, Julian Padget Introduction /

Software and Computing R&D Adam Lyon (Associate Division Head of Systems for Scientific