REVISITING THE MEMORY HIERARCHY FOR TOMORROW COMPUTING SYSTEMS Leti - - PowerPoint PPT Presentation

revisiting the memory hierarchy for tomorrow computing
SMART_READER_LITE
LIVE PREVIEW

REVISITING THE MEMORY HIERARCHY FOR TOMORROW COMPUTING SYSTEMS Leti - - PowerPoint PPT Presentation

REVISITING THE MEMORY HIERARCHY FOR TOMORROW COMPUTING SYSTEMS Leti Devices Workshop | Elisa Vianello | December 4, 2016 OUTLINE Ever Increasing Need for More Memory Rethinking the Memory Hierarchy with New NV Memory Technologies


slide-1
SLIDE 1

Leti Devices Workshop | Elisa Vianello | December 4, 2016

REVISITING THE MEMORY HIERARCHY FOR TOMORROW COMPUTING SYSTEMS

slide-2
SLIDE 2

| 2 Leti Devices Workshop | ElisaVianello | December 4, 2016

OUTLINE

  • Ever Increasing Need for More Memory
  • Rethinking the Memory Hierarchy with New NV

Memory Technologies

  • Rethinking the System Architecture?
slide-3
SLIDE 3

| 3

  • M. Hilbert et al., Vol. 332, SCIENCE, 2011

In billions of GB

Analog Digital

2002 turn-point between analog vs digital format The amount of data being created is growing 40% a year into the next decade

IoT Mobile Server

EVER INCREASING NEED FOR MORE MEMORY

Leti Devices Workshop | ElisaVianello | December 4, 2016

By 2020 the digital universe is expected to contain nearly as many digital bits as there are Stars in the Universe…

slide-4
SLIDE 4

| 4

Core count doubling ~ every 2 years DRAM capacity doubling ~ every 3 years

New memory technologies and rethinking of system architecture focused on data storage and management are needed!

Compute costs dropping faster than memory costs

Source: Liam et al ISCA 2009 Source: IBM Deep Computing

EVER INCREASING NEED FOR MORE MEMORY

logic memory

Leti Devices Workshop | ElisaVianello | December 4, 2016

The growth in data produced is outpacing the improvements in the density and cost of storage technologies

slide-5
SLIDE 5

| 5

CONVENTIONAL TYPES OF MEMORIES AND TECHNOLOGIES

MAGNETIC NAND Flash HDD SDD DRAM CPU MIM CAP 1T 1R Register files L1-L4 SRAM Storage

Cost/bit & Speed Memory volume Latency gap

Main Memory Registers Caches Data moves along all levels of storage hierarchy before and after being processed at the processor

Can new NVMs (RRAM, PCM, MRAM…) help to reduce the latency gap and limit data movements across the Memory Hierarchy?

Processor

Leti Devices Workshop | ElisaVianello | December 4, 2016

slide-6
SLIDE 6

| 6

Cycle Number Nc [#]

MRAM

10

1

10

3

10

5

10

7

10

9

10

11

10 10

2

10

4

10

6

10

8

CBRAM OxRAM PCRAM

GST-Sb Ta2O5/TaOx Ta2O5/TaOx GST-N GST-O Cu-GST/CuO/SiO2/BE WOx CNT TaON TaN/GeTe/Cu Ti/TaSiO/Ru Cu/PSE/Ru

Programming Window Roff/Ron [#] Endurance Nc [#]

Cu-GST/CuO/SiO2/BE Cu/Ta2O5/Pt Ag/GeS2/HfO2

Golden GaSbGe

RRAM MEMORIES: PROGRAMMING WINDOW VS. ENDURANCE

  • CBRAM largest Roff/Ron chalco-based and/or bilayers; best

endurance oxide-based

  • OxRAM largest Roff/Ron non-polar; best endurance bipolar
  • PCRAM best endurance GST-based
  • MRAM outlier..

Universal Memory does Not Exist!

4.5 paper on Monday afternoon on RRAM Endurance, Retention and Window Margin Trade-off

Leti Devices Workshop | ElisaVianello | December 4, 2016

  • E. Vianello IEDM 2014
  • L. Perniola IMW 2016
slide-7
SLIDE 7

| 7

STORAGE CLASS MEMORIES

MAGNETIC NAND Flash HDD SDD DRAM CPU MIM CAP 1T 1R Register files L1-L4 SRAM PCM/RRAM SCM Latency of access <0.001µs registers <0.01µs cache <0.03µs >100µs >103µs 0.05µs-100µs? processing @ SCM: from data compression to query processing

compu ting

Leti Devices Workshop | ElisaVianello | December 4, 2016

Requirements: Fast access speed approaching DRAM Nonvolatile retain data at power off High endurance (program/erase cycles) Solid state (no moving parts) Low cost/bit (approaching HDD)

slide-8
SLIDE 8

| 8

« TRANSFORMATIONAL » ABILITY OF PCM

4 8 12 16 20 10

  • 5

10

  • 4

10

  • 3

10

  • 2

10

  • 1

SET RESET DEFECT DENSITY CURRENT [µ

µ µ µA]

12 Mb 90 nm test chip

  • H. Y. Cheng et al., IEDM 2012
  • G. Navarro et al., IEDM 2013

H.Y. Cheng et al., JAP 2014

  • P. Zuliani et al., SSE 2015
  • V. Sousa et al., VLSI 2015

10

1

10

3

10

5

10

7

10

9

10

4

10

5

10

6

10

7

SET RESET RESITANCE [Ω

Ω Ω Ω]

CYCLES

2h bake at 230°C RESET 30ns - SET 800ns

Leti Devices Workshop | ElisaVianello | December 4, 2016

slide-9
SLIDE 9

| 9

HYBRID MAIN MEMORY

MAGNETIC NAND Flash HDD SDD DRAM CPU Hybrid RRAM/PCM Register files L1-L4 SRAM PCM/RRAM SCM High capacity working memory Hybrid memory: DRAM as a cache to RRAM or PCM tech to achieve the best of multiple technologies

Leti Devices Workshop | ElisaVianello | December 4, 2016

  • non-volatile
  • high-density
  • low-cost
  • fast
  • durable
slide-10
SLIDE 10

| 10

LRS and HRS on 4kbits array MAD Leti OxRAM 4.7 paper on Monday afternoon on Filament-based RRAM

Ti/HfO2 BASED-OxRAM

<100ns programming time at 1V with up to 100M cycles

  • E. Vianello IEDM 2014
  • A. Benoist IRPS 2014 ST/Leti

Leti Devices Workshop | ElisaVianello | December 4, 2016

28nm CMOS

slide-11
SLIDE 11

| 11

L2-L3 CACHE

MAGNETIC NAND Flash HDD SDD DRAM CPU Hybrid PCM/MRAM

Cache L2 L3

PCM/RRAM SCM Replacing SRAM (L2-L3) best candidate is STT-MRAM (fast and high endurance) with STT-MRAM

Leti Devices Workshop | ElisaVianello | December 4, 2016

Reduction of stand-by power consumption

4Mbit STT MRAM Read 3.3ns @ 1.25V 65nm CMOS [Noguchi, Toshiba, ISSCC 2016]

slide-12
SLIDE 12

| 12

REPLACING SRAM WITH STT-MRAM

MAD Leti pSTT-MRAM

27.3 paper on Wednesday morning on Data Retention Extraction Methodology pSTT-MRAM

Leti Devices Workshop | ElisaVianello | December 4, 2016

STT-MRAM demonstrated fast writing speed and high endurance, however retention still to be fully characterized different retention extraction methods are compared a trade off exists between thermal stability and programming speed 4kbit array

slide-13
SLIDE 13

| 13

RETHINKING THE SYSTEM ARCHITECTURE?

Neuromorphic systems

  • Detect and predict patterns in complex data
  • visual or auditory data analysis

Leti Devices Workshop | ElisaVianello | December 4, 2016

PCM

GST , GeTe, GST/HfO2

CBRAM

GeS2/Ag

OxRAM

HfO2/Ti (planar + VRAM)

RRAM has been promoted by Leti (and many others!) to emulate synaptic plasticity: the ability of synapses to strengthen or weaken over time

slide-14
SLIDE 14

| 14

RRAM TO IMPLEMENT ON-LINE LEARNING In nature synapses have a volatile component, is it useful for the learning process?

RRAM data

Time [s] Conductance [S]

biological data

decoding of neural signals Highly noisy neurological data ANN with RRAM based synapses synaptic change has a volatile component

The volatile component allows to improve detection in highly- noisy input data

16.6 paper on Tuesday morning on Short and Long-term Synaptic Plasticity Using OxRAM

Leti Devices Workshop | ElisaVianello | December 4, 2016

slide-15
SLIDE 15

| 15

The data explosion is leading to a corresponding growth in data centric applications (capture, classify, archive…). The adoption of new NVMs enable a rethinking of system architecture-based on data storage and management. However universal memory does not exist, different NVM technologies have to be introduced in the storage hierarchy Challenges: Design matched to the specific memory technology features Non volatility does non come for free

  • static power vs. active power
  • memory window vs. endurance
  • memory window vs. data retention.

CONCLUSION

Leti Devices Workshop | ElisaVianello | December 4, 2016

slide-16
SLIDE 16

| 16

Thanks to the new NVMs technologies (PCM, RRAM, MRAM) the persistent memory is getting closer to the compute center avoiding wasting energy in the movement

CONCLUSION

Leti Devices Workshop | ElisaVianello | December 4, 2016

slide-17
SLIDE 17

Leti, technology research institute Commissariat à l’énergie atomique et aux énergies alternatives Minatec Campus | 17 rue des Martyrs | 38054 Grenoble Cedex | France www.leti.fr

slide-18
SLIDE 18

| 18

EMBEDDED SYSTEMS: TOWARD DISTRIBUTED MEMORY

MCU (Stand-by) Sensor Radio MCU (active)

Source: Renesas ASP-DAC 2014

MCU power challenge: need for Ultra- Low Stand-by Power design solution NV Flip-Flop

[N. Jovanović ISCAS 2016]

NVM merges with SRAM and logic shutdown SRAM & registers thanks to distributed NVM

28nm FDSOI + HfO2 based OxRAM

eFlash (NVM) logic nvSRAM NV logic SRAM

Leti Devices Workshop | ElisaVianello | December 4, 2016

  • thermal stability for smart card &

automotive

  • easy integration with advance CMOS
slide-19
SLIDE 19

| 19

A circuit-architecture co-optimization framework for evaluating emerging memory hierarchies ISPASS.2013 X. Dong et al.