- PowerPoint PPT Presentation

쓰기 쓰기 쓰기 참조의 쓰기 참조의 참조의 특성과 참조의 특성과 특성과 SCM 특성과 SCM SCM 기반 SCM 기반 기반 메모리 기반 메모리 메모리 관리 메모리 관리 관리 관리 Write reference characteristics and SCM Write reference characteristics and SCM- -based memory management based memory management Hyokyung Bahn 2011.4.19 NVRAMOS 2011 EWHA WOMANS UNIVERSITY

Storage Class Memory (SCM) Storage Class Memory (SCM) g g y ( y ( ) ) � SCM Characteristics SCM Ch SCM Ch SCM Characteristics t t i ti i ti – Nonvolatile, Byte-addressable • eg. PCM (Phase Change Memory), FeRAM, STT-RAM (MRAM) eg. PCM (Phase Change Memory), FeRAM, STT RAM (MRAM) � SCM Perspectives SCM Perspectives – Widely deployed in data center by 2012 2012 – Promisingly replace HDD by 2020 • No more than 3-5x cost of HDD No more than 3 5x cost of HDD (<$1/GB in 2012) • < 1usec Access time • > 10 5 Read ops. Per second 10 5 R d P d • > 100MB / sec • 10x lower power than HDD p (IBM Almaden Research Center, USENIX FAST Tutorial, 2009)

Why DRAM main memory need to change? Why DRAM main memory need to change? Multi core system More concurrency Larger working set Multi-core system, More concurrency, Larger working set � an enormous need for increased memory eg) 4GB/32-bit processors, 16EB/64-bit processors (1E = 10 18 ) Density � DRAM scaling to small technology is challenge (cost/bit) watt 1200 40% of the total system energy by the main memory 1000 800 Power 600 Consumption p 400 400 200 0 CPU CPU DRAM DRAM Mother Mother Disk Disk Fan Fan NIC NIC Memory Board (Source: Intel Labs, 2008)

Phase Change Phase Change Memory ( Phase Change Phase Change Memory ( Memory (PCM) Memory (PCM) PCM) PCM) DRAM PCM (DRAM-DDR3 1.35V) ( High Speed PCM ’10) Non-Volatile NO YES Density 1X 2X ~ 4X Read(J/GB) Read(J/GB) 0 7 0.7 1 1 Power Write(J/GB) 1.1 6 (Energy) Static power 100 1 (mW/GB) (mW/GB)

PCM Challenges PCM Challenges PCM Challenges PCM Challenges DRAM PCM (DRAM-DDR3 1.35V) ( High Speed PCM ‘10) Non-Volatile NO YES Density 1X 2X ~ 4X Read(J/GB) Read(J/GB) 0 7 0.7 1 1 Write(J/GB) 1.1 6 Power Idle state 100 1 (mW/GB) (mW/GB) Read 1X 1X~ 2X Latency Write 1X 7X ~ 8X 10 7 ~10 8 Endurance ** 10 15 ** SRAM 10 15 , STT-RAM 10 15 , FeRAM 10 12 , SLC Flash 10 5 , MLC Flash 10 4

Memory & Storage Architectures Memory & Storage Architectures Memory & Storage Architectures Memory & Storage Architectures CPU CPU L1 I-cache L1 D-cache L1 I-cache L1 D-cache SRAM SRAM SRAM SRAM SRAM SRAM L2 cache L2 cache STT-RAM STT RAM DRAM DRAM PCM Main memory y Secondary HDD Flash SSD storage storage • STT-RAM, PCM, Flash SSD: write is slower than read , ,

Estimating Future Writes Estimating Future Writes 1. Find a good estimator for future write references I Issue i. Considering read and write history together or considering write history alone i C id i d d it hi t t th id i it hi t l Issue ii. Which is better? Temporal locality or Frequency based estimation 2. Store pages likely to be re written on DRAM. 2. Store pages likely to be re-written on DRAM. 3 . Comparing 3 Comparing 2. Frequency 1. Temporal Locality Temporal Locality & - Only write history - Only write history Frequency - Total (read+write) history - Total (read+write) history B Based Estimation d E ti ti Write Write Write Temporal Frequency count count count Locality Ranking R ki R Ranking ki Ranking • by recency • by (read + write) frequency • by (read + write) recency • by frequency • by write recency • by write frequency

Virtual Memory Traces Used Virtual Memory Traces Used y Memory access count Memory Ratio of operations Workload Contents Instruction footprint(KB) (data reads : data writes) total Data read Data write read xmms Mp3 player p p y 8,052 , 1 : 7.79 1,169,310 , , 65,413 , 125,653 , 978,244 , gqview Image viewer 7,428 1 : 2.01 611,142 93,653 172,044 345,445 shotwell Photo management S/W 88,228 1 : 1.04 15,090,070 528,549 7,124,101 7,437,420 gnuplot g p Graphing utility p g y 21,132 1 : 1.10 220,240 47,551 82,110 90,579 firefox Web browser 101,520 1.88 : 1 12,648,471 2,392,952 6,690,045 3,565,474 freecell Game 10,084 5.26 : 1 490,700 114,750 315,906 60,044 gedit Word processor 14,460 7.16 : 1 1,736,440 652,154 951,450 132,836 kghostview PDF file viewer 17,388 10.26 : 1 1,548,820 373,260 1,062,008 103,552

Temporal Locality • Using both read & write history estimates future writes better within top 10 rankings. • Beyond top rankings, using write history alone may be better estimates of future writes. • Overall, both estimators show similar results.

Temporal Locality • Temporal locality for relatively write intense workloads are rather irregular (Ranking inversion) • Temporal locality alone may not be sufficient to estimate the likelihood of future writes.

Why temporal locality of write irregular? Why temporal locality of write irregular? � Maybe due to write Maybe due to write- -back operation of cache memory back operation of cache memory – page references observed at VM contain only cache-missed ones page references observed at VM contain only cache missed ones – In case of read, • cache-missed requests are directly propagated to VM → Even though temporal locality becomes weak it is not damaged seriously → Even though temporal locality becomes weak, it is not damaged seriously – In case of write, • cache-missed requests are not propagated directly to VM • but just written to the cache memory. • requests are delivered to VM only after evicted from cache memory. • time a write request arrives ≠ time the request is delivered to VM Write request A Read request A Cache memory Cache memory Write request B (evicted from cache) Read request A (cache missed) Main memory Main memory

Frequency Frequency • Write frequency alone is more effective than frequency counted by both reads and writes

Frequency • Write frequency alone is more effective than frequency counted by both reads and writes

Temporal Locality vs. Frequency Temporal Locality vs. Frequency • Frequency is more effective than temporal locality for most cases. • However, at least the most recent reference history must be considered.

Memory Architecture Memory Architecture � Write latency & Endurance problem of PCM � Use a small amount of DRAM along with PCM. g PCM CPU CPU CPU PCM PCM DRAM DRAM Last level cache memory DRAM Main memory Hybrid main memory (single physical address space) DRAM cache miss � PCM access DRAM cache miss � PCM access • • • • Address translation through page table Address translation through page table • DRAM cache is hidden to the OS • DRAM can be managed by OS � H/W implementation, � Fully associative placement is possible Fully associative placement is difficult! Limited reference information Collision may degrade space efficiency Collision may degrade space efficiency (eg. reference bit)

Comparison Comparison of Cache Replacement Problems of Cache Replacement Problems in Each Layer in Each Layer Cache Memory Virtual Memory System File I/O Buffer Cache Hit H/W H/W OS Who manages hits/misses? Miss H/W OS OS Representative p Random / LRU Random / LRU CLOCK CLOCK LRU LRU Algorithms Replacement manager H/W OS OS H/W implementation H/W implementation S/W i S/W implementation l t ti S/W i S/W implementation l t ti supported by H/W (Logical timestamp or (reference bit) bit shifting for each reference in a set) MRU R:0 position R:0 R:1 R:0 How to Implement? p R:1 R:1 R:1 R:1 memory R:0 hit R:0 R:1 LRU LRU victim! R:0 R:0 position R:1 R:0 R:1

CLOCK CLOCK- C OC C OC -DWF DWF (Clock with Dirty bits and Write Frequency) (Clock with Dirty bits and Write Frequency) � CLOCK DWF � CLOCK-DWF • Allocate read-intensive pages to PCM, write-intensive pages to DRAM. page (2) Read page fault table (1) Read page A CPU PCM (3) PCM is full HDD HDD CLOCK or Flash DRAM

CLOCK C OC C OC CLOCK- -DWF DWF (Clock with Dirty bits and Write Frequency) (Clock with Dirty bits and Write Frequency) � CLOCK DWF � CLOCK-DWF • Allocate read-intensive pages to PCM, write-intensive pages to DRAM. page table (1) Write page A CPU PCM (4) PCM is full HDD HDD CLOCK or (2) write operation Flash on a PCM (3) DRAM is full DRAM • generate an intentional (2)’ Write page fault page fault (minor fault) CLOCK-DWF • DRAM: dirty pages only PCM: clean & dirty pages PCM: clean & dirty pages

- PowerPoint PPT Presentation

SCM SCM SCM SCM Write reference

GlobeX A guide to expanding into Europe A report compiled by arvato SCM Solutions & SVG

How Industry 4.0 is driving change in SCM/ Logistics LOGISTICS AUTUMN CONFERENCE ON DIGITAL

SCM - ICT Presenter: | O CPO, National Treasury | February 2016 1 Agenda Introduction

Pro-Tempore Chairmanship CHILE The SCM began, with the technical cooperation of the IMO, in

25th SCM, COSCAP-SA 1 Status Report on Aviation Activities of Bangladesh

Antibiotic Stewardship in Maryland Lucy Wilson, MD, ScM University of Maryland, Baltimore County

The Future of Forecasting MI MIT SCM M Capst ston one Proje ject ct Evan Humphrey |

Traceability of a Breakthrough Food Science Technology Russ Miller & Hilary Taylor, SCM 2018

OUR VISION OUR VISION To provide excellence in Supply Chain Management, (SCM) Warehousing

Presentation for: 25 th SCM, Maldives 20 th July 2016 History First Aircra*

REV EVI EW EW OF WTO SCM AGREEM EEMEN ENT - Definit ions and Pro rocedure res Prepared by

Airships to the Arctic V Calgary October 8, 2009 October 6, 2009 1 2 3 4 5 6 7 8 9 10

Curing Hepatitis C in the Ryan White HIV/AIDS Program Laura Cheever, MD, ScM Associate

Multisystem Inflammatory Syndrome in Children (MIS-C) and COVID-19 ALLISON AGWU, MD, ScM

Life Course of Cardiovascular Health: Prevention Across the Ages Donald M. LloydJones, MD ScM

The BabySeq Project: Genome Sequencing for Childhood Risk and Newborn Illness Sarah Kalia, ScM,

Computer Science and Computer Engineering Presentation to the Arkansas Academy of Computing

Embedded High Performance Computing (EHPC) and Neuromorphic Computing November 18, 2014 Mr.

Introduce parallelism OpenMP & OpenACC March 5-9, 2018 | Santiago de Compostela Spain

Introduction to GGSSA Kathlene Oliver ASEG Perth Branch Meeting 9 th April 2014 What is GGSSA

eShepherd TM automated grazing control for cattle 1 AGERSENS CONFIDENTIAL A GERSENS eShepherd

Digital On-Demand Computing Organism Dod Org SPP OC Kolloquium DFG SPP 1183 Organic

Transition from Quasi-Static to Dynamic Fracture Carlos G. Dvila Structural Mechanics &

Parallelized RSA Algorithm: An Analysis With Performance Evaluation using OpenMP Library in High

- PowerPoint PPT Presentation

SCM SCM SCM SCM Write reference

GlobeX A guide to expanding into Europe A report compiled by arvato SCM Solutions &amp; SVG

How Industry 4.0 is driving change in SCM/ Logistics LOGISTICS AUTUMN CONFERENCE ON DIGITAL

SCM - ICT Presenter: | O CPO, National Treasury | February 2016 1 Agenda Introduction

Pro-Tempore Chairmanship CHILE The SCM began, with the technical cooperation of the IMO, in

25th SCM, COSCAP-SA 1 Status Report on Aviation Activities of Bangladesh

Antibiotic Stewardship in Maryland Lucy Wilson, MD, ScM University of Maryland, Baltimore County

The Future of Forecasting MI MIT SCM M Capst ston one Proje ject ct Evan Humphrey |

Traceability of a Breakthrough Food Science Technology Russ Miller &amp; Hilary Taylor, SCM 2018

OUR VISION OUR VISION To provide excellence in Supply Chain Management, (SCM) Warehousing

Presentation for: 25 th SCM, Maldives 20 th July 2016 History First Aircra*

REV EVI EW EW OF WTO SCM AGREEM EEMEN ENT - Definit ions and Pro rocedure res Prepared by

Airships to the Arctic V Calgary October 8, 2009 October 6, 2009 1 2 3 4 5 6 7 8 9 10

Curing Hepatitis C in the Ryan White HIV/AIDS Program Laura Cheever, MD, ScM Associate

Multisystem Inflammatory Syndrome in Children (MIS-C) and COVID-19 ALLISON AGWU, MD, ScM

Life Course of Cardiovascular Health: Prevention Across the Ages Donald M. LloydJones, MD ScM

The BabySeq Project: Genome Sequencing for Childhood Risk and Newborn Illness Sarah Kalia, ScM,

Computer Science and Computer Engineering Presentation to the Arkansas Academy of Computing

Embedded High Performance Computing (EHPC) and Neuromorphic Computing November 18, 2014 Mr.

Introduce parallelism OpenMP &amp; OpenACC March 5-9, 2018 | Santiago de Compostela Spain

Introduction to GGSSA Kathlene Oliver ASEG Perth Branch Meeting 9 th April 2014 What is GGSSA

eShepherd TM automated grazing control for cattle 1 AGERSENS CONFIDENTIAL A GERSENS eShepherd

Digital On-Demand Computing Organism Dod Org SPP OC Kolloquium DFG SPP 1183 Organic

Transition from Quasi-Static to Dynamic Fracture Carlos G. Dvila Structural Mechanics &amp;

Parallelized RSA Algorithm: An Analysis With Performance Evaluation using OpenMP Library in High

GlobeX A guide to expanding into Europe A report compiled by arvato SCM Solutions & SVG

Traceability of a Breakthrough Food Science Technology Russ Miller & Hilary Taylor, SCM 2018

Introduce parallelism OpenMP & OpenACC March 5-9, 2018 | Santiago de Compostela Spain

Transition from Quasi-Static to Dynamic Fracture Carlos G. Dvila Structural Mechanics &