A mnesic C ache M anagement for Non-Volatile Memory Dongwoo Kang , - PowerPoint PPT Presentation

A mnesic C ache M anagement for Non-Volatile Memory Dongwoo Kang , Seungjae Baek, Jongmoo Choi Donghee Lee Dankook University, South Korea University of Seoul, South Korea { kangdw, baeksj, chiojm}@dankook.ac.kr dhl_express@uos.ac.kr Sam H. Noh Onur Mutlu Hongik University, South Korea Carnegie Mellon University, USA samhnoh@hongik.ac.kr onur@cmu.edu

Outline Introduction & Motivation Design Evaluation Conclusion

Introduction : Volatility Non-Volatile Memory PCM (Phase Change Memory), STT -RAM (Spin Transfer Torque RAM), RRAM (Resistive RAM), Fe-RAM (Ferroelectric Random Access Memory) Byte addressability and Non-Volatility RAM, storage, file cache, CPU cache Volatility DRAM Non-Volatile NVM Hard disk ( STT-RAM, PCM,..) SSD & Flash

Introduction : Volatility Non-Volatile Memory PCM (Phase Change Memory), STT -RAM (Spin Transfer Torque RAM), RRAM (Resistive RAM), Fe-RAM (Ferroelectric Random Access Memory) Byte addressability and Non-Volatility RAM, storage, file cache, CPU cache Limited retention capability, relaxation write Less retentive 64ms Volatility DRAM 1 years Non-Volatile 10 ¹ ⁵ seconds More retentive NVM Hard disk ( STT-RAM, PCM,..) SSD & Flash

Introduction : Phase Change Memory States of PCM (Phase Change Memory) Target band A region of resistances that corresponds to valid bits Write scheme PCM adopts iterative write scheme Resistance drifts The resistance in a PCM cell has a tendency to increase with time When the resistance drifts up to the boundary of the next region, the state can be incorrectly represented leading to data loss Cell distribution State ’11' State ’10' State ’01' State ’00' Target band Resistance drift Margin Resistance

Introduction : Tradeoff Tradeoff between retention capability and write speed Narrowing target bands Requires more precise control over the iterative mechanism Demands smaller ∆ R resulting in a slowdown of the write latency Higher retention increasing write latency 1.7x write speedup can be obtained by reducing the retention capability of PCM from 10 ⁷ to 10 ⁴ seconds [Liu et al.] How to exploit these characteristics of the PCM? Write speedup State ’11' State ’10' State ’01' State ’00' distribution 2.2 Cell 1.65 Normalized performance 1.1 0.55 State ’11' State ’10' State ’01' State ’00' distribution 0 Cell 10 ⁷ 10 ⁶ 10 ⁵ 10 ⁴ 10 ³ 10 ² Non-Volatility (sec) (source Liu et al.)

Motivation : What about NVM cache? NVM Cache Employing an NVM cache provides performance improvements Fetching/Eviction data from/to storage system Retention capability for the cache 10 ⁷ seconds is recommended retention capability from JEDEC But, data will be evicted from the NVM cache Ensure retention capability while the data is in the cache How much retention capability requires with the NVM cache? Application with long retention Data capability NVM cache Time on Cache Fetching Eviction Storage

Motivation : Caching time Caching time on the NVM cache We measure the caching time with LRU scheme T Caching = T Evict − T First 75% of the data is less than 10 ⁵ seconds Don’t need to ensure 10 ⁷ seconds retention capability in the cache Quartiles Median Quartiles Median 1e+06 1e+06 hm ₀ proj ₃ 100000 100000 Caching time(sec) Caching time(sec) 10000 10000 1000 1000 100 100 10 10 1 1 128MB 256MB 512MB 1GB 2GB 4GB 128MB 256MB 512MB 1GB 2GB 4GB Cache size Cache size

Motivation : Reference interval Reference interval 90% of data are re-referenced within the 10 ⁵ second interval Retention relaxation can enhance write performance However, when data is re-referenced after its retention capability, it will induce a miss, reducing the hit ratio and triggering extra accesses to retrieve the data from storage. ∼ 10 2 ∼ 10 4 ∼ 10 6 ∼ 10 3 ∼ 10 5 Percentage of Reference interval 100% 80% 60% 40% 20% 0% usr 0 stg 0 src2 0 hm 0 mds 0 prn 0 prn 1 proj 3 Workloads

Outline Introduction & Motivation Design REF SACM AACM Evaluation Conclusion

Design : REF REF(REFresh-based cache management scheme) REF is similar to the LRU scheme Free state and Used state Enhances write speed by relaxing retention capability from 10 ⁷ to 10 ⁴ Write latency is decrease by 1.7X Performs refreshing for data whose retention time is about to expire Issue Refresh operation Relaxation write Refresh with Free Used Relaxation write Evict

Design : SACM Simple Amnesic Cache Management Free State to Tentative State Initial write into the cache, the datum is written with the relaxed write(10 ⁴ ) Tentative State to Confirmed State If it is referenced again within the retention time It is rewritten with 10 ⁷ retention capability Confirmed State to Free State If it is not referenced again and the retention time expires Issue Additional writes Free Expired Expired Relaxation write Tentative Confirmed Cache hit & Default write

Design : AACM (1/2) Adaptive Amnesic Cache Management Key idea Estimates the next reference of each data and adaptive write Estimation by IRG model Adaptive write Ensure appropriate retention capability adaptively for each data Ghost buffer Issue Adaptive write Estimation Free Ghost hit & Relaxation Expired Expired Adaptive write write Confirmed Tentative based on IRG Cache hit & Adaptive write

Design : AACM (1/2) Estimation of IRG Use 1st order Markov chain for estimation of IRG Coarse grain levels 10 ² , 10 ³ , 10 ⁴ , 10 ⁵ , 10 ⁶ , 10 ⁷ seconds Estimation is larger than 90% Memory overhead is 144 bytes for each data 100% 90% 80% 70% Accuracy 60% 50% 40% 30% 20% 10% 0% usr 0 stg 0 src2 0 hm 0 mds 0 prn 0 prn 1 homes webmail wm+online

Outline Introduction & Motivation Design Evaluation Conclusion

Evaluation : Environment Simulator Time accurate in-house simulator Storage simulator and trace replayer Trace MSR-Cambridge traces during 7 days FIU traces during 21 days Websearch3 trace during 3.1 days Simulator parameters RETENTION SPEEDUP SSD PCM 10 ⁷ 1X 16 us 50 us READ LA TENCY 10 ⁶ 1.2X WRITE LA TENCY 91.2 us 900 us 10 ⁵ 1.5X 81.9 nj 14.25uj READ ENERGY 10 ⁴ 1.7X 4.73 uj 256 uj WRITE ENERGY 10 ³ 1.9X 10 ² 2.1X

Evaluation : Hit ratio Hit ratio Cache size is set to 25 % of working set of each workload Cache size is set to be 1.95GB with hm ₀ trace(the working set is 7.8GB) Comparable to LRU giving and taking a little bit depending on the workload LRU REF SACM AACM

Evaluation : Latency Latency (normalized to that of LRU) REF reduces latency even more by as much as 48% (36% on average) SACM does it by as much as 7% (4% on average) AACM does it up to 40% (30% on average) LRU REF SACM AACM

Evaluation : Latency with refresh Latency (normalized to that of LRU) REF with refresh operations increases normalized latency up to 6X LRU REF SACM AACM

Evaluation : Latency with refresh (without REF) Latency (normalized to that of LRU) REF with refresh operations increases normalized latency up to 6X SACM and AACM perform better than LRU though the margin has dwindled SACM decreases the latency by 5% on average AACM decreases the latency by 15% on average LRU SACM AACM

Evaluation : Endurance Endurance REF harms the endurance from refresh operations LRU REF SACM AACM

Evaluation : Endurance (without REF) Endurance REF harms the endurance from refresh operations SACM showing similar write counts to LRU AACM incurs roughly 1% more writes compared to LRU (4% at maximum Considering the MLC PCM endurance (10 ⁵ ), the total amount of writes (wm +online), we can estimate that the lifetime is around 26 years. LRU SACM AACM

Evaluation : Energy consumption Energy consumption Energy = Nread x Eread + Nwrite x Ewrite REF is 9 times higher than LRU (refresh overhead) LRU REF SACM AACM

Evaluation : Energy consumption Energy consumption Energy = Nread x Eread + Nwrite x Ewrite REF is 9 times higher than LRU (refresh overhead) SACM reduces energy consumption on average 11% AACM saves energy consumption on average 37% (and as high as 49%) LRU SACM AACM

Evaluation : Energy consumption Energy consumption Energy = Nread x Eread + Nwrite x Ewrite REF is 9 times higher than LRU (refresh overhead) SACM reduces energy consumption on average 11% AACM saves energy consumption on average 37% (and as high as 49%) Also, AACM saves energy by an average of 13% on whole storage system Cause of retention relaxation and reduction of accesses in SSD LRU REF SACM AACM

Evaluation : Hit ratio with various cache size Hit ratio and latency with various cache size AACM performs better when the cache size is set to be small Also, when the cache size becomes larger, both schemes show comparable performance since LRU also keeps most of the cacheable data 100% LRU-hm_0 LRU-mds_0 87.5% LRU-prn_0 LRU-stg_0 LRU-usr_0 LRU-webmail 75% AACM-hm_0 AACM-mds_0 AACM-prn_0 62.5% AACM-stg_0 AACM-usr_0 AACM-webmail 50% Cache size 25% Cache size 50% Cache size 80%

A mnesic C ache M anagement for Non-Volatile Memory Dongwoo Kang , - PowerPoint PPT Presentation

A mnesic C ache M anagement for Non-Volatile Memory Dongwoo Kang , Seungjae Baek, Jongmoo Choi Donghee Lee Dankook University, South Korea University of Seoul, South Korea { kangdw, baeksj, chiojm}@dankook.ac.kr dhl_express@uos.ac.kr Sam H.

Encrypted Non-volatile Main Memory Systems Yu Hua Huazhong University of Science and Technology

ACHE 2014 Survey Comparing Career Attainments of Healthcare Executives by Race/Ethnicity 1

Soft Updates Made Simple and Fast on Non-volatile Memory Mingkai Dong , Haibo Chen Institute of

Object-Oriented Recovery for Non-volatile Memory Nachshon Cohen, David Aksun, James Larus EPFL 10

Architectural Support for Atomic Durability in Non-Volatile Memory Arpit Joshi , Vijay Nagarajan,

W HIRLPOOL ! I MPROVING D YNAMIC C ACHE M ANAGEMENT WITH S TATIC D ATA C LASSIFICATION Anurag

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

A Persistent Friedman Lock-Free Queue Maurice Herlihy for Non-Volatile Memory Virendra

HOOP: Efficient Hardware-Assisted Out-of-Place Update for Non-Volatile Memory Miao Cai Chance

A Write-friendly Hashing Scheme for Non-volatile Memory Systems Pengfei Zuo and Yu Hua Huazhong

Managing Non-Volatile Memory in Database Systems A review by Apaar Shanker DATA ANALYTICS

S CATTER C ACHE : Thwarting Cache Attacks via Cache Set Randomization Werner, Unterluggauer,

Storage Class Memory Towards a disruptively low-cost solid-state non-volatile memory Science

Rethinking Applications in the NVM Era Amitabha Roy ex- Intel Research NVM = Non Volatile

NOVA-Fortis: A Fault-Tolerant Non- Volatile Main Memory File System Jian Andiry Xu, Lu Zhang ,

A Persistent Lock- Free Queue for Maurice Herlihy Non-Volatile Virendra Memory (PPoPP18)

Gmail: Past, Present, and Future Adam de Boor, Google The Plan Where we came from Where we're

The 2017 attack the largest to have ever hit the health service hit computers at hospitals

Fwd: Upton Hill Park Parking Expansion - No Thanks -- FW: F...

Census 2020 RFP August 21, 2019 -;, Presentation Beyond Revealed Media

The team BOGDAN GABRIEL MADALINA CRISTINA GABRIEL MOLDOVAN CONSTANTINESCU ALEXANDRESCU

Shaw Communications Inc. Annual General Meeting January 13, 2011 Forward Looking Disclaimer

TAC Resources Hon. Susan M. Redford, Judicial Program Manager Ms. Ashley Cureton, Wellness

Accessing Higher Ground Using ZoomText and Other Great Products from Ai Squared Kimberly Cline