improving the performance and endurance of encrypted non
play

Improving the Performance and Endurance of Encrypted Non-volatile - PowerPoint PPT Presentation

Improving the Performance and Endurance of Encrypted Non-volatile Main Memory through Deduplicating Writes Pengfei Zuo, Yu Hua, Ming Zhao*, Wen Zhou, Yuncheng Guo Huazhong University of Science and Technology (HUST), China *Arizona State


  1. Improving the Performance and Endurance of Encrypted Non-volatile Main Memory through Deduplicating Writes Pengfei Zuo, Yu Hua, Ming Zhao*, Wen Zhou, Yuncheng Guo Huazhong University of Science and Technology (HUST), China *Arizona State University (ASU), USA

  2. Non-volatile Memory (NVM) � Non-volatile memory is expected to replace or complement DRAM in memory hierarchy � Non-volatility, low power, high density, large capacity PCM ReRAM DRAM Read ( ns ) 20-70 20-50 10 Write ( ns ) 150-220 70-140 10 PCM √ √ × Non-volatility Standby Power ~0 ~0 High 10 7 ~10 9 10 8 ~10 12 10 15 Endurance Density ( Gb/cm 2 ) 13.5 24.5 9.1 ReRAM C. Xu et al. “Overcoming the Challenges of Crossbar Resistive Memory Architectures”, HPCA, 2015. 2 K. Suzuki and S. Swanson. “A Survey of Trends in Non-Volatile Memory Technologies: 2000-2014”, IMW 2015.

  3. Endurance and Security in Non-volatile Memory � NVM typically has limited endurance – 10 7 ~10 9 for PCM, 10 8 ~10 12 for ReRAM – Writes have much higher latency than reads – Write reduction matters for NVM � NVM is vulnerable to stolen DIMM attack – NVM still retains data after systems power down – An attacker can directly read data from the stolen NVM – Memory encryption matters for NVM 3

  4. Encryption Increases Bit Flips to NVM � Diffusion property of encryption – The change of one bit in the original data has to modify half of bits in the encrypted data Encryption New data: 10101100…000100101001 10000000…000000000000 Overwrite Overwrite Encryption Old data in NVM: 00000000…000000000000 01011010…000010110100 256 of 512 bits modified 1 of 512 bits modified 4 4

  5. Encryption Increases Bit Flips to NVM 4X Young et al. “DEUCE: Write-efficient encryption for non-volatile memories”, in Proc. of ASPLOS, 2015. Encryption renders existing bit-level write reduction techniques ineffective 5

  6. Observation and Motivation PARSEC 2.1 SPEC CPU2006 � A large number of entire-line duplicates exist in real-world applications 6

  7. DeWrite � Lightweight cache-line-level deduplication for NVMM Last Level Cache – Employ lightweight hashing Data Memory Controller – Leverage NVM read/write asymmetry Metadata Dedup Logic – Eliminate a write at the cost of a read Cache Non-duplicate OTP AES-ctr � Efficient synergization between Data: Metadata: deduplication and encryption Direct encryption CME Metadata Encrypted NVMM – Opportunistic parallelism Storage – Metadata storage co-location Hardware Architecture 7

  8. Prediction-based Parallelism The direct way A Write Request Detect Duplication Yes Is duplicate ? Cancel the Write No Encrypt Data Write to NVM � Be inefficient for non-duplicate writes • Serial execution latency 8

  9. Prediction-based Parallelism The direct way The parallel way A Write Request A Write Request Detect Duplication Detect Duplication Encrypt Data Yes Is duplicate ? Cancel the Write Is duplicate ? No No Yes Encrypt Data Write to NVM Write to NVM Discard the Ciphertext � Be inefficient for non-duplicate writes � Be inefficient for duplicate writes • Serial execution latency • Unnecessary encryption 9

  10. Prediction-based Parallelism The direct way The parallel way A Write Request A Write Request Detect Duplication Detect Duplication Encrypt Data Yes Is duplicate ? Cancel the Write Is duplicate ? No No Yes Encrypt Data Write to NVM Write to NVM Discard the Ciphertext � Be inefficient for non-duplicate writes � Be inefficient for duplicate writes • Serial execution latency • Unnecessary encryption 10

  11. Prediction-based Parallelism The direct way The parallel way A Write Request A Write Request Detect Duplication Detect Duplication Encrypt Data Yes Is duplicate ? Cancel the Write Is duplicate ? No No Yes Encrypt Data Write to NVM Write to NVM Discard the Ciphertext Duplicate Non-duplicate Prediction 11

  12. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: Memory CPU D C B A History window 1: duplicate 0: non-duplicate 12

  13. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: Memory CPU A D C B History window 1: duplicate 0: non-duplicate 13

  14. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: Memory CPU D C B A 0 History window 1: duplicate 0: non-duplicate 14

  15. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: Memory CPU B D C A 0 History window 1: duplicate 0: non-duplicate 15

  16. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: Memory CPU B D C A Predict 0 History window 1: duplicate 0: non-duplicate 16

  17. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: Memory CPU D C B A 0 0 History window 1: duplicate 0: non-duplicate 17

  18. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: Memory CPU C D B A 0 0 History window 1: duplicate 0: non-duplicate 18

  19. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: Memory CPU C D B A Predict 0 0 History window 1: duplicate 0: non-duplicate 19

  20. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: 92.1% accuracy Memory CPU D C B A 0 0 0 History window 1: duplicate 0: non-duplicate 20

  21. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: Memory CPU D C B A 0 0 0 History window 1: duplicate 0: non-duplicate 21

  22. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: Memory CPU D C B A Predict 0 0 0 History window 1: duplicate 0: non-duplicate 22

  23. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: 92.1% 93.6% Memory CPU D C B A Predict 0 0 0 History window 1: duplicate 0: non-duplicate 23

  24. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: 92.1% 93.6% Memory CPU D C B A Predict 0 0 0 History window 1: duplicate 0: non-duplicate 24

  25. Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: duplication states of most memory writes are the same as those of their previous ones � A prediction scheme: 92.1% 93.6% Why can we achieve such a high prediction accuracy? � Rationale: the size of duplicate (non-duplicate) data is usually much larger than a cache line – E.g., a page (4KB) is duplicate or non-duplicate: 100% accuracy 25

  26. Lightweight Deduplication for NVMM � Traditional deduplication N Non-duplicate Hash computation latency: >300ns SHA1/ SHA1/ ≈ NVM write latency Match? Match? MD5 MD5 Y Duplicate 26

  27. Lightweight Deduplication for NVMM � Traditional deduplication N Non-duplicate Hash computation latency: >300ns SHA1/ SHA1/ ≈ NVM write latency Match? Match? MD5 MD5 Y Duplicate � DeWrite The latency is 91ns at most N Non-duplicate N CRC-32 CRC-32 Match? Match? Y Y Read data Read data Duplicate Match? Match? and compare and compare 15ns 75ns+1ns 27

  28. Metadata Colocation � Encryption metadata: per-line counter Encryption Decryption Plaintext LineAddr Plaintext Counter + + OTP AES-ctr Key Ciphertext Ciphertext 28

  29. Metadata Colocation � Encryption metadata: per-line counter � Deduplication metadata: address mapping, reverted hash 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend