Encrypted Non-volatile Main Memory Systems Yu Hua Huazhong - PowerPoint PPT Presentation

Encrypted Non-volatile Main Memory Systems Yu Hua Huazhong University of Science and Technology https://csyhua.github.io/

Non-volatile Memory (NVM) � Non-volatile memory is expected to replace or complement DRAM in memory hierarchy � Non-volatility, low power, high density, large capacity X Limited write cell endurance PCM ReRAM DRAM Read ( ns ) 20-70 10-50 10 Write ( ns ) 150-220 30-100 10 √ √ × Non-volatility Standby Power ~0 ~0 High 10 7 ~10 9 10 8 ~10 12 10 15 Endurance Density ( Gb/cm 2 ) 13.5 24.5 9.1 K. Suzuki and S. Swanson. “A Survey of Trends in Non-Volatile Memory Technologies: 2000-2014”, IMW 2015 2

NVM Security � Traditional DRAM: volatile – If a DRAM DIMM is removed from a computer • Data are quickly lost � NVM: non-volatile – If an NVM DIMM is removed • Data are still retained in NVM – An attacker can directly read the data • Unsecure 3

Two General attacks to NVM • Attacks: � Stolen NVMM � Bus snooping • Memory encryption is important to NVM • Encrypt data in CPU side, not in-memory. � Direct encryption: AES � Counter encryption: OTP

Encryption Increases Bit Writes to NVM � Diffusion property of encryption – The change of one bit in original data has to modify half of bits in the encrypted data. Encryption Old data: 00000000…0000000000 01011010…0010110100 1 of 512 bits modified 256 of 512 bits modified Encryption New data: 10000000…0000000000 10101100…0100101001 � Memory encryption causes 50% bit flips on each write 5 5

Observation PARSEC 2.1 SPEC CPU2006 � A large number of entire-line duplicates exist, varying from 18% to 98% � On average 58% duplicates lines v.s. 16% zero lines 6

Motivation � Eliminating duplicate lines via performing deduplication in line level – Improve secure NVM endurance • Remove duplicate writes – Improve system performance • Remove the high write latency of duplicate writs • Reduce the wait time of read and non-duplicate write requests 7

Challenges � How to perform in-line deduplication in NVMM without the decrease of system performance – Existing memory deduplication is performed out of line • Duplicates are first written into memory and then eliminated • Fail to reduce writes – Existing in-line deduplication incurs high latency • Use cryptographic hash functions, e.g., SHA-1 and MD5 • > 300ns computation latency that is close to NVM write latency � How to integrate deduplication with NVM encryption while delivering good performance – Be executed serially in the critical path of memory writes – Both produce metadata overheads 8

DeWrite � Light-weight deduplication leveraging asymmetric NVM reads and writes Last Level Cache – Eliminate a write at the cost of a read Data latency Memory Controller – Write latency is much higher than read Metadata Dedup Logic latency (3~8 × ) Cache Non-duplicate � Efficient synergization of deduplication OTP AES-ctr and encryption via parallelism and Metadata: Data: metadata colocation CME Direct encryption Metadata Encrypted NVMM – Opportunistically perform deduplication Storage and encryption in parallel Hardware Architecture – Co-locate their metadata storage for saving space 9

Memory Encryption for Security � Counter mode encryption – Hide the decryption latency – Generate One Time Pad (OTP) using a per-line counter • Counters are buffered in an on-chip counter cache (a) Traditional encryption Encryption Decryption Memory Access Decryption LineAddr Counter Plaintext Plaintext + + Time OTP AES-ctr Key Reduced latency Memory Access Ciphertext Ciphertext One Time Pad (b) Counter mode encryption 10/27

Prediction-based Parallelism The direct way The parallel way A Write Request A Write Request Detect Duplication Detect Duplication Encrypt Data Yes Is duplicate ? Cancel the Write Is duplicate ? No No Yes Encrypt Data Write to NVM Write to NVM Discard the Ciphertext � Be inefficient for applications where � Be inefficient for applications where most lines are non-duplicate most lines are duplicate • Serial execution latency • Unnecessary encryption 11

Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: the duplication states of most memory writes are the same as those of their previous ones – Rationale: The size of duplicate (non-duplicate) data is usually larger than a cache line B is duplicate Memory CPU 1 1 1 1 B B … 1: duplicate A 0: non-duplicate 12

Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: the duplication states of most memory writes are the same as those of their previous ones – Rationale: The size of duplicate (non-duplicate) data is usually larger than a cache line B is non-duplicate Memory CPU 0 0 0 0 B B … 1: duplicate A 0: non-duplicate 13

Prediction-based Parallelism � How to know whether a cache line is duplicate beforehand? � Observation: the duplication states of most memory writes are the same as those of their previous ones – Rationale: The size of duplicate (non-duplicate) data is usually larger than a cache line Predict History window 1 1 1 0 0 0 0 1: duplicate 0: non-duplicate … A new write � Solution: a simple yet effective prediction scheme – Exploiting the duplication states of the most recent memory writes 14

Light-weight Deduplication for NVMM � Compute the light-weight hash (CRC-32) of a cache line, instead of the cryptographic hash � If the hash matches the value in an existing line, read the line and compare data byte by byte (t Q : hash query time) 15

Evaluation � Benchmarks – 12 Benchmarks from SPEC CPU2006: single-threaded – 8 benchmarks from m PARSEC 2.1: multiple-threaded 16

NVM Endurance � DeWrite reduces 54% writes to secure NVM on average 17

Write Speedup � DeWrite speeds up NVM writes by 4.2X on average 18

Read Speedup � DeWrite speeds up NVM reads by 3.1X on average 19

Persistence Issue � The non-volatility of NVM enables data to be persistently stored into NVM � Data may be incorrectly persisted due to crash inconsistency – Modern processors and caches usually reorder memory writes – Volatile caches cause partial update Bus (64bits) Caches (volatile) NVM (non-volatile) 20/27

Consistency Guarantee for Persistence � Durable transaction: a commonly used solution – NV-Heaps (ASPLOS’11), Mnemosyne (ASPLOS’11), DCT (ASPLOS’16), DudeTM (ASPLOS’17), NVML (Intel) – Enable a group of memory updates to be performed in an atomic manner TX_BEGIN � Enforce write ordering do some computation; // Prepare stage: backing up the data in log write undo log; – Cache line flush and memory flush log; memory_barrier (); barrier instructions // Mutate stage: updating the data in place write data; � Avoid partial update flush data; memory_barrier (); // Commit stage: invalidating the log – Logging log->valid = false; flush log->valid; memory_barrier (); TX_END 21/27

The Gap between Persistence and Security � Ensuring both security and persistence – Simply combining existing persistence schemes with memory encryption is inefficient – Each write in the secure NVM has to persist two data • Including the data itself and the counter � Crash inconsistency – Cache line flush instruction cannot operate the counter cache – Memory barrier instruction fails to ensure the ordering of counter writes � Performance degradation – Double write requests 22/27

SecPM: a Secure and Persistent Memory System � Perform only slight modifications on the memory controller, being transparent for programmers – Programs running on an un-encrypted NVM can be directly executed on a secure NVM with SecPM � Consistency guarantee Memory Controller Counters – A counter cache write-through Last Level Cache Counter Plaintext OTP Counter AES-ctr Cache (CWT) scheme Encrypted NVM Ciphertext Counter � Performance improvement The Write Queue – A locality-aware counter write reduction (CWR) scheme Asynchronous DRAM refresh (ADR): cache lines reaching the write queue can be considered durable. 23/27

Counter Cache Write-through (CWT) Scheme � CWT ensures the crash consistency of both data and counter – Append the counter of the data in the write queue during encrypting the data – Ensure the counter is durable before the data flush completes CPU Flu(A) Ack(A) Read(Ac) Ac++ Ret(A) Memory Ctrl Enc(A) App(Ac) (Write Queue) App(A) 24/27

Durable Transaction in SecPM Stage Log content Log Data Data Recoverabl counter content counter e? Prepare Wrong Wrong Correct Correct Yes Mutate Correct Correct Wrong Wrong Yes Commit Correct Correct Correct Correct Yes TX_BEGIN � At least one of log and data is correct do some computation; // Prepare stage: backing up the data in log write undo log; in whichever stage a system failure flush log; memory_barrier (); occurs // Mutate stage: updating the data in place write data; flush data; memory_barrier (); � The system can be recoverable in a // Commit stage: invalidating the log log->valid = false; consistent state in SecPM flush log->valid; memory_barrier (); TX_END 25/27

Encrypted Non-volatile Main Memory Systems Yu Hua Huazhong - PowerPoint PPT Presentation

Encrypted Non-volatile Main Memory Systems Yu Hua Huazhong University of Science and Technology https://csyhua.github.io/ Non-volatile Memory (NVM) Non-volatile memory is expected to replace or complement DRAM in memory hierarchy

Soft Updates Made Simple and Fast on Non-volatile Memory Mingkai Dong , Haibo Chen Institute of

Object-Oriented Recovery for Non-volatile Memory Nachshon Cohen, David Aksun, James Larus EPFL 10

Improving the Performance and Endurance of Encrypted Non-volatile Main Memory through

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Architectural Support for Atomic Durability in Non-Volatile Memory Arpit Joshi , Vijay Nagarajan,

NOVA-Fortis: A Fault-Tolerant Non- Volatile Main Memory File System Jian Andiry Xu, Lu Zhang ,

A Write-friendly Hashing Scheme for Non-volatile Memory Systems Pengfei Zuo and Yu Hua Huazhong

Managing Non-Volatile Memory in Database Systems A review by Apaar Shanker DATA ANALYTICS

HOOP: Efficient Hardware-Assisted Out-of-Place Update for Non-Volatile Memory Miao Cai Chance

A Persistent Friedman Lock-Free Queue Maurice Herlihy for Non-Volatile Memory Virendra

Storage Class Memory Towards a disruptively low-cost solid-state non-volatile memory Science

Ziggurat: A Tiered File System for Non-Volatile Main Memories and Disks Shengan Zheng ,

Cache Systems CPU Main Main CPU Memory Memory 400MHz 10MHz Cache 10MHz Memory Hierarchy

Consistent, Durable, and Safe Memory Management for Byte-Addressable Non-Volatile Main Memory

Log Log-Struct ctured Non-Vo Volatile Ma Main n Me Memory Qingda Hu*, Jinglei Ren, Anirudh

Rethinking Applications in the NVM Era Amitabha Roy ex- Intel Research NVM = Non Volatile

MongoDB Backups, All Grown up! David Murphy David Murphy MongoDB Practice Manager for Percona

Blockchains Beyond Bitcoin: Notations Towards Optimal Level of Analysis of the Problem Let Us

International meeting on numerical semigroups Cortona 2014 September 9, 2014 Francesco

CS473 Web Search (II) Luo Si Department of Computer Science Purdue University Modified Slides

dnstap : high speed DNS logging without packet capture Robert Edmonds (edmonds@fsi.io) Farsight

Information near-duplicates Minimum hashing; Locality Sensitive Hashing Web Search Information

Advanced GATE Embedded Track II, Module 8 Sixth GATE Training Course June 2013 2013 The

Complexity Insights of the Minimum Duplication Problem Guillaume Blin Paola Bonizzoni Riccardo