logging in persistent memory
play

Logging in Persistent Memory: to Cache, or Not to Cache? Mengjie Li, - PowerPoint PPT Presentation

Logging in Persistent Memory: to Cache, or Not to Cache? Mengjie Li, Matheus Ogleari , Jishen Zhao Persistent Memory STT-RAM, PCM, Memory CPU CPU ReRAM, NVDIMM, Battery-backed Load/store DRAM NVRAM DRAM, etc. Not persistent Persistent


  1. Logging in Persistent Memory: to Cache, or Not to Cache? Mengjie Li, Matheus Ogleari , Jishen Zhao

  2. Persistent Memory STT-RAM, PCM, Memory CPU CPU ReRAM, NVDIMM, Battery-backed Load/store DRAM NVRAM DRAM, etc. Not persistent Persistent memory Storage Disk/Flash Load/store Fopen, fread, fwrite , … Persistent Persistent These nonvolatile devices are able to retain the data in a consistent state in case of power loss. 2

  3. Logging in Persistent Memory Update persistent memory with transactions Tx_begin Core Core Core Core … … do some reads L1 L1 L1 L1 do some computation Memory LLC LLC Rlog ( addr(C), new_val(C) ) Barrier memory_barrier NVRAM Root NVRAM Root write C Tx_commit A A C’ B B C D D Micro-ops: Log_C ’ Log_C ’ Log Log s tore C’ 1 C 1 ’ s tore C’ 2 ... Time 3

  4. To cache, or not cache? That is the question. [Mengjie Li+, Memsys 2017] 4

  5. Experimental Setup • Desktop – Dell OptiPlex 7040 Tower o CPU – 4-core 3.4GHz Intel Core-i7 o Cache – 8 MB last-level cache • Measurement Tools – Perf & rdtsc • Micro-benchmarks – run 20 times and report the average performance without initialization time o Various working set sizes o Various transaction sizes and write intensity o Various data structures: hashtable, rbtree, array, … 5

  6. Microbenchmarks Example //initialization Create an array of strings //Uncacheable log //Cacheable log for (i = 0; i < array_size; ++i) { for (i = 0; i < array_size; ++i) { value = random_string; value = random_string; key = i; key = i; // Log updates // Log updates // Intrinsic functions to invoke movnti log[2 * i] = key; _mm_stream_si32(&log[2 * i], key); log[2 * i + 1] = value; asm volatile (“ sfence ”); _mm_stream_si32(&log[2 * i + 1], value); asm volatile (“ sfence ”); array[i] = value; array[i] = value; } } 6

  7. Issue with Cacheable log Core Core L1i Cache L1d Cache L1i Cache L1d Cache Log Cache pollution ... ... Log Last-Level Cache Log Memory Bus DRAM NVM Log 7

  8. LLC Miss Rate and Execution Time Execution Time (Million Cycles) LLC Miss Rate Execution Time 90% 1.4 85% 1.2 LLC Miss Rate 80% 1.0 75% 0.8 70% 0.6 65% 0.4 60% 0.2 55% 50% 0.0 Uncacheable Cacheable 8

  9. How about uncacheable log performance? 9

  10. How do we make log uncacheable? Example: x86 processors provide uncacheable write instructions (movnti, movntg, etc) Instructions can be invoked by • Inline functions (__asm__()) • Intrinsic functions(_mm_stream_si32) 10

  11. Write Combining Buffer (WCB) 4-6 cache lines Core Core WCB Log WCB L1 Cache L1 Cache ... ... Last-Level Cache Memory Bus DRAM NVM Log 11

  12. Issues with Uncacheable Log • Existing uncacheable writing schemes are sub-optimal o Partial writes in WCB o Overhead of uncacheable write instructions o Limited WCB size 12

  13. Partial Writes in WCB Full write Partial write 64B < 64B WCB 1 bus clock 1 bus clock Memory Partial writes are inefficient, because they underutilize the memory bus bandwidth 13

  14. Execution Time vs. Transaction Size — Partial Writes Partial Writes Full Writes 1.28E09 Cycles 1.15E08 Cycles 100% Partial writes: 90% String Size – 4B Execution Time 80% Iterations – 2097152 70% Total Data – 8MB 60% 50% 40% Full wirtes: 30% String Size – 64B 20% 10% Iterations – 131072 0% Total Data – 8MB Uncacheable Cacheable 14

  15. Overhead of Uncacheable Write Instructions / /U n c a c h e a b l e lo g fo r ( i = 0 ; i < a r r a y _ s i z e ; + + i) { v a l u e = r a n d o m _ s tr i n g ; k e y = i ; / / L o g u p d a te s / / In tr i n s i c fu n c ti o n s to i n v o k e m o v n ti _ m m _ s tr e a m _ s i 3 2 ( & lo g [ 2 * i] , k e y ) ; e ( “ ” ) ; _ m m _ s tr e a m _ s i 3 2 ( & lo g [ 2 * i + 1 ] , v a l u e ) ; e ( “ ” ) ; a s m v o l a ti l s fe n c e a r r a y [ i] = v a l u e ; } / /C a c h e a b l e lo g fo r ( i = 0 ; i < a r r a y _ s i z e ; + + i) { v a l u e = r a n d o m _ s tr i n g ; k e y = i ; / / L o g u p d a te s lo g [ 2 * i] = k e y ; lo g [ 2 * i + 1 ] = v a l u e ; e ( “ ” ) ; a s m v o l a ti l s fe n c e e ( “ ” ) ; a r r a y [ i] = v a l u e ; 15 } 6

  16. Overhead of Uncacheable Write Instructions More overhead to do type casting, if the type of data written is not integer void _mm_stream_si32 (int *p, int a) asm (” movnti %1, %0” : “=m” (*p) : “r”(v)); // int * p, int v; 16

  17. Issues with Limited WCB Size Log updates among transactions issued by program WCB NVRAM bus 17

  18. Inefficiencies of Uncacheable Log String size iterations (Bytes) uncacheable cacheable speedup 4 2097152 3.5 1.6 Partial writes 8 1048576 Execution Time (Billion cycles) 3.0 and sfence Speedup 16 524288 WCB size limit 2.5 1.4 32 262144 2.0 – – 64 131072 1.5 1.2 1.0 128 65536 0.5 256 32768 0.0 1.0 4 8 16 32 64 128 256 String size (Bytes) 18

  19. Summary • Tradeoff between cacheable and uncacheable log o Issues with cacheable log – cache contamination o Issues with uncacheable log – sub-optimal design in • Uncacheable write instructions and programming interface • Hardware components, e.g., write-combining buffer design and the way it is used • More results o Sensitivity study on read/write ratio in transactions o Sensitivity study on transaction size o Other data structures: hash table, rbtree, b+tree, etc. 19

  20. Logging in Persistent Memory: to Cache, or Not to Cache? Mengjie Li, Matheus Ogleari , Jishen Zhao

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend