SLIDE 1
Beomseok Nam
UNIST (Ulsan National Institute of Science and Technology)
SLIDE 2
- Non-Volatile Memory (NVM)
Non-volatile Low Latency Persistent Memory NAND
STT-MRAM
PCM DRAM Non-volatility
Read (ns) 2.5 X 104 5 - 30 20 – 70 10 Write (ns) 2 X 105 10 - 100 150 - 220 10
Byte-addressable
x
185.8 Gbit/cm2 0.36 Gbit/cm2 13.5 Gbit/cm2 9.1 Gbit/cm2
- K. Suzuki and S. Swanson. “A Survey of Trends in Non-Volatile Memory Technologies: 2000-2014”, IMW 2015
SLIDE 3
- When Granularity of Atomicity
= Page 4K Page fsync() write() 4K Page 4K Page
3
SLIDE 4 4K Page 4K Page 4K Page 4K Page store A store B clflush clflush 4K Page 4K Page 4K Page fsync() write() Memory level parallelism A B
- When Granularity of Atomicity
= Page
- When Granularity of Atomicity
= Cache Line
4
SLIDE 5 4K Page 4K Page 4K Page 4K Page 4K Page 4K Page 4K Page fsync() write() store A store B clflush clflush A B B can be stored first.
- When Granularity of Atomicity
= Page
- When Granularity of Atomicity
= Cache Line
5
SLIDE 6 4K Page
cache line
store mfence clflush 4K Page 4K Page 4K Page 4K Page fsync() write()
- When Granularity of Atomicity
= Page
- When Granularity of Atomicity
= Cache Line … Legacy Block IO Interface requires too many barriers and clflushes unnecessarily
6
SLIDE 7
- fsync() vs. a group of mfence and clflush instructions
- Faster than flash memory, but there’s room for improvement.
- Need to make transactions be aware of byte-addressability of NVM
- All or Nothing
Atomicity
Consistency
Isolation
Durability
SLIDE 8
SLIDE 9
N O R T H N O R T H
Single Copy Minimize redundant write operations
SLIDE 10
DB File
Block Device Storage Persistent (PM) Buffer Cache
A system crash may result in inconsistent data.
Query update(EAST)
N O R T H N O R T H E A R T H
SLIDE 11 Free space 1000 Slot Header Record Offset Array 1 900 Free space Record Content Area Key = 50 Key = 30 Metadata Number of Records 1000 900 2 Record Content Area Record Offset Array 1024 Slot Header
Logical view
50 30
Invisible
SLIDE 12 1000 Free space Slot Header Record Offset Array 1 1000 1000 900 Free space Record Content Area Key = 50 Key = 30 Metadata Number of Records 1000 900 2 Key = 40 Free space 800 900 800 Free space 3 Record Content Area Record Offset Array 1024 Slot Header
Logical view
50 30 40
Invisible
SLIDE 13 1000 Free space Record Offset Array 1 1000 1000 900 Free space Record Content Area Key = 50 Key = 30 Metadata Number of Records 1000 900 2 Key = 40 Free space 800 900 1024 Slot Header
Logical view
50 30
Invisible
SLIDE 14
Dirty Record of Slotted Page
Slot Header
3
SLIDE 15
Page A Page B
800 900 1000 900 800 Key = 10 Key = 30 Key = 20 Free space 1000 3 900 1000 1000 900 Key = 50 Key = 40 Free space 2
SLIDE 16
800 900 1000 900 800 Key = 10 Key = 30 Key = 20 Free space 1000 3 900 1000 1000 900 Key = 50 Key = 40 Free space 2 Key = 20
① Writing the record invisible Page A Page B
SLIDE 17
800 900 1000 900 1000 Key = 10 Key = 30 Key = 20 Free space 2 900 1000 1000 900 Key = 50 Key = 40 Free space 2 Key = 20
invisible invisible ② Updating the slot header A Page A Page B
SLIDE 18
3 20 10 30
2 3
A B
commit
A B
dirty record
2 20 50 40
SLIDE 19
3 20 10 30
A B
2 20 50 40 10 30 20 50 40
dirty record
A B
commit
2 3
SLIDE 20 Dirty Slotted Page A Dirty Slotted Page B commit Slot Header
Slot Header
Recovery
A
2 3
B
SLIDE 21
SLIDE 22 NVWAL FASH FAST Single page update Differential logging Slot-header logging In-place commit Multiple page update Slot-header logging Buffer cache In DRAM In PM In PM Log In PM In PM In PM
Hybrid memory architecture PM-only architecture
DB File WAL File Volatile Buffer Cache Persistent Buffer Cache
SLIDE 23
2.1x 2.6x
SLIDE 24
- FAST and FASH consistently outperform NVWAL
- FAST and FASH do not duplicate write operations for records
- NVWAL generates large log frames for large records
- FASH calls more clflush instructions for small record sizes
- The reason is that with smaller records, the slotted-page can hold more records
- FAST calls about 3 clflush instructions when the record is smaller than 64 bytes
- The slot-header size of FAST must be less than 64bytes.
SLIDE 25
- “Failure-atomic slotted paging scheme” eliminates the necessity of redundant
copies by integrating logging into database buffer caching.
- PM-only memory systems can perform faster than hybrid memory systems
that consist of both PM and DRAM
- Even with a small PM, we can significantly reduce IO traffic via Slot-Header
Journaling.
SLIDE 26