Non-Volatile Memory (NVM) NAND STT-MRAM PCM DRAM Non-volatility - - PowerPoint PPT Presentation

non volatile memory nvm
SMART_READER_LITE
LIVE PREVIEW

Non-Volatile Memory (NVM) NAND STT-MRAM PCM DRAM Non-volatility - - PowerPoint PPT Presentation

Beomseok Nam UNIST (Ulsan National Institute of Science and Technology) Non-Volatile Memory (NVM) NAND STT-MRAM PCM DRAM Non-volatility o o o x 2.5 X 10 4 Read (ns) 5 - 30 20 70 10 2 X 10 5 Write (ns) 10 - 100 150 - 220 10


slide-1
SLIDE 1

Beomseok Nam

UNIST (Ulsan National Institute of Science and Technology)

slide-2
SLIDE 2
  • Non-Volatile Memory (NVM)

Non-volatile Low Latency Persistent Memory NAND

STT-MRAM

PCM DRAM Non-volatility

  • x

Read (ns) 2.5 X 104 5 - 30 20 – 70 10 Write (ns) 2 X 105 10 - 100 150 - 220 10

Byte-addressable

x

  • Density

185.8 Gbit/cm2 0.36 Gbit/cm2 13.5 Gbit/cm2 9.1 Gbit/cm2

  • K. Suzuki and S. Swanson. “A Survey of Trends in Non-Volatile Memory Technologies: 2000-2014”, IMW 2015
slide-3
SLIDE 3
  • When Granularity of Atomicity

= Page 4K Page fsync() write() 4K Page 4K Page

3

slide-4
SLIDE 4

4K Page 4K Page 4K Page 4K Page store A store B clflush clflush 4K Page 4K Page 4K Page fsync() write() Memory level parallelism A B

  • When Granularity of Atomicity

= Page

  • When Granularity of Atomicity

= Cache Line

4

slide-5
SLIDE 5

4K Page 4K Page 4K Page 4K Page 4K Page 4K Page 4K Page fsync() write() store A store B clflush clflush A B B can be stored first.

  • When Granularity of Atomicity

= Page

  • When Granularity of Atomicity

= Cache Line

5

slide-6
SLIDE 6

4K Page

cache line

store mfence clflush 4K Page 4K Page 4K Page 4K Page fsync() write()

  • When Granularity of Atomicity

= Page

  • When Granularity of Atomicity

= Cache Line … Legacy Block IO Interface requires too many barriers and clflushes unnecessarily

6

slide-7
SLIDE 7
  • fsync() vs. a group of mfence and clflush instructions
  • Faster than flash memory, but there’s room for improvement.
  • Need to make transactions be aware of byte-addressability of NVM
  • All or Nothing

Atomicity

  • Only valid data

Consistency

  • No interference

Isolation

  • Data is recoverable

Durability

slide-8
SLIDE 8
slide-9
SLIDE 9

N O R T H N O R T H

Single Copy Minimize redundant write operations

slide-10
SLIDE 10

DB File

Block Device Storage Persistent (PM) Buffer Cache

A system crash may result in inconsistent data.

Query update(EAST)

N O R T H N O R T H E A R T H

slide-11
SLIDE 11

Free space 1000 Slot Header Record Offset Array 1 900 Free space Record Content Area Key = 50 Key = 30 Metadata Number of Records 1000 900 2 Record Content Area Record Offset Array 1024 Slot Header

Logical view

  • f this page

50 30

Invisible

slide-12
SLIDE 12

1000 Free space Slot Header Record Offset Array 1 1000 1000 900 Free space Record Content Area Key = 50 Key = 30 Metadata Number of Records 1000 900 2 Key = 40 Free space 800 900 800 Free space 3 Record Content Area Record Offset Array 1024 Slot Header

Logical view

  • f this page

50 30 40

Invisible

slide-13
SLIDE 13

1000 Free space Record Offset Array 1 1000 1000 900 Free space Record Content Area Key = 50 Key = 30 Metadata Number of Records 1000 900 2 Key = 40 Free space 800 900 1024 Slot Header

Logical view

  • f this page

50 30

Invisible

slide-14
SLIDE 14

Dirty Record of Slotted Page

Slot Header

3

slide-15
SLIDE 15

Page A Page B

800 900 1000 900 800 Key = 10 Key = 30 Key = 20 Free space 1000 3 900 1000 1000 900 Key = 50 Key = 40 Free space 2

slide-16
SLIDE 16

800 900 1000 900 800 Key = 10 Key = 30 Key = 20 Free space 1000 3 900 1000 1000 900 Key = 50 Key = 40 Free space 2 Key = 20

① Writing the record invisible Page A Page B

slide-17
SLIDE 17

800 900 1000 900 1000 Key = 10 Key = 30 Key = 20 Free space 2 900 1000 1000 900 Key = 50 Key = 40 Free space 2 Key = 20

invisible invisible ② Updating the slot header A Page A Page B

slide-18
SLIDE 18

3 20 10 30

2 3

A B

commit

A B

dirty record

2 20 50 40

slide-19
SLIDE 19

3 20 10 30

A B

2 20 50 40 10 30 20 50 40

dirty record

A B

commit

2 3

slide-20
SLIDE 20

Dirty Slotted Page A Dirty Slotted Page B commit Slot Header

  • f Page A

Slot Header

  • f Page B

Recovery

A

2 3

B

slide-21
SLIDE 21
slide-22
SLIDE 22

NVWAL FASH FAST Single page update Differential logging Slot-header logging In-place commit Multiple page update Slot-header logging Buffer cache In DRAM In PM In PM Log In PM In PM In PM

Hybrid memory architecture PM-only architecture

DB File WAL File Volatile Buffer Cache Persistent Buffer Cache

slide-23
SLIDE 23

2.1x 2.6x

slide-24
SLIDE 24
  • FAST and FASH consistently outperform NVWAL
  • FAST and FASH do not duplicate write operations for records
  • NVWAL generates large log frames for large records
  • FASH calls more clflush instructions for small record sizes
  • The reason is that with smaller records, the slotted-page can hold more records
  • FAST calls about 3 clflush instructions when the record is smaller than 64 bytes
  • The slot-header size of FAST must be less than 64bytes.
slide-25
SLIDE 25
  • “Failure-atomic slotted paging scheme” eliminates the necessity of redundant

copies by integrating logging into database buffer caching.

  • PM-only memory systems can perform faster than hybrid memory systems

that consist of both PM and DRAM

  • Even with a small PM, we can significantly reduce IO traffic via Slot-Header

Journaling.

slide-26
SLIDE 26