Unified Address Translation for Memory Mapped SSDs with FlashMap - - PowerPoint PPT Presentation

unified address translation for memory mapped ssds with
SMART_READER_LITE
LIVE PREVIEW

Unified Address Translation for Memory Mapped SSDs with FlashMap - - PowerPoint PPT Presentation

Unified Address Translation for Memory Mapped SSDs with FlashMap Jian Huang Anirudh Badam Moinuddin K. Qureshi Karsten Schwan Bridging the DRAM-Disk Gap DRAM Disk Low Performance High Performance Large Capacity Small Capacity


slide-1
SLIDE 1

Unified Address Translation for Memory Mapped SSDs with FlashMap

Jian Huang

Anirudh Badam† Moinuddin K. Qureshi Karsten Schwan †

slide-2
SLIDE 2

2

Bridging the DRAM-Disk Gap

DRAM High Performance Small Capacity Application Memory Component Disk Low Performance Large Capacity Application Storage Component

slide-3
SLIDE 3

2

Bridging the DRAM-Disk Gap

DRAM High Performance Small Capacity Application Memory Component Disk Low Performance Large Capacity Application Storage Component SSD Good Performance Good Capacity

slide-4
SLIDE 4

3

Flash: Slow Memory or Fast Disk?

Flash behaves more like memory than disk

slide-5
SLIDE 5

3

Flash: Slow Memory or Fast Disk?

Flash behaves more like memory than disk

slide-6
SLIDE 6

3

Flash: Slow Memory or Fast Disk?

Flash behaves more like memory than disk

No Seek Latency

slide-7
SLIDE 7

3

Flash: Slow Memory or Fast Disk?

Flash behaves more like memory than disk

No Seek Latency + Internal Parallelism

slide-8
SLIDE 8

3

Flash: Slow Memory or Fast Disk?

Flash behaves more like memory than disk

No Seek Latency + Internal Parallelism + High IOPS

slide-9
SLIDE 9

3

Flash: Slow Memory or Fast Disk?

Flash behaves more like memory than disk

No Seek Latency + Internal Parallelism + High IOPS Use Flash as Memory [Badam et al., NSDI’11]

DRAM High Performance Small Capacity Application Memory Component SSD Good Performance Good Capacity Disk Low Performance Large Capacity Application Storage Component

slide-10
SLIDE 10

3

Flash: Slow Memory or Fast Disk?

Flash behaves more like memory than disk

No Seek Latency + Internal Parallelism + High IOPS Use Flash as Memory [Badam et al., NSDI’11]

DRAM High Performance Small Capacity Application Memory Component SSD Good Performance Good Capacity Disk Low Performance Large Capacity Application Storage Component

slide-11
SLIDE 11

4

Memory Mapped SSDs

Application SSD

slide-12
SLIDE 12

4

Memory Mapped SSDs

Application SSD Virtual Memory File Filesystem

slide-13
SLIDE 13

4

Memory Mapped SSDs

Application SSD

mmap() munmap() msync()

Virtual Memory File Filesystem

slide-14
SLIDE 14

4

Memory Mapped SSDs

Application SSD

mmap() munmap() msync()

Virtual Memory File Filesystem Extended Memory

YAY! MORE MEM!!!

slide-15
SLIDE 15

4

Memory Mapped SSDs

Minimal Code Modifications

+

Application SSD

mmap() munmap() msync()

Virtual Memory File Filesystem Extended Memory

YAY! MORE MEM!!!

slide-16
SLIDE 16

4

Memory Mapped SSDs

Minimal Code Modifications

+

Application SSD

mmap() munmap() msync()

Virtual Memory File Filesystem Extended Memory

YAY! MORE MEM!!!

+

Data Durability

slide-17
SLIDE 17

5

No Free Lunch: Software Overhead

Virtual Memory System Application Flash Translation Layer Flash File System

slide-18
SLIDE 18

5

No Free Lunch: Software Overhead

Virtual Memory System Application Flash Translation Layer Flash File System

Page Table & Memory Manager

Virtual Address

Physical Address/File Offset

page fault

slide-19
SLIDE 19

5

No Free Lunch: Software Overhead

Virtual Memory System Application Flash Translation Layer Flash File System

Page Table & Memory Manager

Virtual Address

Physical Address/File Offset

File Index File Offset Logical Block Address

page fault

slide-20
SLIDE 20

5

No Free Lunch: Software Overhead

Virtual Memory System Application Flash Translation Layer Flash File System

Page Table & Memory Manager

Virtual Address

Physical Address/File Offset

File Index File Offset Logical Block Address FTL Logical Block Address Physical Block Address

page fault

slide-21
SLIDE 21

6

Software Overhead Quantified

Virtual Memory System Application Flash Translation Layer Flash File System 3 address translations + 2 boundary checks + 2 permission checks

  • = Latency: 15 – 20 microseconds

+ Increased Metadata Overhead

slide-22
SLIDE 22

7

FlashMap: Unified Address Translation

Application Flash

2

Reduced Storage, only 1 mapping table Reduced Latency, only 1 address translation + 1 permission check +1 boundary check

1

Virtual Memory System Flash Translation Layer File System

slide-23
SLIDE 23

7

FlashMap: Unified Address Translation

Application Flash

2

Reduced Storage, only 1 mapping table Reduced Latency, only 1 address translation + 1 permission check +1 boundary check

1

Unified Address Translation

slide-24
SLIDE 24

8

Combining Page Table and File System

Flash Translation Layer Flash Process A File System File

slide-25
SLIDE 25

8

Combining Page Table and File System

Flash Translation Layer Flash Process A File System File VM Region

slide-26
SLIDE 26

8

Combining Page Table and File System

Flash Translation Layer Flash Process A File System File VM Region Page Table

PGD PUD PMD PTE Offset

slide-27
SLIDE 27

8

Combining Page Table and File System

Flash Translation Layer Flash Process A File System File VM Region Page Table

PGD PUD PMD PTE Offset

Process-specific, private

slide-28
SLIDE 28

8

Combining Page Table and File System

Flash Translation Layer Flash Process A File System File VM Region Page Table

PGD PUD PMD PTE Offset

Only for mapped file

slide-29
SLIDE 29

9

Preserving File System Permissions

Flash Translation Layer Flash Process A File System Mapped File VM Region Shared Page Table Process B VM Region

PGD PUD PMD PTE Offset

slide-30
SLIDE 30

9

Preserving File System Permissions

Flash Translation Layer Flash Process A File System Mapped File VM Region Shared Page Table Process B VM Region

READ_ONLY READ_WRITE

PGD PUD PMD PTE Offset

slide-31
SLIDE 31

9

Preserving File System Permissions

Flash Translation Layer Flash Process A File System Mapped File VM Region Shared Page Table Process B VM Region

READ_ONLY READ_WRITE

Permission Conflict !!!

PGD PUD PMD PTE Offset

slide-32
SLIDE 32

9

Preserving File System Permissions

Flash Translation Layer Flash Process A File System Mapped File VM Region Shared Page Table Process B VM Region

READ_ONLY READ_WRITE

Only share the leaf-level page table pages !

PGD PUD PMD PTE Offset

slide-33
SLIDE 33

9

Preserving File System Permissions

Flash Translation Layer Flash Process A File System Mapped File VM Region Shared Page Table Process B VM Region

READ_ONLY READ_WRITE

Only share the leaf-level page table pages !

PGD PUD PMD PTE Offset

slide-34
SLIDE 34

9

Preserving File System Permissions

Flash Translation Layer Flash File System Mapped File Process A

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

Process B

PGD PUD PMD PTE Offset

slide-35
SLIDE 35

9

Preserving File System Permissions

Flash Translation Layer Flash File System Mapped File Process A

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

Process B

PGD PUD PMD PTE Offset

… Private Private Shared

slide-36
SLIDE 36

10

Page Table in FlashMap

Before Mapping a File Process’s Private Virtual Memory + File Backed Memo

File Process Page Directory Private Virtual Memory Regions Private Leaf-Level Page Table Pages Virtual Memory Regions Backed by File Shared Leaf-Level Page Table Pages

slide-37
SLIDE 37

10

Page Table in FlashMap

After Mapping a File Process’s Private Virtual Memory + File Backed Memory

File Process Page Directory Private Virtual Memory Regions Private Leaf-Level Page Table Pages Virtual Memory Regions Backed by File Shared Leaf-Level Page Table Pages Only for mapped file

slide-38
SLIDE 38

11

Preserving Memory Protection

Flash Translation Layer Flash Process A File System Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

Process B

PGD PUD PMD PTE Offset

slide-39
SLIDE 39

11

Preserving Memory Protection

Flash Translation Layer Flash Process A File System Mapped File

Shared Leaf-level Page Table Pages What if I require custom memory protection for a single page ???

PGD PUD PMD PTE Offset

Process B

PGD PUD PMD PTE Offset

slide-40
SLIDE 40

11

Preserving Memory Protection

Flash Translation Layer Flash Process A File System Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

Process B

PGD PUD PMD PTE Offset

Private Leaf-level Page Table

slide-41
SLIDE 41

12

Flash Translation Layer Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

Process B

PGD PUD PMD PTE Offset

Combining FTL and Shared Page Table

Mapping Table GC ECC Wear Leveling

slide-42
SLIDE 42

12

Flash Translation Layer Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

Process B

PGD PUD PMD PTE Offset

Combining FTL and Shared Page Table

GC ECC Wear Leveling

Overloaded PTE …

slide-43
SLIDE 43

12

Flash Translation Layer Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

Process B

PGD PUD PMD PTE Offset

Combining FTL and Shared Page Table

GC ECC Wear Leveling

Overloaded PTE …

slide-44
SLIDE 44

13

Putting It All Together

Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

Read DRAM …

FlashMap

slide-45
SLIDE 45

13

Putting It All Together

Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

Read DRAM …

FlashMap

slide-46
SLIDE 46

13

Putting It All Together

Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

Read DRAM … page fault

FlashMap

slide-47
SLIDE 47

13

Putting It All Together

Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

Read DRAM … update PTE

FlashMap

slide-48
SLIDE 48

13

Putting It All Together

Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

DRAM Write …

FlashMap

slide-49
SLIDE 49

13

Putting It All Together

Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

DRAM Write … DRAM hit

FlashMap

slide-50
SLIDE 50

13

Putting It All Together

Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

DRAM Write … DRAM miss

FlashMap

slide-51
SLIDE 51

13

Putting It All Together

Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

DRAM Write … update PTE

FlashMap

slide-52
SLIDE 52

13

Putting It All Together

Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

DRAM Write …

FlashMap

slide-53
SLIDE 53

13

Putting It All Together

Flash Process A Mapped File

Shared Leaf-level Page Table Pages

PGD PUD PMD PTE Offset

DRAM GC …

FlashMap

slide-54
SLIDE 54

14

FlashMap: Implementation in Real System

SSD Emulator Real SSD File System (EXT4) Mapped File … File Index RAMDisk Linux Memory Manager

slide-55
SLIDE 55

14

FlashMap: Implementation in Real System

SSD Emulator Real SSD File System (EXT4) Mapped File … File Index RAMDisk Linux Memory Manager

slide-56
SLIDE 56

14

FlashMap: Implementation in Real System

SSD Emulator Real SSD File System (EXT4) Mapped File … File Index RAMDisk Linux Memory Manager

slide-57
SLIDE 57

14

FlashMap: Implementation in Real System

SSD Emulator Real SSD File System (EXT4) Mapped File … File Index RAMDisk Linux Memory Manager

slide-58
SLIDE 58

15

Experimental Setup

Baseline

unmodified Linux: mmap + EXT4 + FTL with page-level mapping

FTL+FS★

mmap + combined FTL & file system

FlashMap

unified address translation

Intel Xeon processors + 64 GB DRAM + 2 TB SSD

★similar to Nameless Writes [Zhang et al., FAST’12] and

DFS [Josephson et al., FAST’10]

slide-59
SLIDE 59

16

Real Application Workloads

NoSQL Store SQL Database

Shore-MT TPCC, TPCB, TATP

Graph Analytics

+

YCSB

+ +

PageRank

slide-60
SLIDE 60

17

Metadata Size for 2 TB SSD

2 4 6 8 10 12 14 Metadata Size (GB) Baseline FTL+FS FlashMap

50%

slide-61
SLIDE 61

1 2 3 4 16 32 64 128 256 Speedup SSD size : DRAM size

Baseline FTL+FS FlashMap

18

Benefits from Reduced Mapping Overhead

FlashMap: 1.7x performance improvement over FTL+FS

slide-62
SLIDE 62

1 2 3 4 16 32 64 128 256 Speedup SSD size : DRAM size

Baseline FTL+FS FlashMap

18

Benefits from Reduced Mapping Overhead

FlashMap: 1.7x performance improvement over FTL+FS

slide-63
SLIDE 63

1 2 3 4 16 32 64 128 256 Speedup SSD size : DRAM size

Baseline FTL+FS FlashMap

18

Benefits from Reduced Mapping Overhead

25 50 75 100

16 32 64 128 256

DRAM Hit Rate (%)

Reducing the mapping overhead improves the DRAM caching efficiency

slide-64
SLIDE 64

19

5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 Latency (us) Device latency (microseconds)

Baseline FTL+FS FlashMap

Latency Reduction

Benefit (up to 53% latency reduction) mainly comes from the combination of page table and file system

slide-65
SLIDE 65

20

Benefits from Reduced Latency

5 10 15 20 25 5 10 15 20 25 Throughput (K TPS) Device Latency (us)

Baseline FTL+FS FlashMap FlashMap: 1.8x more TPS than baseline and FTL+FS

slide-66
SLIDE 66

21

Conclusion

Unified Address Translation Application Flash as Memory

2

Reduced Storage

3.3x performance improvement for data-intensive applications

Reduced Latency

53% latency reduction for high-end SSDs, 1.8x more TPS for latency-sensitive applications, e.g., database systems

1

slide-67
SLIDE 67

Jian Huang jian.huang@gatech.edu

Anirudh Badam† Moinuddin K. Qureshi Karsten Schwan

Thanks! Q&A