1
Ziggurat: A Tiered File System for Non-Volatile Main Memories and Disks
Shengan Zheng†, Morteza Hoseinzadeh§, Steven Swanson§
† Shanghai Jiao Tong University § University of California, San Diego
Ziggurat: A Tiered File System for Non-Volatile Main Memories and - - PowerPoint PPT Presentation
Ziggurat: A Tiered File System for Non-Volatile Main Memories and Disks Shengan Zheng , Morteza Hoseinzadeh , Steven Swanson Shanghai Jiao Tong University University of California, San Diego 1 Background Non-volatile main
1
† Shanghai Jiao Tong University § University of California, San Diego
2
DRAM + Flash NVDIMM 3D-XPoint NVDIMM
3
1GB/s 100MB/s 10GB/s 0.01 0.1 1 10
Hard Disk Drive SATA SSD NVMe SSD Optane SSD NVMM DRAM
4
1GB/s 100MB/s 10GB/s 0.01 0.1 1 10
Hard Disk Drive SATA SSD NVMe SSD Optane SSD NVMM DRAM
5
6
7
8
9
10
write(0,2); fsync(); write(2,2); fsync(); write(4,2); fsync(); write(0,2); write(2,2); write(4,2); fsync(); 1 2 3
4 5 6 7 1 2 3
4 5 6 7 File log File data Write entry
File log File data
11
write(0,2); fsync(); write(2,2); fsync(); write(4,2); fsync(); write(0,2); write(2,2); write(4,2); fsync(); 1 2 3
4 5 6 7 1 2 3
4 5 6 7 File log File data Write entry
File log File data
0,2
12
write(0,2); fsync(); write(2,2); fsync(); write(4,2); fsync(); write(0,2); write(2,2); write(4,2); fsync(); 1 2 3
4 5 6 7 1 2 3
4 5 6 7 File log File data Write entry
File log File data
0,2
13
write(0,2); fsync(); write(2,2); fsync(); write(4,2); fsync(); write(0,2); write(2,2); write(4,2); fsync(); 1 2 3
4 5 6 7 1 2 3
4 5 6 7 File log File data Write entry
File log File data
0,2 2,2
14
write(0,2); fsync(); write(2,2); fsync(); write(4,2); fsync(); write(0,2); write(2,2); write(4,2); fsync(); 1 2 3
4 5 6 7 1 2 3
4 5 6 7 File log File data Write entry
File log File data
0,2 2,2
15
write(0,2); fsync(); write(2,2); fsync(); write(4,2); fsync(); write(0,2); write(2,2); write(4,2); fsync(); 1 2 3
4 5 6 7 1 2 3
4 5 6 7 File log File data Write entry
File log File data
0,2 2,2 4,2
16
write(0,2); fsync(); write(2,2); fsync(); write(4,2); fsync(); write(0,2); write(2,2); write(4,2); fsync(); 1 2 3
4 5 6 7 1 2 3
4 5 6 7 File log File data Write entry
File log File data
0,2 2,2 4,2
17
write(0,2); fsync(); write(2,2); fsync(); write(4,2); fsync(); write(0,2); write(2,2); write(4,2); fsync(); 1 2 3
4 5 6 7 1 2 3
4 5 6 7 File log File data Write entry
File log File data
0,2 2,2 4,2 0,2
18
write(0,2); fsync(); write(2,2); fsync(); write(4,2); fsync(); write(0,2); write(2,2); write(4,2); fsync(); 1 2 3
4 5 6 7 1 2 3
4 5 6 7 File log File data Write entry
File log File data
0,2 2,2 4,2 0,2 2,2
19
write(0,2); fsync(); write(2,2); fsync(); write(4,2); fsync(); write(0,2); write(2,2); write(4,2); fsync(); 1 2 3
4 5 6 7 1 2 3
4 5 6 7 File log File data Write entry
File log File data
0,2 2,2 4,2 0,2 2,2 4,2
20
write(0,2); fsync(); write(2,2); fsync(); write(4,2); fsync(); write(0,2); write(2,2); write(4,2); fsync(); 1 2 3
4 5 6 7 1 2 3
4 5 6 7 File log File data Write entry
File log File data 0,2 2,2 4,2 0,2 2,2 4,2
21
1 2 3 4 5 6 7 File log 0,4,3 File data 4,4,1 5,1,0 6,1,0 write(0,4); write(6,1); write(4,4);
Write entry
22
1 2 3 4 5 6 7 File log 0,4,3 File data 4,4,1 5,1,0 6,1,0 write(0,4); write(6,1); write(4,4);
Write entry 0,4,?
23
1 2 3 4 5 6 7 File log 0,4,3 File data 4,4,1 5,1,0 6,1,0 write(0,4); write(6,1); write(4,4);
Write entry 0,4,?
24
1 2 3 4 5 6 7 File log 0,4,3 File data 4,4,1 5,1,0 6,1,0 write(0,4); write(6,1); write(4,4);
Write entry 0,4,?
0,4,4
25
1 2 3 4 5 6 7 File log 0,4,3 File data 4,4,1 5,1,0 6,1,0 write(0,4); write(6,1); write(4,4);
Write entry 0,4,?
0,4,4 6,1,?
26
1 2 3 4 5 6 7 File log 0,4,3 File data 4,4,1 5,1,0 6,1,0 write(0,4); write(6,1); write(4,4);
Write entry 0,4,?
0,4,4 6,1,?
27
1 2 3 4 5 6 7 File log 0,4,3 File data 4,4,1 5,1,0 6,1,0 write(0,4); write(6,1); write(4,4);
Write entry 0,4,?
0,4,4 6,1,?
6,1,0
28
1 2 3 4 5 6 7 File log 0,4,3 File data 4,4,1 5,1,0 6,1,0 write(0,4); write(6,1); write(4,4);
Write entry 0,4,?
0,4,4 6,1,?
6,1,0 4,4,?
29
1 2 3 4 5 6 7 File log 0,4,3 File data 4,4,1 5,1,0 6,1,0 write(0,4); write(6,1); write(4,4);
Write entry 0,4,?
0,4,4 6,1,?
6,1,0 4,4,?
30
1 2 3 4 5 6 7 File log 0,4,3 File data 4,4,1 5,1,0 6,1,0 write(0,4); write(6,1); write(4,4);
Write entry 0,4,?
0,4,4 6,1,?
6,1,0 4,4,?
4,4,0
31
1 2 3 4 5 6 7 File log 0,2,2 File data 2,2,4 4,2,6 1 2 3 4 5 6 7 File log 0,2,2 File data 2,2,4 4,2,6 6,2,8
2 ∗ 2 + 4 ∗ 2 + 6 ∗ 2 2 + 2 + 2
4 ∗ 6 + 8 ∗ 2 6 + 2
Write entry
32
Cold
CPU 0 Hot
CPU 1
amtime
1 2 3 4 5 6 7 File log 0,2,2 File data 2,2,4 4,2,6 6,2,8
Cold Cold Hot Hot
Write entry
33
Chmod Write 0-8K
Head Tail
Inode Inode log NVMM Disk
Write 0-4K File Page 1 File Page 2 File Page 3 File Page 1 File Page 4 Write 8-16K
Pages
Page state
Stale Live
Entry type
Inode update Old write entry New write entry
34
Chmod Write 0-8K
Head Tail
Inode Inode log NVMM Disk
Write 0-4K File Page 1 File Page 2 File Page 3 File Page 1 File Page 4 Write 8-16K File Page 3’ File Page 4’ Step 1
Pages
Page state
Stale Live
Entry type
Inode update Old write entry New write entry
35
Chmod Write 0-8K
Head Tail
Inode Inode log NVMM Disk
Write 0-4K File Page 1 File Page 2 File Page 3 File Page 1 File Page 4 Write 8-16K File Page 3’ File Page 4’ Step 1 Step 2
Pages
Page state
Stale Live
Entry type
Inode update Old write entry New write entry
36
Chmod Write 0-8K
Head Tail
Inode Inode log NVMM Disk
Write 0-4K Write 8-16K File Page 1 File Page 2 File Page 3 File Page 1 File Page 4 Write 8-16K File Page 3’ File Page 4’ Step 1 Step 2 Step 3
Pages
Page state
Stale Live
Entry type
Inode update Old write entry New write entry
37
Chmod Write 0-8K
Head Tail
Inode Inode log NVMM Disk
Write 0-4K Write 8-16K File Page 1 File Page 2 File Page 3 File Page 1 File Page 4 Write 8-16K File Page 3’ File Page 4’ Step 1 Step 2 Step 3 Step 4
Pages
Page state
Stale Live
Entry type
Inode update Old write entry New write entry
38
Chmod Write 0-8K
Head Tail
Inode Inode log NVMM Disk
Write 0-4K Write 8-16K File Page 1 File Page 2 File Page 3 File Page 1 File Page 4 Write 8-16K File Page 3’ File Page 4’ Step 1 Step 2 Step 3 Step 4
Pages
Step 5
Page state
Stale Live
Entry type
Inode update Old write entry New write entry
39
Chmod Write 0-8K
Head Tail
Inode
Write 4-8K File Page 1 File Page 2 File Page 3 File Page 2 File Page 4 Write 8-16K Write 12-16K File Page 4
Inode log NVMM Disk Pages
Page state
Stale Live
Entry type
Inode update Old write entry New write entry
40
Step 1 File Page 1’ File Page 2’ File Page 3’ File Page 4’ Chmod Write 0-8K
Head Tail
Inode
Write 4-8K File Page 1 File Page 2 File Page 3 File Page 2 File Page 4 Write 8-16K Write 12-16K File Page 4
Inode log NVMM Disk Pages
Page state
Stale Live
Entry type
Inode update Old write entry New write entry
41
Step 1 File Page 1’ File Page 2’ File Page 3’ File Page 4’ Chmod Write 0-8K
Head Tail
Inode
Write 4-8K File Page 1 File Page 2 File Page 3 File Page 2 File Page 4 Write 8-16K Write 12-16K File Page 4
Inode log NVMM Disk Pages
Page state
Stale Live
Entry type
Inode update Old write entry New write entry Step 2
42
Step 1 File Page 1’ File Page 2’ File Page 3’ File Page 4’ Chmod Write 0-8K
Head Tail
Inode
Write 4-8K File Page 1 File Page 2 File Page 3 File Page 2 File Page 4 Write 8-16K Write 12-16K Write 0-16K File Page 4 Step 3
Inode log NVMM Disk Pages
Page state
Stale Live
Entry type
Inode update Old write entry New write entry Step 2
43
Step 1 File Page 1’ File Page 2’ File Page 3’ File Page 4’ Chmod Write 0-8K
Head Tail
Inode
Write 4-8K File Page 1 File Page 2 File Page 3 File Page 2 File Page 4 Write 8-16K Step 4 Write 12-16K Write 0-16K File Page 4 Step 3
Inode log NVMM Disk Pages
Page state
Stale Live
Entry type
Inode update Old write entry New write entry Step 2
44
Step 1 Step 5 File Page 1’ File Page 2’ File Page 3’ File Page 4’ Chmod Write 0-8K
Head Tail
Inode
Write 4-8K File Page 1 File Page 2 File Page 3 File Page 2 File Page 4 Write 8-16K Step 4 Write 12-16K Write 0-16K File Page 4 Step 3
Inode log NVMM Disk Pages
Page state
Stale Live
Entry type
Inode update Old write entry New write entry Step 2
45
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
46
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
47
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
48
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
49
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
50
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
51
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
52
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
53
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
54
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
55
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
56
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
57
NVMM DRAM write(0,8); write(8,8); fsync(); append(16,1); append(17,1); ... append(23,1); ...... migrate(); ...... read(16,8); mmap(0,8); DISK 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
58
59
Fileserver Varmail
60
Random insert Sequential insert
61
PERSIST WAL
62
63