HMVFS: A Hybrid Memory Versioning File System
Shengan Zheng, Linpeng Huang, Hao Liu, Linzhu Wu, Jin Zha
Department of Computer Science and Engineering Shanghai Jiao Tong University
Versioning File System Shengan Zheng, Linpeng Huang, Hao Liu, Linzhu - - PowerPoint PPT Presentation
HMVFS: A Hybrid Memory Versioning File System Shengan Zheng, Linpeng Huang, Hao Liu, Linzhu Wu, Jin Zha Department of Computer Science and Engineering Shanghai Jiao Tong University Outline Introduction Design Implementation
Shengan Zheng, Linpeng Huang, Hao Liu, Linzhu Wu, Jin Zha
Department of Computer Science and Engineering Shanghai Jiao Tong University
Block Information Table (BIT) Node Address Tree Cache (NAT Cache ) Segment Information Table (SIT)
Random Writes Sequential Writes
NVM
Segment Information Table Journal
DRAM
Checkpoint Information Tree (CIT) Node Address Tree (NAT)
Main Area (SFST) Auxiliary Information
Node Blocks Checkpoint Blocks (CP) Data Blocks Superblock Superblock
Direct pointer Or Inline data Metadata Single-indirect Double-indirect Triple-indirect Inode block Direct node Direct node Indirect node Indirect node Indirect node Direct node Direct node Data block Direct node Indirect node
Data Node
… … … … … … … … … … … Data block Data block Data block Data block Data block Data block Data block Data block Indirect node Inode
Updated blocks
Direct node Data block
Direct pointer Or Inline data Metadata Single-indirect Double-indirect Triple-indirect Inode block Direct node Direct node Indirect node Indirect node Indirect node Direct node Direct node Data block Direct node Indirect node
Data Node
… … … … … … … … … … … Data block Data block Data block Data block Data block Data block Data block Data block
Updated blocks
Direct node Data block
Node Address Table
Node-ID Address … … n-1 0x38 n 0x42 n+1 0x24 … … 0x73
Direct pointer Or Inline data Metadata Single-indirect Double-indirect Triple-indirect Inode block Direct node Direct node Indirect node Indirect node Indirect node Direct node Direct node Data block Direct node Indirect node
Data Node
… … … … … … … … … … … Data block Data block Data block Data block Data block Data block Data block Data block
Updated blocks
Direct node Data block
Node Address Table with Version
Node-ID Address … … n-1 0x14 n n+1 0x24 … … 0x42 Address … 0x38 0x24 … x42 Address … 0x38 0x24 … 0x73
Version1 Version2 Version3
Table space-efficiently
blocks.
Node NAT root NAT internal NAT internal NAT leaf NAT leaf NAT leaf Indirect node NAT internal NAT internal NAT internal NAT root NAT internal NAT internal NAT leaf Direct node Inode Direct node Node Address Tree
P,1 A,1 B,1 C,1 D,1 E,1 F,1 P,1 A,1 B,1 C,2 D,1 E,2 F,1 Q,1 D',1 F',1 P,0 A,1 B,1 C,1 D,0 E,1 F,0 Q,1 D',1 F',1 Original New
log-structured writes
Data block Data block Data block Data block Data block Data block Node Data NAT root NAT internal NAT internal NAT leaf NAT leaf NAT leaf Indirect node NAT internal NAT internal NAT internal NAT root NAT internal NAT internal NAT leaf Direct node Inode Direct node Node Address Tree Original snapshot New snapshot CP block CP block Checkpoint
Data block Data block Data block Data block Data block Data block Node Data NAT root NAT internal NAT internal NAT leaf NAT leaf NAT leaf Indirect node NAT internal NAT internal NAT internal NAT root NAT internal NAT internal NAT leaf Direct node Inode Direct node Node Address Tree Original snapshot New snapshot CP block CP block Checkpoint
NAT block Node Block 1 Node Block 2
Version 1
NAT block Node Block 2
Version 2
NAT block Node Block 2
Version 3
NAT block Node Block 2
Version 4 Segment A Segment B
written in the main area
Type of the block Type of the parent Node ID Checkpoint N/A N/A NAT internal NAT internal Index code in NAT NAT leaf Inode NAT leaf Node ID Indirect Direct Data Inode or direct Node ID of parent node
number
count
Node Address Tree
the validity of the new snapshot
undo or redo depend on the validity
Node Data NAT root NAT internal NAT internal NAT leaf NAT leaf NAT leaf Indirect node NAT internal NAT internal NAT internal NAT root NAT internal NAT internal NAT leaf Direct node Inode Direct node Node Address Tree Original snapshot New snapshot CP block CP block Checkpoint
Data block Data block Data block Data block Data block Data block … … … … … … …
Super Block
P,0 A,1 B,1 C,1 D,0 E,1 F,0 Q,1 D',1 F',1 P,1 A,1 B,1 C,2 D,1 E,2 F,1 Q,1 D',1 F',1
Checkpoint Checkpoint Checkpoint Checkpoint Superblock NAT root … NAT root … NAT root … NAT root …
20 40 60 80 100 120 140 160 180 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
sec Percentage of Reads
HMVFS BTRFS NILFS2 EXT4 PMFS
20 40 60 80 100 120 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Efficiency (sec-1) Percentage of Reads
HMVFS BTRFS NILFS2
Transaction performance Snapshotting efficiency
5 10 15 20 25 2k 4k 8k 16k
Number of Files HMVFS BTRFS NILFS2 EXT4 PMFS 5 10 15 20 25 30 35 40 2k 4k 8k 16k Efficiency (sec-1) Number of Files HMVFS BTRFS NILFS2
Throughput performance Snapshotting efficiency
2 4 6 8 10 12 0.7 1.2 1.4 2.1 Efficiency (sec-1) Directory Depth HMVFS BTRFS NILFS2 5 10 15 20 25 0.7 1.2 1.4 2.1
Directory Depth HMVFS BTRFS NILFS2 EXT4 PMFS
Throughput performance Snapshotting efficiency
in-memory file systems using snapshotting.
updated at byte granularity
handles write amplification problem and block sharing problem well
addressability of NVM to automatically take frequent snapshots
performance while providing strong consistency guarantee and having little impact on foreground operations