SHRD: Improving Spatial Locality in Flash
Storage Accesses by Sequentializing in Host and Randomizing in Device
Hyukjoong Kim1, Dongkun Shin1, Yun Ho Jeong2 and Kyung Ho Kim2 Sungkyunkwan University1 Samsung Electronics2
Presented at FAST’17
1
Random write is still slow at SSD 1200 random write (4KB) 1000 - - PowerPoint PPT Presentation
SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device Yun Ho Jeong 2 and Kyung Ho Kim 2 Hyukjoong Kim 1 , Dongkun Shin 1 , Sungkyunkwan University 1 Samsung Electronics 2 Presented at
Hyukjoong Kim1, Dongkun Shin1, Yun Ho Jeong2 and Kyung Ho Kim2 Sungkyunkwan University1 Samsung Electronics2
1
Sungkyunkwan University
2 200 400 600 800 1000 1200 eMMC 5.0 (Odroid-XU3) UFS (Galaxy S6) SATA SSD (Intel 525) SATA SSD (Samsung 850 Pro) NVMe SSD (Intel 750) Bandwidth (MB/s) random write (4KB) sequential write (512KB)
Sungkyunkwan University
3
Sungkyunkwan University
4
Sungkyunkwan University
5
Blocks with RW Blocks with SW
Invalid I I I Valid V V V I V I V V I V I
Sungkyunkwan University
6
Sungkyunkwan University
7
NAND Flash Chips DRAM Map Cache
… map page … map page … map page
map blocks data blocks
map pages
data OOB
loads unloads LRU list
1023 4096 5119 7168 8191
Write LPN 768
Sungkyunkwan University
8
Invalid Invalid
FS Storage copy read write Cleaning requires host-to-device data transfer operations
LPN: 32 128
Sungkyunkwan University
9
FS Storage copy read write < Copying at filesystem > FS Storage < Remapping at storage > remap LPN: 32 128 remap request LPN: 32 128 32 33 128 72 72
Logical Physical
Mapping Table
…
Sungkyunkwan University
10
Storage Append-logging D/D Legacy Filesystems Legacy Applications
Sungkyunkwan University
11
Sungkyunkwan University
12
Log area (FS invisible) Logical address NAND flash Normal area (FS visible) Host redirection table 1024
32 128 765 1024 1026 1025 854 1027
tLPN
multiple small random writes 32 128 765 854 single large sequential write Logging 1024 368
1024 1025 1026 368 369 370 1027 371 LPN PPN
Device mapping table physical address
tLPN: temporal LPN
Sungkyunkwan University
13
Logical address NAND flash Host redirection table 1024
32 128 765 1024 1026 1025 854 1027
tLPN
368
1024 1025 1026 368 369 370 1027 371 LPN PPN
Device mapping table
physical address
tLPN: temporal LPN
Sungkyunkwan University
14
Logical address NAND flash Host redirection table 1024
32 128 765 1024 1026 1025 854 1027
tLPN
368
1024 1025 1026 368 369 370 1027 371 LPN PPN
Device mapping table
32 765 128 368 369 370 854 371 LPN PPN
tLPN
physical address
tLPN: temporal LPN
Sungkyunkwan University
15
37 134 774 1028 1029 1030 time-ordered access 8 map loads 900 1031 32 765 128 1024 1025 1026 854 1027
tLPN 765 774 854 1025 1030 1027
5 map loads
remapping sequence
900 1031 32 37 128 1024 1028 1026 134 1029
tLPN
Sungkyunkwan University
16
0% 20% 40% 60% 80% 100% 0% 5% 10% 15% NONE 2MB 4MB 8MB 16MB 32MB 64MB Reordering window size utilization of parallel unit Map miss ratio stg_0 proj_0 stg_0 proj_0
reduce map miss improve parallelism
Sungkyunkwan University
17
File system, Applications
SSD (SHRD-FTL)
RWLB Blocks Data Blocks Map Blocks Map Cache
Device Driver
Sequentializer Redirection Table Randomizer
map reclaim remap() write() map insert twrite() read() read( )
Sungkyunkwan University
18
File system, Applications
Device Driver
Sequentializer Redirection Table Randomizer
▪ Gathers random write requests, sequentially logs into temporal location
▪ Maintains redirection table between temporal address and original address
▪ Sends remap command to storage device and reclaims temporal location
Sungkyunkwan University
19 SSD (SHRD-FTL)
RWLB Blocks Data Blocks Map Blocks Map Cache
twrite (oLPN, tLPN) remap (tLPN, oLPN)
Sungkyunkwan University
20
Sungkyunkwan University
21
Normal area Log area 1024 a b c d
I/O scheduler SHRD driver SATA Interface SSD Device
32 765 128 854 mapping twrite header
tLPN 32 1024 128 1026 765 1025 854 1027
a b c d
1024 1025 1026 1027 twrite data SATA write command (OOB) Completion SATA write command (OOB)
a
1024
a 32
data spare NAND
b c d
Completion rand_ptr seq_ptr
Sungkyunkwan University
22
Normal area Log area 1024
I/O scheduler SHRD driver SATA Interface SSD Device
a b c d valid log area
tLPN 32 1024 128 1026 765 1025 854 1027
remap SATA write command (OOB)
Change mapping table
rand_ptr seq_ptr
Sungkyunkwan University
23
Sungkyunkwan University
24
1 85 1023 72 Data OOB RWLB Block
after remap before remap twrite
Sungkyunkwan University
25
Sungkyunkwan University
26
10 20 30 40 50 60 128KB 256KB 512KB 1MB 2MB 4MB 8MB fully loaded bandwidth (MB/s)
Map cache size
DFTL SHRD
30x faster than DFTL at tiny size of cache
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
128KB 256KB 512KB 1MB 2MB 4MB 8MB fully loaded
map load per page IO Map cache size DFTL SHRD
fio random write test (32GB space, 4KB write)
Sungkyunkwan University 0.5 1 1.5 2 2.5 3 3.5 4 tpcc YCSB postmark fileserver varmail
throughput (normalized to DFTL)
27
small random write dominant workloads sequential write workload read/flush dominant workload
Sungkyunkwan University
10 20 30 40 50 60 70 80 90 100 20 40 60 80 100 bandwidth (MB/s) time(s) F2FS (SSR) w/ DFTL EXT4 w/ DFTL
28 low utilization high utilization
Sungkyunkwan University
20 40 60 80 100 20 40 60 80 100 bandwidth (MB/s) time(s) F2FS (SSR) w/ DFTL F2FS (SSR) w/ SHRD EXT4 w/ DFTL EXT4 w/ SHRD
29 low utilization high utilization
ext4 improvement F2FS improvement
Sungkyunkwan University
20 40 60 80 100 20 40 60 80 100 bandwidth (MB/s) time(s) F2FS (SSR) w/ DFTL F2FS (SSR) w/ SHRD EXT4 w/ DFTL EXT4 w/ SHRD
30
20 40 60 80 100 120 Sequential read Random read bandwidth (MB/s)
EXT4 w/ SHRD F2FS (SSR) w/ SHRD
Sungkyunkwan University
31
5 10 15 20 25 DFTL SHRD bandwidth (MB/s) read write fio mixed workload (32GB area, 4KB random read/write mixed)
remap period
Sungkyunkwan University
< without SHRD > < with SHRD > remap command access seqeuntialized write
Sungkyunkwan University
33
34