Random write is still slow at SSD 1200 random write (4KB) 1000 - PowerPoint PPT Presentation

SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device Yun Ho Jeong 2 and Kyung Ho Kim 2 Hyukjoong Kim 1 , Dongkun Shin 1 , Sungkyunkwan University 1 Samsung Electronics 2 Presented at FAST’17 1

Random write is still slow at SSD 1200 random write (4KB) 1000 sequential write (512KB) Bandwidth (MB/s) 800 600 400 200 0 eMMC 5.0 UFS SATA SSD SATA SSD NVMe SSD (Odroid-XU3) (Galaxy S6) (Intel 525) (Samsung 850 Pro) (Intel 750) Sungkyunkwan University 2

Why is RW slower than SW? 1. Request Handling Overhead 2. Garbage Collection Overhead 3. Mapping Table Handling Overhead Sungkyunkwan University 3

Why is RW slower than SW? 1. Request Handling Overhead • Sequential write • Packed command  Large, few requests ▪ e.g. eMMC • Interrupt coalescing • Random write ▪ e.g. NVMe, SATA NCQ • Vectored I/O  Small, many requests ▪ e.g. OpenChannel SSD [FAST’17] Sungkyunkwan University 4

Why is RW slower than SW? 2. Garbage Collection Overhead • Hot/cold separation Blocks with SW Blocks with RW  Stores hot and cold data Invalid Valid I V into different blocks I V V I I V I V • Incremental GC / bgGC I V V I  can hide GC latency • RW generates hot/cold-mixed blocks • Dispersed invalid pages  high GC overhead Sungkyunkwan University 5

Why is RW slower than SW? 3. Mapping Table Handling Overhead • Page-level mapping FTL shows good performance on RW ▪ Requires a large DRAM to maintain fine-grained mapping table ▪ 4 byte per 4 KB  8 GB DRAM for 8 TB storage • Demand loading FTL (DFTL [ASPLOS’08]) ▪ Uses a small map cache with on-demand map loading ▪ Random writes invoke frequent map loading/unloading Sungkyunkwan University 6

Demand-loading FTL (DFTL) Write LPN 768 • Map caching scheme can show DRAM Map Cache good performance by utilizing 0 1023 4096 5119 7168 8191 temporal & spatial locality … LRU list … … ▪ Page level map load/unload map page map page map page ✓ One map page contains multiple loads unloads contiguous mapping entries OOB • Vulnerable to random workload data map pages ▪ low temporal & spatial locality ▪ high map miss rate map blocks data blocks ▪ high map loading overhead NAND Flash Chips Sungkyunkwan University 7

Previous Solution: LFS • Generate only sequential writes ▪ out-of-place append-only write scheme storage space Append logging Invalid Invalid • Problems copy ▪ reclaiming log space (cleaning overhead) FS write • Filesystem needs to copy valid page Storage read  host-to-device data transfer LPN: 32 128 ▪ Large metadata, wandering tree problem Cleaning requires host-to-device ▪ Fragmented read operation data transfer operations Sungkyunkwan University 8

Can we remove copy overhead? • SSD maintains a page-level mapping table • Address remapping ▪ Can change the logical address of a written data by modifying mapping table ▪ AnViL [FAST’15], SHARE [SIGMOD’16] • Can reclaim log space with address remapping copy remap request Logical Physical FS FS write 32 72 Storage Storage read 33 0 remap … LPN: 32 128 128 LPN: 32 128 72 Mapping Table < Copying at filesystem > < Remapping at storage > Sungkyunkwan University 9

Which layer? File System or Block Layer • Our solution is Append logging on Block Layer ▪ Append logging on log area temporarily ▪ Remap to the original location ▪ Can utilize legacy filesystems (e.g. EXT4) • Simpler metadata management Legacy Applications • Faster sequential read performance Legacy Filesystems Append-logging D/D Storage Sungkyunkwan University 10

SHRD (Sequentializing in Host, Randomizing in Device) • Sequentializing in Host ▪ Host OS writes random requests sequentially at log area • Randomizing in Device ▪ SSD modifies the mapping table to change the logical address (1) Sequentializing Normal area Log area (reserved) (2) Randomizing Sungkyunkwan University 11

oLPN: original LPN SHRD Example: write tLPN: temporal LPN multiple small random writes single large sequential write Host redirection table oLPN tLPN 32 854 765 128 1024 Logging 32 1024 128 1026 765 1025 Logical address 854 1027 0 0 1024 Normal area (FS visible) Log area (FS invisible) Device mapping table LPN PPN 1024 368 1025 369 1026 370 physical address 1027 371 368 NAND flash Sungkyunkwan University 12

oLPN: original LPN SHRD Example: read redirection tLPN: temporal LPN Read 32 Host redirection table oLPN tLPN 32 1024 redirect to 128 1026 765 1025 1024 Logical address 854 1027 0 0 1024 Device mapping table LPN PPN 1024 368 1025 369 1026 370 physical address 1027 371 368 NAND flash Sungkyunkwan University 13

oLPN: original LPN SHRD Example: remap tLPN: temporal LPN Host redirection table remap 1024-1027 oLPN oLPN tLPN tLPN 32 1024 128 1026 765 1025 Logical address 854 1027 0 0 1024 Device mapping table LPN LPN PPN PPN 1024 32 368 368 1025 765 369 369 1026 128 370 370 physical address 1027 854 371 371 368 NAND flash Sungkyunkwan University 14

Can we really reduce map loading overhead? • Remap modifies the mapping entries of sequentialized pages ▪ A time-ordered access scheme tLPN tLPN oLPN oLPN inherits the original random pattern 32 1024 32 1024 remapping sequence • low spatial locality 765 1025 37 1028 128 1026 128 1026 • oLPN-ordered map access 854 1027 134 1029 37 1028 765 1025 ▪ The mapping table is oLPN-indexed 134 1029 774 1030 ▪ Can increase spatial locality 774 1030 854 1027 900 1031 900 1031 time-ordered access oLPN-ordered access 8 map loads 5 map loads Sungkyunkwan University 15

The effect of request reordering 15% 100% stg_0 proj_0 stg_0 proj_0 utilization of parallel unit 80% Map miss ratio 10% 60% improve parallelism 40% 5% reduce map miss 20% 0% 0% NONE 2MB 4MB 8MB 16MB 32MB 64MB Reordering window size Sungkyunkwan University 16

SHRD (Sequentializing in Host, Randomizing in Device) File system, Applications write() read() Device Driver map reclaim map insert Redirection Randomizer Sequentializer Table SSD (SHRD-FTL) twrite() remap() read( Map Cache ) Map Blocks Data Blocks RWLB Blocks Sungkyunkwan University 17

SHRD (Sequentializing in Host, Randomizing in Device) File system, Applications Device Driver Redirection Randomizer Sequentializer Table • Sequentializer ▪ Gathers random write requests, sequentially logs into temporal location • Redirection table ▪ Maintains redirection table between temporal address and original address • Randomizer ▪ Sends remap command to storage device and reclaims temporal location Sungkyunkwan University 18

SHRD (Sequentializing in Host, Randomizing in Device) • SHRD-FTL • Receives twrite and remap command from host OS • twrite Write command with two addresses, temporal/original address • • The data must be stored into separate physical blocks called RWLB • remap • Restores the data written at temporal location into original address Changes mapping table from temporal address to original address • • Corresponding RWLB blocks will be transferred into data blocks remap (tLPN, oLPN) twrite (oLPN, tLPN) SSD (SHRD-FTL) Map Cache Map Blocks Data Blocks RWLB Blocks Sungkyunkwan University 19

Special commands: twrite & remap • twrite (oLPN[n], tLPN_start, n, data) ▪ Write command sends two addresses, ( tLPN , oLPN) ▪ oLPN is stored at the OOB area of physical page • used for power-off-recovery / GC ▪ Packed command with multiple RW requests • remap (oLPN[m], tLPN[m], m) ▪ m = # of remapping entries per remap command • oLPN-sorted entries  Improving spatial locality ▪ Changes mapping table from tLPN to oLPN • tLPN : PPN  oLPN : PPN Sungkyunkwan University 20

Command Sequence: Sequentializing in Host 32 128 765 854 a b c d I/O scheduler oLPN tLPN twrite data 32 1024 twrite header 1024 1025 1026 1027 128 1026 a b c d mapping 765 1025 SHRD driver 854 1027 SATA SATA write SATA write Completion Completion command (OOB) command (OOB) Interface SSD Device 1024 a seq_ptr rand_ptr Normal area b c d Log area 1024 data spare a 32 NAND Sungkyunkwan University 21

Command Sequence: Randomizing in Device I/O scheduler oLPN tLPN 32 1024 128 1026 remap 765 1025 SHRD driver 854 1027 SATA SATA write command (OOB) Interface SSD Device Change mapping table seq_ptr rand_ptr Normal area c d Log area a b valid log area 1024 Sungkyunkwan University 22

GC & Power Off Recovery (POR) • Reverse map in out-of-band (OOB) area ▪ SSD stores corresponding LPN in OOB area ▪ Reverse map is used for GC & recovery • GC: change the mapping table of victim valid page • Recovery: recover the mapping table of active blocks Physical page layout Data OOB ECC LPN Sungkyunkwan University 23

Random write is still slow at SSD 1200 random write (4KB) 1000 - PowerPoint PPT Presentation

SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device Yun Ho Jeong 2 and Kyung Ho Kim 2 Hyukjoong Kim 1 , Dongkun Shin 1 , Sungkyunkwan University 1 Samsung Electronics 2 Presented at

SSD electronics review M. LeVine BNL M.J. LeVine SSD electronics review, June 20, 2012 1 ST

A Case for Flash Memory SSD in A Case for Flash Memory SSD in A Case for Flash Memory SSD in

to Enable Emerging SSD Form Factors Nigel Alvares VP of SSD & Data Center Storage Solutions,

Become a SSD expert in minutes! Ryan Smith ryan.smith@ssi.samsung.com 408-205-8889 What is a

SGX-SSD: A Policy-based Versioning SSD with Intel SGX Jinwoo Ahn , Seungjin Lee, Jinhoon

U i U i Using Using Flash Fl Fl Flash SSDs h h SSD SSDs as SSD as Primary Primary P i

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Write Through No Write Allocate Cache Write Reference Check tag and index Yes Tag AND

Why care? HDD SSD Require seek, rotate, No seeks SSDs transfer on each I/O Parallel Not

Debra Prinzing SLOW FLOWERS COLLECTIONS Datisca cannabina ECOMMERCE: Direct to Consumer What

Lets talk locks! @kavya719 kavya locks. locks are slow locks are slow latency

Cracking the Habit Code 21 days to keeping your resolutions 1 Day 3: Start Small & Go Slow

Read Write Inc. Phonics Parents Meeting Who is Read Write Inc. Phonics for? Read Write Inc.

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Random Numbers, Files, and Onwards Random Numbers Computers cannot produce truly random numbers.

SSD Characterization: From Energy Consumption's Perspective Youjip Won Hanyang University

Division of Calamba City EUGENIA R. GORGON Schools Division Superintendent The Senior High

LAG-3: Iden,fica,on & Valida,on Of Next Genera,on Checkpoint

over King Sejong Station, Antarctic Peninsula Hosik Kam 1 , Yong Ha Kim 1 , Takuji Nakamura 2,3 ,

Investigation of the Fatigue Cracking and Leakage Rate potential of U-Bend Tube Bundles subjected

Monthly Meeting December 21, 2016 Central Maryland Chapter Sponsors: Bay Dynamics, Clearswift,

Long Term Evolution (LTE) - A Tutorial Ahmed Hamza aah10@cs.sfu.ca Network Systems Laboratory

Renew able Transport Fuel Obligation: consultation 2 0 0 7 Rupert Furness, Department for

WFPC2 Re-Commissioning After Servicing Mission 3B Anton M. Koekemoer, Shireen Gonzaga, Inge Heyer,

Random write is still slow at SSD 1200 random write (4KB) 1000 - PowerPoint PPT Presentation

SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device Yun Ho Jeong 2 and Kyung Ho Kim 2 Hyukjoong Kim 1 , Dongkun Shin 1 , Sungkyunkwan University 1 Samsung Electronics 2 Presented at

SSD electronics review M. LeVine BNL M.J. LeVine SSD electronics review, June 20, 2012 1 ST

A Case for Flash Memory SSD in A Case for Flash Memory SSD in A Case for Flash Memory SSD in

to Enable Emerging SSD Form Factors Nigel Alvares VP of SSD &amp; Data Center Storage Solutions,

Become a SSD expert in minutes! Ryan Smith ryan.smith@ssi.samsung.com 408-205-8889 What is a

SGX-SSD: A Policy-based Versioning SSD with Intel SGX Jinwoo Ahn , Seungjin Lee, Jinhoon

U i U i Using Using Flash Fl Fl Flash SSDs h h SSD SSDs as SSD as Primary Primary P i

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Write Through No Write Allocate Cache Write Reference Check tag and index Yes Tag AND

Why care? HDD SSD Require seek, rotate, No seeks SSDs transfer on each I/O Parallel Not

Debra Prinzing SLOW FLOWERS COLLECTIONS Datisca cannabina ECOMMERCE: Direct to Consumer What

Lets talk locks! @kavya719 kavya locks. locks are slow locks are slow latency

Cracking the Habit Code 21 days to keeping your resolutions 1 Day 3: Start Small &amp; Go Slow

Read Write Inc. Phonics Parents Meeting Who is Read Write Inc. Phonics for? Read Write Inc.

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Random Numbers, Files, and Onwards Random Numbers Computers cannot produce truly random numbers.

SSD Characterization: From Energy Consumption's Perspective Youjip Won Hanyang University

Division of Calamba City EUGENIA R. GORGON Schools Division Superintendent The Senior High

LAG-3: Iden,fica,on &amp; Valida,on Of Next Genera,on Checkpoint

over King Sejong Station, Antarctic Peninsula Hosik Kam 1 , Yong Ha Kim 1 , Takuji Nakamura 2,3 ,

Investigation of the Fatigue Cracking and Leakage Rate potential of U-Bend Tube Bundles subjected

Monthly Meeting December 21, 2016 Central Maryland Chapter Sponsors: Bay Dynamics, Clearswift,

Long Term Evolution (LTE) - A Tutorial Ahmed Hamza aah10@cs.sfu.ca Network Systems Laboratory

Renew able Transport Fuel Obligation: consultation 2 0 0 7 Rupert Furness, Department for

WFPC2 Re-Commissioning After Servicing Mission 3B Anton M. Koekemoer, Shireen Gonzaga, Inge Heyer,

to Enable Emerging SSD Form Factors Nigel Alvares VP of SSD & Data Center Storage Solutions,

Cracking the Habit Code 21 days to keeping your resolutions 1 Day 3: Start Small & Go Slow

LAG-3: Iden,fica,on & Valida,on Of Next Genera,on Checkpoint