FlashBlox: Achieving Both Performance Isolation and Uniform Lifetime for Virtualized SSDs
Jian Huang † Anirudh Badam Laura Caulfield Suman Nath Sudipta Sengupta Bikash Sharma Moinuddin K. Qureshi †
†
and Uniform Lifetime for Virtualized SSDs Jian Huang Anirudh Badam - - PowerPoint PPT Presentation
FlashBlox: Achieving Both Performance Isolation and Uniform Lifetime for Virtualized SSDs Jian Huang Anirudh Badam Laura Caulfield Bikash Sharma Moinuddin K. Qureshi Suman Nath Sudipta Sengupta Flash Has Changed Over the
Jian Huang † Anirudh Badam Laura Caulfield Suman Nath Sudipta Sengupta Bikash Sharma Moinuddin K. Qureshi †
†
2
Performance Improvement
100x lower latency 5,000x higher throughput
2
Increased Parallelism
Dozens of
parallel chips
Performance Improvement
100x lower latency 5,000x higher throughput
2
Increased Parallelism
Dozens of
parallel chips
Became Commodity
Less than $0.3/GB
Performance Improvement
100x lower latency 5,000x higher throughput
2
Increased Parallelism
Dozens of
parallel chips
Became Commodity
Less than $0.3/GB
Performance Improvement
100x lower latency 5,000x higher throughput
3
…….
3
…….
Channel
Chip Chip
Channel
Chip Chip
……
…
……
… … …
Channel
Chip Chip
……
… …
Flash Translation Layer
4
…….
Write Read
Channel
Chip Chip
Channel
Chip Chip
……
…
……
… … …
Channel
Chip Chip
……
… …
Flash Translation Layer
4
…….
Write Read
Channel
Chip Chip
Channel
Chip Chip
……
…
……
… … …
Channel
Chip Chip
……
… …
Flash Translation Layer
4
…….
Write Read
Channel
Chip Chip
Channel
Chip Chip
……
…
……
… … …
Channel
Chip Chip
……
… …
Flash Translation Layer
4
…….
Write Read
5
Channel
Chip Chip
Channel
Chip Chip
……
…
……
… … …
…….
Channel
Chip Chip
……
… …
Flash Translation Layer
5
Channel
Chip Chip
Channel
Chip Chip
……
…
……
… … …
…….
Channel
Chip Chip
……
… …
Flash Translation Layer
6
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
6
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
Channel-Level Parallelism
6
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
Channel-Level Parallelism
Chip-Level Parallelism
6
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
Channel-Level Parallelism
Chip-Level Parallelism
Plane-Level Parallelism
Plane-level parallelism is constrained as each chip contains only one address buffer
6
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
Channel-Level Parallelism
Chip-Level Parallelism
Plane-Level Parallelism
Different parallelism level provides different isolation guarantee
7
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
7
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
Virtual SSD (Chip Level) Virtual SSD (Channel Level) Virtual SSD (Plane Level)
High Medium Low
7
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
Virtual SSD (Chip Level) Virtual SSD (Channel Level) Virtual SSD (Plane Level)
High Medium Low Software-based
8
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
vSSD (Chip) vSSD (Channel) vSSD (Software)
8
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
vSSD (Chip) vSSD (Channel) vSSD (Software)
Azure DocumentDB Azure SQL Database Amazon DynamoDB
8
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
vSSD (Chip) vSSD (Channel) vSSD (Software)
Azure DocumentDB Azure SQL Database Amazon DynamoDB Throughput Single Partition Size Price
8
Channel
Chip
plane plane
Chip
plane plane
Channel
Chip
plane plane
Chip
plane plane
……
…
……
… … …
Channel
Chip
plane plane
Chip
plane plane
……
… …
vSSD (Chip) vSSD (Channel) vSSD (Software)
9
Channel
Chip Chip
Channel
Chip Chip
……
…
……
… … …
…….
Channel
Chip Chip
……
… …
Flash Translation Layer
9
Channel
Chip Chip
Channel
Chip Chip
……
…
……
… … …
…….
Channel
Chip Chip
……
… …
Flash Translation Layer
0.5 1 1.5 2 2.5 3 3.5 4 4.5 Average #Blocks Erased/sec
The average rate at which flash blocks are erased
9
Channel
Chip Chip
Channel
Chip Chip
……
…
……
… … …
…….
Channel
Chip Chip
……
… …
Flash Translation Layer
0.5 1 1.5 2 2.5 3 3.5 4 4.5 Average #Blocks Erased/sec
The average rate at which flash blocks are erased
9
Channel
Chip Chip
Channel
Chip Chip
……
…
……
… … …
…….
Channel
Chip Chip
……
… …
Flash Translation Layer
0.5 1 1.5 2 2.5 3 3.5 4 4.5 Average #Blocks Erased/sec
The average rate at which flash blocks are erased
Flash blocks wear out at different rate with different workload
9
Channel
Chip Chip
Channel
Chip Chip
……
…
……
… … …
…….
Channel
Chip Chip
……
… …
Flash Translation Layer Write Intensive
10
Chip
…
Chip
…
Chip
…
App App App SSD Lifetime Performance Isolation
10
Chip
…
Chip
…
Chip
…
App App App SSD Lifetime Performance Isolation
Chip
…
Chip
…
Chip
…
App App App SSD Lifetime Performance Isolation
10
Chip
…
Chip
…
Chip
…
App App App SSD Lifetime Performance Isolation SSD Lifetime
11
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4
Adjusting the wear imbalance at a more coarse time granularity can achieve near-ideal SSD lifetime
11
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4
The channel that has incurred the maximum wearout The channel that has the minimum rate of wearout
11
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4
Channel migration takes 15 minutes, once per 19 days Overall performance drops only for 0.04% of all the time
12
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4
Imbalance = MaxWear / AvgWear
12
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4 M
Imbalance = MaxWear / AvgWear 4 App
12
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4 M M
Imbalance = MaxWear / AvgWear 4 2 App
12
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4 M M M
Imbalance = MaxWear / AvgWear 4 2 4/3 App
12
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4 M M M M
Imbalance = MaxWear / AvgWear 4 2 4/3 1 App
12
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4 M M M M
Imbalance = MaxWear / AvgWear 4 2 4/3 1
M
8/5
M
4/3
M
8/7
M
1 App
12
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4 M M M M
Imbalance = MaxWear / AvgWear 4 2 4/3 1
M
8/5
M
4/3
M
8/7
M
1 App
How many times should we swap within SSD lifetime?
13
Assume there are N channels, wear imbalance target: 1+x after K rounds of cycling: Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) ≤ (1 + x)
Maximum Wearout Average Wearout
13
Assume there are N channels, wear imbalance target: 1+x after K rounds of cycling: Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) ≤ (1 + x) K ≥ (N – 1 – x) / (Nx)
13
Assume there are N channels, wear imbalance target: 1+x after K rounds of cycling: Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) ≤ (1 + x) K ≥ (N – 1 – x) / (Nx)
If N = 16, x = 0.1, then K = 9, which means after swap NK = 148 times, we can guarantee the wear imbalance is bounded in 1.1
13
Assume there are N channels, wear imbalance target: 1+x after K rounds of cycling: Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) ≤ (1 + x) K ≥ (N – 1 – x) / (Nx)
If N = 16, x = 0.1, then K = 9, which means after swap NK = 148 times, we can guarantee the wear imbalance is bounded in 1.1
For an SSD with 5 years lifetime, swap once per 12 days can guarantee the channels are well balanced for worst case
14
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4 M M/3 M/2
App App App App
14
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4 M M/3 M/2
App App
Using erase rate as the trigger condition for swapping
App App
15
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4
15
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4
Chips will be swapped along with the channel migration
Chip
15
Channel 1 Used Erase Cycles Channel 2 Channel 3 Channel 4
Chips will be swapped along with the channel migration
Chip
Intra-chip wear leveling mechanisms
16
App Channel-Level Wear Leveling Flash Resource Manager Chip-Level Wear Leveling
16
App Channel-Level Wear Leveling Flash Resource Manager App Virtual SSD App Virtual SSD … Chip-Level Wear Leveling
Isolation, Bandwidth & Capacity Requirement (Virtual SSD to Parallel Chips Mappings)
16
App Channel-Level Wear Leveling Flash Resource Manager App Virtual SSD App Virtual SSD … Chip-Level Wear Leveling
Isolation, Bandwidth & Capacity Requirement (Virtual SSD to Parallel Chips Mappings)
16
App Channel-Level Wear Leveling Flash Resource Manager App Virtual SSD App Virtual SSD … Chip-Level Wear Leveling
Pay-As-You-Go Model in Cloud
Inter Channel Swapping Isolation, Bandwidth & Capacity Requirement (Virtual SSD to Parallel Chips Mappings)
16
App Channel-Level Wear Leveling Flash Resource Manager App Virtual SSD App Virtual SSD … Chip-Level Wear Leveling
Channel Channel
Intra Channel Swapping Intra Channel Swapping Other FTL Algorithms … … … Inter Channel Swapping Isolation, Bandwidth & Capacity Requirement (Virtual SSD to Parallel Chips Mappings)
16
App Channel-Level Wear Leveling Flash Resource Manager App Virtual SSD App Virtual SSD … Chip-Level Wear Leveling
Channel Channel
17
16 channels 4 chips 4 planes 16 KB page size
Yahoo Cloud Service Benchmark Bing Search / Index / PageRank Transactional Database Azure Storage
18
100 200 300 400 500 600 700 A+A A+B A+C A+D A+E A+F 99th Percentile Latency (microsecons) Yahoo Cloud Service Benchmark (YCSB) App1-Software Isolation App1-FlashBlox App2-Software Isolation App2-FlashBlox
App1 App2
A: Session store recording recent actions B: Photo tagging C: User profile cache D: User status update E: Threaded conversations F: User database
18
100 200 300 400 500 600 700 A+A A+B A+C A+D A+E A+F 99th Percentile Latency (microsecons) Yahoo Cloud Service Benchmark (YCSB) App1-Software Isolation App1-FlashBlox App2-Software Isolation App2-FlashBlox
Tail latency reduction: 2.6x, average latency reduction: 1.4x
App1 App2
19
0.1 0.2 0.3 0.4 0.5 0.6 Latency (milliseconds) Time (Seconds)
Bing Search’s Performance During Channel Migration Without Migration With Migration
19
0.1 0.2 0.3 0.4 0.5 0.6 Latency (milliseconds) Time (Seconds)
Bing Search’s Performance During Channel Migration Without Migration With Migration 34%
19
Channel migration takes 15 minutes, once per 19 days Overall performance drops only for 0.04% of all the time
0.1 0.2 0.3 0.4 0.5 0.6 Latency (milliseconds) Time (Seconds)
Bing Search’s Performance During Channel Migration Without Migration With Migration 34%
20
Swap once per 19 days
21
Jian Huang† jian.huang@gatech.edu
Anirudh Badam Laura Caulfield Suman Nath Sudipta Sengupta Bikash Sharma Moinuddin K. Qureshi †
†