1
Re-think Data Management Software Design Upon the Arrival of Storage - - PowerPoint PPT Presentation
Re-think Data Management Software Design Upon the Arrival of Storage - - PowerPoint PPT Presentation
Re-think Data Management Software Design Upon the Arrival of Storage Hardware with Built-in Transparent Compression 07/2020 1 The Rise of Computational Storage Homogeneous Computing Heterogenous Computing Compute Networking Storage
2
The Rise of Computational Storage
Domain Specific Compute
Compute
FPGA/GPU/TPU End of Moore’s Law
Networking
SmartNICs 10 → 100-400Gb/s
Storage
Computational Storage
Fast & Big Data Growth
Homogeneous Computing Heterogenous Computing
3
The Rise of Computational Storage
§ CPU & Memory I/O bottlenecks § Limited FPGAs, specific sockets required § Massive data movement § No compute parallelism
FPGA
DRAM
CPU
DRAM
… …
Controller
SSD
Flash Controller
SSD
Flash Controller
SSD
Flash Controller
SSD
Flash
to Data-Driven
§ Balanced compute & storage I/O § Multiple FPGAs, easily plug-in via storage § Minimize data movement § Maximum compute parallelism
CPU
DRAM
DRAM
… …
FPGA Flash
CSD
FPGA Flash
CSD
FPGA Flash
CSD
FPGA Flash
CSD
CSD: Computational Storage Drive
from Processor-Driven…
4
Computational Storage: A Simple Idea
q End of Moore’s Law è heterogeneous computing
Flash Control NAND Flash FPGA
In-line per-4KB zlib compression & decompression
HW → ←SW
Driver
Low-hanging fruits
FPGA/GPU/TPU SmartNICs Computational Storage
Computational Storage Drive (CSD) with Data Path Transparent Compression
5
ScaleFlux Computational Storage Drive
ü Complete, validated solution
ü Pre-Programmed FPGA ü Hardware ü Software ü Firmware
ü No FPGA knowledge or coding ü Field upgradeable ü Standard U.2 & AIC form factors
Multiple, discrete components for Compute and SSD Functions
SSD
CPU
FPGA
Flash Controller Flash Flash
FPGA
FC Flash Flash
CSD
CPU
Single FPGA combines Compute and SSD Functions
6
CSD 2000: Data Path Compression/Decompression
50 100 150 200 250
1 % 9 % 8 % 7 % 6 % 5 % 4 % 3 % 2 % 1 % %
100% Reads
100 200 300 400 500 600 700
1 % 9 % 8 % 7 % 6 % 5 % 4 % 3 % 2 % 1 % %
Increasing Mix R/W Performance Advantage with Larger Block Sizes
IOPS (k) Better 170%
70/30 R/W
100% Writes
FIO: 4K Random R/W IOPS FIO: 16K Random R/W IOPS
230%
100% Write
50 100 150 200 250 300 350 400 450
1 % 9 % 8 % 7 % 6 % 5 % 4 % 3 % 2 % 1 % %
FIO: 8K Random R/W IOPS
100% Reads 100% Writes
200%
70/30 R/W
220%
100% Write
220%
70/30 R/W
100% Reads 100% Writes
220%
100% Write
2.5:1 Compressible Data, 8 jobs, 32 QD, steady state after preconditioning 2.5:1 Compressible Data, 8 jobs, 32 QD, steady state after preconditioning 2.5:1 Compressible Data, 8 jobs, 32 QD, steady state after preconditioning
IOPS (k) Better IOPS (k) Better
CSD 2000 Vendor-A NVMe
7
Open a Door for System-level Innovations
Logical storage space utilization efficiency Physical storage space utilization efficiency
OS/Applications can purposely waste logical storage space to gain performance benefits
Data path compression
Data 0’s 4KB
Transparent compression
Compressed data Unnecessary to completely fill each 4KB sector with user data Transparent compression NAND Flash (e.g., 4TB) Expanded LBA space (e.g., 32TB) Unnecessary to use all the LBAs
8
PostgreSQL
Normalized Performance Physical storage usage
600GB 1,200GB 100% 150% 300GB
Data 8KB/page Fillfactor (FF) Reserved for future update FF Performance Storage space Data 0’s 8KB/page
Data path compression
Compressed data
Commodity NVMe SFX NVMe FF=100 FF=50
9
PostgreSQL (Sysbench-TPCC)
60.0% 70.0% 80.0% 90.0% 100.0% 110.0% 120.0% 130.0% 140.0%
FF100 (740GB)
Vendor-A
Normalized TPS
CSD 2000
FF100
(178GB)
FF75
(905GB)
60.0% 70.0% 80.0% 90.0% 100.0% 110.0% 120.0% 130.0% 140.0% 150.0% FF75
(189GB) FF100 (1,433GB) FF100 (342GB) FF75 (1,762GB) FF75 (365GB)
Vendor-A CSD 2000
Fillfactor Drive Logical size (GB) Physical size (GB) Comp Ratio 100 Vendor-A 740 740 1.00 CSD 2000 178 4.12 75 Vendor-A 905 905 1.00 CSD 2000 189 4.75
740GB 1.4TB
Fillfactor Drive Logical size (GB) Physical size (GB) Comp Ratio 100 Vendor-A 1,433 1,433 1.00 CSD 2000 342 4.19 75 Vendor-A 1,762 1,762 1.00 CSD 2000 365 4.82 Normalized TPS
10
Table-less Hash-based Key-Value Store
Key space K
. . .
Hash function fKàL
. . .
4KB Key space K
. . .
Hash table
. . . . . .
4KB LBA space L LBA space L KV pairs are tightly packed in L KV pairs are loosely packed in L
Hash function fKà T
Unoccupied space
KV store purposely under-utilizes logical storage space to eliminate hash table without sacrificing physical storage utilization
11
Table-less Hash-based Key-Value Store
ü Simple code base & high operational concurrency ü Very small memory footprint ü Absence of frequent background operations (e.g., GC and compaction) è low and consistent CPU usage
Key space K
. . .
Hash function fKàL
. . .
4KB LBA space L KV pairs are loosely packed in L Unoccupied space
We will open source and are looking for collaborations to together grow the community! Compared with RocksDB ² >2x ops/s improvement ² >2x less average CPU usage
12
Summary
www.scaleflux.com tong.zhang@scaleflux.com
Logical storage space utilization efficiency Physical storage space utilization efficiency
Transparent compression