- Object-based SSD (OSSD):
Our Practice and Experience
Jaesoo Lee jaesu.lee@samsung.com Flash Solution Team, Memory Division Samsung Electronics Co.
Object-based SSD (OSSD): Our Practice and Experience Jaesoo Lee - - PowerPoint PPT Presentation
Object-based SSD (OSSD): Our Practice and Experience Jaesoo Lee jaesu.lee@samsung.com Flash Solution Team, Memory Division Samsung Electronics Co. Outline SAMSUNG Part 1. OSD and
Jaesoo Lee jaesu.lee@samsung.com Flash Solution Team, Memory Division Samsung Electronics Co.
Part 1. OSD and Object-based SSD
Introduction Our Practice and Experience
Part 2. Application-Aware Storage
Introduction
2
Introduction Our Practice and Experience
Future Directions Summary and Conclusion
Part 1. OSD and Object-based SSD
Introduction Our Practice and Experience
Part 2. Application-Aware Storage
Introduction
3
Introduction Our Practice and Experience
Future Directions Summary and Conclusion
HDD as main storage for decades
Structure is relatively simple Physical location can be easily derived
Storage stack has remained static
( ) ( )
1
sec sec
− + ⋅ + ⋅ ⋅ = s N h N N c LBA
tors tors heads
4
Narrow block interfaces (ATA, SCSI) No information flow except block reads/writes
File subsystems make HDD-specific assumptions
Sequential read is much faster than random read No write amplification, wear-out, background activity, …
What if the underlying device changes ?
[Excerpted from Block Management in SSD, Usenix 2009 ]
Host-side optimization is no more feasible
Confidential, subject to changes, different among SSDs, …
□ Striping method □ Buffer cache management policy □ Logical-to-physical mapping □ Garbage collection policy □ Wear-leveling policy □ Bad block management □ # of channels □ # of ways □ # of planes per chip □ # of blocks per plane □ # of pages per block
5
Processor Core □ # of pages per block □ Block size □ Page size
Preliminary results with in-house SSD-aware file system
Up to 1700% improvement for random writes 13% degradation in sequential write performance
12 14 16 18 ughput SSD-Aware FS ext3
6
2 4 6 8 10 12 4 8 16 32 64 128 256 512 1024 2048 4096 8192 Normalized Through I/O Size (KB)
Optional command set defined for SCSI device
Provide object-based interface instead of traditional block-
based interface
In OSD, an object is a flexible-sized data container
Unique object ID A set of attributes
7
A set of attributes
OSD manages space for objects
Informed cleaning (utilize delete info) Stripe aligned accesses Logical to physical mapping
OSD supports object attributes
8
OSD supports object attributes
Wear-leveling using cold data
information
Priority assigned to objects (i.e., QoS)
OSD handles low-level operations
Block management in SSD
[Source: Block Management in SSD, Usenix 2009 ]
Linux provide a ready-to-use OSD prototype
OSC-OSD: iSCSI OSD target Open-iSCSI: iSCSI transport
vfs exofs ext3 udf
9
scsi sata
iscsi iscsi initiator scsi initiator SCSI/ SCSI/ SATA
SCSI core (scsi_mod.ko) bsg sd sr st
OSD is the most promising for mobile storages
Storage and system vendors are decoupled inherently QoS provisioning is essential e.g., managed flash memory applications including
eMMC, UFD, miniSD, SD, …
Our primary target is a Linux-based mobile platform
10
Our primary target is a Linux-based mobile platform
Part 1. OSD and Object-based SSD
Introduction Our Practice and Experience
Part 2. Application-Aware Storage
Introduction
11
Introduction Our Practice and Experience
Future Directions Summary and Conclusion
Prototyped in the Linux-based host
MLD: object mapping, space management, …
New type of SSD developed (called rawSSD)
exofs VFS
□ Get rawSSD information □ Erase all blocks from open-osd
12
Host (Linux) OSSD ULD OSSD MLD OSSD LLD raw SSD Flash Cmd I/F SCSI/SATA Flash Mng.
□ Erase all blocks □ Erase a block □ Read a flash page □ Program a flash page SATA Link from open-osd and Linux kernel
Overall architecture
(RB-Tree) Index (RB-Tree) {PID, OID} [127:0] Allocator GC Wear-leveling OSD Commands
13
Object descriptor Wear-leveling Meta data User data Flash media OSSD LLD READ / PROGRAM / ERASE
Containing metadata including extents and attributes Cached in memory (LRU)
Descriptor Header □ Object ID □ PPN □ Object information
14
Block Table Attributes 1page (16KB) □ PPNs for object data □ Extent: support 9MB □ Indirect table: support 4GB □ Attr. data of object □ Usually contain i-node and length □ Managed using original code
Support page-wise mapping
Based on page extents
Association policies of an update block
Fully associative
Separation of index and user data
15
No separation (called unified)
Object associative (called per object)
Garbage collection
Victim selection
Highest invalid pages Bitmap for invalid information
Background GC supported
Provide block + Raw SSD specific
Sanity check
Page overwrite Out-of-order page write
OSSD LLD
Sanity Checker
Block interface
Raw SSD interface
16
SCSI subsystem
Out-of-order page write
Provide backend flexibility
RawSSD
Via SCSI-ATA Translation Layer (SATL)
Loop RAM Backend dispatcher
RAM Loop Raw SSD
Host system
Quad Q9650 3GHz 4GB RAM Linux kernel 2.6.33 4GB partition for OSSD Write Threads Read Thread
17
Host exofs OSSD Driver raw SSD Flash Cmd I/F SCSI/SATA Flash Mng. VFS
Effect of incorporating space management in OSSD
Effect of having knowledge on the contexts of data blocks
18
Effect of having knowledge on the contexts of data blocks
and a request
Scenario
Write 800 MB 4 thread delete 2 file write 800 MB 2
thread
Show the effect of fragmentation
Unified update block Index / Data Index / Per object
19
block Write 800MB 4 thread 3m48.470s 3m52.263s 3m49.769s After delete, write 800MB 2 thread 6m34.535s 2m59.239s 1m59.375s Erase 689 599 414 Valid copy 34887 23605
Scenario
200MB read over total 2.4G write (e.g., 300MB x 8)
20
Part 1. OSD and Object-based SSD
Introduction Our Practice and Experience
Part 2. Application-Aware Storage
Introduction
21
Introduction Our Practice and Experience
Future Directions Summary and Conclusion
Capacity of HDD grows exponentially Ratio of the capacity to the interface bandwidth also
100000 1000000 100000 1000000 width (sec) cale) Capacity C/B Ratio
22
1 10 100 1000 10000 1 10 100 1000 10000 1990 1995 2000 2005 2010 Capacity / Interface Bandwidt Capacity (MB, log scale Year
Maxtor 7000 (IDE) Quantum Fireball ST (UltraATA/33) IBM Deskstar 16GP (UltraATA/33) Seagate Barracuda IV (UltraATA/100) Seagate Barracuda (SATA/300)
With a small database proxy, the amount of data
scan, aggregation, join, sorting, …
23
Other promising areas
Data mining, search indexing, image processing, Anti-Virus, …
Application-awareness can be easily achieved in OSSD
Integrated object (i.e., file) management Fluent attribute mechanism
Furthermore, SSD is no more dumb
4 Cortex-R4 CPUs, SATA 6Gbps, 512MB RAM, AXI bus matrix,
24
4 Cortex-R4 CPUs, SATA 6Gbps, 512MB RAM, AXI bus matrix,
16 flash memory controllers, …
What about moving some of application’s work to
Issues are models for programming, execution, and
deployment
Part 1. OSD and Object-based SSD
Introduction Our Practice and Experience
Part 2. Application-Aware Storage
Introduction
25
Introduction Our Practice and Experience
Future Directions Summary and Conclusion
PostgreSQL
PostgreSQL
Free and open source Object-relational database
Developed a plug-in for
Query Planner Query Planner Query Parser Query Parser OSSD Plug-in Query Optimizer Query Optimizer Executor Executor
26
Aggregation Selection
Query is processed in a similar
Kernel
VFS EXOFS OSD Driver
DB-Aware OSSD
DB Proxy
ioctl
Aggregation
Count, Sum, Average
SELECT count(*) FROM emp where age < 30; Count(*)
Result Result
27
Name Age Salary Mc 29 3000 Kim 26 1200 Na 34 4000 Kang 28 1400 Lim 25 400 … … …
Aggregate
And aggregate
Selection, Projection
Filtering (not yet)
SELECT * FROM emp where age > 30; Name Age Salary
Na 34 4000 Result Result
28
Selection
And selection
Name Age Salary Mc 29 3000 Kim 26 1200 Na 34 4000 Kang 28 1400 Lim 25 400 … … …
SELECT AGGREGATE(target_col) FROM t1 OSD WHERE cond_col COND value;
Linux Host PostgreSQL
Currently,
Running on existing exofs file system
29
WHERE cond_col COND value;
Translate inode to Object ID Sequential Read and Processing Aggregate Function exofs OSSD ULD OSSD LLD SCSI/SATA OSSD MLD
Experimental setup
Hardware
AMD Atholon64x2 7750 2.7 Ghz 3GB RAM (PC6400) HDD WD3200AAKS (SATA2/7200/16M)
OS and drivers
Linux kernel 2.6.33 (Fedora Core 13)
30
Linux kernel 2.6.33 (Fedora Core 13) Target : OSC-OSD File system and Driver : Open-OSD exofs and drivers
Database
PostgreSQL 8.4.4
Results will be available soon!!
31
Part 1. OSD and Object-based SSD
Introduction Our Practice and Experience
Part 2. Application-Aware Storage
Introduction
32
Introduction Our Practice and Experience
Future Directions Summary and Conclusion
Native support of OSD commands in SSD OSD command tunneling
Virtually any communication protocol over SATA
33
Virtually any communication protocol over SATA
JVM and deployment middleware Storage programming framework
Linux Host OSSD
exofs OSSD ULD OSSD Cmd Handler
34
OSSD LLD SCSI/SATA OSD Cmd I/F (via Tunneling Tech.) SATA Link Object Layer FTL SATA Buffer Manager
2-phased protocol
OSD command is transferred as a payload of the 1st
35
1st phase (cmd transfer) 2nd phase (optional) (data transfer) SATA vendor command (register FIS) D2H Data H2D Data
ATA/SCSI protocol is too limited for application-specific
We extended SATA with a simple bi-directional
Based on reserved LBAs and notification mechanism Serve as a link layer of other advanced protocol, e.g., TCP/IP
36
Serve as a link layer of other advanced protocol, e.g., TCP/IP
Application Main ATA/SCSI Trasport Bi-directiona l Messaging Cmd. Application Agent ATA/SCSI Trasport Bi-directional Messaging Cmd. FTL SATA/SAS/… Link Bi-directional communication HOST HDD/SSD
What is i2SSD ?
Object-based SSD capable of extending its functionalities via
dynamic application deployment
i.e., “Open Storage Architecture for Enabling Storage-Aware
Application Deployment”
Benefit is significantly saved time and energy
Computation at the near of the data
37
Computation at the near of the data
!"# $ %
38
!"# &' &( ) *+,-
/ 0+ 1
$2
!4 5( //
Part 1. OSD and Object-based SSD
Introduction Our Practice and Experience
Part 2. Application-Aware Storage
Introduction
39
Introduction Our Practice and Experience
Future Directions Summary and Conclusion
OSD fits well for SSD
Free from variations in SSD internals Easy application-specific extension Advanced features (e.g., QoS) made possible
We have developed a proof-of-concept OSSD and
40
We have developed a proof-of-concept OSSD and
The results look promising
Standardization
SCSI/SATA extensions for OSD command tunneling SCSI/SATA extensions for bi-directional communication
command set
i2SSD framework and programming interfaces
41
Call for participation of Linux community
OSSD bring-up issues in Linux End-to-end storage QoS support in Linux