Become a SSD expert in minutes! Ryan Smith - - PowerPoint PPT Presentation
Become a SSD expert in minutes! Ryan Smith - - PowerPoint PPT Presentation
Become a SSD expert in minutes! Ryan Smith ryan.smith@ssi.samsung.com 408-205-8889 What is a SSD? SSD = Solid State Drive RAM- based introduced in 1970s Flash- based version in 1990s Today, it typically uses NAND Flash
2 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
What is a SSD?
- SSD = Solid State Drive
- RAM-based introduced in 1970’s
- Flash-based version in 1990’s
- Today, it typically uses NAND Flash
- 2012 is a big year for SSDs
- Don’t complicate it.. it’s just a really fast drive!
5 10 15 20 25 30 35 2010 2011 2012 Millions
# of SSDs sold
PC Server Storage
Source : Samsung
3 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
Why an SSD?
- Three things that dictate the speed of your PC/Server:
- CPU, DRAM, and HDD
- Everything is speeding up.. Except the HDD
Processor:
- Multi-core
- Higher bandwidth
Memory:
- Larger footprint
- Higher bandwidth
Storage:
- Minor throughput improvements
- Currently solved with spindles
Time Performance Closing the gap with Solid State Storage
4 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
Why an SSD?
- Lower response times (latency)
- Higher IOPS and Throughput
- Lower Power
- No RVI Issues, More reliable
Random Performance (IOPS) Power Consumption (Watt) Read Write 70:30
Test Environment : Intel SR2600UR Server / IOMeter2008 Test Environment : Intel SR2600UR Server / IOMeter2008 / 4KB RND R70:W30
Active Idle
Source : Samsung
SM825 15K RPM HDD SM825 15K RPM HDD X100 X60 X30
- 87%
- 75%
8.5 12.6 3.2 1.1 11K 23K 43K
5 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
So what’s there to know about an SSD?
SSD Key Characteristics SSD Components NAND Characteristics P/E Cycles WAF TBW SMART Host Interface Sustained vs. Peak Performance Benchmarking SSD Influencers TRIM Over-provisioning Changing Workload
MLC
1 1
3,000 User Area
Reserved
O/P
SSD Key Characteristics
7 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
SSD Components
- Host/NAND Controller
- Firmware
- NAND Flash
- DRAM
- Capacitors (optional)
NAND DRAM Firmware Controller
DRAM NAND Flash Controller Firmware Host Interface
All components work closely together
SSD Image Source : Anandtech
8 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
NAND Characteristics
- Types of NAND
- TLC
- MLC
- E-MLC
- SLC
- Geometry / Lithography
- 4xnm, 3xnm, 2xnm
- Smaller = Less Cost
- NAND Hierarchy
- Pages: Smallest unit that can be read/written (e.g., 8KB)
- Erase block: Groups of pages (e.g., 64 pages @ 8KB = 512KB)
500-1K P/E Cycles 1 year retention 3-5K P/E Cycles 1 year retention 10-30K P/E Cycles 3 month retention 90-100K P/E Cycles 3 mo – 1 yr retention
TLC
1 1 1
MLC
1 1
E-MLC
SLC
1 1 1
PC Enterprise
1 1 1 1 1 1
9 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
P/E Cycles
- As geometries shrink, error correction must get better
- It’s like a car warranty!
- 3 years or 50,000 miles
- 3 years or 3,000 P/E Cycles
- Not a useful characteristic by itself
3,000 3xnm 2xnm 2ynm
ECC Requirements
Program / Erase Cycles
The # of times a given NAND cell can be programmed & erased
10 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
Write Amplification Factor (WAF)
- WAF 1 means 1MB from host writes 1MB to NAND
- WAF 5 means 1MB from host writes 5MB to NAND
- Factors that can affect WAF:
Write Amplification Factor
Bytes written to NAND versus bytes written from PC/Server
Controller
Flash Translation Layer (FTL) Wear Leveling Over-provisioning Garbage Collection
Host Application
Write Profile (Ran vs. Seq) Free user space / TRIM
Bytes written to NAND Bytes written from Host
WAF =
11 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
Write Amplification (WAF) Example
- Below example illustrates WAF of 6
Z A
LBA 0
Host SSD
Time
B C D E F
LBA 0
A B C D E F
Z B C D E F
Cache Flash
Z Z B C D E F Z B C D E F
Z B C D E F
Host wants to update LBA 0
No more free pages Need to erase entire block Read existing data to Cache Erase block Write modified page and
- ld pages back to Flash
4KB from Host 24KB to NAND
12 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
TBW
Examples: ((128GB / 1000) * 3000) / 5 = 76.8 TBW ((128GB / 1000) * 3000) / 2.5 = 153.6 TBW ((256GB / 1000) * 3000) / 5 = 153.6 TBW ((128GB / 1000) * 30000) / 5 = 768 TBW
TeraBytes Written
# of terabytes you can write to the drive over it’s useful life
(Capacity GB/1000) x PE Cycles WAF
TBW =
13 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
SMART
ID Attribute Name
5 Reallocated Sector Count 9 Power-on Hours 12 Power-on Count 177 Wear Leveling Count 179 Used Reserved Block Count 180 Unused Reserved Block Count 181 Program Fail Count 182 Erase Fail Count 187 Uncorrectable Error Count 195 ECC Error Count 199 CRC Error Count 241 Total LBA Written
- Look at health and various statistics
- Allows for predictable maintenance windows
- Calculate WAF, TBW
- Host GB written = [ID241] / (2/1024/1024)
- NAND GB written = [ID177] * Capacity GB
- WAF = NAND GB / Host GB
- Expected Life (yrs) = Warranty PE * ([ID9]/24/365) / [ID177]
14 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
Host Interface
- This is how you communicate to the SSD
- So many choices..
- SATA
- SAS
- PCIe (NVMe, SCSIe, SATAe, Proprietary)
- Which is right for you?
PC Server External Storage SATA PCIe SATA SAS PCIe
SATA + SAS bridge
SAS
15 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
Sustained vs. Peak Performance
- There can be significant differences in sustained vs. peak
- Run enterprise benchmark (e.g., SNIA RTP 2.0)
- Or even better, run your own workload (or simulated)
4KB Ran. R/W 100/0 (NCQ=16) 4KB Ran. R/W 65/35 (NCQ=16) 4KB Ran. R/W 0/100 (NCQ=16)
[IOPS]
[Ran. Performance @ 4KB]
PM830 128GB Value SSD SM825 200GB Mainstream SSD Vendor “X” 160GB Value SSD 1MB Seq. R/W 100/0 (NCQ=16) 1MB Seq. R/W 65/35 (NCQ=16) 1MB Seq. R/W 0/100 (NCQ=16)
[MBs]
[Seq. Performance @ 1MB]
PM830 128GB SM825 200GB Vendor “X” 160GB
Source : Samsung / SNIA RTP2.0 Benchmark
99% below Peak 94% below Peak
Samsung PM830 vs Vendor “X” 11x Sustained Random Writes Samsung PM830 vs Vendor “X” 2x Sustained Sequential Writes
95% below Peak 95% below Peak Over 10,000 IOPS!
There is a BIG difference between “Value” and “Mainstream/Enterprise” SSDs when you have any degree of writes in your workload
16 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
Benchmarking
Benchmark URL
SNIA RTP 2.0
http://www.snia.org/tech_activities/standards/curr_standards/pts
Iometer
http://sourceforge.net/projects/iometer/
ATTO Disk
http://www.attotech.com/products/product.php?sku=Disk_Benchmark
CrystalDiskMark
http://crystalmark.info/software/CrystalDiskMark/index-e.html
HD Tune Pro
http://www.hdtune.com/
AS SSD (SSD)
http://alex-is.de/PHP/fusion/downloads.php?download_id=9
Anvil (SSD)
http://thessdreview.com/latest-buzz/anvil-storage-utilities-releases-new-storage-and-ssd-benchmark/
Scripts Have multiple “dd” running with best guess workload, capturing timing/speeds Real Workload Capture trace during real workload and playback (ioapps, blktrace/btereplay)
- Synthetic or actual workload & take measurements
17 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
SSD Reviewers
- Good SSD Review sites available..
SSD Influencers
19 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
TRIM
- Helps the SSD know which blocks aren’t used
- Widely supported standard: Windows, Mac OS X, Linux, hdparm
- Better sustained performance and extends TBW
- Without TRIM, SSD only knows block isn’t used once the same
LBA is written to
Hi Hi
Bye
Hi
Bye
No TRIM needed Hi Hi
Bye
Hi
Bye
TRIM makes SSD aware
LBA 0 LBA 0 LBA 0 LBA 0
Host SSD
?
TRIM
Hi
Bye
LBA 0 LBA 0 LBA 0 LBA 1 LBA 1 LBA 0 LBA 1
Time Time
20 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
Over-Provisioning
- Helps a few things:
- Improves Write Performance
- Reduces WAF, Increases TBW
User Area
Reserved
O/P
128GB 100GB
128GB Base-2 to Base-10 conversion: 137,438,953,472 to 128,000,000,0000 (6.9%)
28GB
28% O/P
Sample 128GB SSD 120GB 100GB Over-Provisioning 7% 28% Random Read (8K) IOPS 80K 80K Random Write (8K) IOPS 1,800 6,300 Sequential Read (64K) MB/s 500 500 Sequential Write (64K) MB/s 400 400 4KB Random WAF 5 1.35 4KB Random TBW 15 45
- 73%
3x 3.5x
These performance numbers are fictitious but do represent the actual benefits seen during tests
21 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
Change Write Workload
- Write sequentially instead of random to reduce WAF
- If you have control of the I/O to the disk, this will pay off
- Align your writes with the page boundaries (e.g., 8KB)
- If alignment is too hard to implement, just increase your IO size
Only 1 Page needed 2 Pages needed Random Sequential MLC 512GB SSD 60 TBW 1250 TBW
20x
8K 8K 8K 8K 8K
LBA 0
LBA 16 LBA 8
LBA 0
LBA 16
8K 8K 8K
LBA 16
Change block alignment
Host SSD
Applications of SSDs
23 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
HDD Replacement
- Replace boot drive or main storage
- Fastest and easiest way to experience SSDs
SSD
HDD Server HDD
SSD
Storage
24 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
Caching Appliance
- Read and/or Write Cache
- Sits between servers and storage, typically in a SAN
- Used to speed up legacy or slower storage
HDD
SSD
Servers Cache Storage
25 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
Tiered Storage
- An external storage device (NAS, SAN)
- Only puts “hot” or “critical” data on SSD
- Most of the storage is still on HDD
SSD
Servers HDD HDD Storage
26 / ?
YYYY.MM.DD / 홍길동 책임 / xxxxxx팀
All Flash Storage
- External storage based on 100% SSD/Flash
- Typically uses MLC and de-duplication/compression to
achieve better pricing
- Designers of these systems are Flash experts