Monthly Webinar Series: Understanding storage performance for - - PowerPoint PPT Presentation
Monthly Webinar Series: Understanding storage performance for - - PowerPoint PPT Presentation
Monthly Webinar Series: Understanding storage performance for hyperconverged infrastructure Luke Pruen Technical Services Director Virtual SANs made simple Introducing StorMagic Enabling HCI Partner Network Global footprint 30 +
Enabling HCI
Pre-configured, certified and supported by major vendors
Large and Small
Large and small deployments from enterprises with 1000s of site to SME’s with a single site
Global footprint
In 72 countries, customers depend on StorMagic for sever and storage infrastructure
Partner Network
Wherever you are, StorMagic has resellers, integrators, and server partners to meet your needs
30 + verticals
Including retail, financial services, healthcare, government, education, energy, professional services, pharma, and manufacturing
Introducing StorMagic
What is Hyperconverged Infrastructure?
“Tightly coupled compute, network and storage hardware that dispenses with the need for a regular storage area network (SAN).”
Magic Quadrant for Integrated Systems: Published October 10th 2016
There’s a lot of choice out there
The hyperconverged market
- Market is maturing with many options now available
- Be careful of pursuing a “one size fits all” approach
Customers
- Few customers understand their requirements
- Often blindly deploy over-spec’d solutions
Our take
- Customers need to be able to measure their needs more accurately
- Real world data often provides a surprising insight
Hyperconverged Architectures: Kernel based
- Storage “software” is within the hypervisor
- Pools local server storage where the hypervisor is installed
- Presents storage over a proprietary mechanism
- Claims to be more efficient and able to deliver higher performance
̶ More efficient because its where the hypervisor runs ̶ Less “hops” to the storage
SSD
Hypervisor Storage SW
SSD
Hypervisor Storage SW
SSD
Hypervisor Storage SW
Shared Storage
Hyperconverged Architectures: VSA based
- A virtual storage appliance resides on the hypervisor
- Host storage assigned to the local VSA
- Storage is generally presented as iSCSI or NFS
- Claims to be more flexible than kernel based models
̶ Hypervisor agnostic ̶ More storage flexibility ̶ Easier to troubleshoot storage issues vs hypervisor issues
SSD SSD SSD
Shared Storage
StorMagic SvSAN: Overview
“SvSAN turns the internal disk, SSD and memory of 2
- r more servers into highly
available shared storage”
StorMagic SvSAN: Benefits
Availability Flexible Cost-Effective Robust
Data & operations protected No single point of failure Local and stretched cluster capable Split-brain risk eliminated Proven at the IT edge and the datacenter From harshest to the most controlled Supports mission critical applications Tolerates poor, unreliable networks Enterprise-class management Integrates with standard tools Automated deployment and recovery scripts Designed for use by any IT professional
Today’s needs, future proofed Lightest footprint, lowest cost Any site, any network
No more physical SANs Converge compute and storage Utilize power of commodity servers Eliminate storage networking components Lowest CAPEX Start with only 2 servers - existing or new Significantly less CPU and memory One lightweight quorum for all clusters Lowest OPEX Reduced power, cooling and spares Lower costs with centralized management Eliminate planned and unplanned downtime Performance and scale Leverage any CPU and storage type Active/Active synchronous mirroring Scale-up performance with 2 node cluster Build-your-own Hyperconverged Eliminate appliance overprovisioning Configure to precise IOPS & capacity Auto-tier disk, SSD and memory Flexibility and growth Hyperconverged or storage-only Hypervisor and server agnostic Non-disruptive upgrades
Optimizing Storage: All storage is not equal
9
Magnetic drives provide poor random performance
- SATA 7.2k rpm 75 – 100 IOPS
- SAS 10k/15k rpm 140 – 210 IOPS
- Lower cost per GB
- Higher cost per IOPS
- Flash and SSDs have good random performance
- SSD/Flash 8.6K to 10 millions IOPS
- Lower cost per IOPS
- High cost per GB compared to magnetic
- Memory has even better performance
- Orders of magnitude faster than Flash/SSD
- Much higher cost per GB compared to SSD/Flash
- Memory is volatile and typical low in capacity
*https://en.wikipedia.org/wiki/IOPS *https://en.wikipedia.org/wiki/RAM_drive
Optimising Storage: The importance of caching
10
Virtualized environments suffer from the ‘I/O blender’ effect
- Multiple Virtual Machines sharing a set of disks
- Resulting in predominantly random I/O
- Magnetic drives provide poor random performance
- SSD & Flash storage ideal for workloads but expensive
Working sets of data
- Driven by workloads which are ever changing
- Refers to the amount of data most frequently accessed
- Always related to a time period
- Working sets sizes evolve as workloads change
Caching
- Combat the I/O blender effect without the expense of all Flash or SSD
- Working sets of data can be identified and elevated to cache
Optimising Storage: SSD/Flash caching
SSD/Flash Caching
- Significantly improves overall I/O performance
- Reduces the number of I/Os going directly to disk
- Dynamic cache sizing read/write ratio
Writes operations
- Data is written as variable sized extents
- Extents are merged and coalesced in the background
- Data in cache is flushed to hard disk regularly in small bursts
Read operations
- SvSAN algorithm identifies and promotes data, based on access patterns
- Frequently accessed data blocks are elevated on SSD/Flash
- Least frequently accessed blocks are aged out
Optimising Storage: Cold & Hot data
Intelligent read caching algorithm
- All read I/Os are monitored and analyzed
- Most frequently used data – “Hot” data
- Cache tiers are populated based on access frequency
Tiering
- RAM: Most frequently accessed data
- SSD/Flash: Next most frequently accessed data
- HDD: Infrequently accessed data – “Cold” data
Sizing
- Assign cache sizes to meet requirements
- Grow caches as working sets change
- Use any combination of Memory, SSD/Flash and Disk
Play to the strengths
- Play to the strengths of all mediums
- Memory Highest IOPS
- SSD/Flash Magnetic drives providing low price per GB
Industry performance numbers
Lab produced
- Numbers produced under strict conditions representing peak IOPS
- Random workloads focus on small block sizes to produce BIG numbers
- Sequential workloads focus on large block sizes to show BIG throughput
- Set unrealistic expectations
Example
- All Read: 4KiBs 100% random read
- Mixed Read/Write: 4KiBs 70/30 random read
- Sequential Read: 256KiB
- Sequential Write: 256KiB
The real world
- Multiple VMs running numerous mixed workloads
- AD, DNS, DHCP: low IOPS requirement
- Database, email and application servers: higher IOP requirement
- Generally sharing the same storage subsystem
*SQL Server I/O block size ref table
Operation IO Block Size
Transaction log write 512 bytes – 60 KB Checkpoint/Lazywrite 8KB – 1MB Read-Ahead Scans 128KB – 512KB Bulk Loads 256KB Backup/Restore 1MB ColumnStore Read-Ahead 8MB File Initialization 8MB In-Memory OLTP Checkpoint 1MB
How much performance is enough?
What do you need?
- Understand and document your storage requirements
- Current IOP and latency requirements of the current environment
- Lifecycle of solution?
How do you choose?
- Don’t base your decision on a 4KiB 100% random read workload
- When evaluating use realistic workloads
- Does the management and functionality meet your needs?
What matters
- Meets your current performance & capacity requirements
- Meets your future performance & capacity requirements
- Meets your deployment, management and availability requirements
Customer data analysis
Real life data
- Real customer data collected and analysed
- Exact data patterns simulated and replayed
- Accurate expectations performance under their workloads
Results
- Average customer would benefit from caching/tiering
- Up to 70% of I/O being satisfied from read cache
- A small amount of cache makes a big difference
Conclusion
- Few customers have performed this exercise
- Over provision hardware is common
- Significant cost savings were identified
Customer examples
- UK - Oil & Gas
- US - On-demand consumer service
- US - National retailer
Customer data analysis: Oil and Gas (UK)
Read Write Read/Write Ratio 53% 47% Average Per Day 93GB 84GB Average Block Size 61KB 24KB Average IOPS 18 41
Workloads
- Back office apps
- Back up service
Challenge
- Customer was looking to understand current workloads
- No definitive indication of current storage performance requirements
- Concerned about growing number of disk failures
StorMagic analysis
- Enabled I/O meta-data collection of a period of time
̶ Distribution of I/O sizes ̶ Throughput and IOPS ̶ Locality of access
0.001 0.01 0.1 1 10 100 1000 10000 21 MB 264 GB 529 GB 794 GB 1.0 TB 1.3 TB 1.6 TB 1.8 TB 2.1 TB 2.3 TB 2.6 TB 2.8 TB 3.1 TB 3.4 TB 3.6 TB 3.9 TB 4.1 TB 4.4 TB 4.7 TB 4.9 TB 5.2 TB 5.4 TB 5.7 TB Number of accesses (logarithmic scale)
Locality of access
Read Write
500 1000 1500 2000 2500 3000 3500 4000 14:16 15:04 15:52 16:40 17:28 18:16 19:04 19:52 20:40 21:28 22:16 23:04 23:52 00:40 01:28 02:16 03:04 03:52 04:40 05:28 06:16 07:04 07:52 08:40 09:28 10:16 11:04 11:52 12:40 13:28
IOPS Time of Day (UTC)
Throughput IOPS
Read Write
Customer data analysis: Oil and Gas (UK)
Estimates
- SSD & memory: 70% of I/O being satisfied from read cache when using 2GB
memory and 200GB SSD
- Memory only: 56% of I/O being satisfied from read cache when using 2GB of
memory Testing
- Replay the exact workload collected from live environment
- No best guess synthetic workload but exact patterns from data collection
Conclusion
- Current environment sufficient for today’s workloads
- Allocate a small amount of memory and SSD for optimal caching
- Using caching would increase disk MTBF
17% 23% 60%
0% 20% 40% 60% 80% 100%
Read Data Serviced by Tiers
Me mo ry SSD Disk 30% 14% 56% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Estimate % Read Serviced by Tiers
Memory SSD Disk 28.79 GB 13.52 GB 53.94 GB 0.00 GB 10.00 GB 20.00 GB 30.00 GB 40.00 GB 50.00 GB 60.00 GB 70.00 GB 80.00 GB 90.00 GB 100.00 GB
Actual Data Read From Tiers
Customer data analysis: On-demand consumer service (US)
Read Write Read/Write Ratio 40% 60% Average Per Day 9GB 13GB Average Block Size 30KB 11KB Average IOPS 5 15
Workloads
- Network monitoring for on-demand service
- Back office apps
Challenge
- Customer was evaluating Hyperconverged solutions
- Was considering full flash
StorMagic analysis
- Enabled I/O meta-data collection of a period of time in a live POC
̶ Distribution of I/O sizes ̶ Throughput and IOPS ̶ Locality of access
50 100 150 200 250 300 350 400 16:13 16:53 17:33 18:13 18:53 19:33 20:13 20:53 21:33 22:13 22:53 23:33 00:13 00:53 01:33 02:13 02:53 03:33 04:13 04:53 05:33 06:13 06:53 07:33 08:13 08:53 09:33 IOPS Time of Day (UTC)
Throughput IOPS
Read Write 0.001 0.01 0.1 1 10 100 1000 10000 42 GB 87 GB 132 GB 177 GB 222 GB 267 GB 312 GB 357 GB 402 GB 447 GB 492 GB 536 GB 581 GB 626 GB 671 GB 716 GB 761 GB 806 GB 851 GB 896 GB 941 GB 986 GB Number of accesses Thousands
Locality of access
Read Write
Estimates
- SSD & memory: 94% of I/O being satisfied from read cache, when using 2GB
memory and 120GB SSD
- Memory only: 94% of I/O being satisfied from read cache when using 2GB of
memory Testing
- Replay the exact workload collected from live environment
- No best guess synthetic workload but exact patterns from data collection
Conclusion
- Environment sufficient for workloads
- Allocate a small amount of memory to satisfy almost all reads
6% 94% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Read Data Serviced by Tiers
Memory SSD Disk 0.37GB 67.85GB 0.00GB 10.00GB 20.00GB 30.00GB 40.00GB 50.00GB 60.00GB 70.00GB 80.00GB
% of Read Data Serviced by Tiers
Customer data analysis: On-demand consumer service (US)
Customer data analysis: National retailer (US)
Read Write Read/Write % 77% 23% Average Per Day 991 GB 294 GB Average Block Size 58 KB 54 KB Average IOPS 212 138
Workloads
- Point of Sale
- 78 in store applications
- Back up service
Challenge
- Customer is looking at a hardware refresh across store locations
- How to size for current environment and future growth?
StorMagic analysis
- Enabled I/O meta-data collection of a period of time
̶ Distribution of I/O sizes ̶ Throughput and IOPS ̶ Locality of access
500 1000 1500 2000 2500 3000 3500 18:39 21:38 00:37 03:36 06:35 09:34 12:33 15:32 18:31 21:30 00:29 03:28 06:27 09:26 12:25 15:24 18:23 21:22 00:21 03:20 06:19 09:18 12:17 15:16 18:15 IOPs Time of Day (UTC)
Throughput IOPs
Read Write 0.001 0.01 0.1 1 10 100 1000 10000 21 MB 121 GB 243 GB 365 GB 487 GB 608 GB 730 GB 852 GB 974 GB 1.1 TB 1.2 TB 1.3 TB 1.4 TB 1.5 TB 1.7 TB 1.8 TB 1.9 TB 2.0 TB 2.1 TB 2.3 TB 2.4 TB 2.5 TB 2.6 TB Number of accesses (logarithmic scale) Thousands
Locality of access
Read Write
Customer data analysis: National retailer (US)
Estimates
- SSD and Memory: As high as 83% of I/O being satisfied from read cache when
using 8GB memory and 200GB SSD
- Memory only: As high as 60% of I/O being satisfied from read cache when
using 8GB of memory Testing
- No SSD used in testing, only focused on memory read caching
- Replay the exact workload collected from live environment
- No best guess synthetic workload but exact patterns from data collection
Conclusion
- Use SATA and memory caching vs SAS disks
- Reduced drive count but doubled performance and capacity
- Reduced power and cooling
- Less disks means a increase in Mean Time Between Failures (MTBF)
17% 23% 60%
0% 20% 40% 60% 80% 100%
Read Data Serviced by Tiers
Me mo ry SSD Disk
What did these customers learn
- Clarity into their environment workloads
- They could spend less on hardware
- How caching technologies will benefit their workloads
- Identifying their immediate requirements
- Know the performance limits of environment
- How the environment can scale to meet performance growth
Summary
- Understand your requirements
- Create a viable success criteria
- Only compare solutions that make sense
- Be wary of lab produced numbers!
- Storage performance can be affected by other factors
- Simulate workloads you plan to run
24
Q&A and Next Steps
SvSAN Product Information
Product Options SvSAN license 2, 6, 12 and unlimited TBs License entitlement 2 mirrored servers Maintenance and support Platinum - 24x7 / Gold - 9x5
For further information, please contact: sales@stormagic.com Further Reading: An overview of SvSAN - http://stormagic.com/svsan/ SvSAN v6 Data Sheet - http://stormagic.com/svsan-data-sheet/ SvSAN v6 White Paper - http://stormagic.com/svsan-6/ Download your free trial of SvSAN