Alleviating I/O Interference via Caching and Rate-Controlled - - PowerPoint PPT Presentation

alleviating i o interference via caching and
SMART_READER_LITE
LIVE PREVIEW

Alleviating I/O Interference via Caching and Rate-Controlled - - PowerPoint PPT Presentation

Alleviating I/O Interference via Caching and Rate-Controlled Prefetching without Degrading Migration Performance Morgan Stuart Tao Lu Xubin He Storage Technology & Architecture Research Lab Electrical and Computer Engineering Dept.,


slide-1
SLIDE 1

Alleviating I/O Interference via Caching and Rate-Controlled Prefetching without Degrading Migration Performance

Morgan Stuart Tao Lu Xubin He

Storage Technology & Architecture Research Lab

Electrical and Computer Engineering Dept., Virginia Commonwealth University

Parallel Data Storage Workshop November 16, 2014

slide-2
SLIDE 2

2

Summary

  • 1. Virtualization and Migration
  • 2. Migration Induced Storage I/O Interference
  • 3. Storage Migration Offloading
  • a. Rate-Controlled storage read
  • b. Caching the migrating VM’s accesses
  • c. Prefetching bulk data
slide-3
SLIDE 3

3

Migration Overview

  • Virtual Machine (VM) adoption is huge
  • Flexibility for enterprise datacenters, HPC, and cloud
  • Live Migration is a key enabler
  • Move a running VM without shutting down
  • Federate and increase manageability
slide-4
SLIDE 4

4

Migration Data

  • Early migration required shared storage

Clark et al. (NSDI’05)

  • Source and destination could both access virtual disk
  • Only the memory and state required transfer
  • Now capable of full migrations Bradford et al. (VEE’07)
  • Virtual disk must be moved as well
  • Much more data (Avg. ~60 GB Cloud vDisk Birke et al. (FAST’14) )
slide-5
SLIDE 5

5

Progress in Migration Research

  • Focus on migration performance
  • Reduce migration latency
  • Time between migration start and complete
  • Reduce migration downtime
  • The length of stop-and-copy
  • General strategy
  • Reduce amount of data to transfer

Pierre et al. (Euro-Par’11), Al-Kiswany et al. (HPDC’11), Koto et al. (APSYS’12)

  • Avoid retransmissions
  • Zheng et al. (VEE’11)
slide-6
SLIDE 6

6

Progress in Migration Research

Shared demand for a resource can create interference

Host

VM State VM Data VM State VM Data VM State VM Data Data Hardware

Applications running directly on the host

slide-7
SLIDE 7

7

Understanding Interference

  • Fundamentally similar to any other interference
  • VMs contend for a resource...and the hypervisor

can’t quite deliver

  • Leads to VM performance degradation
  • Recent work has target VM interference
  • Primarily application level
  • Chiang and Huang (SC’11), Mars et al. (MICRO-44), Nathuji (EuroSys’10)
slide-8
SLIDE 8

8

Migration Interference

  • Migration causes undeniable interference
  • Some work has addressed network, memory, and

CPU

  • Xu, Liu, et al. (Transactions on Computers, 2013)
  • Storage is often the performance bottleneck
  • How does storage migration impact its

performance?

slide-9
SLIDE 9

9

Migration Interference

  • Tests on KVM-QEMU
  • Two VMs located on the same source host
  • Virtual disks both placed on RAID-6 (8 disks) over NFS
  • Migration traffic and NFS mounted on separate networks
  • 1st VM runs an IO benchmark
  • fio: random R/W across a 1GB file @ 2MB/s
  • 2nd VM is idle
  • 2nd VM is migrated to destination host
  • Virtual disk is stored to a local drive here
slide-10
SLIDE 10

10

Storage Host

Migration Interference

Source Host

Hypervisor Disk 1

VM 1

Disk 2

VM 2

Destination Host

NIC1 NIC

Hypervisor

NIC2 NIC

Disk 2 Migration bandwidth is configurable ...which directly adjusts the migration’s disk utilization Therefore, we can use the adjustable migration bandwidth to experiment with the storage utilization intensity of a full migration

slide-11
SLIDE 11

11

Migration Interference

Default Bandwidth Setting

slide-12
SLIDE 12

12

Storage Migration Offloading (SMO)

  • SMO Design goals
  • 1. Maintain negligible interference throughout

migration

  • 2. Don’t reduce the migration’s chance for

convergence

  • 3. Avoid sacrificing the migration’s performance
slide-13
SLIDE 13

13

Migration Interference

Settings where interference level may be acceptable

slide-14
SLIDE 14

14

SMO: Rate-Controlled Read

  • Rate Controlled migration
  • Monitor perceived utilization/interference on disk
  • Adjust the migration read rate to avoid over-utilizing
  • Reduce interference
  • Problems still exist, just not as bad…
  • Improved latency vs. simple low static migration…
  • Could periods of low utilization be leveraged better?
  • Convergence and stop-and-copy could still suffer
  • Migration can still fail
slide-15
SLIDE 15

15

SMO: Caching

Storage Host

Source Host

Hypervisor

Disk 1

VM 1

Disk 2

VM 2

Destination Host

NIC1 NIC

Hypervisor

NIC2 NIC

Disk 1 1. Start migration 2. Cache storage accesses for VM being migrated 3. The hot data in the migration cache can be left for the end

a. High bandwidth reads on the cache will not cause cross-VM interference

Combine a storage cache with rate-controlled migration to eliminate need for the hypervisor’s redundant accesses

Cache

slide-16
SLIDE 16

16

SMO: Caching

Cache Virtual Disk

Migrating VM’s IO serviced by cache Misses and write-backs

For migration of IO-heavy VMs, this...

  • Decreases shared storage utilization

○ Allowing increase in rate-controlled read

  • Provides a low-interference store to get dirtied data

○ Data that makes it hard to converge Can it be improved?

  • Does not help if the migrating VM has low IO
  • Workloads unlikely to access the entire disk

○ Some data will have to be read on behalf of the migration

slide-17
SLIDE 17

17

SMO: Prefetch Data into the Cache

  • Caching alone probably not enough
  • Employ prefetching to Buffer data
  • Get non-migrated data into the buffer whenever possible
  • Prefetching should not cause extra interference
  • Use excess disk bandwidth identified by rate controller
  • Prefetched data can serve IO requests

Disjoin sending data over the network from reading data out of storage

slide-18
SLIDE 18

18

Maintain migration rate with Buffer

SMO: Transfer Rates

Buffer Virtual Disk

Cached Accesses Send left over to the Buffer

(4) (5)

Rate-controlled primary storage read

(3)

Data to Destination

(1)

Network rate limit Configurable rate

(2)

slide-19
SLIDE 19

19

SMO: Analysis

D0=32MB/s

slide-20
SLIDE 20

20

SMO: Analysis

slide-21
SLIDE 21

21

SMO: Analysis

slide-22
SLIDE 22

22

SMO: Analysis

slide-23
SLIDE 23

23

SMO: Dynamic Caching Policy

  • Basic assumptions
  • Non-volatile, Fully-associative, necessary meta-data

to achieve consistency

  • Migrated data is usable, but considered free
  • Create interplay with rate-controlled prefetching
  • Since cache policy dictates the primary IO levels
slide-24
SLIDE 24

24

Storage Migration Offloading: Buffer States

  • Two Properties (4 Combinations):
  • Space Available
  • Under Capacity or At Capacity
  • Status of Migration data
  • Partially Offloaded
  • Non-migrated data still on primary storage
  • Fully Offloaded
  • All remaining non-migrated data resides in buffer
slide-25
SLIDE 25

25

Partially Offloaded & Under Capacity

Primary Storage

Virtual Disk

Goal: Drive primary utilization down while populating the buffer

Buffer with space remaining, with non- migrated data on primary storage

Writes are issued as write-back Misses are brought into the buffer

Buffer’s cache lines

slide-26
SLIDE 26

26

Storage Migration Offloading: Buffer States

  • Partially Offloaded & Under Capacity
  • Drive primary utilization down while populating the buffer
  • Partially Offloaded & At Capacity
  • Allow primary utilization to rise to normal levels, decrease buffer

utilization

  • Fully Offloaded & Under Capacity
  • Capture dirty data, decrease buffer utilization
  • Fully Offloaded & At Capacity
  • Allow primary utilization to rise to normal levels, decrease buffer

utilization

slide-27
SLIDE 27

27

Conclusion

  • Storage migration interference impacts IO
  • Easily degrade basic IO over 90%
  • Any full migration will require a full read of vDisk
  • Deduplication, compression, cold-data first, etc.
  • Simple scheme: Storage Migration Offloading
  • Use secondary storage for buffering & caching
  • Offload data as quickly as possible
  • Leverage the workload’s IO through caching
  • Take advantage of extra disk bandwidth when possible
slide-28
SLIDE 28

28

SMO: Future Work

  • Subtleties of caching
  • State transitions, required tracking data, etc.
  • Caching + Prefetching analysis
  • Benefit of interplay needs exploration
  • Potential for migration staging phase?
  • Cache and offload prior to migration
slide-29
SLIDE 29

29

Thank You!

Questions?

Acknowledgements

  • Anonymous reviewers of PDSW 2014
  • U.S. National Science Foundation (NSF), grants CCF-1102624

and CNS-1218960.

slide-30
SLIDE 30

30

Backup

slide-31
SLIDE 31

31

Buffer Device

  • Considerations
  • Non-volatile keeps the preliminary design simple
  • Though, RAM or leveraging page cache is enticing
  • Size can be small (16GB - 32GB)
  • Assuming migration of ~60 GB disk
  • Stays cheap
  • SSD or High-performing HDD preferred
  • Should be able to maintain expected performance while

prefetching+migrating

slide-32
SLIDE 32

32

Cache Consistency

  • Considerations
  • All data is expected to move to another store
  • Avoid writing through to storage, since the data is not

required to be there

  • Must correlate buffer data to specific VM
  • Rebuilding in event of failure
  • Require several bits to uniquely identify the VM
slide-33
SLIDE 33

33

Storage Migration Offloading: Caching

  • Recognize redundant reads to storage
  • The running VM will likely read/write
  • The migration must read the entire vDisk
  • If the hypervisor reads X on behalf of the VM, the

migration process should not have to read X again

slide-34
SLIDE 34

34

Partially Offloaded & At Capacity

Primary Storage

Virtual Disk

Goal: Allow primary utilization to rise to normal levels, decrease buffer utilization

Buffer full, but non-migrated data on primary storage

Write hits are write-through Reads bypass the buffer Write misses bypass the buffer

Need to make space in the buffer → need to migrate data from the buffer

slide-35
SLIDE 35

35

Fully Offloaded & Under Capacity

Primary Storage

Virtual Disk

Goal: Capture dirty data, decrease buffer utilization

Buffer has space remaining, and all non-migrated data is on the buffer

Writes are write-through Reads bypass the buffer

slide-36
SLIDE 36

36

Fully Offloaded & At Capacity

Primary Storage

Virtual Disk

Goal: Allow primary utilization to rise to normal levels, decrease buffer utilization

Buffer full, and all non-migrated data is

  • n the buffer

Write hits are write-through Reads bypass the buffer Write misses bypass the buffer

Need to make space in the buffer → need to migrate data from the buffer