1
Chungbuk National University
Reducing Journaling Harm on Virtualized I/O systems
Eunji Lee, Hyokyung Bahn, Minseong Jeong, Sunghwan Kim, Jesung Yeon, Seunghoon Yoo, Sam H. Noh, Kang G. Shin
ACM/ USENIX SYSTOR ’16, June 6‐8
Reducing Journaling Harm on Virtualized I/O systems Eunji Lee, - - PowerPoint PPT Presentation
Chungbuk National University Reducing Journaling Harm on Virtualized I/O systems Eunji Lee, Hyokyung Bahn, Minseong Jeong, Sunghwan Kim, Jesung Yeon, Seunghoon Yoo, Sam H. Noh, Kang G. Shin ACM/ USENIX SYSTOR 16, June 6 8 1 Chungbuk
1
Chungbuk National University
Eunji Lee, Hyokyung Bahn, Minseong Jeong, Sunghwan Kim, Jesung Yeon, Seunghoon Yoo, Sam H. Noh, Kang G. Shin
ACM/ USENIX SYSTOR ’16, June 6‐8
2
Chungbuk National University
Widely used in modern computer systems
Provide flexibility, scalability and energy savings
Accompanied with inefficiencies
“Bare‐metal” approach:
80% of cloud servers rely on
http://www.infoq.com/news/2012/10/Survey-Virtualization-Cloud
3
Chungbuk National University
Challenge
compromising on transparency
Guest’s journaling makes harmful effects on I/O
Analyze the effectiveness of journaling and reduce the
Proposed strategy is implemented in QEMU‐KVM, and
4
Chungbuk National University
Both guest and host have their own file systems
Guest’s I/O goes through host cache
… Guest’s buffer cache Host’s buffer cache A B C A
Read “A”
A
5
Chungbuk National University
Both guest and host have their own file systems
Guest’s I/O goes through host cache
… Guest’s buffer cache Host’s buffer cache A B C A
Read “A”
A
6
Chungbuk National University
Both guest and host have their own file systems
Guest’s I/O goes through host cache
… Guest’s buffer cache Host’s buffer cache A B C A
Read “A”
A
7
Chungbuk National University
Both guest and host have their own file systems
Guest’s I/O goes through host cache
Deliver better performance
… Guest’s buffer cache Host’s buffer cache A B C A
Read “A”
A
8
Chungbuk National University
Both guest and host have their own file systems
Guest’s I/O goes through host cache
Deliver better performance
What if in high‐speed storage?
… Guest’s buffer cache Host’s buffer cache A B C A
Read “A”
A Additional memory copy is more painful
Solid State Disk
9
Chungbuk National University
Performance comparison of using host cache (Writeback)
10
Chungbuk National University
Using host cache delivers 2.7x and 1.7x better
Benefit of host cache
11
Chungbuk National University
A bit on Journaling
permanent file location only if logging succeeds
Case of Ext4
Commit (5s) Checkpoint Data in page cache File system Journal area
12
Chungbuk National University
No locality : Not accessed again unless the system crashes
… Guest’s buffer cache Host’s buffer cache
J J J
Never hit in host cache
13
Chungbuk National University
Synchronous writes : FLUSH command comes right after
… Guest’s buffer cache Host’s buffer cache
J J J
No buffering effect due to immediate synchronization
S J J C D
Associated data blocks
D
Time Ext4 with ordered mode FLUSH FLUSH FLUSH
Journal data
(can be skipped)
14
Chungbuk National University
Large footprint : Completely sequential writes in a large
…
J J
Cache pollution
Journal Area J J J J
… Guest’s buffer cache Host’s buffer cache
J J J U S E F U L U S E F U L J J J J J J J
15
Chungbuk National University
Journal traffic accounts for 19% on average and up to 47%
16
Chungbuk National University
Footprint of journal accesses account for 45.2% on
17
Chungbuk National University
86% of total sync operations are associated with journal
18
Chungbuk National University
Filter out journal traffic from host cache Two challenges
to cache it or not
Need approaches that still provide transparency
…
Guest OS
Hypervisor
J F F F J
Host OS
Write journal data directly to storage
19
Chungbuk National University
Implicit Journal Traffic Detection
journal area
Explicit Knowledge Implicit Detection
Prediction Period
20
Chungbuk National University
posix_fadvise system call enables user
Implement POSIX_FADV_NOREUSE flag
Host operating system commences
21
Chungbuk National University
Experimental Setup
file system r/w func. buffer cache layer block I/O layer device driver Guest
system call (read / write) dev1 dev2 page buffer bio request Data structure
file system r/w func. buffer cache layer block I/O layer device driver
invalidation
use cache
Hypervisor Pollution Defensive Caching Host journal data regular data system call modify interface in I/O path selective caching routine development
22
Chungbuk National University
PDC provides 8‐32% higher IOPS than original caching
23
Chungbuk National University
PDC improves synchronous writes and compact
by 33% and 18%
24
Chungbuk National University
No significant difference in the hit ratio between two
25
Chungbuk National University
No significant difference in the hit ratio between two
Journaling writes small updates periodically that are consecutive to previous accesses Little effect on evicting likely‐to‐be accessed data from the host cache
26
Chungbuk National University
Analyze the journaling effect in fully virtualized systems Uncover that journal traffic deteriorates cache
Propose a new caching policy : pollution‐defensive
Implemented in Linux 4.14 and QEMU‐KVM Improve I/O performance by 3‐32% in file and key‐value
27
Chungbuk National University
32nd International Conference on Massive Storage Systems and Technology (MSST) May, 2016 Eunji Lee, Chungbuk Natational University Julie Kim, Ewha University Hyokyung Bahn, Ewha University Sam H. Noh, UNIST
28
Chungbuk National University
Undesirable phenomenon associated with flash memory Number of writes to storage is higher than the number of
Key aspect limiting stable performance and endurance of
Source: Radian Memory Systems
Performance is fluctuating!
29
Chungbuk National University
Garbage collection is performed to recycle used blocks Copy out valid pages in a victim block into a free block
B C D F G H write (B’ F’ G’ H’) B’ F’ G’ H’ B’ F’ G’ H’ A C D E A E write (B’ F’ G’ H’) + write (A C D E ) in GC Write 4 Blocks
30
Chungbuk National University
Analyze WAF with respect to workload characteristics Generate two synthetic workloads using SSDsim
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0
4GB 8GB 16GB 32GB 64GB
SSD Capacity
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0
4GB 8GB 16GB 32GB 64GB
SSD Capacity
6x writes! No write amplification
Random updates incur the dispersed distribution of the valid pages
31
Chungbuk National University
Real workload is a mixture of random and sequential
Observe WAF varying the level of randomness
1 1.05 1.1 1.15 1.2 1.25 1.3 1.35 1.4
random r:s = 9:1 r:s = 8:2 r:s = 7:3 r:s = 6:4 r:s = 5:5 r:s = 4:6 r:s = 3:7 r:s = 2:8 r:s = 1:9 sequential
Write Amplification Factor
100% random 100% Sequential A low level of randomness amplifies writes! Real workloads are likely to be between two opposites
32
Chungbuk National University
Becoming increasingly viable as leading semi‐conductor
Fast, scalable, and persistent memory is being realized in
33
Chungbuk National University
Goal: Reduce WAF by taking advantage of non‐volatility
Using NVM as a storage cache – promising option
B C D F G H B’ F’ G’ H’ A E B C D F G H B’ F’ G’ H’ A E A C B’ D A C B’ D Volatile Cache Non‐Volatile Cache Flash Storage Flash Storage
34
Chungbuk National University
In traditional systems, all valid pages in a victim block
4 block writes !
B C D F G H B’ F’ G’ H’ A C D E A E A C B’ D Volatile Cache Flash Storage B C D F G H B’ F’ G’ H’ A E A C B’ D Non‐Volatile Cache Flash Storage
35
Chungbuk National University
CDM skips the copying of valid pages in GC if the data
Only one block write!
B C D F G H B’ F’ G’ H’ A C D E A E B C D F G H B’ F’ G’ H’ E A E A C B’ D A C B’ D Volatile Cache Non‐Volatile Cache Flash Storage Flash Storage NO NO NO
36
Chungbuk National University
Finite state diagram
‐ Can be erased if the data needs to be copied into another block ‐ Or it is same as the valid state
it is cached in a non‐volatile cache
37
Chungbuk National University
Implementation of CDM has several issues depending on
Feasible architectures
NV‐cache Flash Storage Host’s page cache DRAM‐cache Flash Storage Host’s page cache NV‐cache
38
Chungbuk National University
Issue 1. Consistency Updating data in a cache touches a final copy Crash during update results in inconsistent data
B’ F’ G’ H’ A C D E B’ F’ G’ H’ E A C B’ D A C B’ D Volatile Cache Non‐Volatile Cache Flash Storage Flash Storage C’ C’
crash crash
39
Chungbuk National University
Issue 1. Consistency Solution is associated with specific file system
Data consistency is managed in a file system layer with
File systems should be redesigned considering a way of
40
Chungbuk National University
Finite State Diagram Introduce additional states of
Update data with a copy‐on‐
41
Chungbuk National University
Issue 2. Communication Overhead Events in cache and storage should be notified to each
Designing a new interface is no longer a big deal
additional information on the original data
However, frequent communication for additional
42
Chungbuk National University
More deployable architecture No consistency issue
inconsistent if a system crashes during updates
Not serious communication overhead
device is much cheaper and easier than synchronizing storage with a host cache
Development can be achieved by
DRAM‐cache Flash Storage Host’s page cache NV‐cache
43
Chungbuk National University
Trace‐driven simulations with SSDsim
44
Chungbuk National University
Implement the in‐storage NV‐cache module and modify a
Compare with a NVM‐basic model
45
Chungbuk National University
CDM reduces WAF by 2.1‐17.6% and 4.3‐38.2% in JEDEC
46
Chungbuk National University
Average response time is improved by 9.7% and 20.3% on
47
Chungbuk National University
Reduce standard deviation of response time by 31% and
Relieve the performance fluctuating
48
Chungbuk National University