Duy Le (Dan) - The College of William and Mary Hai Huang - IBM T. J. - - PowerPoint PPT Presentation

duy le dan the college of william and mary hai huang ibm
SMART_READER_LITE
LIVE PREVIEW

Duy Le (Dan) - The College of William and Mary Hai Huang - IBM T. J. - - PowerPoint PPT Presentation

Duy Le (Dan) - The College of William and Mary Hai Huang - IBM T. J. Watson Research Center Haining Wang - The College of William and Mary Virtualization Games Games Web server Videos Programming Database server Web File server Mail server


slide-1
SLIDE 1

Duy Le (Dan) - The College of William and Mary Hai Huang - IBM T. J. Watson Research Center Haining Wang - The College of William and Mary

slide-2
SLIDE 2

Virtualization

2

Games Programming File server Web server Database server Mail server Games Videos Web

File Systems File Systems File Systems Storage Storage Disks Disks Disks

[Tang-ATC’11] Storage space allocation and tracking dirty blocks functionality optimization [Tang-ATC’11] [Boutcher-Hotstorage’09] Different I/O scheduler combinations [Jujjuri-LinuxSym’10] VirtFS - File system pass- through

Performance Implications

  • f

File Systems

slide-3
SLIDE 3

 “Selected file systems are based on workloads”

  • Only true in physical systems

 File systems for guest virtual machine

  • Workloads
  • Deployed file systems (at host level)

 Investigation needed!

 Ext2, Ext3, Ext4, ReiserFS, XFS, and JFS  Ext2, Ext3, Ext4, ReiserFS, XFS, and JFS

3

Guest File Systems Host File Systems

slide-4
SLIDE 4

 For the best performance?

 Best and worst Guest/Host File System combinations?

 Guest and Host File System Dependency

  • Varied I/Os and interaction
  • File disk images and physical disks

4

slide-5
SLIDE 5

Experimentations

Macro level

Throughout analysis

Micro level

Findings and Advice

5

slide-6
SLIDE 6

Experimentations

Macro level

Throughout analysis

Micro level

Findings and Advice

6

slide-7
SLIDE 7

VirtIO

Guest

Host

7

File Systems sdb2 sdb3 sdb4 sdb5 sdb6 sdb7 sdb1 File Systems vdc1 vdc2 vdc3 vdc5 vdc6 vdc4 Ext2 Ext3 Ext4 XFS JFS ReiserFS Ext2 Ext3 Ext4 ReiserFS XFS JFS BD

Qemu 0.9.1; 512MB RAM Linux 2.6.32 Pentium D 3.4 GHz, 2GB Ram Linux 2.6.32 + Qemu-KVM 0.12.3 1 TB, SATA 6Gb/s, 64MB Cache 60 x 106 blocks Sparse disk image 9 x 106 blocks Raw partition as block device (BD) 60 x 106 blocks

slide-8
SLIDE 8

 Filebench

  • File server, web server, database server, and mail

server.

 Throughput  Latency  I/O Performance

  • Different abstraction consideration

 Via block device (BD)  Via nested file systems

  • Relative performance variation

 BD as baseline

8

slide-9
SLIDE 9

9

Baseline Relative performance to baseline of guest file systems Host file systems

slide-10
SLIDE 10

10

ReiserFS guest file system

ReiserFS guest file system

slide-11
SLIDE 11

5 10 15 20 25 30

BD

Throughput (MB/s)

5 10 15 20 25 30

BD

Throughput (MB/s)

Ext2 Ext3 Ext4 ReiserFS XFS JFS

20 40 60 80 100 120

Percentage (%)

R

11

ReiserFS guest file system

Ext4 guest file system

slide-12
SLIDE 12

 Guest file system  Host file systems

  • Varied performance

 Host file system  Guest file systems

  • Impacted differently

 Right and wrong combinations

  • Bidirectional dependency

 I/Os behave differently

  • Writes is more critical than Read (mail

server)

12

slide-13
SLIDE 13

13

Ext2 Ext3 Ext4 ReiserFS XFS JFS

10 20 30 40 50 60 70 80 90 100

Percentage (%)

R

1 2 3 4 5 6 7 8

BD

Throughput (MB/s)

1 2 3 4 5 6 7 8

BD

Throughput (MB/s)

Ext3 guest file system Ext2 guest file system

slide-14
SLIDE 14

 Guest file system  Host file systems

  • Varied performance

 Host file system  Guest file systems

  • Impacted differently

 Right and wrong combinations

  • Bidirectional dependency (mail server)

 I/Os behave differently

  • Writes is more critical than Read (mail

server)

14

slide-15
SLIDE 15

15

0.5 1 1.5 2 2.5

BD

Throughput (MB/s)

0.5 1 1.5 2 2.5

BD

Throughput (MB/s)

Ext2 Ext3 Ext4 ReiserFS XFS JFS

10 20 30 40 50 60 70 80

Percentage (%)

R

Ext2 guest file system

slide-16
SLIDE 16

 Guest file system  Host file systems

  • Varied performance

 Host file system  Guest file systems

  • Impacted differently

 Right and wrong combinations

  • Bidirectional dependency

 I/Os behave differently

  • Writes is more critical than Reads

16

slide-17
SLIDE 17

17

0.5 1 1.5 2 2.5

BD

Throughput (MB/s)

0.5 1 1.5 2 2.5

BD

Throughput (MB/s)

Ext2 Ext3 Ext4 ReiserFS XFS JFS

10 20 30 40 50 60 70 80

Percentage (%)

R

More WRITES Lower than 100%

slide-18
SLIDE 18

 Guest file system  Host file systems

  • Varied performance

 Host file system  Guest file systems

  • Impacted differently

 Right and wrong combinations

  • Bidirectional dependency

 I/Os behave differently

  • WRITES are more critical than READS

18

slide-19
SLIDE 19

 Guest file system  Host file systems

  • Varied performance

 Host file system  Guest file systems

  • Impacted differently

 Right and wrong combinations

  • Bidirectional dependency

 I/Os behave differently

  • WRITES are more critical than READS

 Latency is sensitive to nested file systems

19

slide-20
SLIDE 20

Experimentations

Macro level

Throughout analysis

Micro level

Findings and Advice

20

slide-21
SLIDE 21

 Same testbed  Primitive I/Os

  • Reads or Writes
  • Random or Sequential

 FIO benchmark

21

Description Parameters T

  • tal I/O size

5 GB I/O parallelism 255 Block size 8 KB I/O pattern Random/Sequential I/O mode Native async I/O

slide-22
SLIDE 22

22

0.4 0.8 1.2 1.6 2

BD

Throughput (MB/s)

0.4 0.8 1.2 1.6 2

BD

Throughput (MB/s)

Ext2 Ext3 Ext4 ReiserFS XFS JFS

20 40 60 80 100 120 140 160

Percentage (%)

R

Ext2 Ext3 Ext4 ReiserFS XFS JFS

20 40 60 80 100 120 140 160

Percentage (%)

R

20 40 60 80 100

BD

Throughput (MB/s)

20 40 60 80 100

BD

Throughput (MB/s)

Random Sequential

Ext3 guest file system Ext3 guest file system

slide-23
SLIDE 23

23

 Read dominated workloads

  • Unaffected performance by nested file systems

 Write dominated workloads

  • Heavily affected performance by nested file systems
slide-24
SLIDE 24

24

1 2 3 4

BD

Throughput (MB/s)

1 2 3 4

BD

Throughput (MB/s)

Ext2 Ext3 Ext4 ReiserFS XFS JFS

20 40 60 80 100 120 140 160

Percentage (%)

R

20 40 60 80 100

BD

Throughput (MB/s)

20 40 60 80 100

BD

Throughput (MB/s)

Ext2 Ext3 Ext4 ReiserFS XFS JFS

20 40 60 80 100 120 140 160

Percentage (%)

R

Random Sequential

slide-25
SLIDE 25

25

 Read dominated workloads

  • Unaffected performance by nested file systems

 Write dominated workloads

  • Heavily affected performance by nested file systems
slide-26
SLIDE 26

26

 Read dominated workloads

  • Unaffected performance by nested file systems

 Write dominated workloads

  • Heavily affected performance by nested file systems

 Sequential Reads: Ext3/JFS vs. Ext3/BD  Sequential Writes:

  • Ext3/ReiserFS vs. JFS/ReiserFS (same host file systems)
  • JFS/ReiserFS vs. JFS/XFS (same guest file systems)

 I/O analysis using blktrace

Ext2 Ext3 Ext4 ReiserFS XFS JFS

20 40 60 80 100 120 140 160

Percentage (%)

R

20 40 60 80 100

BD

Throughput (MB/s)

20 40 60 80 100

BD

Throughput (MB/s) Ext3 guest file system Ext3 guest file system

20 40 60 80 100

BD

Throughput (MB/s)

20 40 60 80 100

BD

Throughput (MB/s)

Ext2 Ext3 Ext4 ReiserFS XFS JFS

20 40 60 80 100 120 140 160

Percentage (%)

R

JFS guest file system Ext3 guest file system JFS guest file system

Sequential Reads: Ext3/JFS vs. Ext3/BD

slide-27
SLIDE 27

 Findings:

 Readahead at the

hypervisor when nesting FS

 Long idle times for

queuing

27

slide-28
SLIDE 28

 Different guests (Ext3, JFS) same host (ReiserFS)

  • I/O scheduler and Block allocation scheme

28

slide-29
SLIDE 29

29

Low I/Os for journaling Well merged Long waiting in the queue

Ext3 causes multiple back merges JFS coalescences multiple log entries

slide-30
SLIDE 30

 Different guests (Ext3, JFS) same host (ReiserFS)

  • I/O scheduler and Block allocation scheme
  • Findings

 I/O schedulers are NOT effective for ALL nested file systems  I/O scheduler’s effectiveness on block allocation scheme

30

slide-31
SLIDE 31

 Different guests (Ext3, JFS) same host (ReiserFS)  Same guest (JFS) different hosts (ReiserFS, XFS)

  • Block allocation schemes

31

slide-32
SLIDE 32

32

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

CDF of disk I/Os

Normalized seek distance

ReiserFS XFS

Fairly similar Longer waiting in queue Long distance seeks Repeated logging

slide-33
SLIDE 33

 Different guests (Ext3, JFS) same host (ReiserFS)  Same guest (JFS) different hosts (ReiserFS, XFS)

  • Block allocation schemes
  • Findings:

 Effectiveness of guest file system’s block allocation is NOT guaranteed  Journal logging on disk images lowers the performance

33

slide-34
SLIDE 34

Experimentations

Macro level

Throughout analysis

Micro level

Findings and Advice

34

slide-35
SLIDE 35

 Advice 1 – Read-dominated workloads

  • Minimum impact on I/O throughput
  • Sequential reads: even improve the performance

 Advice 2 – Write-dominated workloads

  • Nested file system should be avoided

 One more pass-through layer  Extra metadata operations

  • Journaling degrades performance

35

slide-36
SLIDE 36

 Advice 3 – I/O sensitive workloads

  • I/O latency increased by 10-30%

 Advice 4 – Data allocation scheme

  • Data and Metadata I/Os of nested file systems are not

differentiated at host

  • Pass-through host file system is even better!

 Advice 5 – Tuning file system parameters

  • “Non-smart” disk
  • Noatime and nodiratime

36

slide-37
SLIDE 37

37

slide-38
SLIDE 38

Devices Blocks (x106) Speed (MB/s) Type sdb2 60.00 127.64 Ext2 sdb3 60.00 127.71 Ext3 sdb4 60.00 126.16 Ext4 sdb5 60.00 125.86 ReiserFS sdb6 60.00 123.47 XFS sdb7 60.00 122.23 JFS sdb8 60.00 121.35 Block Device

38