Virtualized Infrastructures, Systems, & Applications
vPFS +: Managing I/ O Performance for Diverse HPC Applications - - PowerPoint PPT Presentation
vPFS +: Managing I/ O Performance for Diverse HPC Applications - - PowerPoint PPT Presentation
vPFS +: Managing I/ O Performance for Diverse HPC Applications Ming Zhao, Arizona S tate University Yiqi Xu, VMware ht t p:/ / visa.lab.asu.edu V irtualized I nfrastructures, S ystems, & A pplications Background: HPC I/ O Management
Background: HPC I/ O Management
Increasing diverse HPC applications on shared storage
- Different I/ O rates, sizes, and data/ metadata intensities
Lack of I/ O QoS
differentiation
- Parallel file systems treat all I/ Os equally
Compute nodes
APP1 APP2 APPn APPn APPn
S torage nodes
Generic parallel file system I/Os Ming Zhao, AS U 2
Mismatch!
Background: vPFS
Proxy-based interposition of application data requests
- Transparent to applications, support different setups
Proportional I/ O bandwidth scheduling using S
FQ(D)
- Work conserving, strong fairness
3
App App App PFS HPC application 1 HPC application n HPC application 2 SFQ(D) Proxy
1.99:1 3.97:1 8.10:1 16:21 32.73:1
20 40 60 80 100 120 140 2:1 4:1 8:1 16:1 32:1 Throughput (MB/s) App1
2.02:1 3.95:1 8.01:1 16.01:1 31.34:1
2:1 4:1 8:1 16:1 32:1 App1 Write vs. Read Write vs. Random R/ W Target Ratio
Limitations?
Lack of isolat ion bet ween large and small workloads S
FQ (D): st art -t ime fair queueing wit h I/ O dept h D
- S
tart times capture each flow’s service usage
- Dispatch requests in the increasing order of their start times
- D captures the available I/ O parallelism
- Allow up to D of outstanding requests
Limitations
4
f1 fn
I/Os
f2
SFQ(D)
D=10 f1 fn
I/Os
f2
SFQ(D)
D=10 But used 14!
In theory In practice
Interference!
Limitations
Lack of Met adat a I/ O scheduling Many HPC applicat ions are met adat a int ensive
- Metadata I/ O performance is important
5
S
- lution: vPFS
+
S
FQ(D)+
- A new scheduler to support diverse I/ O sizes
Metadata I/ O management
- An extension to support distributed scheduling of metadata
requests
PVFS
2-based real prototype
Comprehensive experimental evaluation
6 Ming Zhao, AS U
S FQ(D)+: Variable Cost I/ O Depth Allocation
Allocate the limited I/ O depth D to outstanding
requests based on their sizes
- Consider D as the number of available I/ O slots
- Each slot represents the cost of the smallest I/ Os
- Each outstanding request occupies one or multiple
slots based on its size
- S
top dispatching when D is used up
Effectively protect small I/ O workloads
- Low-rate I/ Os wait less for large outstanding
I/ Os to complete
- S
mall I/ Os are less affected by large I/ Os after dispatched
7 Ming Zhao, AS U
f1 fn
I/Os
f2
SFQ(D)
D=10 Used 10
S FQ(D)+: I/ O Backfilling
Large I/ Os at t he head of queue have t o wait t ill t here
are enough slot s
- Waste the currently available slots
Backfill promot es small I/ Os t o ut ilize t he available slot s
- S
imilar to the backfill of small j obs in batch scheduling
8
f1 f2
SFQ(D)
D=10 Used only 9 Backfill t1 t0 t0 < t1 f1 f2
SFQ(D)
D=10 Used 10 t1 t0 t0 < t1 Wasted 1
Metadata I/ O S cheduling
Extends the scheduling to both
data and metadata requests
- Apply S
FQ(D)+ to schedule metadata I/ Os on each server
- Treat metadata I/ Os as small I/ Os
Achieve total-metadata-service fair sharing for
distributed metadata servers
- Coordinate scheduling across distributed metadata servers
- Each scheduler adj usts its scheduling of local metadata
requests based on global metadata service distribution
9 Ming Zhao, AS U
Evaluation
Testbed
- vPFS
+ implemented for PVFS 2
- 8 Clients & 8 S
ervers, 1 gigabit switch
Workloads
- IOR: intensive checkpointing I/ Os
- multi-md-test: intensive metadata I/ Os
- BTIO: scientific application benchmark
- WRF: real-world scientific application
10 Ming Zhao, AS U
BTIO vs. IOR
BTIO—
Class C (4MB-16MB I/ Os), Class A (320B I/ Os)
vPFS
+ substantially reduces BTIO slowdown
11 Ming Zhao, AS U
S lows down IOR by 99% S lows down IOR by 56%
WRF vs. IOR
WRF—
a large number of small I/ Os and int ensive met adat a request s
vPFS
+ achieves 80% and 281% bet t er performance for WRF t han Nat ive and vPFS , respect ively
12 Ming Zhao, AS U
Metadata I/ O S cheduling
mult i-md-t est —
mkt est dir, creat e, writ e, readdir, read, close, rm, rmt est dir
vPFS
+ achieves nearly perfect fairness despit e dynamic met adat a demands for t wo met adat a-int ensive apps
13 Ming Zhao, AS U
Conclusions
I/ O diversity is becoming a top concern
- Different types of requests (POS
IX vs. MPI-IO, data vs. metadata)
- Different I/ O rates and sizes
vPFS
+ manages I/ O performance for diverse apps
- S
FQ(D)+ recognizes the variable cost of different I/ Os and takes it under control
- Distributed metadata scheduling supports metadata-
intensive applications
14 Ming Zhao, AS U
Future Work
Implement S
FQ(D)+ directly into data/ metadata servers
- Proxy-based scheduling may incur extra latency
- But its impact to throughput is small (< 1%
)
Evaluate vPFS
+ in larger and more diverse environments
- Performance isolation is even more important on larger
systems with more diverse workloads
- Faster storage does not eliminate performance isolation
the gap between processor and I/ O performance is still
increasing
15 Ming Zhao, AS U
Acknowledgement
16 Ming Zhao
National S
cience Foundation
- CNS
- 1629888, CNS
- 1619653, CNS
- 1562837,
CNS
- 1629888, CMMI-1610282, IIS
- 1633381