Analysis and Approximation
- f Optimal Co-Scheduling on
Chip Multiprocessors
Yunlian Jiang Xipeng Shen The College of William & Mary, USA Jie Chen DoE Jefferson Lab , USA Rahul Tripath University of South Florida, USA
Analysis and Approximation of Optimal Co-Scheduling on Chip - - PowerPoint PPT Presentation
Analysis and Approximation of Optimal Co-Scheduling on Chip Multiprocessors Yunlian Jiang Xipeng Shen The College of William & Mary, USA Jie Chen DoE Jefferson Lab , USA Rahul Tripath University of South Florida, USA Cache
Analysis and Approximation
Chip Multiprocessors
Yunlian Jiang Xipeng Shen The College of William & Mary, USA Jie Chen DoE Jefferson Lab , USA Rahul Tripath University of South Florida, USA
Cache Sharing on CMP
Shorten inter-thread
communication
Flexible usage of cache
The College of William and Mary
2
CPU
Shared Cache
CPU
degrade performance impair fairness hurt performance isolation
Degradation is affected by co-runner
Performance degradation range
The College of William and Mary
3
20 40 60 80 100 120 140 160 180 200 Degradation min median max
Job Co-Scheduling
To assign jobs to chips in a manner to
minimize contention
The College of William and Mary
4
Shared cache 1 Shared cahe 2
P1 P3 P4 P2
Job Co-Scheduling
To assign jobs to chips in a manner to
minimize contention
The College of William and Mary
5
Shared cache 1 Shared cahe 2
P1 P3 P4 P2 Resource Waste Resource Contention
Job Co-Scheduling
To assign jobs to chips in a manner to
minimize contention
The College of William and Mary
6
Shared cache 2 Shared cache 1
P1 P3 P2 P4
Job Co-Scheduling
To assign jobs to chips in a manner to
minimize contention
The College of William and Mary
7
Shared cache 2 Shared cache 1
P1 P3 P2 P4
The Goal of this Work
Related work
Snavely etc. [00’ ASPLOS]
Goal of this work
Find the optimal schedule on CMP system
Benefits
Evaluate current schedule quality Applied in real system
The College of William and Mary
8
Contributions
Polynomial optimal solution on Dual-core
systems
NP-Completeness proof on K-core (K>2)
systems
Polynomial approximation algorithms on K-
core (K>2) systems
The College of William and Mary
9
Contributions
Polynomial optimal solution on Dual-core
systems
NP-Completeness proof on K-core (K>2)
systems
Polynomial approximation algorithms on K-
core (K>2) systems
The College of William and Mary
10
Problem Formulation
M jobs N Core processors The College of William and Mary
11
Problem Formulation
M jobs N Core processors The College of William and Mary
12
Problem Formulation
Assignment The College of William and Mary
13
i i i i
cCPI sCPI Deg sCPI − =
Goal Minimize ∑Degi
Dual-Core System
Polynomial Solution
Minimum-weight perfect matching
[Edmonds: 1965] Matching
A matching M in graph G is a set of edges with no
common vertex.
perfect matching is a matching which
matches all vertices of the graph
The College of William and Mary
14
Dual-Core System
Minimum-weight perfect
matching
In edge weighted
graph
Sum of weight of
edges in the match is minimum
the College of William and Mary
15
Dual-Core System
Job
Corun-Degradation
Optimal Schedule
perfect matching
the College of William and Mary
16
Contributions
Polynomial optimal solution on Dual-core
systems
NP-Completeness proof on K-core (K>2)
systems
Polynomial approximation algorithms on K-
core (K>2) systems
the College of William and Mary
17
NP-Completeness proof
NP proof
Given a schedule, can compute
Reduction
NP-Complete problem Job Co-scheduling
Multidimensional Assignment Problem (MAP)
the College of William and Mary
18
NP-Completeness Proof
MAP the College of William and Mary
19
NP-Completeness Proof
MAP the College of William and Mary
20 Weight
NP-Completeness Proof
MAP the College of William and Mary
21
Minimize Total Weight
Weight
NP-Completeness Proof
Job Co-Scheduling on CMP the College of William and Mary
22
NP-Completeness Proof
Job Co-Scheduling on CMP the College of William and Mary
23
Minimize Total Weight
Weight
Sum of Degradations in the Assignment
NP-Completeness Proof
MAP Job Co-Scheduling the College of William and Mary
24
NP-Completeness Proof
MAP Job Co-Scheduling the College of William and Mary
25 Weight
=
Weight
NP-Completeness Proof
MAP Job Co-Scheduling the College of William and Mary
26 Weight Weight
= =∞
Weight
Contributions
Polynomial optimal solution on Dual-core
systems
NP-Completeness proof on K-core (K>2)
systems
Polynomial approximation algorithms on K-
core (K>2) systems
the College of William and Mary
27
Approximation algorithms
Hierarchical Perfect Matching Greedy the College of William and Mary
28
Hierarchical Perfect Matching
Dual-core system optimal solution the College of William and Mary
29
N Core N/2 Core Dual Core
Hierarchical Perfect Matching
the College of William and Mary
30
Hierarchical Perfect Matching
the College of William and Mary
31
Hierarchical Perfect Matching
the College of William and Mary
32
Hierarchical Perfect Matching
the College of William and Mary
33
Hierarchical Perfect Matching
the College of William and Mary
34
Greedy Algorithm
Basic idea
Schedule the least “polite” job first
“politeness” of a Job
Sum of degradations of all the assignments
contain this job.
Impact of a job on others
the College of William and Mary
35
Greedy Algorithm
I.
Sort unassigned jobs based on politeness
II.
Pick the least politeness job J to schedule
degradation to schedule
the College of William and Mary
36
Local Optimization
Main Scheme the College of William and Mary
37
for i 1 to K-1 for j i+1 to K Local-Optimization( i, j )
K: number of assignments
Performance Evaluation
Machine
AMD Opteron 4 core processors
Benchmarks
15 SPEC CPU2000, 1 Stream
Metrics
Performance Degradation Scheduling time Fairness
the College of William and Mary
38
Performance Degradation
the College of William and Mary
39
10 20 30 40 50 60 70
Benchmarks
OPT Greedy Hierarchical Random
Performance Degradation
the College of William and Mary
40
10 20 30 40 50 60 70 ammp art applu bzip crafty equake facerec gap mcf parser stream swim twolf vpr Average Benchmarks
OPT Greedy-Opt Hierarchical-Opt Random
Scheduling Time
5 10 15 20 16 32 48 64 80 96 112 128 144 Number of Jobs Running Time(s)
Greedy Greedy-opt Hierarchical Hierarchical-opt
the College of William and Mary
41
Fairness
0.05 0.1 0.15 0.2 0.25
1Unfairness Factor
OPT Greedy
Greedy Hierarchical
Hierarchical Random the College of William and Mary
42
Unfairness Factor
Coefficient of Variation of normalized degradation
Conclusion
Job co-scheduling on CMP is crucial
Different schedule performance varies
Dual-core system
Polynomial solvable
K-core (K>2) system
NP-Complete problem Heuristics
Hierarchical Greedy
the College of William and Mary
43
Acknowledgement
Weizhen Mao William and Mary Cliff Stein Columbia University William Cook Georgia Tech National Science Foundation IBM CAS Fellowship the College of William and Mary
44
the College of William and Mary
45