analysis and approximation of optimal co scheduling on
play

Analysis and Approximation of Optimal Co-Scheduling on Chip - PowerPoint PPT Presentation

Analysis and Approximation of Optimal Co-Scheduling on Chip Multiprocessors Yunlian Jiang Xipeng Shen The College of William & Mary, USA Jie Chen DoE Jefferson Lab , USA Rahul Tripath University of South Florida, USA Cache


  1. Analysis and Approximation of Optimal Co-Scheduling on Chip Multiprocessors Yunlian Jiang Xipeng Shen The College of William & Mary, USA Jie Chen DoE Jefferson Lab , USA Rahul Tripath University of South Florida, USA

  2. Cache Sharing on CMP  Shorten inter-thread communication  Flexible usage of cache CPU CPU  degrade performance  impair fairness Shared Cache  hurt performance isolation 2 The College of William and Mary

  3. Degradation is affected by co-runner Performance degradation range min median max 200 180 160 140 Degradation 120 100 80 60 40 20 0 3 The College of William and Mary

  4. Job Co-Scheduling  To assign jobs to chips in a manner to minimize contention P1 P2 Shared cache 1 P3 Shared cahe 2 P4 4 The College of William and Mary

  5. Job Co-Scheduling  To assign jobs to chips in a manner to minimize contention Resource Waste P2 P1 Shared cache 1 Resource Contention P3 P4 Shared cahe 2 5 The College of William and Mary

  6. Job Co-Scheduling  To assign jobs to chips in a manner to minimize contention P1 P2 Shared cache 1 P3 Shared cache 2 P4 6 The College of William and Mary

  7. Job Co-Scheduling  To assign jobs to chips in a manner to minimize contention P1 P3 Shared cache 1 P2 P4 Shared cache 2 7 The College of William and Mary

  8. The Goal of this Work  Related work  Snavely etc. [00’ ASPLOS]  Goal of this work  Find the optimal schedule on CMP system  Benefits  Evaluate current schedule quality  Applied in real system 8 The College of William and Mary

  9. Contributions  Polynomial optimal solution on Dual-core systems  NP-Completeness proof on K-core (K>2) systems  Polynomial approximation algorithms on K- core (K>2) systems 9 The College of William and Mary

  10. Contributions  Polynomial optimal solution on Dual-core systems  NP-Completeness proof on K-core (K>2) systems  Polynomial approximation algorithms on K- core (K>2) systems 10 The College of William and Mary

  11. Problem Formulation  M jobs  N Core processors 11 The College of William and Mary

  12. Problem Formulation  M jobs  N Core processors 12 The College of William and Mary

  13. Problem Formulation  Assignment − cCPI sCPI = i i Deg i sCPI i  Goal Minimize ∑ Deg i 13 The College of William and Mary

  14. Dual-Core System  Polynomial Solution  Minimum-weight perfect matching [Edmonds: 1965]  Matching  A matching M in graph G is a set of edges with no common vertex.  perfect matching is a matching which matches all vertices of the graph 14 The College of William and Mary

  15. Dual-Core System  Minimum-weight perfect matching  In edge weighted graph  Sum of weight of edges in the match is minimum 15 the College of William and Mary

  16. Dual-Core System  Job  Nodes  Corun-Degradation  Edge Weight  Optimal Schedule  Minimum weight perfect matching 16 the College of William and Mary

  17. Contributions  Polynomial optimal solution on Dual-core systems  NP-Completeness proof on K-core (K>2) systems  Polynomial approximation algorithms on K- core (K>2) systems 17 the College of William and Mary

  18. NP-Completeness proof  NP proof  Given a schedule, can compute  Reduction  NP-Complete problem  Job Co-scheduling  Multidimensional Assignment Problem (MAP) 18 the College of William and Mary

  19. NP-Completeness Proof  MAP 19 the College of William and Mary

  20. NP-Completeness Proof  MAP Weight 20 the College of William and Mary

  21. NP-Completeness Proof  MAP Weight Minimize Total Weight 21 the College of William and Mary

  22. NP-Completeness Proof  Job Co-Scheduling on CMP 22 the College of William and Mary

  23. NP-Completeness Proof  Job Co-Scheduling on CMP Sum of Degradations in the Assignment Weight Minimize Total Weight 23 the College of William and Mary

  24. NP-Completeness Proof  MAP  Job Co-Scheduling 24 the College of William and Mary

  25. NP-Completeness Proof  MAP  Job Co-Scheduling = Weight Weight 25 the College of William and Mary

  26. NP-Completeness Proof  MAP  Job Co-Scheduling = Weight Weight = ∞ Weight 26 the College of William and Mary

  27. Contributions  Polynomial optimal solution on Dual-core systems  NP-Completeness proof on K-core (K>2) systems  Polynomial approximation algorithms on K- core (K>2) systems 27 the College of William and Mary

  28. Approximation algorithms  Hierarchical Perfect Matching  Greedy 28 the College of William and Mary

  29. Hierarchical Perfect Matching  Dual-core system optimal solution N Core N/2 Core Dual Core 29 the College of William and Mary

  30. Hierarchical Perfect Matching 30 the College of William and Mary

  31. Hierarchical Perfect Matching 31 the College of William and Mary

  32. Hierarchical Perfect Matching 32 the College of William and Mary

  33. Hierarchical Perfect Matching 33 the College of William and Mary

  34. Hierarchical Perfect Matching 34 the College of William and Mary

  35. Greedy Algorithm  Basic idea  Schedule the least “polite” job first  “politeness” of a Job  Sum of degradations of all the assignments contain this job.  Impact of a job on others 35 the College of William and Mary

  36. Greedy Algorithm Sort unassigned jobs based on politeness I. Pick the least politeness job J to schedule II. III. Add assignment contains J with least degradation to schedule IV. Update unassigned job list 36 the College of William and Mary

  37. Local Optimization  Main Scheme for i  1 to K-1 K: number of assignments for j  i+1 to K Local-Optimization( i, j ) 37 the College of William and Mary

  38. Performance Evaluation  Machine  AMD Opteron 4 core processors  Benchmarks  15 SPEC CPU2000, 1 Stream  Metrics  Performance Degradation  Scheduling time  Fairness 38 the College of William and Mary

  39. Performance Degradation 70 OPT 60 Greedy Perf. Degradation(%) 50 Hierarchical Random 40 30 20 10 0 Benchmarks 39 the College of William and Mary

  40. Performance Degradation 70 OPT 60 Greedy-Opt Perf. Degradation(%) 50 Hierarchical-Opt 40 Random 30 20 10 0 ammp applu bzip crafty equake facerec gap mcf parser swim twolf vpr Average art stream Benchmarks 40 the College of William and Mary

  41. Scheduling Time Running Time(s) 20 Greedy Greedy-opt 15 Hierarchical 10 Hierarchical-opt 5 0 16 32 48 64 80 96 112 128 144 Number of Jobs 41 the College of William and Mary

  42. Fairness  Unfairness Factor  Coefficient of Variation of normalized degradation OPT Unfairness Factor 0.25 Greedy opt 0.2 Greedy Hierarchical opt 0.15 Hierarchical 0.1 Random 0.05 0 1 42 the College of William and Mary

  43. Conclusion  Job co-scheduling on CMP is crucial  Different schedule performance varies  Dual-core system  Polynomial solvable  K-core (K>2) system  NP-Complete problem  Heuristics  Hierarchical  Greedy 43 the College of William and Mary

  44. Acknowledgement  Weizhen Mao William and Mary  Cliff Stein Columbia University  William Cook Georgia Tech  National Science Foundation  IBM CAS Fellowship 44 the College of William and Mary

  45. Thanks! Questions? 45 the College of William and Mary

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend