a moldable online scheduling algorithm and its
play

A Moldable Online Scheduling Algorithm and Its Application to - PowerPoint PPT Presentation

A Moldable Online Scheduling Algorithm and Its Application to Parallel Short Sequence Mapping Erik Saule , Doruk Bozda g, Umit V. Catalyurek Department of Biomedical Informatics, The Ohio State University { esaule,bozdagd,umit } @bmi.osu.edu


  1. A Moldable Online Scheduling Algorithm and Its Application to Parallel Short Sequence Mapping Erik Saule , Doruk Bozda˘ g, Umit V. Catalyurek Department of Biomedical Informatics, The Ohio State University { esaule,bozdagd,umit } @bmi.osu.edu JSSPP 2010 Supported by the U.S. DOE SciDAC Institute, the U.S. National Science Foundation and the Ohio Supercomputing Center Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule :: 1 / 26 HPC Lab http://bmi.osu.edu/hpc

  2. Motivation Mapping Sequencing Map reads to a reference genome Next generation efficiently (Human genome: 3Gb) sequencing instruments (SOLiD, Solexa, 454) can Need large parallel computer sequence up to 1 billion Pooling resource will decrease cost bases a day We study the job scheduling Hundreds of millions of problem 35-50 base reads Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule Introduction:: 2 / 26 HPC Lab http://bmi.osu.edu/hpc

  3. Parallel Short Sequence Mapping[Bozdag et al. , IPDPS 09] Three partitioning dimensions: G G R G R P ( m g , m r , m s ) = c gs + c g + c rs + ( c r + c c ) m g m g m s m r m g m s m r m s Partitioning on m processors is finding minimum P ( m g , m r , m s ) such that m g m r m s ≤ m Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule Introduction:: 3 / 26 HPC Lab http://bmi.osu.edu/hpc

  4. Outline of the Talk Introduction 1 A Moldable Scheduling Problem 2 Deadline Based Online Scheduler (DBOS) 3 Experiments 4 Conclusion 5 Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule Introduction:: 4 / 26 HPC Lab http://bmi.osu.edu/hpc

  5. Parallel Short Sequence Mapping The important facts: 22 speedup can adapt to different number 20 18 of processor 16 good runtime prediction 14 12 function 10 8 no super linear speed up 6 4 non convex speedup function 2 (steps) 0 0 5 10 15 20 25 30 no preemption Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule A Moldable Scheduling Problem:: 5 / 26 HPC Lab http://bmi.osu.edu/hpc

  6. Moldable Scheduling Instance m processors n tasks Task i arrives at r i The execution of i on j processors takes p i , j time units 12 6 10 4 7 3 5 Solution Task i is executed on π i processors Task i starts at σ i Task i finishes at C i = σ i + p i ,π i Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule A Moldable Scheduling Problem:: 6 / 26 HPC Lab http://bmi.osu.edu/hpc

  7. Objective Function Flow time The flow time is the time spent in the system per a task F i = C i − r i . Does not take task size into account. Optimizing the maximum flow time is unfair to small tasks. Optimizing the average flow time should starve large tasks. Stretch [Bender et al. SoDA 98] The stretch is the flow time normalized by the processing time of the task. In the moldable tasks context, we define it as s i = C i − r i p i , 1 . It provides a better fairness between tasks. Optimizing maximum stretch avoids starvation. Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule A Moldable Scheduling Problem:: 7 / 26 HPC Lab http://bmi.osu.edu/hpc

  8. Objective Function Flow time The flow time is the time spent in the system per a task F i = C i − r i . Does not take task size into account. Optimizing the maximum flow time is unfair to small tasks. Optimizing the average flow time should starve large tasks. Stretch [Bender et al. SoDA 98] The stretch is the flow time normalized by the processing time of the task. In the moldable tasks context, we define it as s i = C i − r i p i , 1 . It provides a better fairness between tasks. Optimizing maximum stretch avoids starvation. Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule A Moldable Scheduling Problem:: 7 / 26 HPC Lab http://bmi.osu.edu/hpc

  9. Online maximum stretch can not be approximated Adversary technique on one processor A large task enters in the system On several processors There are similar techniques on several processors but there are more complicated and thus less prone to appear in practice. The key point: if all processors are busy, a small task entering the system will have a large stretch. Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule A Moldable Scheduling Problem:: 8 / 26 HPC Lab http://bmi.osu.edu/hpc

  10. Online maximum stretch can not be approximated Adversary technique on one processor If it is scheduled immediately, a small task is sent On several processors There are similar techniques on several processors but there are more complicated and thus less prone to appear in practice. The key point: if all processors are busy, a small task entering the system will have a large stretch. Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule A Moldable Scheduling Problem:: 8 / 26 HPC Lab http://bmi.osu.edu/hpc

  11. Online maximum stretch can not be approximated Adversary technique on one processor It suffers a large delay (and an unbounded stretch) On several processors There are similar techniques on several processors but there are more complicated and thus less prone to appear in practice. The key point: if all processors are busy, a small task entering the system will have a large stretch. Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule A Moldable Scheduling Problem:: 8 / 26 HPC Lab http://bmi.osu.edu/hpc

  12. Online maximum stretch can not be approximated Adversary technique on one processor If the large task is scheduled later, a small task is sent accordingly On several processors There are similar techniques on several processors but there are more complicated and thus less prone to appear in practice. The key point: if all processors are busy, a small task entering the system will have a large stretch. Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule A Moldable Scheduling Problem:: 8 / 26 HPC Lab http://bmi.osu.edu/hpc

  13. Online maximum stretch can not be approximated Adversary technique on one processor It suffers a large delay (and an unbounded stretch) On several processors There are similar techniques on several processors but there are more complicated and thus less prone to appear in practice. The key point: if all processors are busy, a small task entering the system will have a large stretch. Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule A Moldable Scheduling Problem:: 8 / 26 HPC Lab http://bmi.osu.edu/hpc

  14. Online maximum stretch can not be approximated Adversary technique on one processor It suffers a large delay (and an unbounded stretch) On several processors There are similar techniques on several processors but there are more complicated and thus less prone to appear in practice. The key point: if all processors are busy, a small task entering the system will have a large stretch. Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule A Moldable Scheduling Problem:: 8 / 26 HPC Lab http://bmi.osu.edu/hpc

  15. Outline of the Talk Introduction 1 A Moldable Scheduling Problem 2 Deadline Based Online Scheduler (DBOS) 3 Experiments 4 Conclusion 5 Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule Deadline Based Online Scheduler (DBOS):: 9 / 26 HPC Lab http://bmi.osu.edu/hpc

  16. Principle of the Deadline Based Online Scheduler (DBOS) All tasks running concurrently should get the same stretch to maximize efficiency Using the optimal maximum stretch as an instant measure of the load Aim at a more efficient schedule than the optimal instant maximum stretch one to deal with still-to-arrive tasks Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule Deadline Based Online Scheduler (DBOS):: 10 / 26 HPC Lab http://bmi.osu.edu/hpc

  17. The DBOS Algorithm Targeting a maximum stretch S Task i must complete before the deadline D i = r i + p i , 1 S . Moldable Earliest Deadline First (MEDF) Considers task in deadline order. Allocates the minimum number of processors to each task to completes before the deadline. Schedules the task as soon as possible without moving any other task. DBOS ( ρ ) Estimate the best maximum stretch S * using a binary search. The deadline problem is solved by MEDF. Build a schedule of good efficiency of stretch ρ S *. ρ is the online parameter Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule Deadline Based Online Scheduler (DBOS):: 11 / 26 HPC Lab http://bmi.osu.edu/hpc

  18. The DBOS Algorithm Targeting a maximum stretch S Task i must complete before the deadline D i = r i + p i , 1 S . Moldable Earliest Deadline First (MEDF) Considers task in deadline order. Allocates the minimum number of processors to each task to completes before the deadline. Schedules the task as soon as possible without moving any other task. DBOS ( ρ ) Estimate the best maximum stretch S * using a binary search. The deadline problem is solved by MEDF. Build a schedule of good efficiency of stretch ρ S *. ρ is the online parameter Ohio State University, Biomedical Informatics Moldable Task Scheduling Erik Saule Deadline Based Online Scheduler (DBOS):: 11 / 26 HPC Lab http://bmi.osu.edu/hpc

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend