avoiding register overflow in the bakery algorithm the
play

Avoiding Register Overflow in the Bakery Algorithm The Bakery++ - PowerPoint PPT Presentation

Avoiding Register Overflow in the Bakery Algorithm The Bakery++ Algorithm The Bakery algorithm is the first true solution of mutual exclusion, but it suffers from register overflows. Bakery++ is a slightly modified version of Bakery that avoids


  1. Avoiding Register Overflow in the Bakery Algorithm The Bakery++ Algorithm The Bakery algorithm is the first true solution of mutual exclusion, but it suffers from register overflows. Bakery++ is a slightly modified version of Bakery that avoids overflows without introducing new variables or redefining the operations or functions of Bakery. Bakery++ is quite simple. Bakery++ is specified formally in the PlusCal language and verified correct using the TLC model checker. Amirhossein Sayyadabdi and Mohsen Sharifi SRMPDS ‘20, Edmonton, AB, Canada

  2. Communication-aware Job Scheduling using SLURM Priya Mishra, Tushar Agrawal, Preeti Malakar Indian Institute of Technology Kanpur MOTIVATION RESULTS METHODS • • Performance of communication-intensive jobs Proposed algorithms reduce the Greedy Allocation: Nodes affected by network contention, node-spread and execution times by 9% on average allocated on switches with job interference and the wait times by 31% across lower communication ratio three job logs (lower contention and higher free nodes) • Balanced and adaptive always • perform better than default and Balanced Allocation: Nodes greedy allocated in powers-of-two to minimize inter-switch • Proposed algorithms always communication perform better than the default for • the same cluster state (individual Adaptive Allocation: Selects runs) the more optimal node- OBJECTIVE allocation algorithm (greedy or Developing node-allocation algorithms that balanced) based on their cost consider the job’s behaviour during resource of communication allocation to improve the performance of communication-intensive jobs

  3. Charact cterizing t the Cost-Accu curacy P Performance of Cloud A Applications Sunimal Rathnayake, *Lavanya Ramapantulu, Yong Meng Teo National University of Singapore, *Nanyang Technological University Motivation ‒ results of different accuracy ‒ scalable Opportunity for ‒ resource demand ‒ resource pool Trading-off varies with ‒ pay for use accuracy Accuracy for Time charging ‒ e.g. machine and Cost learning cloud resources some cloud applications Contribution Approach two stage approach • Measurement-driven model and analysis - measurements for characterization • - model and optimization for determining cost, time and Cost-accuracy “sweet spots” configuration • Cost-accuracy and time-accuracy Pareto optimal configurations • Metrics for cost-accuracy and time-accuracy performance SRMPDS Workshop @ ICPP 2020

  4. SRMPDS 2020 Scheduling Task-parallel Applications in Dynamically Asymmetric Environments Jing Chen , Pirah Noor Soomro , Mustafa Abduljabbar, Madhavan Manivannan, Miquel Pericàs Motivations Method Results RWS RWSM-C FA FAM-C DA DAM-C DAM-P Performance Trace Table (PTT) 3500 u Applications sharing resources suffer 3000 2500 Throughput [Tasks/s] from interference. 2000 u Runtime scheduling techniques 1500 coupled with application knowledge 1000 500 can be used to mitigate interference. 0 2 3 4 5 6 u An online performance model is used u Goal: Performance prediction for future DAG Parallelism Interference: co-running application to predict task performance. tasks given a set of resources; u Entries: elastic execution place (leader core, u We leverage task moldability and RWS RWSM-C FA FAM-C DA DAM-C DAM-P 900 resource width); 800 knowledge of task criticality to adapt u One PTT for each task type; 700 Throughput [Tasks/s] to interference. 600 u Dynamic update of execution time records 500 u Our scheduler targets to minimize during execution; 400 300 resource usage, execution time and u Awareness of interference activities; 200 u Only require few information; 100 overcommitting of resources . 0 u PTT is independent of platforms; 2 3 4 5 6 DAG Parallelism u Low overhead. Interference: DVFS

  5. Network and Load-Aware Resource Manager for MPI Programs Ashish Kumar, Naman Jain, Preeti Malakar Department of Computer Science and Engineering, Indian Institute of Technology, Kanpur Problem Results Node allocation in a shared cluster for parallel jobs to max- imize performance considering both compute and network load on the cluster. Challenges Figure: Allocator workflow (a) N/W bandwidth (b) CPU load Figure: Variation across nodes Problem Formulation • Non exclusive access to nodes in shared cluster Model: Represent cluster as graph with vertices as compute nodes and edges as network links • Variation in load/utilization across time/nodes Objective: Find a sub-graph satisfying user demands such that the overall load of sub-graph • Topology does not capture the current state of network is minimized • Contention and congestion in the network due to Network Load Compute Load existing jobs − Measure of load on the P2P network link − Measure of overall load on the node • Varying computation and communication requirements − Considers bandwidth and latency − Static (core count, clock speed) & dynamic of different programs − Topology automatically gets captured (CPU load, available memory) attributes Algorithm Avg. gain Max. gain − NL ( u,v ) = w lt LT ( u,v ) + w bw BW ( u,v ) � − CL v = a ∈ attributes w a ∗ val va Core Components Random 49.9% 87.8% Resource Monitor Sequential 43.1% 84.5% Load Aware 32.4% 87.7% − Distributed monitoring system for cluster Algorithm − Uses light-weight daemons for periodically updating Table: Performance gain using our allocation method livehosts, node statistics and network status Observations − Find candidate sub-graphs Allocator − Our algorithm performs better than random, − Calculate total load for each sub-graph 2 . 2 4 . 1 1 . 8 3 . 7 − Allocates nodes based on user request sequential, and load-aware on an average. � Compute Load, C G v = u ∈V v CL u � �� � − Considers node attributes and network dynamics − Load-aware performed better than sequential for less � Network Load, N G v = ( x,y ) ∈E v NL ( x,y ) − Uses data collected by resource monitor number of nodes whereas worse for a large number of Total Load = α × C G v + β × N G v nodes. − Pick the best one according to total load Pictorial representation of allocation algorithm Conclusions and Future Work − Our algorithm reduces run-times by more than 38% over Candidate Selection Algorithm random, sequential and load-aware allocations. − Start with a particular node v − Formalization of weights estimation − Calculate addition load for all nodes w.r.t. start node − Extension to large scale systems, spanning over multiple A v ( u ) = α × CL ( u ) + β × NL ( v, u ) clusters. − Keep adding nodes in increasing order of addition load to sub-graph until request is satisfied

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend