Online Load Balancing with Learned Weights Benjamin Moseley Tepper - PowerPoint PPT Presentation

Online Load Balancing with Learned Weights Benjamin Moseley Tepper School of Business, Carnegie Mellon University Relational-AI Joint work with: Silvio Lattanzi (Google), Thomas Lavastida (CMU), and Sergei Vassilvitskii (Goolge)

Data Center Scheduling • Client Server Scheduling • Processed in m machines in the restricted assignment setting (more generally unrelated machines ) • Jobs arrive over time in the online-list model • Assign jobs to the machines to minimize makespan

Load Balancing under Restricted Assignment • m machines • n jobs • Online list: a job must be immediately assigned before the next job arrives • N(j): feasible machines for job j • p(j): size of job j (complexity essentially the same if unit sized ) • Minimize the maximum load • Optimal load is T

Online Competitive Analysis Model ALG ( I ) • c-competitive OPT ( I ) ≤ c • Worst case relative performance on each input I • Problem well understood: • A lower bound on any online algorithm Ω (log m ) • Greedy is a competitive algorithm [Azar, O (log m ) Naor, and Rom 1995]

Beyond Worst Case • Reasonable assumption: • Access to job traces • Desire a model to assist in assigning future jobs based on the past. • Predict the future based on the past. • What should be predicted? • How can it be predicted?

Learning and Online Algorithms • Combining learning and optimization • Caching [Lykouris and Vassilvitskii 2018] • Ski Rental [Purohit et al 2018] • Non-clairvoyant scheduling [Purohit et al 2018]

Building a Model • Guiding principals • Computable based on prior job traces • Predictions should be reasonably sized • Should be robust to error or inconsequential changes to the input • Focus on quantity to predict • Independent of learning algorithm used to construct the prediction • Focus on the worst case with access to the prediction • Goal: beat log(m) when error is small • Competitive ratio should depend on the error

What to Predict? • Load of the machines in the optimal solution? • Perhaps we can identify the contentious machines? 80 makespan 80 60 40 20 0 Machine 1 Machine 2 Machine 3 Machine 4 optimal solution

What to Predict? • Load of the machines in the optimal solution? • Perhaps we can identify the contentious machines? No new instance 80 padded with 60 dummy jobs 40 loads the same 20 0 Machine 1 Machine 2 Machine 3 Machine 4 optimal solution

What to Predict? • Number of jobs that can be assigned to a machine? • Perhaps machines that can be assigned more jobs are more contentious?

What to Predict? • Number of jobs that can be assigned to a machine • Consider the following gadget to any instance New jobs can be assigned to old machines, skewing ‘degrees’ adversarially New jobs say have a private machine. Old Machine

What to Predict? • Distribution on job types • Is this the best predictive model? • job types possible 2 m • Need to predict a lot of information in some cases • Perhaps not the right model if information is sparse

What to Predict? • Predict dual variables • Known to be useful for matching in the random order model [Devanur and Hayes, Vee et al.] • Read a portion of the input • Compute the duals • Prove a primal assignment can be (approximately) constructed from the duals online • Use duals to make assignments on remaining input

What to Predict? • Predict dual variables for makespan scheduling • Can derive primal based on dual • Sensitive to small error (e.g. changing a variable by a factor of 1/n 1/2 has the potential to drastically change the schedule)

What to Predict? • Idea: Capture contentiousness of a machine • Seems like the most important quantity besides types of jobs

Machine Weights • Predict a weight for each machine • Single number (compact) • Lower weight means more restrictive machine • Higher weight less restrictive • Framework: • Predict machine weights • Using to construct fractional assignments • Round to an integral solution online

Results on Predictions • Existence of weights • Theorem 1 : Let T be optimal max load. For any ε > 0, there exists machine weights and a rule to convert the weights to fractional assignments such that the resulting fractional max load is at most (1+ ε )T. • Theorem 2: Given predictions of the machine weights with maximum relative error η > 1, there exists an online algorithm yielding fractional assignments for which the fractional max load is bounded by O(Tmin{log( η ), log(m)}).

Results on Rounding • Theorem 3 : There exists an online algorithm that takes as input fractional assignments and outputs integer assignments for which the maximum load is bounded by O((loglog(m)) 3 T’), where T’ is maximum fractional load of the input. The algorithm is randomized and succeeds with probability at least 1- 1 / m c . • Corollary : There exists an O(min{(loglog(m)) 3 log( η ), log m}) competitive algorithm for restricted assignment in the online algorithms with learning setting • Theorem 4 : Any randomized online rounding algorithm has worst case load at least Ω ( T 0 log log m )

Existence of Good Weights • Each machine i has a weight w i • Job j is assigned to machine i fractionally as follows: w i x i,j = P i 0 ∈ N ( j ) w i 0

Existence of Good Weights • There exists weights that satisfy the following for all machines i X x i,j ≤ (1 + ✏ ) T j • Existence builds from [Agrawal, Zadimoghaddam, Mirrokni 2018] • Used for approximate maximum matching

Finding the Weights • Algorithm sketch for computing weights given an instance • Initialize all weights to be the same • While there is an overloaded machine • For each machine i w i X X • Current load of machine i: L i = x i,j = P i 0 ∈ N ( j ) w i 0 j j • If L i ≥ (1 + ✏ ) T • Divide by (1 + ✏ ) w i

Accounting for Error in the Predicted Weight • Say we are given a prediction ˆ w ˆ w i η = max • Let the error be the maximum i w i • If a machine is overloaded, run an iteration of the weight computation algorithm online log η • Converges in steps log m • If the load is greater than a factor off then revert to another online algorithm (i.e. greedy) O ( T min { log η , log m } ) • Get a fractional makespan at most

Setup for Rounding Algorithm • Jobs arrive online • When j arrives it reveals all over all machines i x i,j • Assign each job immediately when it arrives • Compare maximum load to the maximum factional load seen so far

Rounding Algorithm • Possible approaches • Prior LP rounding techniques • Techniques are too sophisticated to be used online i.e.[Lenstra, Shmoys, Tardos 1990] needs a basic solution, BFS on support graph,… • Deterministic rounding • We show a lower bound Ω (log m ) • Vanilla randomized rounding • Easy to construct instances where a machine is over loaded by Ω (log m )

Rounding Algorithm • Use randomized rounding with deterministic assignments • Assign jobs to machines using the distribution defined by the fractional assignment • If a job picks a machine with load more than Tc log log m • c is some constant • The job fails • Let F be the set of failed jobs • Assign failed jobs using greedy (i.e. assign to the the least loaded feasible machine)

Analysis of the Rounding Algorithm • Assume jobs (machines) have at most machines log m (jobs) in the support of their fractional assignment. • Most interesting case • Only care about failed jobs (others have small makespan) • Consider conceptually creating a graph G • Nodes are failed jobs • Two jobs are connected if they share the same machine

Greedy on Failed Jobs • Prove components have polylogarithmic size, say with high probability O (log m ) • Greedy is an approximation for an O (log m 0 ) instance with m’ machines • Each component is a separate instance with m � = polylog m number machines O (log m 0 ) = O (log log m ) • Greedy gives a approximation to the fractional load

Future Work • How to combine learning with optimization • Can predictions be used to discover improved algorithms ? • Theoretical model characterizing good predictions? • Does there a exist generic algorithm for using data?

Thank you! Questions?

Online Load Balancing with Learned Weights Benjamin Moseley Tepper - PowerPoint PPT Presentation

Online Load Balancing with Learned Weights Benjamin Moseley Tepper School of Business, Carnegie Mellon University Relational-AI Joint work with: Silvio Lattanzi (Google), Thomas Lavastida (CMU), and Sergei Vassilvitskii (Goolge) Data Center

Load Balancing Load Balancing Load balancing: distributing data and/or computations across

Load Balancing with nftables by Laura Garca (Zen Load Balancer Team) Netdev 1.1 Prototype of

Internal Load Balancing in 5 mins Deliver scalable and resilient internal-only services on GCP

Dynamic Load Balancing in Dynamic Load Balancing in Charm+ + Charm+ + Abhinav S Bhatele

Epidemic Algorithm for Load Balancing Harshitha Menon, Laxmikant Kal e 15th April 1 / 25

L O A D B A L A N C I N G I S I M P O S S I B L E LOAD BALANCING IS IMPOSSIBLE Tyler McMullen

Load Balancing in Ceph: Load Balancing With Pseudorandom Placement Esteban Molina-Estolano,

Balancing Gas system information provision 12 June 2018 GRTgaz balancing in a nutshell -> 2

Load Balancing Load Balancing: Example Example Problem Consider 6 jobs whose processing times

Load Balancing and Termination Detection Load balancing used to distribute computations fairly

CTA WEIGHTS AND CTA WEIGHTS AND DIMENSIONS DIMENSIONS INITIATIVES INITIATIVES Meeting of the

Plane partitions with two-periodic weights Sevak Mkrtchyan University of Rochester GGI June 15,

The Impact of Weights on the Performance of Server Load Balancing (SLB) Systems Jrg Jung

Vertical Stress Increases Chapter 8 Point Load 1 3/25/2015 Point Load Point Load

1 1 Slide 5 Slide 6 Partitioning and Load Balancing Partitioning Goals Assignment of

Gone WILD Richard Wang, Dana Butnariu, Jennifer Rexford Key Tradeoffs Load Balancing 1. Fast

Interpreting Social Media Elijah Mayfield School of Computer Science Carnegie Mellon University

..3 -,/ $

CIS 330: Applied Database Systems Lecture 7: Technologies at the Three Tiers Alan Demers

Tag Spam Creates Large Non-Giant Connected Components Non-Giant Connected Components Nicolas

On Some Stochastic Load Balancing Problems Anupam Gupta Carnegie Mellon University Joint work

COMP558 Network Games Martin Gairing University of Liverpool, Computer Science Dept 2nd

Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A.

Stochastic Load Balancing on Unrelated Machines Viswanath Nagarajan Industrial & Operations

Online Load Balancing with Learned Weights Benjamin Moseley Tepper - PowerPoint PPT Presentation

Online Load Balancing with Learned Weights Benjamin Moseley Tepper School of Business, Carnegie Mellon University Relational-AI Joint work with: Silvio Lattanzi (Google), Thomas Lavastida (CMU), and Sergei Vassilvitskii (Goolge) Data Center

Load Balancing Load Balancing Load balancing: distributing data and/or computations across

Load Balancing with nftables by Laura Garca (Zen Load Balancer Team) Netdev 1.1 Prototype of

Internal Load Balancing in 5 mins Deliver scalable and resilient internal-only services on GCP

Dynamic Load Balancing in Dynamic Load Balancing in Charm+ + Charm+ + Abhinav S Bhatele

Epidemic Algorithm for Load Balancing Harshitha Menon, Laxmikant Kal e 15th April 1 / 25

L O A D B A L A N C I N G I S I M P O S S I B L E LOAD BALANCING IS IMPOSSIBLE Tyler McMullen

Load Balancing in Ceph: Load Balancing With Pseudorandom Placement Esteban Molina-Estolano,

Balancing Gas system information provision 12 June 2018 GRTgaz balancing in a nutshell -&gt; 2

Load Balancing Load Balancing: Example Example Problem Consider 6 jobs whose processing times

Load Balancing and Termination Detection Load balancing used to distribute computations fairly

CTA WEIGHTS AND CTA WEIGHTS AND DIMENSIONS DIMENSIONS INITIATIVES INITIATIVES Meeting of the

Plane partitions with two-periodic weights Sevak Mkrtchyan University of Rochester GGI June 15,

The Impact of Weights on the Performance of Server Load Balancing (SLB) Systems Jrg Jung

Vertical Stress Increases Chapter 8 Point Load 1 3/25/2015 Point Load Point Load

1 1 Slide 5 Slide 6 Partitioning and Load Balancing Partitioning Goals Assignment of

Gone WILD Richard Wang, Dana Butnariu, Jennifer Rexford Key Tradeoffs Load Balancing 1. Fast

Interpreting Social Media Elijah Mayfield School of Computer Science Carnegie Mellon University

..3 -,/ $

CIS 330: Applied Database Systems Lecture 7: Technologies at the Three Tiers Alan Demers

Tag Spam Creates Large Non-Giant Connected Components Non-Giant Connected Components Nicolas

On Some Stochastic Load Balancing Problems Anupam Gupta Carnegie Mellon University Joint work

COMP558 Network Games Martin Gairing University of Liverpool, Computer Science Dept 2nd

Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A.

Stochastic Load Balancing on Unrelated Machines Viswanath Nagarajan Industrial &amp; Operations

Balancing Gas system information provision 12 June 2018 GRTgaz balancing in a nutshell -> 2

Stochastic Load Balancing on Unrelated Machines Viswanath Nagarajan Industrial & Operations