[PPT] - Balanced Allocation with Random Walk Based Sampling Dengwang Tang PowerPoint Presentation

SLIDE 1

Introduction Related Work Main Results Discussion References

Balanced Allocation with Random Walk Based Sampling

Dengwang Tang

Electrical and Computer Engineering Department University of Michigan, Ann Arbor

Joint work with Vijay Subramanian

October 9, 2018

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 2

Introduction Related Work Main Results Discussion References

Balls in Bins Model

m balls are placed into n bins sequentially according to some policy The dispatching policy is usually random or partially random The maximum load (i.e. number of balls in the fullest bin) is usually the quantity of interest Applications

Resource Allocation Distributed Hash Tables Load Balancing in Cloud Computing

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 3

Introduction Related Work Main Results Discussion References

Example: Random Allocation

Random Allocation Policy:

Each ball is inserted into a bin uniformly at random The choice of bin for each ball is independent

Theorem Let m = n. The maximum load under random allocation policy is (1 + o(1))

log n log log n with high probability as n → ∞.

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 4

Introduction Related Work Main Results Discussion References

Example: Power of d choices

Power of d choice policy [Azar et al 1999]:

For each ball, sample d bins uniformly and independently Place the ball into the least loaded bin among d sampled bins Ties are broken arbitrarily

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 5

Introduction Related Work Main Results Discussion References

Example: Power of d choices

Theorem Let m = n. The maximum load under power-of-d-choices allocation policy is log log n

log d

+ Θ(1) with high probability as n → ∞. Much better than random allocation!

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 6

Introduction Related Work Main Results Discussion References

Motivation

At each time, uniform random bins are chosen Are there any other process which returns a uniform random bin? Random walk on k-regular graphs.

Stationary distribution: Uniform on all vertices.

What about sampling bins using d independent random walks?

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 7

Introduction Related Work Main Results Discussion References

Motivation

Randomness (or Entropy) is an precious resource in computer. How many times per ball do you need to throw dice in order to implement power-of-d-choices with n bins? Answer: d log6 n

……

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 8

Introduction Related Work Main Results Discussion References

Motivation

In Computer Science Literature, random walks on certain graphs are utilized to derandomize a randomized algorithm Can we “derandomize” power-of-d-choices policy?

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 9

Introduction Related Work Main Results Discussion References

Alternative ways of sampling

Alon et al. (2007): d = 1 bin is sampled through a non-backtracking random walk of length n on a high girth expander graph. V¨

king (2003): d bins are sampled from d disjoint groups of

bins respectively. Kenthapadi and Panigraphy (2006): d = 2 bins are sampled from a random edge of a high degree regular graph. Pourmiri (2016): d bins are sampled by a random walk starting from a random vertex each time. Godfrey (2008): A random set of d = logΘ(1) n bins is chosen each time, where the random set satisfy some “balancedness” property.

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 10

Introduction Related Work Main Results Discussion References

Alternative ways of sampling

Alon et al. (2007): d = 1 bin is sampled through a non-backtracking random walk of length n on a high girth expander graph. V¨

king (2003): d bins are sampled from d disjoint groups of

bins respectively. Kenthapadi and Panigraphy (2006): d = 2 bins are sampled from a random edge of a high degree regular graph. Pourmiri (2016): d bins are sampled by a random walk starting from a random vertex each time. Godfrey (2008): A random set of d = logΘ(1) n bins is chosen each time, where the random set satisfy some “balancedness” property.

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 11

Introduction Related Work Main Results Discussion References

Alon et al.’s Model

Alon et al. (2007) investigated the the maximum load after inserting n balls into n bins based on the location of a non-backtracking random walker

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 12

Introduction Related Work Main Results Discussion References

Alon et al.’s Model

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 13

Introduction Related Work Main Results Discussion References

Alon et al.’s Model

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 14

Introduction Related Work Main Results Discussion References

Alon et al.’s Model

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 15

Introduction Related Work Main Results Discussion References

Alon et al.’s Model

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 16

Introduction Related Work Main Results Discussion References

Alon et al.’s Model

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 17

Introduction Related Work Main Results Discussion References

Alon et al.’s Model

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 18

Introduction Related Work Main Results Discussion References

Alon et al.’s Model

Theorem (Alon et al.) If the graph G is an expander graph whose girth is greater than 10 logk−1 log n, then the maximum load after inserting n balls into n bins is (1 + o(1))

log n log log n w.h.p. as n → ∞.

Same performance as random allocation But randomness used is reduced!

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 19

Introduction Related Work Main Results Discussion References

Motivation

Alon et al. raised the following open question ... Let W1 and W2 denote two non-backtracking random walks on an expander of high girth, and suppose that in each step we are given a choice between the two current locations

f W1 and W2, and pick the least loaded one. Does the

maximal load decrease from Θ(

log n log log n) to Θ(log log n) in this

setting as well? ... We answered this question!

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 20

Introduction Related Work Main Results Discussion References

Main Results

We analyzed two models for allocating n balls into n bins. For both models:

Bins are associated with vertices of a k-regular graph Gn. Gn is fixed throughout the process. W1[j], W2[j], · · · , Wd[j] are candidate bins for the j-th ball. The j-th ball is allocated to the least loaded bin of W1[j], W2[j], · · · , Wd[j].

We consider the case where k, d are fixed and n → ∞.

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 21

Introduction Related Work Main Results Discussion References

Model I

Model 1 W1[j], W2[j], · · · , Wd[j] are independent non-backtracking random

walks. Except when

either the random walkers’ paths “intersect”.

r a reset has not occurred for T steps

Then W1[j], W2[j], · · · , Wd[j] are reset to independent uniform random positions. Assumption 1 The graph Gn is such that Pr (randomly initialized random walkers intersect before T) < 0.1

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 22

Introduction Related Work Main Results Discussion References

Model I

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

n = 10 bins d = 2 walkers/choices G = 3-regular graph T = 4

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 23

Introduction Related Work Main Results Discussion References

Model I

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

n = 10 bins d = 2 walkers/choices G = 3-regular graph T = 4

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 24

Introduction Related Work Main Results Discussion References

Model I

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

n = 10 bins d = 2 walkers/choices G = 3-regular graph T = 4

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 25

Introduction Related Work Main Results Discussion References

Model I

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

n = 10 bins d = 2 walkers/choices G = 3-regular graph T = 4

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 26

Introduction Related Work Main Results Discussion References

Model I

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

n = 10 bins d = 2 walkers/choices G = 3-regular graph T = 4

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 27

Introduction Related Work Main Results Discussion References

Model I

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

n = 10 bins d = 2 walkers/choices G = 3-regular graph T = 4

intersected!

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 28

Introduction Related Work Main Results Discussion References

Model I

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

n = 10 bins d = 2 walkers/choices G = 3-regular graph T = 4

Reset!

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 29

Introduction Related Work Main Results Discussion References

Model I

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

n = 10 bins d = 2 walkers/choices G = 3-regular graph T = 4

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 30

Introduction Related Work Main Results Discussion References

Model I

1 2 3 4 5 6 1 2 5 4 3 7 8 9 10 6 7 8 9 10

n = 10 bins d = 2 walkers/choices G = 3-regular graph T = 4

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 31

Introduction Related Work Main Results Discussion References

Model I

Theorem (T., Subramanian) In Model 1, for any T = o(√n), if Assumption 1 holds, then the maximum load does not exceed log log n

log d

+ Θ(1) with high probability as n → ∞. Same performance as power-of-d-choices! But less randomness used!

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 32

Introduction Related Work Main Results Discussion References

Model I

Comparing the randomness used per reset period:

For power-of-d: (length of reset period)·d log6 n. For Model 1: d log6(kn)+(length of reset period - 1)·d log6(k − 1)

If G is a cycle graph and T = ⌊n1/3⌋, then the number of dice throw per ball ≈ dn−1/3 log6(2n) → 0

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 33

Introduction Related Work Main Results Discussion References

Key Insights

Sufficient Exploration of Bins

Resets are frequent

Decorrelating the load vector and random walker position

Load of sampled bin is equal to that load at last reset. The distribution of the random walker positions is always independent uniform. Relate the distribution of the load of a sampled bin to the empirical distribution of load of bins at the last reset time.

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 34

Introduction Related Work Main Results Discussion References

Model II

Model 2 (Alon et al.’s Model) W1[j], W2[j], · · · , Wd[j] are independent non-backtracking random walks on G. Assumption 2 girth(Gn) = Ω(logk−1 n) (Gn)n is a sequence of expander graphs.

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 35

Introduction Related Work Main Results Discussion References

Model II

Definition (Alon et al.) Let (Gn)n be a sequence of k-regular graphs with |Gn| = n. Let k = λ1,n ≥ λ2,n ≥ · · · λn,n be the eigenvalues of Gn. Define λn := max{λ2,n, |λn,n|}. (Gn)n is said to be expander graphs if lim sup

n→∞ λn < k

Explicit Construction of graphs that satisfies Assumption 2 exists! e.g. Lubotzky, Phillips, and Sarnak’s Ramanujan Graph (1987)

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 36

Introduction Related Work Main Results Discussion References

Model II

Theorem (T., Subramanian) In Model 2, if Assumption 2 holds, then the maximum load does not exceed log log n

log d

+ Θ(1) with high probability as n → ∞.

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 37

Introduction Related Work Main Results Discussion References

Key Insights

Sufficient exploration of bins:

Non-backtracking random walks on expanders has O(log n) mixing time [Alon et al 2011]. Mixing effect replaces resets.

Decorrelating queue length vector and random walker locations

In a high girth graph, “intersections” are unlikely to happen within one mixing interval. The number of visits to a vertex within one mixing interval is bounded by Θ(1) w.p.1 .

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 38

Introduction Related Work Main Results Discussion References

Discussion

In load balancing applications, is the graph G costing too much memory? No! If G is a graph with structure, we can compute the neighbors rather than storing them.

For example, LPS Graph is a Cayley graph of PSL(2, q). One can find the neighbors of a given vertex through multiplying 2 × 2 matrices!

Model 1 could cost memory through storing the paths of random walkers. But Model 2 eliminated this cost through a stronger assumption on the graph.

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 39

Introduction Related Work Main Results Discussion References

Discussion

Why non-backtracking random walk? Because simple random walk doesn’t perform well! Theorem (T., Subramanian) For Model 2, if we replace the non-backtracking random walks with simple random walks, then the maximum load is greater than Θ (log n) w.h.p. as n → ∞. Proof idea: short-term revisits are frequent. It’s even worse than random allocation!

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 40

Introduction Related Work Main Results Discussion References

Discussion

What if we have a weaker girth assumption? Theorem (T., Subramanian) If Gn is a graph such that each vertex is contained in a cycle of length g = g(n), then the maximum load yielded by Model 2 is larger than Θ( log n

g ) almost surely.

Proof idea: short-term revisits are frequent. If g = o(

log n log log n), Model 2 could perform worse than

power-of-d!

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 41

Introduction Related Work Main Results Discussion References

Discussion

Theorem (T., Subramanian) If (Gn)n is a sequence of expander graph with girth girth(Gn) = Ω(

log n log log n), then the maximum load yielded by Model

2 is larger than Θ(log log n) with high probability. Not necessarily the same performance as power-of-d, but close!

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 42

Introduction Related Work Main Results Discussion References

Conclusion

Random walk based sampling can work for balls-in-bins. However we need to...

ensure that short-term revisits are rare. e.g. non-backtracking random walk on high girth graphs, reset at intersections ensure sufficient exploration of bins. e.g. expander graphs, reset to uniform random positions

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 43

Introduction Related Work Main Results Discussion References

Future Work

Queuing system setting: low traffic and high traffic Lower bound on maximum load for Model 1 and Model 2 Apply random walk based methods to other randomized algorithms

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 44

Introduction Related Work Main Results Discussion References

Acknowledgments

Harsha Honappa Yang Xiao National Science Foundation

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 45

Introduction Related Work Main Results Discussion References

References

Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal (1999).

Balanced Allocations SIAM Journal on Computing, Vol. 29, No. 1, 180 – 200.

N. Alon, I. Benjamini, E. Lubetzky, and S. Sodin (2007).

Non-backtracking random walks mix faster Communications in Contemporary Mathematics, Vol. 9, No. 04, 585-603

Dengwang Tang Balanced Allocation with Random Walk

SLIDE 46

Introduction Related Work Main Results Discussion References

Thanks

Dengwang Tang Balanced Allocation with Random Walk