Load Balancing in Distributed Computing Over Wireless LAN: Effects - - PowerPoint PPT Presentation

load balancing in distributed computing over wireless lan
SMART_READER_LITE
LIVE PREVIEW

Load Balancing in Distributed Computing Over Wireless LAN: Effects - - PowerPoint PPT Presentation

UNM Load Balancing in Distributed Computing Over Wireless LAN: Effects of Network Delay S. Dhakal, M.M. Hayat, M. Elyas, J. Ghanem, C.T. Abdallah Department of Electrical and Computer Engineering University of New Mexico Albuquerque, NM


slide-1
SLIDE 1

UNM

Load Balancing in Distributed Computing Over Wireless LAN: Effects of Network Delay

  • S. Dhakal, M.M. Hayat, M. Elyas, J. Ghanem, C.T. Abdallah

Department of Electrical and Computer Engineering University of New Mexico Albuquerque, NM 87131-0001, USA

slide-2
SLIDE 2

UNM

The Load Balancing Group at UNM

This work is supported by the National Science Foundation under Information Technology Research (ITR) grants No. ANI-0312611 and ANI-0312182.

slide-3
SLIDE 3

UNM

  • Determine the statistics and model for load-transfer

delay in wireless network in the context of distributed computing.

  • Determine the analytical solution to optimal load

balancing (LB) in a two-node distributed system.

  • Understand the interplay between intensity of LB and

delay and their effects on the workload completion time

  • Verify LB performance experimentally as well as

Monte-Carlo simulation.

Goals

slide-4
SLIDE 4

UNM

  • Description of the LB policy.
  • Network delay characterization.
  • Regeneration-based queuing model for one-shot LB.
  • Results: analytical, experimental and MC

simulation.

  • Conclusion and References.

Overview

slide-5
SLIDE 5

UNM

Schematic Description of Distributed System

N3 N1 N2

New Tasks Arrival Tasks Served Communication Channel In this work, we present the initial value problem.

slide-6
SLIDE 6

UNM

Description of the LB Policy

  • There exists an exchange of information about load-states

among all nodes (with a communication delay)

  • At the LB instant, viz., at t = tb

Ql(tb) System average load

] 1 , [ ∈ K

Load to be transferred to respective node (with load transport delay) LB intensity Excess load at node l

Load Partitioning

∑ =

≠l k kl

p 1

Excess Load

slide-7
SLIDE 7

UNM

LB Policy

Mathematically, at t = tb , where, l pll = 0 , for n = 2, pkl = 1 and ∀ k l

] 1 , [ ∈ K

Communication delay: j to l

⎪ ⎪ ⎪ ⎩ ⎪ ⎪ ⎪ ⎨ ⎧ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − ≤ ≠ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ∑ − − − ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ∑ − − − =

= − = −

  • therwise

n t Q Kp t l k t u t Q n t Q u t u t Q n t Q Kp t L

l kl lj n j lj lj j l n j lj lj j l kl kl

, 1 1 ) ( , , , ) ( ) ( ) ( . ) ( ) ( ) ( ) (

1 1 1 1

η η η η η

t l k

  • therwise

n t Q t Q n p

lk l i li i lk k kl

≤ ≠ ⎪ ⎪ ⎩ ⎪ ⎪ ⎨ ⎧ − ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ∑ − − − − =

η η η , , , 1 1 ) ( ) ( 1 2 1

slide-8
SLIDE 8

UNM

One-shot Load Balancing

  • Each node sends to other nodes its queue length

information at time t = 0.

  • All the nodes execute load balancing at a common

instant tb with a common gain K.

Remarks:

  • tb should be large enough so that

each node is informed of the initial load state, but should be small enough such that much time is not wasted waiting for communication.

  • K should be large enough to

tackle the variability in the processor speed but should be small enough such that transfer of load does not take too long.

TC1 TC2 N1 N2 tb Q1(0) Q2(0) Processing Speed Communication Delay t = 0

t

Transfer Delay Node 2 idle

slide-9
SLIDE 9

UNM

One-shot LB over Wireless LAN

  • A. Queuing model for one-shot LB (to be described later) has

been developed to accommodate:

  • Randomness of communication delay
  • Randomness of transfer delay
  • Load-size dependence of transfer delay
  • Variability in the processor speed among nodes
  • B. Delays in wireless networks exhibit heavy random fluctuations.
  • C. The wireless testbed is the most-suitable platform to validate

the predictive ability of the analytical model.

  • D. Delay probing experiments are conducted over the wireless

LAN to estimate the channel statistics, which is later integrated in the model to develop optimal LB policy.

slide-10
SLIDE 10

UNM

Setup (the same setup is used for LB experiments)

  • Testbed: Two 1 GHz Transmeta processor machines communicating
  • ver ECE wireless LAN (802.11 b access point).
  • Each task is one row with fixed number of elements, where each

element is generated uniformly and independently between 10 B to 100 B.

Delay Probing Experiments

⎟ ⎟ ⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ ⎛

mn m m n n

x x x x x x x x x ... ... ...

2 1 2 22 21 1 12 11

  • One task

m tasks

slide-11
SLIDE 11

UNM

  • In this experiment, tasks are independent, and size of each task

is distributed as ~ U( [1KB, 10KB]). Load is a collection of tasks.

  • Task execution time is the time to multiply a row by a static
  • matrix. The randomness in processing speed is introduced due to

task size.

  • Both nodes use TCP to transfer load of different size between

them.

  • Load transfer delay is calculated from the application’s layer

perspective, i.e., this is the delay in transferring the entire load.

  • Communication delay is delay in transmitting a small time-

stamped UDP packet carrying information about the number of tasks.

Delay Probing Experiments

slide-12
SLIDE 12

UNM

Empirical pdf of Delay

Transfer delay (per task) Communication delay

  • Average transfer delay per task is approximated as ~ exp(1/.35)
  • Communication delay is approximately as ~ exp(1/.089)
slide-13
SLIDE 13

UNM

Empirical estimate of transfer delay

  • Each load (batch of tasks) was sent 25 times and the average delay

was taken.

  • Traffic in the network changed during the time of experiments.

) /( 1 ) /( 1 min

1 1

β β

θ

Ld Ld

e e d − + − =

Fitting the parametric model:

Average transfer delay grows monotonically with the load size after some initial floor.

With, dmin= 0.162 s, d = 0.15, β = 0.085

slide-14
SLIDE 14

UNM

Empirical estimate of transfer delay

  • Previous experiments are modified to obtain time-

averaged estimate of transfer delay.

Transfer delay (per task)

  • Transfer delay per task can be approximated with

exponential pdf.

  • Time-averaged transfer delay grows monotonically as in the

parametric model.

slide-15
SLIDE 15

UNM

Regeneration-principle based queuing- model to calculate the expected value of

  • verall completion time

Setup: Two-node system

(The model has a trivial extension to n > 2 with little increase in algebraic complexity.)

: Initial tasks at node 1 : Initial tasks at node 2

m

n

b

t : LB instant

) (

, , b n m

t T

: Overall completion time,

K : Balancing gain

[ ]

) ( ) (

, , ) , ( , b n m b n m

t T E t = µ

Goal: Calculate

) (

) , ( , b n m

t µ

where, (0,0) is the knowledge state of the system at time t = 0 possible knowledge states : (0,0),(0,1),(1,0),(1,1)

slide-16
SLIDE 16

UNM

Example of Regeneration Technique in Context of Gambling

  • A gambler starts with an initial fortune $x
  • He can bid a dollar at a time and wins with probability p.
  • P(x) := P{ the gambler hits 20 | initial fortune =x}
  • Regeneration Event: Outcome of the first hand.

P(x) = p*P(x+1) + (1-p)*P(x-1), with P(0) = 0, P(20) = 1

slide-17
SLIDE 17

UNM

Regeneration Events and Knowledge States

  • Regeneration event: The first event to occur among:
  • the completion of a task by any node.
  • the arrival of communication sent by any node.
  • Upon the occurrence of a regeneration event:
  • the stochastic dynamics of the new queues remain unchanged.
  • the new queues have different set of initial conditions, viz.,
  • different load distribution if event is task completion
  • different knowledge state if event is communication arrival
  • Knowledge states: (k1,k2) where, ki∈{0,1}
  • (0,0) is the knowledge state at t=0, where each node does not know

the load-state information of the other node.

  • (0,0) transits to (0,1) if node 2 receives communication from node 1.
slide-18
SLIDE 18

UNM

Regeneration random variable: τ

) , , , min( Z Y X W = τ

12 21 2 1

), ( ) ( λ λ λ λ λ λ

λ τ

+ + + = =

where t u e t f

t

W = waiting time for executing one task at node 1, W ~ exp (-λD1) X = waiting time for executing one task at node 2, X ~ exp (-λD2) Y = arrival of communication from node 1 to node 2, Y ~ exp (-λ21) Z = arrival of communication from node 2 to node 1, Z ~ exp (-λ12)

slide-19
SLIDE 19

UNM

Regeneration Equation: (0,0) case

∫ + =

b

t b n m b n m

ds t s f t ] ) ( )[ ( ) (

) , ( , ) , ( ,

µ µ

τ

∫ + − +

b

t D b n m

ds s s t s f

1 ) , ( , 1

]. ) ( ).[ ( λ λ µ

τ

∫ + − +

b

t b n m

ds s s t s f

21 ) 1 , ( ,

]. ) ( ).[ ( λ λ µ

τ

∫ + − +

b

t D b n m

ds s s t s f

2 ) , ( 1 ,

]. ) ( ).[ ( λ λ µ

τ

∫ + − +

b

t b n m

ds s s t s f

12 ) , 1 ( ,

]. ) ( ).[ ( λ λ µ

τ

slide-20
SLIDE 20

UNM

Regeneration Equation

∫ + − +

b

t b k n m

ds s s t s f

21 ) 1 , ( ,

]. ) ( ).[ (

1

λ λ µ

τ

∫ + =

b

t b k k n m b k k n m

ds t s f t ] ) ( )[ ( ) (

) , ( , ) , ( ,

2 1 2 1

µ µ

τ

∫ + − +

b

t D b k k n m

ds s s t s f

1 ) , ( , 1

]. ) ( ).[ (

2 1

λ λ µ

τ

∫ + − +

b

t D b k k n m

ds s s t s f

2 ) , ( 1 ,

]. ) ( ).[ (

2 1

λ λ µ

τ

∫ + − +

b

t b k n m

ds s s t s f

12 ) , 1 ( ,

]. ) ( ).[ (

2

λ λ µ

τ

slide-21
SLIDE 21

UNM

Regeneration Equation in Differential Form

) ( ) ( ) (

) , ( 1 , 2 ) , ( , 1 1 ) , ( ,

2 1 2 1 2 1

b k k n m D b k k n m D b b k k n m

t t t d t d

− −

+ = µ λ µ λ µ 1 ) ( ) ( ) (

) , ( , ) , 1 ( , 12 ) 1 , ( , 21

2 1 2 1

+ − + +

b k k n m b k n m b k n m

t t t µ λ µ λ µ λ

To solve these equations, we need to solve for initial conditions:

) (

) , ( ,

2 1 k

k n m

µ

TC1 TC2 Node 1 Node 2 L12 L21 TA1 TA2 m n

[ ]

2 1 ) , ( ,

) (

2 1

C C k k n m

T T E ∨ = µ

Exploiting the independence of TC1 and TC2 , we obtain:

[ ]

dt t F t f t F t f t

C C C C

T T T T k k n m

∫ + =

∞ ) , ( ,

) ( ) ( ) ( ) ( ) (

1 2 2 1 2 1

µ

slide-22
SLIDE 22

UNM

Technique to Solve the Regeneration Equation

) (

1 ) 1 , 1 ( ,

1 n

m

µ

1.Solve differential equations at each level

  • 2. Move one

level up at a time in the tree.

) (

) 1 , 1 ( , b n m

t µ

Step A 1.Solve differential equations at each level

  • 2. Move one

level up at a time in the tree.

) (

) , ( , b n m

t µ

Step D

) (

) 1 , ( ,

1 1

b n m

t µ ) (

) , 1 ( ,

1 1

b n m

t µ ) (

) , ( ,

1 1 n

m

µ

1.Solve differential equations at each level

  • 2. Move one

level up at a time in the tree.

) (

) 1 , ( , b n m

t µ

Step B

) (

) 1 , 1 ( ,

1 1

b n m

t µ ) (

) 1 , ( ,

1 1 n

m

µ

slide-23
SLIDE 23

UNM

Empirical estimates of Network Delay and Processor Speed

1 21 12

119 .

= = s λ λ

2 . 4 . 3462 .

min

= = = β d s d

1 1

4 75 .

= s

D

λ

1 2

7847 .

= s

D

λ

From the delay-probing experiments, the following statistics were collected to be used for the analytical model: Average transfer delay parameters: Average communication rate: Average processor speed :

slide-24
SLIDE 24

UNM

Results: Optimization over tb

Experimental results. Fixed Gain K=1 Theoretical predictions with K=1.

  • Around tb = 0 s the performance is very sensitive to the balancing instant.
  • In this region, the system knowledge state is likely to be hybrid, (0,1) or

(1,0), and hence the LB results in severe uneven distribution of the load.

  • The completion time drops significantly to 64 s if informed LB is done.
  • Experimental result (averaged over 20 realizations) more or less agree to

the theoretical prediction.

slide-25
SLIDE 25

UNM

Results: Optimization over K

Experimental results. tb=100 ms Theoretical ( ) and simulation (+) predictions ( tb=100 ms.)

  • K reduces the effect of hybrid (blind + informed) LB.
  • The optimal gain K is less than unity from all three results, conservative

balancing policy outperforms the full-extent LB.

  • Better similarities in the results are obtained by
  • realizing more experimental runs for a given K,
  • using time-averaged channel parameters in the queuing model
  • improvising the load transfer delay model
slide-26
SLIDE 26

UNM

Results of Optimal LB policy

40.08 41.78 7 1 (50,50) 48.53 49.58 4 0.9 (70,50) 25.78 25.3 5 1 (20,40) 46.59 49.7 0.8 (100,0) 62.53 61.7 5 0.9 (100,50)

Average Completion Time (Experiment)

(s)

Optimal tb

(s)

Optimal K Initial Workload (m,n)

) (

) , ( , b n m

t µ

(s)

  • Even for balanced initial load-state, viz., (m,n) = (50,50), there is
  • ptimal K and optimal tb, which is due to difference in average

processing speed of the two nodes.

slide-27
SLIDE 27

UNM

  • Regeneration-theory based analytical model for LB has

been verified through real-time LB experiments over the wireless-LAN.

  • For every initial workload distribution, there exist an
  • ptimal K and tb to minimize the overall completion time.
  • The optimal performance of the LB policy is attained

through an informed balancing.

  • In a delay-limited environment, a conservative choice of

the balancing gain better suits uncertainties in the processor speed and delay.

Conclusion

slide-28
SLIDE 28

UNM

LB Policy with Node Failure

Present Work

Dynamic Load Balancing

  • load (external to the system)

arrives randomly with certain rate.

  • At every arrival of this load,
  • the node executes
  • ptimal one-shot LB.
  • only the receiver node

executes balancing.

  • A sub-optimal policy which

utilizes the optimal LB repeatedly.

10 20 30 40 50 60 70 80 90 100 0.365 0.37 0.375 0.38 0.385 0.39 0.395 0.4 0.405

Balancing Instant tb, s H100,50

(0,0)

(tb)

slide-29
SLIDE 29

UNM

[1] M. M. Hayat et al., “ Dynamic time delay models for load balancing. Part II: Stochastic analysis of the effect of delay uncertainty, Advances in Time Delay Systems, LNCSE vol. 38 pp.355-368, Springer-Verlag 2004. [2] S. Dhakal et al., “On the optimization of load balancing in distributed networks in the presence of delay, Advances in Communication Control Networks,” LNCSE vol. 308, pp.223-244, Springer-Verlag 2004. [3] J. Ghanem et al., “Load balancing in distributed systems with large time delays: Theory and experiment,” Proceedings of the IEEE/CSS 12th Mediterranean Conference on Control and Automation (MED ’04), Aydin, Turkey, June 2004. [4] F. Bacelli and P. Bremaud, Elements of Queuing Theory: Palm-Martingale Calculus and Stochastic Recurrence: New York, Springer-Verlag, 1994.

References