A Generalized Replica Placement Strategy to Optimize Latency in a - - PowerPoint PPT Presentation

a generalized replica placement strategy to optimize
SMART_READER_LITE
LIVE PREVIEW

A Generalized Replica Placement Strategy to Optimize Latency in a - - PowerPoint PPT Presentation

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System John A. Chandy John A. Chandy Department of Electrical and Computer Engineering Department of Electrical and Computer Engineering


slide-1
SLIDE 1

International Workshop on Data-Aware Distributed Computing June 24, 2008

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System

John A. Chandy John A. Chandy

Department of Electrical and Computer Engineering Department of Electrical and Computer Engineering

slide-2
SLIDE 2

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Distributed Storage

  • Local area network

Local area network

– – Spread data across multiple Spread data across multiple networked nodes networked nodes – – Parallelism and higher throughput Parallelism and higher throughput

  • W

Wide-area network ide-area network

– – Instead of splitting data for scalability, replicate data Instead of splitting data for scalability, replicate data for availability for availability – – Improves latency Improves latency

slide-3
SLIDE 3

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Replica Placement

  • Where do you put the replicas?

Where do you put the replicas?

– – Optimization problem Optimization problem

  • Minimize latency or

Minimize latency or maximize availability maximize availability

  • Constraints: storage capacity, load balancing

Constraints: storage capacity, load balancing

– – Significant existing work Significant existing work

  • Latency optimization

Latency optimization

– – Greedy algorithm, Greedy algorithm, Qiu Qiu et al. et al. – – HotZone HotZone, , Szymaniak Szymaniak et al. et al.

» » Popularity based Popularity based

– – Lat-cdn Lat-cdn, , Pallis Pallis et al. et al.

» » Heuristic approach Heuristic approach

slide-4
SLIDE 4

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Replica Placement

– – Availability optimization Availability optimization

  • Van

Van Renesse Renesse

– – Place replicas until desired availability is reached Place replicas until desired availability is reached

  • Farsite

Farsite

– – Hill-climbing approach to replica placement Hill-climbing approach to replica placement

  • Xin

Xin et al. et al.

– – Takes into account bimodal availability Takes into account bimodal availability

slide-5
SLIDE 5

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Replica Placement

  • What

What’ ’s the problem? s the problem?

– – Existing approaches assume that objects are Existing approaches assume that objects are completely replicated completely replicated – – Full replication has significant overhead Full replication has significant overhead – – Use erasure codes instead Use erasure codes instead

  • Less overhead

Less overhead

  • Better reliability than parity

Better reliability than parity

  • Placement is much more complicated

Placement is much more complicated

slide-6
SLIDE 6

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Problem Formulation

  • K

K data objects data objects

  • N

N storage nodes storage nodes

  • C

C clients clients

  • Each object is split into

Each object is split into n n fragments of which fragments of which m m fragments must be recovered to reconstruct object fragments must be recovered to reconstruct object

– – m=n m=n

  • no redundancy
  • no redundancy

– – m=n m=n-1

  • 1
  • parity
  • parity

– – m= m=1 1

  • replication
  • replication
slide-7
SLIDE 7

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Problem Formulation

  • Placement problem

Placement problem

– – Place Place fragments fragments

  • f each object on
  • f each object on n

n of the

  • f the N

N storage storage nodes nodes – – x xjk

jk=1

=1 if if fragment of object fragment of object j j is placed on storage node is placed on storage node k k

  • Assignment problem

Assignment problem

– – For each object, assign each client to For each object, assign each client to m m of the

  • f the

n n storage nodes where the object fragments are placed storage nodes where the object fragments are placed – – y yijk

ijk=1

=1 if if client client i i retrives retrives fragment of object fragment of object j j from from storage node storage node k k

slide-8
SLIDE 8

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Problem formulation

  • Overall objective is to minimize

Overall objective is to minimize average latency average latency

  • Cost function is

Cost function is

F X,Y

( ) =

yijkij

iC

  • j N
  • kK
slide-9
SLIDE 9

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Problem formulation

  • Constraints:

Constraints:

– – x x is a binary variable is a binary variable – – y y is a binary variable is a binary variable – – Each object Each object k k has has n n fragments fragments – – Each Each client client i i requests requests m m fragments of object fragments of object k k

x 0,1

{ }

x 0,1

{ }

x jk = n

j N

  • k

yijk = m

j N

  • i,k
slide-10
SLIDE 10

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Problem formulation

  • Constraints:

Constraints: – – Client Client i i should request a fragment of object should request a fragment of object k k from storage from storage node node j j only if that node stores the fragment

  • nly if that node stores the fragment

– – Storage allocation balancing - each node Storage allocation balancing - each node j j stores the stores the same number of fragments same number of fragments – – Load balancing - each node Load balancing - each node j j services the same number of services the same number of clients clients yijk x jk i, j,k x jk = nK N

kK

  • j

yijk = mCK N

kK

  • iC
  • j
slide-11
SLIDE 11

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Problem formulation

  • 0-1 integer linear programming problem

0-1 integer linear programming problem

– – CN + CN + CNK CNK variables variables – – K + CK + CNK + N + N K + CK + CNK + N + N constraints constraints – – K K and and C C can be in the millions and can be in the millions and N N can be can be in the thousands in the thousands – – Problem is too large to solve with normal Problem is too large to solve with normal methods methods

slide-12
SLIDE 12

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Problem formulation

  • Instead of doing global placement, do a placement

Instead of doing global placement, do a placement

  • n
  • n

each individual object each individual object

– – Makes more sense since it is impractical to reallocate Makes more sense since it is impractical to reallocate and replace and replace fragments every time an fragments every time an obect

  • bect is created

is created – – Make Make x x and and y y independent of independent of k k

  • Will cause

Will cause load imbalance as all objects will be placed load imbalance as all objects will be placed

  • n the same nodes
  • n the same nodes

– – Intermediate approach - consider only a subset of Intermediate approach - consider only a subset of

  • bjects
  • bjects
  • Since each

Since each

  • bject has
  • bject has

n n fragments, we can insure that fragments, we can insure that each each node has at least 1 fragment, by setting node has at least 1 fragment, by setting and and introducing new constraint introducing new constraint

K = N n

x jk =1

kK

  • j
slide-13
SLIDE 13

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Problem formulation

  • With reduced object set size, derive global

With reduced object set size, derive global placement placement P P

  • For each new object, calculate a hash

For each new object, calculate a hash h h based on object ID, name, contents, etc. based on object ID, name, contents, etc.

  • Place and assign object according to object

Place and assign object according to object h h mod mod K K in placement in placement P. P.

slide-14
SLIDE 14

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Problem Approach

  • Heuristic approach to 0-1

Heuristic approach to 0-1 integer linear integer linear programming problem programming problem

  • Start with an initial placement and assignment that

Start with an initial placement and assignment that is guaranteed to be feasible is guaranteed to be feasible

– – Place first object on the first Place first object on the first n n nodes, place second nodes, place second

  • bject on the next
  • bject on the next n

n nodes, and so on nodes, and so on – – Assign first client to the first Assign first client to the first m m nodes of the nodes of the n n storage storage nodes, next client to the next nodes, next client to the next m m nodes and so on nodes and so on

x jk = j n

  • == k
  • yijk =

j m

  • == k n

m + imod n m

slide-15
SLIDE 15

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Problem Approach

  • Greedily alter solution until no improvements

Greedily alter solution until no improvements

  • 3 possible solution transformations

3 possible solution transformations

– – Swap assignment Swap assignment – – Swap Swap placement placement – – Change assignment Change assignment

x jk x j'k' where x jk = x j'k' =1 yijk yi' j'k' where yijk =1 and yi' j'k' = 0 yijk yi' j'k' where yijk = yi' j'k' =1

slide-16
SLIDE 16

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Problem approach

  • Change assignment can introduce load

Change assignment can introduce load imbalance imbalance

  • We can relax the load

We can relax the load balance requirement balance requirement

mCK N 1

( ) <

yijk

kK

  • iC
  • < mCK

N 1+

( ) j

slide-17
SLIDE 17

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Algorithm

cost = F( X, Y ) cost = F( X, Y ) do do

  • ldcost
  • ldcost = cost

= cost for all objects k for all objects k for all clients i for all clients i for all storage nodes j for all storage nodes j delta_cost delta_cost = change assignment = change assignment if ( if ( delta_cost delta_cost < 0 ) < 0 ) accept change accept change for clients with maximum latencies for clients with maximum latencies delta_cost delta_cost = swap placement = swap placement if ( if ( delta_cost delta_cost < 0 ) < 0 ) accept swap accept swap delta_cost delta_cost = swap assignment = swap assignment if ( if ( delta_cost delta_cost < 0 ) < 0 ) accept swap accept swap cost = F(X,Y) cost = F(X,Y) while cost < while cost < oldcost

  • ldcost
slide-18
SLIDE 18

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Algorithm

  • Swaps provide the most improvement, but are very

Swaps provide the most improvement, but are very costly to evaluate costly to evaluate

  • Assignment changes provide relatively little

Assignment changes provide relatively little improve, but is very easy to evaluate improve, but is very easy to evaluate

  • Do mostly assignment changes and do swaps only

Do mostly assignment changes and do swaps only for maximum latency clients for maximum latency clients

  • O(CN)

O(CN) computation due to computation due to mCK=mCN/n mCK=mCN/n non-zero non-zero elements in matrix elements in matrix

slide-19
SLIDE 19

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Evaluation

  • Autonomous System Network generated with

Autonomous System Network generated with Inet Inet topology generator topology generator

  • Similar to a content delivery network with storage

Similar to a content delivery network with storage nodes at AS nodes nodes at AS nodes

  • Graphs with 3200, 4000, 5000, and

Graphs with 3200, 4000, 5000, and 6000 nodes 6000 nodes

  • Clients and storage nodes are equivalent

Clients and storage nodes are equivalent

  • Latency is number of hops

Latency is number of hops

slide-20
SLIDE 20

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Evaluation

  • N

N=4000 =4000, , n n=8 =8, , m m=4 =4

slide-21
SLIDE 21

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Evaluation

  • Average latency

Average latency vs vs. . λ λ ( (n n=8 =8, , m m=4 =4) )

slide-22
SLIDE 22

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Evaluation

  • Average latency

Average latency vs vs. . N N ( (varying varying n,m n,m) )

slide-23
SLIDE 23

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Evaluation

  • Average latency

Average latency vs vs. . N N ( (varying varying n, n,m m=1 =1) )

slide-24
SLIDE 24

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System International Workshop on Data-Aware Distributed Computing June 24, 2008

Summary

  • Generalized replica placement algorithm suitable

Generalized replica placement algorithm suitable for fragmented objects - parity, erasure codes, for fragmented objects - parity, erasure codes, secret sharing, etc. secret sharing, etc.

  • Greedy algorithm based on assignment changes

Greedy algorithm based on assignment changes and swaps and swaps

  • Load balancing relaxation improves performance

Load balancing relaxation improves performance significantly significantly