Autoplacer : Scalable Self-Tuning Data Placement in Distributed - - PowerPoint PPT Presentation

autoplacer scalable self tuning data placement in
SMART_READER_LITE
LIVE PREVIEW

Autoplacer : Scalable Self-Tuning Data Placement in Distributed - - PowerPoint PPT Presentation

Autoplacer : Scalable Self-Tuning Data Placement in Distributed Key-value Stores ICAC13 Jo ao Paiva , Pedro Ruivo, Paolo Romano, Lu s Rodrigues Instituto Superior T ecnico / Inesc-ID, Lisboa, Portugal June 27, 2013 Outline


slide-1
SLIDE 1

Autoplacer: Scalable Self-Tuning Data Placement in Distributed Key-value Stores

ICAC’13 Jo˜ ao Paiva, Pedro Ruivo, Paolo Romano, Lu´ ıs Rodrigues

Instituto Superior T´ ecnico / Inesc-ID, Lisboa, Portugal

June 27, 2013

slide-2
SLIDE 2

Outline

Introduction Our approach Evaluation Conclusions

slide-3
SLIDE 3

Motivation

Collocating processing with storage can improve performance.

◮ Using random placement, nodes waste resources due to

node-intercommunication.

◮ Optimize data placement to improve locality and to reduce

remote requests.

slide-4
SLIDE 4

Motivation

Collocating processing with storage can improve performance.

◮ Using random placement, nodes waste resources due to

node-intercommunication.

◮ Optimize data placement to improve locality and to reduce

remote requests.

slide-5
SLIDE 5

Motivation

Collocating processing with storage can improve performance.

◮ Using random placement, nodes waste resources due to

node-intercommunication.

◮ Optimize data placement to improve locality and to reduce

remote requests.

slide-6
SLIDE 6

Approaches Using Offline Optimization

Algorithm:

  • 1. Gather access trace for all items
  • 2. Run offline optimization algorithms on traces
  • 3. Store solution in directory
  • 4. Locate data items by querying directory

◮ Fine-grained placement ◮ Costly to log all accesses ◮ Complex optimization ◮ Directory creates additional network usage

slide-7
SLIDE 7

Approaches Using Offline Optimization

Algorithm:

  • 1. Gather access trace for all items
  • 2. Run offline optimization algorithms on traces
  • 3. Store solution in directory
  • 4. Locate data items by querying directory

◮ Fine-grained placement ◮ Costly to log all accesses ◮ Complex optimization ◮ Directory creates additional network usage

slide-8
SLIDE 8

Main challenges

Cause: Key-Value stores may handle large amounts of data

Challenges:

  • 1. Collecting Statistics: Obtaining usage statistics in an

efficient manner.

  • 2. Optimization: Deriving fine-grained placement for data
  • bjects that exploits data locality.
  • 3. Fast lookup: Preserving fast lookup for data items.
slide-9
SLIDE 9

Approaches to Data Access Locality

  • 1. Consistent Hashing (CH):

The “don’t care” approach

  • 2. Distributed Directories:

The “care too much” approach

slide-10
SLIDE 10

Consistent Hashing

Don’t care for locality: items placed deterministically according to hash functions and full membership information.

◮ Simple to implement ◮ Solves lookup challenge by using local lookups ◮ No control on data placement → bad locality ◮ Does not address optimization challenge

slide-11
SLIDE 11

Consistent Hashing

Don’t care for locality: items placed deterministically according to hash functions and full membership information.

◮ Simple to implement ◮ Solves lookup challenge by using local lookups ◮ No control on data placement → bad locality ◮ Does not address optimization challenge

slide-12
SLIDE 12

Distributed Directories

Care too much for locality: nodes report usage statistics to centralized optimizer, placement defined in a distributed directory (may be cached locally)

◮ Can solve statistics challenge using coarse statistics ◮ Solves optimization challenge with precise data placement

control Hindered by lookup challenge:

◮ Additional network hop ◮ Hard to update

slide-13
SLIDE 13

Distributed Directories

Care too much for locality: nodes report usage statistics to centralized optimizer, placement defined in a distributed directory (may be cached locally)

◮ Can solve statistics challenge using coarse statistics ◮ Solves optimization challenge with precise data placement

control Hindered by lookup challenge:

◮ Additional network hop ◮ Hard to update

slide-14
SLIDE 14

Outline

Introduction Our approach Evaluation Conclusions

slide-15
SLIDE 15

Our approach: beating the challenges

Best of both worlds

◮ Statistics Challenge: Gather statistics only for hotspot items ◮ Optimization Challenge: Fine-grained optimization for

hotspots

◮ Lookup Challenge: Consistent Hashing for remaining items

slide-16
SLIDE 16

Algorithm overview

Online, round-based approach:

  • 1. Statistics: Monitor data access to collect hotspots
  • 2. Optimization: Decide placement for hotspots
  • 3. Lookup: Encode / broadcast data placement
  • 4. Move data
slide-17
SLIDE 17

Algorithm overview

Online, round-based approach:

  • 1. Statistics: Monitor data access to collect hotspots
  • 2. Optimization: Decide placement for hotspots
  • 3. Lookup: Encode / broadcast data placement
  • 4. Move data
slide-18
SLIDE 18

Statistics: Data access monitoring

Key concept: Top-K stream analysis algorithm

◮ Lightweight ◮ Sub-linear space usage ◮ Inaccurate result... But with bounded error

slide-19
SLIDE 19

Statistics: Data access monitoring

Key concept: Top-K stream analysis algorithm

◮ Lightweight ◮ Sub-linear space usage ◮ Inaccurate result... But with bounded error

slide-20
SLIDE 20

Statistics: Data access monitoring

Key concept: Top-K stream analysis algorithm

◮ Lightweight ◮ Sub-linear space usage ◮ Inaccurate result... But with bounded error

slide-21
SLIDE 21

Algorithm overview

Online, round-based approach:

  • 1. Statistics: Monitor data access to collect hotspots
  • 2. Optimization: Decide placement for hotspots
  • 3. Lookup: Encode / broadcast data placement
  • 4. Move data
slide-22
SLIDE 22

Optimization

Integer Linear Programming problem formulation: min

  • j∈N
  • i∈O

X ij(crrrij + crwwij) + Xij(clrrij + clwwij) (1) subject to: ∀i ∈ O :

  • j∈N

Xij = d ∧ ∀j ∈ N :

  • i∈O

Xij ≤ Sj

Inaccurate input:

◮ Does not provide optimal placement ◮ Upper-bound on error

slide-23
SLIDE 23

Accelerating optimization

  • 1. ILP Relaxed to Linear Programming problem
  • 2. Distributed optimization

LP relaxation

◮ Allow data item ownership to be in [0 − 1] interval

Distributed Optimization

◮ Partition by the N nodes ◮ Each node optimizes hotspots mapped to it by CH ◮ Strengthen capacity constraint

slide-24
SLIDE 24

Accelerating optimization

  • 1. ILP Relaxed to Linear Programming problem
  • 2. Distributed optimization

LP relaxation

◮ Allow data item ownership to be in [0 − 1] interval

Distributed Optimization

◮ Partition by the N nodes ◮ Each node optimizes hotspots mapped to it by CH ◮ Strengthen capacity constraint

slide-25
SLIDE 25

Accelerating optimization

  • 1. ILP Relaxed to Linear Programming problem
  • 2. Distributed optimization

LP relaxation

◮ Allow data item ownership to be in [0 − 1] interval

Distributed Optimization

◮ Partition by the N nodes ◮ Each node optimizes hotspots mapped to it by CH ◮ Strengthen capacity constraint

slide-26
SLIDE 26

Algorithm overview

Online, round-based approach:

  • 1. Statistics: Monitor data access to collect hotspots
  • 2. Optimization: Decide placement for hotspots
  • 3. Lookup: Encode / broadcast data placement
  • 4. Move data
slide-27
SLIDE 27

Lookup: Encoding placement

Probabilistic Associative Array (PAA)

◮ Associative array interface (keys→values) ◮ Probabilistic and space-efficient ◮ Trade-off space usage for accuracy

slide-28
SLIDE 28

Probabilistic Associative Array: Usage

Building

  • 1. Build PAA from hotspot mappings
  • 2. Broadcast PAA

Looking up objects

◮ If item not in PAA, use Consistent Hashing ◮ If item is hotspot, return PAA mapping

slide-29
SLIDE 29

Probabilistic Associative Array: Usage

Building

  • 1. Build PAA from hotspot mappings
  • 2. Broadcast PAA

Looking up objects

◮ If item not in PAA, use Consistent Hashing ◮ If item is hotspot, return PAA mapping

slide-30
SLIDE 30

PAA: Building blocks

◮ Bloom Filter

Space-efficient membership test (is item in PAA?)

◮ Decision tree classifier

Space-efficient mapping (where is hotspot mapped to?)

slide-31
SLIDE 31

PAA: Building blocks

◮ Bloom Filter

Space-efficient membership test (is item in PAA?)

◮ Decision tree classifier

Space-efficient mapping (where is hotspot mapped to?)

slide-32
SLIDE 32

PAA: Properties

Bloom Filter:

◮ False Positives: match items that it was not supposed to. ◮ No False Negatives: never return ⊥ for items in PAA.

Decision tree classifier:

◮ Inaccurate values (bounded error). ◮ Deterministic response: deterministic (item→node)

mapping.

slide-33
SLIDE 33

PAA: Properties

Bloom Filter:

◮ False Positives: match items that it was not supposed to. ◮ No False Negatives: never return ⊥ for items in PAA.

Decision tree classifier:

◮ Inaccurate values (bounded error). ◮ Deterministic response: deterministic (item→node)

mapping.

slide-34
SLIDE 34

PAA: Properties

Bloom Filter:

◮ False Positives: match items that it was not supposed to. ◮ No False Negatives: never return ⊥ for items in PAA.

Decision tree classifier:

◮ Inaccurate values (bounded error). ◮ Deterministic response: deterministic (item→node)

mapping.

slide-35
SLIDE 35

Algorithm Review

Online, round-based approach:

  • 1. Statistics: Monitor data access to collect hotspots

Top-k stream analysis

  • 2. Optimization: Decide placement for hotspots

Lightweight distributed optimization

  • 3. Lookup: Encode / broadcast data placement

Probabilistic Associative Array

  • 4. Move data
slide-36
SLIDE 36

Algorithm Review

Online, round-based approach:

  • 1. Statistics: Monitor data access to collect hotspots

Top-k stream analysis

  • 2. Optimization: Decide placement for hotspots

Lightweight distributed optimization

  • 3. Lookup: Encode / broadcast data placement

Probabilistic Associative Array

  • 4. Move data
slide-37
SLIDE 37

Algorithm Review

Online, round-based approach:

  • 1. Statistics: Monitor data access to collect hotspots

Top-k stream analysis

  • 2. Optimization: Decide placement for hotspots

Lightweight distributed optimization

  • 3. Lookup: Encode / broadcast data placement

Probabilistic Associative Array

  • 4. Move data
slide-38
SLIDE 38

Algorithm Review

Online, round-based approach:

  • 1. Statistics: Monitor data access to collect hotspots

Top-k stream analysis

  • 2. Optimization: Decide placement for hotspots

Lightweight distributed optimization

  • 3. Lookup: Encode / broadcast data placement

Probabilistic Associative Array

  • 4. Move data
slide-39
SLIDE 39

Outline

Introduction Our approach Evaluation Conclusions

slide-40
SLIDE 40

Experimental settings

◮ Integrated in Distributed Key-Value store (JBoss Infinispan) ◮ 40 Virtual Machines (10 physical machines) ◮ Gigabit network

slide-41
SLIDE 41

Modified TPC-C benchmark

Induce controllable locality:

◮ Probability p: Nodes access data associated with a given

warehouse.

◮ Probability 1 − p: Nodes access data associated a random

warehouse.

slide-42
SLIDE 42

Remote operations

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 10 15 20 25 30 Percentage of remote operations (%) Time (minutes) 100% locality 90% locality 50% locality 0% locality baseline

slide-43
SLIDE 43

Throughput

10 100 1000 5 10 15 20 25 30 Transactions per second (TX/s) Time (minutes) 100% locality 90% locality 50% locality 0% locality baseline

slide-44
SLIDE 44

Directory effects

10 100 1000 100% Locality 90% Locality 0% Locality Transaction per second (tx/s)

Autoplacer

Directory Baseline

slide-45
SLIDE 45

Outline

Introduction Our approach Evaluation Conclusions

slide-46
SLIDE 46

Conclusions

◮ Gather statistics only for hotspots ◮ Fine-grained hotspot placement ◮ Retain Local lookups using PAA ◮ Effective locality improvement ◮ Good network usage ◮ Considerable performance improvements

slide-47
SLIDE 47

Conclusions

◮ Gather statistics only for hotspots ◮ Fine-grained hotspot placement ◮ Retain Local lookups using PAA ◮ Effective locality improvement ◮ Good network usage ◮ Considerable performance improvements

slide-48
SLIDE 48

Thank you