Data Location Optimization for a Self-Organized Storage System - - PowerPoint PPT Presentation

data location optimization for a self organized storage
SMART_READER_LITE
LIVE PREVIEW

Data Location Optimization for a Self-Organized Storage System - - PowerPoint PPT Presentation

Data Location Optimization for a Self-Organized Storage System Hannes Mhleisen, Tilman Walther and Robert Tolksdorf 1 [A. Bockoven] 2 [Thomas Schmickl] 3 Brood Sorting - Algorithm item = null; while (true) if (item != null) if


slide-1
SLIDE 1

Data Location Optimization for a Self-Organized Storage System

Hannes Mühleisen, Tilman Walther and Robert Tolksdorf

1

slide-2
SLIDE 2

2

[A. Bockoven]

slide-3
SLIDE 3

[Thomas Schmickl]

3

slide-4
SLIDE 4

4

Brood Sorting - Algorithm

item = null; while (true) if (item != null) if (similarity(item,nearbyItems()) > α) drop(item) item = null else item = min(similarity(nearbyItems()²)) pickup(item) move()

slide-5
SLIDE 5

S1 S2 S3 S6 S5 S4

#B

70% 25% 95% 50% 50% 95% 10% 85%

#B?

Probabilistic Request Routing

5

[Lindgren03]

slide-6
SLIDE 6

Research Question

Can brood sorting improve data placement in a large-scale distributed storage system based

  • n probabilistic routing?

6

slide-7
SLIDE 7

Some Adaptions

7

  • Data is clustered into a limited amount of

“buckets”

  • Movement split up into two phases:
  • Search phase: Every node periodically

generates “profile” of locally stored data and sends it on its way

  • Response phase: Nodes compare

incoming profiles to local stored data, generating movement responses

slide-8
SLIDE 8

8

1 2 3

(1)

Profile

(1)

slide-9
SLIDE 9

9

1 2 3

(2)

(2)

slide-10
SLIDE 10

10

1 2 3

(3)

✓ Clean!

(3)

slide-11
SLIDE 11

Evaluation

11

  • Cluster of 100 Linux nodes
  • Two datasets, random & synthetic
  • 1000 write operations, four phases
  • Recorded data:
  • # Data items in network
  • # Successful movement operations
  • Bucket amount & size
slide-12
SLIDE 12

12

20 40 60 80 100 120 2e+04 4e+04 6e+04 8e+04 1e+05 Sample Data Items 500 1000 1500 2000 2500 Move Operations Data Items Move Operations

Data Items vs. Move Operations synthetic/100nodes

slide-13
SLIDE 13

13

20 40 60 80 100 120 100 200 300 400 500 Sample Total Amount 120 140 160 180 200 Average Size Total Amount Average Size

Bucket Amount vs. Average Size synthetic/100nodes

slide-14
SLIDE 14

14

50 100 150 200 250 20000 40000 60000 80000 Sample Data Items 1000 2000 3000 4000 Move Operations Data Items Move Operations

Data Items vs. Move Operations random/100nodes

slide-15
SLIDE 15

15

50 100 150 200 250 2000 4000 6000 8000 Sample Total Amount 50 100 150 Average Size Total Amount Average Size

Bucket Amount vs. Average Size random/100nodes

slide-16
SLIDE 16

Conclusion

  • Brood Sorting works! *

16

* YMMV

slide-17
SLIDE 17

Thank You!

Web Page: http://hannes.muehleisen.org

Questions?