Astrolabe: A Robust and Scalable Technology for Distributed System - - PowerPoint PPT Presentation

astrolabe
SMART_READER_LITE
LIVE PREVIEW

Astrolabe: A Robust and Scalable Technology for Distributed System - - PowerPoint PPT Presentation

Astrolabe: A Robust and Scalable Technology for Distributed System R. van Renesse, K.P. Birman, W. Vogels (Cornell University) Presentation by Konrad Iwanicki October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 1


slide-1
SLIDE 1

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 1

Astrolabe:

A Robust and Scalable Technology for Distributed System

  • R. van Renesse, K.P. Birman, W. Vogels (Cornell University)

Presentation by Konrad Iwanicki

slide-2
SLIDE 2

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 2

Introduction

  • Imagine an organization that needs to manage large

collections of distributed resources, such as:

  • personal workstations
  • dedicated nodes in a web farm
  • or objects stored and services run on these computers.
  • Think really big:
  • Amazon.com,
  • Google.com,
  • University of Warsaw.
slide-3
SLIDE 3

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 3

Sample Objects & Resources

Name pc372 Browser IE Printer OKI 783 Disk_total 20GB Disk_used 5GB pc372.mimuw.edu.pl Name laptop065 Browser Firefox Printer

  • Disk_total

500GB Disk_used 413GB laptop065.mimuw.edu.pl

slide-4
SLIDE 4

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 4

Sample Management

Name pc372 duch laptop065 Browser IE

  • Firefox

Printer OKI 783 HP 3971

  • Disk_total

20GB 2000GB 500GB Disk_used 5GB 587GB 413GB laptop065.mimuw.edu.pl

slide-5
SLIDE 5

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 5

Problems

  • The computers hosting the resources may be:
  • co-located in a room,
  • spread across a building or a campus,
  • scattered around the world.
  • Configurations of such systems change rapidly:
  • machine failures and connectivity changes are common,
  • significant adaptation may be necessary to provide the

desired level of service.

  • How do you manage the resources in such systems?
  • In particular, how do you retrieve information about the

resources?

slide-6
SLIDE 6

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 6

Sample Management Queries

SELECT

COUNT(user_name) AS firefox_user_count

FROM processes WHERE

name = 'firefox.exe'

GROUP BY

name

SELECT

FIRST(3, host_name) AS host

WHERE

Disk_used/Disk_total > 80%

ORDER BY

Disk_used/Disk_total DESC

slide-7
SLIDE 7

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 7

Requirements

  • Resource information aggregation:
  • A convenient way of getting some summaries regarding the

resources in the system.

  • Resource location:
  • A means for locating resources based on the summaries.
  • Scalability (in terms of the number of machines).
  • Robustness to changes in the network topology.
  • Security (access control and integrity).
slide-8
SLIDE 8

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 8

A solution: Astrolabe

  • Developed at Cornell University and a start-up

company, RNS.

  • Used by Amazon.com to manage its huge collections
  • f machines and the services those machines run.
  • Werner Vogels is now the CTO of Amazon.com

(http://www.allthingsdistributed.com/)

slide-9
SLIDE 9

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 9

Zone Hierarchy

pc372 duch laptop065

slide-10
SLIDE 10

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 10

Zone Hierarchy

pc372 duch laptop065 mim wz geol

slide-11
SLIDE 11

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 11

Zone Hierarchy

pc372 duch laptop065 mim wz geol uw

slide-12
SLIDE 12

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 12

Attribute List

pc372 duch laptop065 mim wz geol uw

slide-13
SLIDE 13

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 13

Attribute List

pc372 duch laptop065 mim wz geol uw

slide-14
SLIDE 14

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 14

Attribute List

pc372 duch laptop065 mim wz geol uw

slide-15
SLIDE 15

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 15

Attribute List

pc372 duch laptop065 mim wz geol uw

What are these attributes?

slide-16
SLIDE 16

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 16

Aggregation Functions

pc372 duch laptop065 mim wz geol uw

Aggregated attributes. Exported attributes.

slide-17
SLIDE 17

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 17

Aggregation Functions

Function Description MIN(attribute) Find the minimum attribute. MIN(attribute) Find the maximum attribute. SUM(attribute) Sum the attributes. COUNT(attribute) Compute the attributes. AVG(attribute [, weight]) Calculate a weighted average. OR(attribute) Bit-wise OR of a bitmap. AND(attribute) Bit-wise AND of a bitmap. FIRST(n, attribute) Return a set with the first n attributes. RANDOM(n, attribute [, weight]) Return a set with n randomly selected attributes. In general, aggregation functions should compact N values into a small output value. The size of the output value should be a small function of N:

  • preferably O(1),
  • alternatively O(logN).

Aggregated attributes a continuously recomputed.

slide-18
SLIDE 18

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 18

Attribute Maintenance

pc372 duch laptop065 mim wz geol uw

Who maintains attributes for a zone?

slide-19
SLIDE 19

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 19

Attribute Maintenance

pc372 duch laptop065

The members

  • f the zone.

The members

  • f the sibling zones.
slide-20
SLIDE 20

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 20

Attribute Maintenance

pc372 duch laptop065 mim wz geol

The members

  • f the zone.

The members

  • f the sibling zones.
slide-21
SLIDE 21

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 21

Attribute Maintenance

pc372 duch laptop065 mim wz geol uw

My zones; I compute the attributes for these zones myself. Sibling zones; I get the attributes for these zones from

  • ther machines.

me

slide-22
SLIDE 22

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 22

Attribute Maintenance

pc372 duch laptop065 mim wz geol uw pw pjwstk waw kra lub

me me me me /pl/waw/uw/mim/laptop065 /pl/waw/uw/mim /pl/waw/uw /pl/waw /pl All locally maintained attributes for zones: /pl/waw, /pl/kra, /pl/lub

slide-23
SLIDE 23

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 23

Attribute Maintenance

pc372 duch laptop065 mim wz geol uw pw pjwstk waw kra lub

me me me me

How do I get the attributes

  • f the sibling zones?

/pl/waw/uw/mim/laptop065 /pl/waw/uw/mim /pl/waw/uw /pl/waw /pl All locally maintained attributes for zones: /pl/waw, /pl/kra, /pl/lub

slide-24
SLIDE 24

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 24

Hierarchical Gossiping

/pl/waw/uw/mim/laptop065

pc372 193.0.96.127 duch 193.0.96.2 laptop065 193.0.96.415 mim null wz 211.123.1.15, ... geol 193.14.16.29, ... uw null pw 195.28.56.201, ... pjwstk 162.32.90.45, ... waw null kra 147.13.132.137, ... lub 182..132.137, ...

A list of contacts for zone /pl/kra A list of contacts for zone /pl/waw/uw/geol One of the attributes maintained Locally for a zone is a list of contacts for the zone.

slide-25
SLIDE 25

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 25

Hierarchical Gossiping

  • A computer in zone */Z/X computes the attributes for */Z/X

locally, based on the child zone attributes.

  • To obtain attributes for a sibling zone, */Z/Y, the computer:
  • Selects a random contact IP with zone */Z/Y.
  • Communicates with this IP to exchange:

– The attributes of all child zones of the parent zone, */Z. – The attributes of all common higher level zones.

  • Suppose that Alocal is the local (the computer's) value of attribute A,

while Aremote is the remote (the contact's) value of A, as obtained in the exchange. The computer chooses the fresher attribute:

– If TimeComputed(Alocal) < TimeComputed(Aremote): Alocal := Aremote

slide-26
SLIDE 26

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 26

Hierarchical Gossiping

/pl/waw/uw/mim/laptop065

pc372 193.0.96.127 duch 193.0.96.2 laptop065 193.0.96.415 mim null wz 211.123.1.15, ... geol 193.14.16.29, ... uw null pw 195.28.56.201, ... pjwstk 162.32.90.45, ... waw null kra 147.13.132.137, ... lub 182..132.137, ...

  • Select zone to gossip for.

/pl/waw/uw/wz

slide-27
SLIDE 27

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 27

Hierarchical Gossiping

/pl/waw/uw/mim/laptop065

pc372 193.0.96.127 duch 193.0.96.2 laptop065 193.0.96.415 mim null wz 211.123.1.15, ... geol 193.14.16.29, ... uw null pw 195.28.56.201, ... pjwstk 162.32.90.45, ... waw null kra 147.13.132.137, ... lub 182..132.137, ...

  • Select zone to gossip for.
  • Randomly select a

contact IP.

slide-28
SLIDE 28

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 28

Hierarchical Gossiping

/pl/waw/uw/mim/laptop065

pc372 193.0.96.127 duch 193.0.96.2 laptop065 193.0.96.415 mim null wz 211.123.1.15, ... geol 193.14.16.29, ... uw null pw 195.28.56.201, ... pjwstk 162.32.90.45, ... waw null kra 147.13.132.137, ... lub 182..132.137, ...

  • Select zone to gossip for.
  • Randomly select a

contact IP.

  • Exchange common zone

attributes.

Exchange the common attributes with 211.123.1.15

slide-29
SLIDE 29

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 29

Hierarchical Gossiping

/pl/waw/uw/mim/laptop065

pc372 193.0.96.127 duch 193.0.96.2 laptop065 193.0.96.415 mim null wz 211.123.1.15, ... geol 193.14.16.29, ... uw null pw 195.28.56.201, ... pjwstk 162.32.90.45, ... waw null kra 147.13.132.137, ... lub 182..132.137, ...

  • Select zone to gossip for.
  • Randomly select a

contact IP.

  • Exchange common zone

attributes.

  • Adopt the fresher

attributes.

slide-30
SLIDE 30

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 30

Hierarchical Gossiping

/pl/waw/uw/mim/laptop065

pc372 193.0.96.127 duch 193.0.96.2 laptop065 193.0.96.415 mim null wz 211.123.1.15, ... geol 193.14.16.29, ... uw null pw 195.28.56.201, ... pjwstk 162.32.90.45, ... waw null kra 147.13.132.137, ... lub 182..132.137, ...

  • Select zone to gossip for.
  • Randomly select a

contact IP.

  • Exchange common zone

attributes.

  • Adopt the fresher

attributes.

  • Mine and the contact's

exchanged attributes are now the same.

slide-31
SLIDE 31

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 31

Hierarchical Gossiping

  • The gossiping process is continuous.
  • Each computer initiates a gossip every 5 seconds.
  • It also receives gossip requests from other computers.
  • The gossip is performed at every level of the zone hierarchy.
slide-32
SLIDE 32

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 32

Hierarchical Gossiping

  • The gossiping process is continuous.
  • Each computer initiates a gossip every 5 seconds.
  • It also receives gossip requests from other computers.
  • The gossip is performed at every level of the zone hierarchy.
  • If the attribute values did not change, the computers

would all reach a consistent view:

  • Eventual consistency.
slide-33
SLIDE 33

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 33

Hierarchical Gossiping

  • The gossiping process is continuous.
  • Each computer initiates a gossip every 5 seconds.
  • It also receives gossip requests from other computers.
  • The gossip is performed at every level of the zone hierarchy.
  • If the attribute values did not change, the computers

would all reach a consistent view:

  • Eventual consistency.
  • Gossiping is extremely robust to node failures and

changes in connectivity:

  • If a computer to contact with dies, simply another contact

can be chosen at random.

slide-34
SLIDE 34

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 34

Hierarchical Gossiping

  • How do computers know zone contact IPs?
slide-35
SLIDE 35

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 35

Hierarchical Gossiping

  • How do computers know zone contact IPs?
  • They are simply gossiped like other attributes (details

in the paper).

slide-36
SLIDE 36

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 36

Usability

  • The aggregation functions and the attributes are not

static.

  • They can be dynamically installed in the zones for which we

are interested in some attribute values.

  • They propagate automatically to the sibling and parent

zones (via gossiping).

slide-37
SLIDE 37

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 37

Usability

  • The aggregation functions and the attributes are not

static.

  • They can be dynamically installed in the zones for which we

are interested in some attribute values.

  • They propagate automatically to the sibling and parent

zones (via gossiping).

  • This is a form of mobile code.
slide-38
SLIDE 38

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 38

Usability

  • The aggregation functions and the attributes are not

static.

  • They can be dynamically installed in the zones for which we

are interested in some attribute values.

  • They propagate automatically to the sibling and parent

zones (via gossiping).

  • This is a form of mobile code.
  • In this way, not only can we monitor aggregate

information about resources, but we can also locate particular resources...

  • … and do other interesting stuff (details in the paper).
slide-39
SLIDE 39

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 39

Security

  • Since aggregation functions are dynamically installed,

we have to ensure some security.

slide-40
SLIDE 40

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 40

Security

  • Since aggregation functions are dynamically installed,

we have to ensure some security.

  • The aggregation functions require certificates issued

by a trusted certification authority.

slide-41
SLIDE 41

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 41

Security

  • Since aggregation functions are dynamically installed,

we have to ensure some security.

  • The aggregation functions require certificates issued

by a trusted certification authority.

  • An aggregation function or an attribute can be installed

in a zone only if it is accompanied by a certificate for that zone.

slide-42
SLIDE 42

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 42

Security

  • Since aggregation functions are dynamically installed,

we have to ensure some security.

  • The aggregation functions require certificates issued

by a trusted certification authority.

  • An aggregation function or an attribute can be installed

in a zone only if it is accompanied by a certificate for that zone.

  • In this way, Astrolabe ensures write access control and

integrity (details in the paper).

slide-43
SLIDE 43

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 43

Performance

The average number of rounds to propagate an update depending on the number

  • f sibling zones at every hierarchy level.
slide-44
SLIDE 44

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 44

Performance

The average number of rounds to propagate an update depending on the probability of a message being lost (bf = 25).

slide-45
SLIDE 45

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 45

Performance

The average number of rounds to propagate an update depending on the probability of a node being down (bf = 25).

slide-46
SLIDE 46

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 46

Summary

  • The goal of Astrolabe is to manage a large collection
  • f distributed objects and resources.
  • Astrolabe's design employs the following principles:
  • Scalability through hierarchy:

– Zone hierarchy, hierarchical attribute aggregation

  • Flexibility through mobile code:

– Dynamically installed aggregation functions

  • Robustness via a gossip-based peer-to-peer protocol:

– Self-management and recovery

  • Security through certificates:

– Integrity and write access control

slide-47
SLIDE 47

October 3rd and 8th, 2012 Distributed Systems Course, University of Warsaw 47

Thank You

Questions?