Distributed Systems Principles and Paradigms Chapter 02 (version - - PDF document

distributed systems
SMART_READER_LITE
LIVE PREVIEW

Distributed Systems Principles and Paradigms Chapter 02 (version - - PDF document

Distributed Systems Principles and Paradigms Chapter 02 (version September 5, 2007 ) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.20. Tel: (020) 598 7784


slide-1
SLIDE 1

Distributed Systems

Principles and Paradigms

Chapter 02

(version September 5, 2007)

Maarten van Steen

Vrije Universiteit Amsterdam, Faculty of Science

  • Dept. Mathematics and Computer Science

Room R4.20. Tel: (020) 598 7784 E-mail:steen@cs.vu.nl, URL: www.cs.vu.nl/∼steen/

01 Introduction 02 Architectures 03 Processes 04 Communication 05 Naming 06 Synchronization 07 Consistency and Replication 08 Fault Tolerance 09 Security 10 Distributed Object-Based Systems 11 Distributed File Systems 12 Distributed Web-Based Systems 13 Distributed Coordination-Based Systems

00 – 1 /

slide-2
SLIDE 2

Architectures

  • Architectural styles
  • Software architectures
  • Arvchitectures versus middleware
  • Self-management in distributed systems

02 – 1 Architectures/

slide-3
SLIDE 3

Architectural styles (1/2)

Basic idea: Organize into logically different compo- nents, and subsequently distribute those components

  • ver the various machines.

Layer N Layer N-1 Layer 1 Layer 2 Request flow Response flow (a) (b) Object Object Object Object Object Method call

Observation: (a) Layered style is used for client-server system; (b) object-based style for distributed object systems.

02 – 2 Architectures/2.1 Architectural styles

slide-4
SLIDE 4

Architectural Styles (2/2)

Observation: Decoupling processes in space (“anony- mous”) and also time (“asynchronous”) has led to al- ternative styles:

(a) (b) Component Component Component Event bus Publish Publish Event delivery Component Component Data delivery Shared (persistent) data space

(a) Publish/subscribe and (b) Shared dataspace

02 – 3 Architectures/2.1 Architectural styles

slide-5
SLIDE 5

Centralized Architectures

Basic Client–Server Model: Characteristics:

  • There are processes offering services (servers)
  • There are processes that use services (clients)
  • Clients and servers can be distributed across dif-

ferent machines

  • Clients follow request/reply model with respect to

using services

Client Request Reply Server Provide service Time Wait for result

02 – 4 Architectures/2.2 System Architectures

slide-6
SLIDE 6

Application Layering (1/2)

Traditional three-layered view:

  • User-interface layer contains units for an applica-

tion’s user interface

  • Processing layer contains the functions of an ap-

plication, i.e. without specific data

  • Data layer contains the data that a client wants to

manipulate through the application components Observation: This layering is found in many distributed information systems, using traditional database tech- nology and accompanying applications.

02 – 5 Architectures/2.2 System Architectures

slide-7
SLIDE 7

Application Layering (2/2)

Database with Web pages Query generator Ranking algorithm HTML generator User interface Keyword expression Database queries Web page titles with meta-information Ranked list

  • f page titles

HTML page containing list Processing level User-interface level Data level

02 – 6 Architectures/2.2 System Architectures

slide-8
SLIDE 8

Multi-Tiered Architectures

Single-tiered: dumb terminal/mainframe configuration Two-tiered: client/single server configuration Three-tiered: each layer on separate machine Traditional two-tiered configurations:

User interface User interface User interface Application User interface Application User interface Application Database Application Application Application Database Database Database Database Database User interface (a) (b) (c) (d) (e) Client machine Server machine

02 – 7 Architectures/2.2 System Architectures

slide-9
SLIDE 9

Decentralized Architectures

Observation: In the last couple of years we have been seeing a tremendous growth in peer-to-peer sys- tems:

  • Structured P2P: nodes are organized following a

specific distributed data structure

  • Unstructured P2P: nodes have randomly selected

neighbors

  • Hybrid P2P: some nodes are appointed special

functions in a well-organized fashion Note: In virtually all cases, we are dealing with over- lay networks: data is routed over connections setup between the nodes (cf. application-level multicasting).

02 – 8 Architectures/2.2 System Architectures

slide-10
SLIDE 10

Structured P2P Systems (1/2)

Basic idea: Organize the nodes in a structured over- lay network such as a logical ring, and make specific nodes responsible for services based only on their ID:

15 2 14 3 13 4 12 8 7 9 6 10 5 11 1 Actual node {2,3,4} {5,6,7} {8,9,10,11,12} {13,14,15} {0,1} Associated data keys

Note: The system provides an operation LOOKUP(key) that will efficiently route the lookup request to the as- sociated node.

02 – 9 Architectures/2.2 System Architectures

slide-11
SLIDE 11

Structured P2P Systems (2/2)

Other example: Organize nodes in a d-dimensional space and let every node take the responsibility for data in a specific region. When a node joins ⇒ split a region.

(0.2,0.8) (0.6,0.7) (0.9,0.9) (0.2,0.3) (0.7,0.2) (0.9,0.6)

(0,0) Keys associated with node at (0.6,0.7)

(0.2,0.8) (0.6,0.7) (0.9,0.9) (0.2,0.45) (0.7,0.2) (0.9,0.6) (0.2,0.15) (1,0) (0,1) (1,1)

Actual node

(a) (b)

02 – 10 Architectures/2.2 System Architectures

slide-12
SLIDE 12

Unstructured P2P Systems

Observation: Many unstructured P2P systems at- tempt to maintain a random graph: Basic principle: Each node is required to be able to contact a randomly selected other node:

  • Let each peer maintain a partial view of the net-

work, consisting of c other nodes

  • Each node P periodically selects a node Q from

its partial view

  • P and Q exchange information and exchange mem-

bers from their respective partial views Observation: It turns out that, depending on the ex- change, randomness, but also robustness of the net- work can be maintained.

02 – 11 Architectures/2.2 System Architectures

slide-13
SLIDE 13

Topology Management of Overlay Networks (1/2)

Basic idea: Distinguish two layers: (1) maintain ran- dom partial views in lowest layer; (2) be selective on who you keep in higher-layer partial view.

Protocol for randomized view Protocol for specific

  • verlay

Random peer Links to randomly chosen other nodes Links to topology- specific other nodes Random

  • verlay

Structured

  • verlay

Note: lower layer feeds upper layer with random nodes; upper layer is selective when it comes to keeping ref- erences.

02 – 12 Architectures/2.2 System Architectures

slide-14
SLIDE 14

Topology Management of Overlay Networks (2/2)

Example: Consider a N × N grid. Keep only refer- ences to nearest neighbors:

(a1,a2) − (b1,b2) = d1 + d2

di = min{N − |ai − bi|,|ai − bi|} Result: a nice torus will appear after a while:

Time

02 – 13 Architectures/2.2 System Architectures

slide-15
SLIDE 15

Superpeers

Observation: Sometimes it helps to select a few nodes to do specific work: superpeer

Superpeer Regular peer Superpeer network

Examples:

  • Peers maintaining an index (for search)
  • Peers monitoring the state of the network
  • Peers being able to setup connections

02 – 14 Architectures/2.2 System Architectures

slide-16
SLIDE 16

Hybrid Architectures (1/2)

Observation: In many cases, client-server architec- tures are combined with peer-to-peer solutions Example: Edge-server architectures, which are often used for Content Delivery Networks:

Edge server Core Internet Enterprise network ISP ISP Client Content provider

02 – 15 Architectures/2.2 System Architectures

slide-17
SLIDE 17

Hybrid Architectures (2/2)

Example: Combining a P2P download protocol with a client-server architecture for controlling the down- loads: Bittorrent

Node 1 Node 2 Node N .torrent file for F A BitTorrent Web page List of nodes storing F Web server File server Tracker Client node K out of N nodes Lookup(F)

  • Ref. to

file server

  • Ref. to

tracker

Basic idea: Once a node has identified where to down- load a file from, it joins a swarm of downloaders who in parallel get file chunks from the source, but also distribute these chunks amongst each other.

02 – 16 Architectures/2.2 System Architectures

slide-18
SLIDE 18

Architectures versus Middleware

Problem: In many cases, distributed systems/applications are developed according to a specific architectural style. The chosen style may not be optimal in all cases ⇒ there is a need to (dynamically) adapt the behavior of the middleware when needed. Interceptors: Intercept the usual flow of control when invoking a remote object:

Client application

B.do_something(value) invoke(B, &do_something, value) send([B, "do_something", value])

Request-level interceptor Message-level interceptor Object middleware Local OS Application stub To object B Nonintercepted call Intercepted call

02 – 17 Architectures/2.3 Architectures versus Middleware

slide-19
SLIDE 19

Adaptive Middleware

Separation of concerns: Try to separate extra func- tionalities and later weave them together into a single implementation ⇒ only toy examples so far. Computational reflection: Let a program inspect it- self at runtime and adapt/change its settings dy- namically if necessary ⇒ mostly at language level and applicability unclear. Component-based design: Organize a distributed ap- plication through components that can be dynam- ically replaced when needed ⇒ highly complex, also many intercomponent dependencies. Observation: Do we need adaptive software at all,

  • r is the issue adaptive systems?

02 – 18 Architectures/2.3 Architectures versus Middleware

slide-20
SLIDE 20

Self-managing Distributed Systems

Observation: Distinction between system and soft- ware architectures blurs when automatic adaptivity needs to be taken into account:

  • Self-configuration
  • Self-managing
  • Self-healing
  • Self-optimizing
  • Self-*

Note: There is a lot of hype going on in this field of autonomic computing.

02 – 19 Architectures/2.4 Self-management in Distributed Systems

slide-21
SLIDE 21

Feedback Control Model

Observation: In many cases, self-* systems are or- ganized as a feedback control system:

Core of distributed system Metric estimation Analysis Adjustment measures +/- +/- +/- Reference input Initial configuration Uncontrollable parameters (disturbance / noise) Observed output Measured output Adjustment triggers Corrections

02 – 20 Architectures/2.4 Self-management in Distributed Systems

slide-22
SLIDE 22

Example: Globule

Globule: Collaborative CDN that analyzes traces to decide where replicas of Web content should be placed. Decisions are driven by a general cost model: cost = (w1 × m1) + (w2 × m2) + ··· + (wn × mn)

Replica server Core Internet Enterprise network ISP ISP Client Origin server Client Client

  • Globule origin server collects traces and does what-

if analysis by checking what would have happened if page P would have been placed at edge server S.

  • Many strategies are evaluated, and the best one

is chosen.

02 – 21 Architectures/2.4 Self-management in Distributed Systems