Distributed Event Routing in Routing in Publish/Subscribe Systems - - PowerPoint PPT Presentation

distributed event routing in routing in publish subscribe
SMART_READER_LITE
LIVE PREVIEW

Distributed Event Routing in Routing in Publish/Subscribe Systems - - PowerPoint PPT Presentation

Universit di Roma La Sapienza MIDLAB Middleware Laboratory Dipartimento di Informatica e Sistemistica Distributed Event Routing in Routing in Publish/Subscribe Systems Roberto Baldoni Sapienza University of Rome Sapienza


slide-1
SLIDE 1 Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica Middleware Laboratory

MIDLAB

Distributed Event Routing in Routing in Publish/Subscribe Systems

Roberto Baldoni Sapienza University of Rome Sapienza University of Rome Joint work with Leonardo Querzoni, Saso Tarkoma, Antonino Virgillito Goteborg - 25/3/2009

slide-2
SLIDE 2

■ The publish/subscribe communication paradigm:

■ Publishers: produce data in the form of events. ■ Subscribers: declare interests on published data with subscriptions. ■ Each subscription is a filter on the set of published events. ■ An Event Notification Service (ENS) notifies to each subscriber every

published event that matches at least one of its subscriptions.

3

published event that matches at least one of its subscriptions.

  • cks

3

publish notify unsubscribe subscribe Laboratory

Basic building block

Middleware La

MIDLAB

slide-3
SLIDE 3

■Publish/subscribe was thought as a comprehensive solution for those

problems:

■Many-to-many communication model - Interactions take place in an

environment where various information producers and consumers can communicate, all at the same time. Each piece of information can be delivered at the same time to various consumers. Each consumer receives information from various producers.

2

information from various producers.

■Space decoupling - Interacting parties do not need to know each other.

Message addressing is based on their content.

■Time

decoupling

  • Interacting

parties do not need to be actively participating in the interaction at the same time. Information delivery is mediated through a third party.

■Synchronization

decoupling

  • Information

flow from producers to consumers is also mediated, thus synchronization among interacting parties is not needed.

raction model

2

Laboratory

is not needed.

■Push/Pull interactions - both methods are allowed.

■These characteristics make pub/sub perfectly suited for distributed

applications relying on document-centric communication. The pub/sub interac

Middleware La

MIDLAB

slide-4
SLIDE 4

■Events represent information structured following an event

schema.

■The event schema is fixed, defined a-priori, and known to

all the participants.

■It defines a set of fields or attributes, each constituted by a 4 ■It defines a set of fields or attributes, each constituted by a

name and a type. The types allowed depend on the specific implementation, but basic types (like integers, floats, booleans, strings) are usually available.

■Given an event schema, an event is a collection of values,

  • ne for each attribute defined in the schema.

d subscription models

4

Laboratory

Event schema and

Middleware La

MIDLAB

slide-5
SLIDE 5

■Example: suppose we are dealing with an application

whose purpose is to distribute updates about computer- related blogs.

5

name type allowed values blog_name string ANY address URL ANY

Event

d subscription models

5

address URL ANY genre enumerati
  • n
[hardware, software, peripherals, development] author string ANY abstract string ANY rating integer [1-5] update_date date >1-1-1970 00:00

Event Schema

name value aboratory

Event schema and

name value blog_name Prad.de address http://www.prad.de/en/index.html genre peripherals author Mark Hansen abstract “The review of the new TFT panel...” rating 4 update_date 26-4-2006 17:58

Event

Middleware Lab

MIDLAB

slide-6
SLIDE 6

■Subscribers express their interests in specific events

issuing subscriptions.

■A subscription is, generally speaking, a constraint

expressed on the event schema.

■The Event Notification Service will notify an event e to a 6 ■The Event Notification Service will notify an event e to a

subscriber x only if the values that define the event satisfy the constraint defined by one of the subscriptions s issued by x. In this case we say that e matches s.

■Subscriptions can take various forms, depending on the

subscription language and model employed by each specific implementation.

d subscription models

6

re Laboratory

■Example: a subscription can be a conjunction of

constraints each expressed on a single attribute. Each constraint in this case can be as simple as a >=< operator applied on an integer attribute, or complex as a regular expression applied to a string.

Event schema and

Middleware

MIDLAB

slide-7
SLIDE 7

■From an abstract point of view the event schema defines

an n-dimensional event space (where n is the number of attributes).

■In this space each event e represents a point. ■Each subscription s identifies a subspace.

7

■Each subscription s identifies a subspace. ■An event e matches the subscription s if, and only if, the

corresponding point is included in the portion of the event space delimited by s.

d subscription models

7

aboratory

Event schema and

Middleware Lab

MIDLAB

slide-8
SLIDE 8

■Depending on the subscription model used we distinguish

various flavors of publish/subscribe:

■ Topic-based ■ Hierarchy-based ■ Content-based

8

■ Content-based ■ Type-based ■ Concept-based ■ XML-based ■ .........

d subscription models

8

Laboratory

Event schema and

Middleware La

MIDLAB

slide-9
SLIDE 9

■Topic-based selection: data published in the system is

mostly unstructured, but each event is “tagged” with the identifier of a topic it is published in. Subscribers issue subscriptions containing the topics they are interested in.

■A topic can be thus represented as a “virtual channel” 9 ■

connecting producers to consumers. For this reason the problem of data distribution in topic-based publish/subscribe systems is considered quite close to group communications.

d subscription models

9

re Laboratory

Event schema and

Middleware L

MIDLAB

slide-10
SLIDE 10

■Hierarchy-based selection: even in this case each event

is “tagged” with the topic it is published in, and Subscribers issue subscriptions containing the topics they are interested in.

■Contrarily to the previous model, here topics are organized 0

1

in a hierarchical structure which express a notion of containment between topics. When a subscriber subscribe a topic, it will receive all the events published in that topic and in all the topics present in the corresponding sub-tree.

d subscription models

re Laboratory

Event schema and

Middleware L

MIDLAB

slide-11
SLIDE 11

■Content-based selection: all the data published in the

system is mostly structured. Each subscription can be expressed as a conjunction of constrains expressed on

  • attributes. The Event Notification Service filters out useless

events before notifying a subscriber.

1 1

d subscription models

1

event1: name= Acme cables value=23$

e1 e1 re Laboratory

Event schema and

e2

event2: name= Acme RE value=18$

e1 Middleware L

MIDLAB

slide-12
SLIDE 12

■The Event Notification Service is usually implemented as

a:

■ Centralized service: the ENS is implemented on a single server. ■ Distributed service: the ENS is constituted by a set of nodes,

event brokers, which cooperate to implement the service.

2 1

■The latter is usually preferred for large settings where

scalability is a fundamental issue.

ture

2

re Laboratory

General architectur

Middleware L

MIDLAB

slide-13
SLIDE 13
  • Modern ENSs are implemented through a set of processes, called

event brokers, forming an overlay network.

  • Each client (publisher or subscriber) accesses the service through a

broker that masks the system complexity.

3 1 3

re Laboratory
  • An event routing mechanism routes each event inside the ENS from

the broker where it is published to the broker(s) where it must be notified. Event routing

Middleware L

MIDLAB

slide-14
SLIDE 14

■Event flooding: each event is broadcast from the publisher in the

whole system.

■The implementation is straightforward but very expensive. ■This solution has the highest message overhead with no memory

  • verhead.

4 1 4

x>30 x=167 x<18 AND x>10 x=30 OR x>200 x=22 re Laboratory

Event routing

x=30 OR x>200 x=30 x<>30 x<5 x>10 x>40 Middleware L

MIDLAB

slide-15
SLIDE 15

■Subscription flooding: each subscription is copied on every broker, in order

to build locally complete subscription tables. These tables are then used to locally match events and directly notify interested subscribers. This approach suffers from a large memory overhead, but event diffusion is optimal. It is impractical in applications where subscriptions change frequently.

5 1

x>30 IP x

5

x>30 x=167 x<18 AND x>10 x=22 x<>30 IP y x<5 IP z x>40 IP w x>10 IP xyz re Laboratory

Event routing

x=30 OR x>200 x=30 x<>30 x<5 x>10 x>40 Middleware L

MIDLAB

slide-16
SLIDE 16

6 1

Filter-based routing: subscriptions are partially diffused in the system and used to build routing tables. These tables, are then exploited during event diffusion to dynamically build a multicast tree that (hopefully) connects the publisher to all, and only, the interested subscribers.

6

x>30 x=167 x<18 AND x>10 x=30 OR x>200 x=22 re Laboratory

Event routing

x=30 OR x>200 x=30 x<>30 x<5 x>10 x>40 Middleware L

MIDLAB

slide-17
SLIDE 17

6 1

ANY a x>=30 OR (x<18 AND x>10) 5 ANY 1
  • b
x>=30 OR (x<18 AND x>10) 3 ANY

Filter-based routing: subscriptions are partially diffused in the system and used to build routing tables. These tables, are then exploited during event diffusion to dynamically build a multicast tree that (hopefully) connects the publisher to all, and only, the interested subscribers.

6

x>30 x=167 x<18 AND x>10 x=30 OR x>200 x=22 3 ANY 1
  • 2
  • 6
x>10 8 x<5 9 ANY e ANY 5 x>10 OR x<5 d ANY 9 x>10 OR x<5 f
  • re Laboratory

Event routing

x=30 OR x>200 x=30 x<>30 x<5 x>10 x>40 9 ANY 3 x>=30 OR (x<18 AND x>10) 7 x>10 5 ANY Middleware L

MIDLAB

slide-18
SLIDE 18

6 1

ANY a x>=30 OR (x<18 AND x>10) 5 ANY 1
  • b
x>=30 OR (x<18 AND x>10) 3 ANY

Filter-based routing: subscriptions are partially diffused in the system and used to build routing tables. These tables, are then exploited during event diffusion to dynamically build a multicast tree that (hopefully) connects the publisher to all, and only, the interested subscribers.

6

x>30 x=167 x<18 AND x>10 x=30 OR x>200 x=22 3 ANY 1
  • 2
  • 6
x>10 8 x<5 9 ANY e ANY 5 x>10 OR x<5 d ANY 9 x>10 OR x<5 f
  • re Laboratory

Event routing

x=30 OR x>200 x=30 x<>30 x<5 x>10 x>40 9 ANY 3 x>=30 OR (x<18 AND x>10) 7 x>10 5 ANY Middleware L

MIDLAB

slide-19
SLIDE 19

6 1

ANY a x>=30 OR (x<18 AND x>10) 5 ANY 1
  • b
x>=30 OR (x<18 AND x>10) 3 ANY

6

x>30 x=167 x<18 AND x>10 x=30 OR x>200 x=22 re Laboratory 3 ANY 1
  • 2
  • 6
x>10 8 x<5 9 ANY e ANY 5 x>10 OR x<5 d ANY 9 x>10 OR x<5 f
  • Event routing
x=30 OR x>200 x=30 x<>30 x<5 x>10 x>40 Middleware L

MIDLAB

9 ANY 3 x>=30 OR (x<18 AND x>10) 7 x>10 5 ANY
slide-20
SLIDE 20

■Rendez-Vous routing: it is based on two functions, namely SN and

EN, used to associate respectively subscriptions and events to brokers in the system.

■Given a subscription s, SN(s) returns a set of nodes which are

responsible for storing s and forwarding received events matching s to all those subscribers that subscribed it.

7 1

■Given an event e, EN(e) returns a set of nodes which must receive e to

match it against the subscriptions they store.

■Event routing is a two-phases process: first an event e is sent to all

brokers returned by EN(e), then those brokers match it against the subscriptions they store and notify the corresponding subscribers.

■This approach works only if for each subscription s and event e, such

that e matches s, the intersection between EN(e) and SN(s) is not empty

7

re Laboratory

that e matches s, the intersection between EN(e) and SN(s) is not empty (mapping intersection rule). Event routing

Middleware L

MIDLAB

slide-21
SLIDE 21

■Rendez-Vous routing: example. ■Phase 1: two nodes issue the same subscription S.

8 1 8

re Laboratory

■SN(S) = {4,a}

Event routing

Middleware L

MIDLAB

slide-22
SLIDE 22

■Rendez-Vous routing: example. ■Phase 1I: an event e matching S is routed toward the rendez-vous

node where it is matched against S.

9 1 9

re Laboratory

■EN(e) = {5,6,a} ■Broker a is the rendez-vous point between event e and subscription S.

Event routing

Middleware L

MIDLAB

slide-23
SLIDE 23

■A generic architecture of a publish/subscribe system:

2

re Laboratory

Event routing

Middleware L

MIDLAB

From “Distributed Event Routing in Publish/Subscribe Communication Systems: a survey” R.Baldoni, L. Querzoni, S. Takoma, A. Virgillito midlab tech.rep. 2007, to appear (springer)
slide-24
SLIDE 24

Which pub-sub for a given environment

Type of dynamics of subscriptions

Laboratory

Type of dynamics of nodes (churn)

Middleware La

MIDLAB

Type of dynamics of mobility

slide-25
SLIDE 25

Which pub-sub for a given environment Pub-sub with broker overlay managed environment

Type of dynamics of nodes (churn)

Laboratory
  • managed environment
  • longlife peers
  • no churn

Pub-sub with structured overlay

  • unmanaged/managed environment
  • shortlife peers
Middleware La

MIDLAB

Pub-sub with unstructured overlay

  • unmanaged environment
  • shortlife peers
  • High churn
  • low churn
slide-26
SLIDE 26

■Financial Infrastructures (the CoMiFin Project): Future scenarios for pub/sub

Laboratory Middleware La

MIDLAB

slide-27
SLIDE 27

■Smart Houses (The SM4ALL Project): Future scenarios for pub/sub

Brain-Computer Interface Traditional Interface User Layer Laboratory Services (embedded on device) Orchestration engine (embedded on device) Ad hoc communications Composition Engine Composite domotic servicespecification Goal(s) / desiderata service(s) templates User Profiler & Context Manager deployment Interface CompositionLayer Data Distribuition Bus (P2P) Repository LocalRepository (embedded on device) Middleware La

MIDLAB

device/sensor/appliance Middleware (embedded on device) Pervasive Layer
slide-28
SLIDE 28

1 2

1

re Laboratory

SIENA

Middleware L

MIDLAB

slide-29
SLIDE 29

■Antonio Carzaniga, Matthew J. Rutherford, Alexander J. Wolf “A Routing Scheme for Content-Based Networking” INFOCOM 2004

2 2 2

re Laboratory

SIENA

Middleware L

MIDLAB

slide-30
SLIDE 30

■Each node has a service interface consisting of two

  • perations:

■ send_message(m) ■ set_predicate(p)

■A predicate is a disjunction of conjunctions of constraints of 3

2

■A predicate is a disjunction of conjunctions of constraints of

individual attributes.

■A content-based network can be seen as a dynamically-

configurable broadcast network, where each message is treated as a broadcast message whose broadcast tree is dynamically pruned using content-based addresses.

3

re Laboratory

SIENA

Middleware L

MIDLAB

slide-31
SLIDE 31

■Combined Broadcast and Content-Based (CBCB) routing

scheme.

■Content-based layer: “prunes” broadcast forwarding paths ■Broadcast layer: diffuses messages in the network ■Overlay point-to-point network: manages connections

4 2

■Overlay point-to-point network: manages connections

4

x>30 x=167 x<18 AND x>10 x=30 OR x>200 x=22 re Laboratory x=30 OR x>200 x=30 x<>30 x<5 x>10 x>40

SIENA

Middleware L

MIDLAB

slide-32
SLIDE 32

5 2

■The broadcast layer:

■ A broadcast function B : N x I → I* is available at each router.

Given a source node s and an input interface i, it returns a set of

  • utput interfaces.

■ The broadcast function defines a broadcast tree routed at each

source node.

5

source node.

■ The broadcast function satisfies the all-pairs path symmetry

property: for each pair of nodes x and y, the broadcast function defines two broadcast trees Tx and Ty, rooted at nodes x and y respectively, such that the path x⇝y in Tx is congruent to the reverse of the path y⇝x in Ty.

re Laboratory

SIENA

Middleware L

MIDLAB

slide-33
SLIDE 33

6 2

■Example:

6

re Laboratory

SIENA

Middleware L

MIDLAB

slide-34
SLIDE 34

6 2

■Example:

6

re Laboratory

SIENA

Middleware L

MIDLAB

slide-35
SLIDE 35

7 2

■The content-based layer:

■ Maintains forwarding state in the form of a content-based

forwarding table. The table, for each node, associates a content- based address to each interface.

7

re Laboratory

SIENA

Middleware L

MIDLAB

slide-36
SLIDE 36

8 2

■The message forwarding mechanism:

■ The content-based forwarding table is used by a forwarding

function Fc that, given a message m, selects the subset of interfaces associated with predicates matching m.

■ The result of Fc is then combined with the broadcast function B,

computed for the original source of m.

8

computed for the original source of m.

■ A message is therefore forwarded along the set of interfaces

returned by the following formula:

■(B(source(m), incoming_if(m)) ∪ {I0}) ∩ Fc(m)

re Laboratory

SIENA

Middleware L

MIDLAB

slide-37
SLIDE 37

9 2

■Example:

9

re Laboratory

SIENA

Middleware L

MIDLAB

slide-38
SLIDE 38

3

■Forwarding tables maintenance:

■ Push mechanism based on receiver advertisements. ■ Pull mechanism based on sender requests and update replies.

■Receiver advertisements:

■ are issued by nodes periodically and/or when the node changes its 0 ■ are issued by nodes periodically and/or when the node changes its

local content-based address p0.

■ Content-based RA ingress filtering: a router receiving through

interface i an RA issued by node r and carrying content-based address pRA first verifies whether or not the content-based address pi associated with interface i covers pRA. If pi covers pRA, then the router simply drops the RA.

■ Broadcast RA propagation: if pi does not cover pRA, then the router

re Laboratory

■ Broadcast RA propagation: if pi does not cover pRA, then the router

computes the set of next-hop links on the broadcast tree rooted in r (i.e., B(r, i)) and forwards the RA along those links.

■ Routing table update: if pi does not cover pRA, then the router also

updates its routing table, adding pRA to pi, computing pi ← pi ∨ pRA. SIENA

Middleware L

MIDLAB

slide-39
SLIDE 39

1 3

■Example: Broker 6 issues subscription s1

i pred 4 s1 i pred 3 s1

1

i pred 6 s1 4 s1 i pred 4 s1 i pred 3 s1

re Laboratory

3 s1

SIENA

Middleware L

MIDLAB

slide-40
SLIDE 40

■Example: Broker 2 issues subscription s2≺s1

2 3

i pred i pred 4 s1 i pred 3 s1

2

i pred 6 s1 2 s2 4 s1 i pred 4 s1 i pred 3 s1 2 s2 i pred

re Laboratory

3 s1 4 s2

SIENA

Middleware L

MIDLAB

slide-41
SLIDE 41

3 3

■Notice that, because of the ingress filtering rule, the RA

protocol can only widen the selection of the content-based addresses stored in routing tables. In the long run, this may cause an “inflation” of those content-based addresses.

■Example: Broker 6 substitute its predicate with s3≺s1

3

i pred 6 s1 2 s2 i pred

re Laboratory

i pred 4 s1

SIENA

Middleware L

MIDLAB

s3 s1

slide-42
SLIDE 42

1 2

1

re Laboratory

SIENA

Middleware L

MIDLAB

slide-43
SLIDE 43

■Miguel

Castro, Peter Druschel, Anne-Marie Kermarrec and Antony Rowstron

“SCRIBE: A large-scale and decentralized application-level multicast infrastructure” JSAC, 2002.

4 4 4

re Laboratory

SCRIBE

Middleware L

MIDLAB

slide-44
SLIDE 44

■Scribe is a topic-based publish/subscribe system able to

support a large number of groups with a potentially large number of publishers and subscribers.

■Each user in the system (publisher or subscriber) is also a

  • broker. The event notification service is therefore

5 4

constituted by all the users.

■Users can join and leave the system. The event notification

service can therefore change at runtime.

■Scribe is built upon Pastry, a peer-to-peer location and

routing service.

■Pastry is used to build and maintain the application-level

5

re Laboratory

■Pastry is used to build and maintain the application-level

topology that connects brokers in the event notification service.

■Pastry also provides applications with efficient primitives

for object storage and location.

SCRIBE

Middleware L

MIDLAB

slide-45
SLIDE 45

■Pastry implements a Distributed Hash Table:

■ Each object is associated with a key. ■ Each key is stored (together with the corresponding objects) in a

node.

■ Each object can be efficiently located and retrieved knowing its 6

4

■ Each object can be efficiently located and retrieved knowing its

key.

■Each node participating to Pastry is identified by 128-bit

NodeID obtained applying a hash function h to its IP address.

■NodeId is in base 2b, where b is a configuration parameter. ■The function h evenly distributes node identifiers in the

6

re Laboratory

■The function h evenly distributes node identifiers in the

circular key-space [0, 2128-1].

■Each object is stored on the node with the closest NodeID.

SCRIBE

Middleware L

MIDLAB

More details on Pastry…..

slide-46
SLIDE 46

■The main function provided by Pastry is route(msg,key). ■Routing is realized matching key prefixes with nodes

stored in each routing table.

■In each routing step, the current node forwards the

message to a node whose NodeID shares with the target 0

5

message to a node whose NodeID shares with the target key a prefix that is at least one digit longer than the prefix that the key shares with the current NodeID.

■If no such node is found in the routing table the message is

forwarded to a node whose NodeID shares a prefix with the key as long as the current node, but numerically close to the key than the current NodeID.

re Laboratory

SCRIBE

Middleware L

MIDLAB

slide-47
SLIDE 47

■Scribe use the key-node mapping provided by Pastry to

assign a rendez-vous node to each topic:

■ Each topic t (called Group in Scribe) is mapped to a key applying

h(t)

■ EN(e)=h(e), SN(s)=h(s)

1 5

■Membership management:

■ Joining a group ■ Leaving a group

■Message diffusion

1

re Laboratory

SCRIBE

Middleware L

MIDLAB

slide-48
SLIDE 48 Laboratory

Publisher Subscriber Pure forwarder

Middleware La

MIDLAB

First Open Workshop Budapest 21-3-2007

Pure forwarder Rendez-vous node

slide-49
SLIDE 49

1 2

1

re Laboratory

SIENA

Middleware L

MIDLAB

slide-50
SLIDE 50

■R. Baldoni, R. Beraldi, V. Quema, L. Querzoni, S. Tucci Piergiovanni “TERA: Topic- based Event Routing for Peer-to-Peer Architectures” International Conference

  • n Distributed Event-Based Systems (DEBS), 2007.

4 4 4

re Laboratory

SCRIBE

Middleware L

MIDLAB

slide-51
SLIDE 51

■Traffic confinement can be realized solving three problems:

■ Interest clustering

Subscribers sharing similar interests should be arranged in a same cluster; ideally, given an event, all and only the subscribers interested in that event should be grouped in a single cluster.

5 5

■ Outer-cluster routing

Events can be published anywhere in the system. We need a mechanism able to bring each event from the node where it is published, to at least one interested subscriber.

■ Inner-cluster diffusion

Once a subscriber receives an event it can simply broadcast it in

5

re Laboratory

Once a subscriber receives an event it can simply broadcast it in the cluster it is part of.

Middleware L

MIDLAB

slide-52
SLIDE 52

■Scribe [Castro et al., IEEE Journal on Selected Areas in Communications n.8 v.20, 2002]

■ Topic-based publish/subscribe implemented on top of DHTs. ■ For each topic a single node is

responsible to act as a rendez-vous point between published events and issued subscriptions.

6 5

and issued subscriptions.

■ Problems:

■ single points of

failure;

■ hot spots; ■ partial traffic

confinement.

6

re Laboratory

Publisher Subscriber Pure forwarder

Middleware L

MIDLAB

Pure forwarder Rendez-vous node

Inner-cluster diffusion
slide-53
SLIDE 53

■ A two-layer infrastructure:

■ All clients are connected by a single overlay network at the lower layer

(general overlay).

■ Various overlay network instances at the upper layer connect clients

subscribed to same topics (topic overlays).

Laboratory

■ Event diffusion:

■ The event is routed in the

general overlay toward one

  • f the nodes subscribed to

the target topic.

■ This node acts as an

access point for the event that is then diffused in the correct topic overlay.

inner-cluster diffusion Middleware La

MIDLAB in the correct topic overlay.

First Open Workshop Budapest 21-3-2007
slide-54
SLIDE 54

■Event routing in the general overlay is realized through a

random walk.

■The walk stops at the first broker that knows an access

point for the target topic.

1 6

topic AP

1

re Laboratory topic AP x B1 a B5 topic AP t B1 y B6

TERA

Middleware L

MIDLAB

topic AP a B5 f B6 topic AP e B4 h B4
slide-55
SLIDE 55

■Each node maintains locally an Access Point Table (APT) ■Each entry in the APT is a couple

<topic, node address>

■An entry <t,n> represents the fact that n

is an access point for topic t.

topic AP x B1 a B5 2

6

is an access point for topic t.

■The length of the APT is fixed. ■Goal:

■ each topic in the APT must be a uniform random sample among all

the topics in the system;

■ the access point associated to a topic in an APT must be a uniform

random sample among all the odes subscribed to that topic.

2

re Laboratory

random sample among all the odes subscribed to that topic. TERA

Middleware L

MIDLAB

slide-56
SLIDE 56

■Subscription advertisement:

■ each node periodically advertises its subscriptions to a set of

nodes chosen uniformly at random among the population;

■ each advertisement is a set of couples

<topic, popularity>

3 6

■ An advertisement <t,p> represents the fact that there are

(approximately) p nodes subscribed to topic t.

■APT update. When a node receives and advertisement

<t,p> from node n:

■ if the APT contains an entry for <t,m> it simply puts m=n

3

re Laboratory

■ otherwise it puts a new entry <t,n> in the APT with probability 1/p

TERA

Middleware L

MIDLAB

slide-57
SLIDE 57

4 6 4

re Laboratory

■ OMPs: Newscast, Cyclon, etc.

TERA

Middleware L

MIDLAB

slide-58
SLIDE 58

■“Mobile ad-

hoc networks”

1 2

hoc networks”

1

re Laboratory

SIENA

Middleware L

MIDLAB

slide-59
SLIDE 59

■R.

Baldoni, R. Beraldi, G. Cugola, M. Migliavacca, L. Querzoni

“Structure-less Content-Based Routing in Mobile Ad Hoc Networks”

International Conference on Pervasive Services (ICPS), 2005. re Laboratory

TERA

Middleware L

MIDLAB

slide-60
SLIDE 60

■Environment:

■ Mobile ad-hoc networks (MANETs). ■ Mobile nodes that communicate through wireless links. ■ No fixed communication infrastructure. ■ Network topology is defined by node positioning and environment ■ Network topology is defined by node positioning and environment

physical characteristics.

■ Network topology continuously modified by node movements. ■ Limited available resources both on nodes and in the network.

■Existing solutions:

■ Mesh based + multicast [E. Yoneki and J. Bacon, Pervasive Computing and Communications

Workshops, 2004] re Laboratory Workshops, 2004]

■ Spatial scoping [R. Meier and V. Cahill. International Workshop on Distributed Event-Based Systems,

2002] [H. Zhou and S. Singh, MobiHoc, 2000]

■Contribution: routing structures are difficult to maintain in a

dynamic network. Exploit probabilistic event filtering.

TERA

Middleware L

MIDLAB

slide-61
SLIDE 61

Potential Advantages of Pub/Sub for Mobile Wireless

■ Decoupling of publishers and subscribers aids ■ Decoupling of publishers and subscribers aids

mobility

■ Decoupling of publishers and subscribers aids

disconnected operation

re Laboratory

■ Multicast delivery can exploit intrinsic broadcast

properties of wireless

TERA

Middleware L

MIDLAB

slide-62
SLIDE 62

Scenarios of mobility

■One-hop mobile network ■One-hop mobile network

■Centralized vs distributed

dispatcher

■JEDI 2001, Huang 2001

■Multi-hop mobile network

(MANET)

re Laboratory

(MANET) ■No wired infrastructure ■Frequent changes in topology

(2002 – now…)

TERA

Middleware L

MIDLAB

slide-63
SLIDE 63

Mobile ad-hoc network: issues

■Costantly changing topology ■Communication is less “reliable” than in wired systems due ■Communication is less “reliable” than in wired systems due

to disconnections (driven by mobility or volountary)

■Effects on how to design a middleware for pub/sub (e.g., to

improve performance of the event dispatching system reducing the complexity of expressiveness)

slide-64
SLIDE 64

Architectural Model for Mobile ad-hoc networks

Routing Pub/sub PS-Routing Application Application Application Pub/sub

re Laboratory

MAC MAC MAC

[Huang et al 2002, Baldoni et al 2005, Bahemi et al 2005] [Mottola et al 2005, Bacon et al 2005] [Anceaume et al 2002, Picco et al 2003]

TERA

Middleware L

MIDLAB

slide-65
SLIDE 65

Topological Reconfiguration

Assumption: the underlying tree is kept connected and loop-free by some routing algorithm Routing Application CB Pub/sub algorithm Target: rearrange route traversed by events in response to changes in the topology of the network of brokers Separation of concerns between connectivity

re Laboratory

MAC Separation of concerns between connectivity layer and event dispatching layer

Retrofitting reliability (events lost during reconfiguration) through gossip-based algorithms

TERA

Middleware L

MIDLAB

slide-66
SLIDE 66

Integration Approach

Assumption: the underlying tree is kept connected and loop-free by MAODV MAC PS-Routing Application connected and loop-free by MAODV Target: maintaining a tree-shaped overlay network on top of the dynamic topology of a MANET Integration between connectivity layer and event dispatching layer

re Laboratory

MAC and event dispatching layer

TERA

Middleware L

MIDLAB

slide-67
SLIDE 67

Broadcast-Based approach

■Usage of multicast provided by the MAC

■routing structures are difficult to ■routing structures are difficult to

maintain consistent in a dynamic network.

■ Exploit probabilistic event filtering

Pub/sub MAC Application

re Laboratory

TERA

Middleware L

MIDLAB

slide-68
SLIDE 68

■Using deterministic structures for event filtering (like

SIENA’s routing tables) requires huge overhead for their maintenance.

■The authors propose a different strategy: ■The authors propose a different strategy:

■ Lack of any predefined logical structure as a support to event

filtering.

■ Event forwarding exploits the implicit local broadcast primitive

provided by the wireless communication medium.

■ Each broker decides autonomously if and when a received event

must be forwarded.

re Laboratory

■ The decision is taken basing on its proximity to target subscribers. ■ Proximity is estimated.

TERA

Middleware L

MIDLAB

slide-69
SLIDE 69

■Proximity to a subscriber is estimated leveraging time-

distance correlation.

■ Each broker periodically broadcasts in its communication range a

beacon.

■ The beacon contains a summary of the subscriptions the broker

stores. stores.

■ Each time a broker receives a beacon it updates a proximity table

adding:

■ The identifier of the broker that sent the beacon. ■ The subscriptions summary. ■ A time reference that is set to 0.

■ The time references is periodically increased if a beacon from the

re Laboratory

■ The time references is periodically increased if a beacon from the

same broker is not received.

■ When a time reference becomes greater than a predefined value T

the corresponding entry is removed from the proximity table.

■ Proximity to a broker Bi is defined as the ratio between the time

reference recorded in the proximity table and T. (If there is no entry for Bi then the proximity value is 1) TERA

Middleware L

MIDLAB

slide-70
SLIDE 70

■Event routing is realized through a store, delay and cancel-

  • r-forward approach:

■ A destination list containing target subscribers is attached to

each event (initially empty). A proximity value is associated to each target.

■ Each time a broker receives an event:

■ It checks if the event matches locally stored subscriptions,. ■ If its proximity table contains an entry for a broker listed in the

destination list, with a lower value for proximity, it schedules the event for forwarding.

■ The event is forwarded with a delay that is proportional to the

proximity value.

re Laboratory

proximity value.

■ If it receives the event again with a lower value for proximity, it de-

schedules the forwarding.

■ A credit-based mechanism allows a limited number of event

forwarding also if forwarding conditions are not met.

TERA

Middleware L

MIDLAB

slide-71
SLIDE 71 re Laboratory

TERA

Middleware L

MIDLAB

slide-72
SLIDE 72 re Laboratory

TERA

Middleware L

MIDLAB

slide-73
SLIDE 73 re Laboratory

TERA

Middleware L

MIDLAB

slide-74
SLIDE 74 re Laboratory

TERA

Middleware L

MIDLAB

slide-75
SLIDE 75 re Laboratory

TERA

Middleware L

MIDLAB

slide-76
SLIDE 76 re Laboratory

TERA

Middleware L

MIDLAB

slide-77
SLIDE 77 re Laboratory

TERA

Middleware L

MIDLAB

slide-78
SLIDE 78

■Last Slide of Goteborg presentation ■Follow additional material ■Follow additional material

slide-79
SLIDE 79

■“compositional

gossip”

1 2

gossip”

1

re Laboratory

■Étienne Rivière Roberto Baldoni Harry Li José “Compositional gossip: a

conceptual architecture for ACM Operating system review 2007

SIENA

Middleware L

MIDLAB

slide-80
SLIDE 80

1 2

■Gossip-based protocols present common general functionnalities:

■ (i) selecting peers with which to exchange information, ■ (ii) determining which set of information to share between nodes ■ (iii) updating the new local view.

■ Building basic blocks to describe complex gossip based applications: 1

re Laboratory

SIENA

Middleware L

MIDLAB

■SEL (select) the set of nodes (IP adresses) from which a peer to gossip with may be

chosen

■EXC (exchange) the set of information (network component samples, that is, nodes,

data, etc. ; depending on the protocol)
slide-81
SLIDE 81

1 2 1

re Laboratory

SIENA

Middleware L

MIDLAB

slide-82
SLIDE 82

1 2

■Group Composition block

■ Selection function

■ Membership management ■ Interest proximity ■ Network slicing

1

re Laboratory

■ Node semantics (e.g., topic)

SIENA

Middleware L

MIDLAB

slide-83
SLIDE 83

1 2

■TERA Architecture

Group Local node id size of the view, cycle period local subscriptions Membership mngt, global overlay push subscribe(t) unsubscribe(t) notify(e,t) publish(e,t)

Topic-Based Publish/Subscribe Software Logic

1

re Laboratory Peer sampling Group Composition block Broadcasting Broadcasting publish(e,i) notify(e,i) Group Composition block Interest proximity, Topic i overlay push

SIENA

Middleware L

MIDLAB

Broadcasting Notify(e,k) Group Composition block Interest proximity, Topic k overlay push publish(e,k)
slide-84
SLIDE 84

1 2

■Sub-2-sub architecture

1

re Laboratory

SIENA

Middleware L

MIDLAB

slide-85
SLIDE 85

1 2

■Sub-2-sub architecture

1

re Laboratory

SIENA

Middleware L

MIDLAB

slide-86
SLIDE 86

1 2

■Conclusion

■ Publish/subscribe systems are flexile paradigms for future

communication infrastructure

■ P2P technologies are mature enough to be used in other contexts

(data center, financial, network management etc)

1

re Laboratory

■ ……………Time to go to the talk ☺

SIENA

Middleware L

MIDLAB

slide-87
SLIDE 87

1 2

■exercise

1

re Laboratory

SIENA

Middleware L

MIDLAB

slide-88
SLIDE 88

4 3

■Sender Requests and Update Replies:

■ A router uses sender requests (SRs) to pull content-based addresses from all

receivers in order to update its routing table.

■ The results of an SR come back to the issuer of the SR through update

replies (URs).

■ The SR/UR protocol is designed to complement the RA protocol. Specifically, 4 ■ The SR/UR protocol is designed to complement the RA protocol. Specifically,

it is intended to balance the effect of the address inflation caused by RAs, and also to compensate for possible losses in the propagation of RAs.

■ An SR issued by n is broadcast to all routers, following the broadcast paths

defined at each router by the broadcast function B(n, . ).

■ A leaf router in the broadcast tree immediately replies with a UR containing

its content-based address p0.

■ A non-leaf router assembles its UR by combining its own content-based

re Laboratory

address p0 with those of the URs received from downstream routers, and then sends its URs upstream.

■ The issuer of the SR processes incoming URs by updating its routing table.

In particular, an issuer receiving a UR carrying predicate pUR from interface i updates its routing table entry for interface i with pi ← pUR.

SIENA

Middleware L

MIDLAB

slide-89
SLIDE 89

5 3

■Example: Broker 5 sends a Sender Request (SR) to

refresh its forwarding table.

5

i pred i pred 4 s1

re Laboratory

3 s1

SIENA

Middleware L

MIDLAB

slide-90
SLIDE 90

6 3

■Example: Update Replies (URs) are collected on the paths

toward broker 5.

6

i pred ⋁ i pred 4 s2 ⋁ s3 [s2] [s3] [s2 ⋁ s3] [s2 ⋁ s3] [ ] re Laboratory 3 s2 ⋁ s3

SIENA

Middleware L

MIDLAB

Exercise on Siena…..

slide-91
SLIDE 91

■Exercise: consider the following system:

7 3 7

re Laboratory

■The event space is represented by a single numerical

attribute x which can assume real values. Subscriptions can be expressed using the operators <=>.

SIENA

Middleware L

MIDLAB

slide-92
SLIDE 92

■Subscribers issued the following subscriptions.

8 3

Subscriber Subscription A x>23 B x<0 OR x>90 C x<40

8

C x<40 D x>25 AND x<60 E x>5 AND x<18 F x>5 AND x<10 G x>15 AND x<20 H x<12 I x>50 re Laboratory

■Firstly define a spanning tree associated to the broker

associated with publisher P. Then, for every broker compute the content-based forwarding table associated to this spanning tree. Finally compute the path followed by event x=16 through the ENS.

SIENA

Middleware L

MIDLAB

slide-93
SLIDE 93

■1: define a spanning tree associated to broker 1 ■Every tree including all the brokers is ok.

9 3 9

re Laboratory

SIENA

Middleware L

MIDLAB

slide-94
SLIDE 94

■ The content of subscription tables is computed starting from each

subscriber and “climbing the tree” toward the root (broker 1).

4

Broker Interface Content-based address 1 2 x>50 1 3 x>23 OR (x<0 OR x>90) OR x<40 OR (x>25 AND x<60) 2 7 x>50 3 4 x>23 OR (x<0 OR x>90) 3 4 x>23 OR (x<0 OR x>90) 3 8 x<40 OR (x>25 AND x<60) 4 5 x>23 OR (x<0 OR x>90) 5 6 x>23 OR (x<0 OR x>90) 8 10 x<12 OR (x>15 AND x<20) 8 11 x>5 AND x<10 8 12 x<40 OR (x>5 AND x<18) OR (x>25 AND x<60) 10 9 x<12 OR (x>15 AND x<20) 11 13 x>5 AND x<10 re Laboratory

■ We are referring to a run-time status where we can assume that,

independently from the order used to issue subscriptions, the tables’ content is perfect. SIENA

11 13 x>5 AND x<10 12 14 (x>5 AND x<18) OR (x>25 AND x<60) 14 15 (x>5 AND x<18) OR (x>25 AND x<60) Middleware L

MIDLAB

slide-95
SLIDE 95

■Routing event x=16. Notified subscribers: C, E, G. ■The table reports which content-based addresses are

satisfied by the event (in blue).

1 4

Broker Interface Content-based address 1 2 x>50

1

1 2 x>50 1 3 x>23 OR (x<0 OR x>90) OR x<40 OR (x>25 AND x<60) 2 7 x>50 3 4 x>23 OR (x<0 OR x>90) 3 8 x<40 OR (x>25 AND x<60) 4 5 x>23 OR (x<0 OR x>90) 5 6 x>23 OR (x<0 OR x>90) 8 10 x<12 OR (x>15 AND x<20) 8 11 x>5 AND x<10 re Laboratory

SIENA

8 11 x>5 AND x<10 8 12 x<40 OR (x>5 AND x<18) OR (x>25 AND x<60) 10 9 x<12 OR (x>15 AND x<20) 11 13 x>5 AND x<10 12 14 (x>5 AND x<18) OR (x>25 AND x<60) 14 15 (x>5 AND x<18) OR (x>25 AND x<60) Middleware L

MIDLAB

slide-96
SLIDE 96

■On the graph:

2 4 2

re Laboratory

SIENA

Middleware L

MIDLAB

slide-97
SLIDE 97

1 2

1

re Laboratory

SIENA

Middleware L

MIDLAB

slide-98
SLIDE 98

■Each node maintains three data structures:

■ Leaf set, Routing table, Neighborhood set

■Leaf set: contains the set of nodes with the L/2 numerically

closest larger NodeIDs, and the L/2 nodes with numerically closest smaller NodeIDs, relative to the present node’s 7

4

closest smaller NodeIDs, relative to the present node’s NodeID.

■Example: node 60, L=6

7

LS60 23 25

re Laboratory

53 63 74 83

SCRIBE

Middleware L

MIDLAB

slide-99
SLIDE 99

■Routing table: matrix of Log2b N rows and 2b-1 columns.

Entries in the n-th row match the first n-1 digits of current

  • NodeID. The n-th digit has one of the 2b-1 possible values
  • ther than the n-th digit in current NodeID.

■Example: routing table at node 10233102, b=2

8 4

8

  • 0-2212102

1

  • 2-2301203
  • 3-1203203

1-1-301233 1-2-230203 1-3-021022 10-0-31203 10-1-32102 2 10-3-23302 102-0-0230 102-1-1302 102-2-2302 3 1023-0-322 1023-1-000 1023-2-121 3 Possible digit values 1 2 3 Row 1 Row 2 Row 3 Row 4 Row 5

re Laboratory

1023-0-322 1023-1-000 1023-2-121 3 10233-0-01 1 10233-2-32 102331-2-0 2

SCRIBE

Row 5 Row 6 Row 7 Row 8

Middleware L

MIDLAB

slide-100
SLIDE 100

■When a node n wants to subscribe to t (joing group t):

■ it invokes route(JOIN[t],h(t)) ■ the message is routed toward the rendez-vous node for t ■ each node n’ along the route checks a local groups list to see if it

is currently a forwarder for t

children table 2

5

■ if so it accepts n as a child, and adds it to the local children table ■ otherwise it adds t to the groups list, add n to the children table and,

finally, invokes route(JOIN[t],h(t))

■A node can unsubscribe t at any time:

■ if it has no children then it sends to its parent in the diffusion tree a

LEAVE message

■ if it has still children for that group, it cannot leave the diffusion tree

2

re Laboratory

■ if it has still children for that group, it cannot leave the diffusion tree

■Routing is done in two steps:

■ the node that publish the event for topic t invokes

route(MCAST[e],h(t))

■ when the message reaches the rendez-vous point it is diffused

following links defined by children tables for that group. SCRIBE

Middleware L

MIDLAB

slide-101
SLIDE 101

■Example

3 5 3

h(t) children father h(t) children father 73 83 83 h(t) children father h(t) children father 73
  • 121
re Laboratory h(t) children father 73 177,191 83 h(t) children father 73 121 74 h(t) father 73
  • 121

SCRIBE

Middleware L

MIDLAB

slide-102
SLIDE 102

1 2

1

re Laboratory

SIENA

Middleware L

MIDLAB

slide-103
SLIDE 103

■We want every topic to appear with the same probability in

every APT, regardless of its popularity.

5 6 5

re Laboratory

TERA

Middleware L

MIDLAB

slide-104
SLIDE 104

■Which is the probability for an event to be correctly routed

in the general overlay toward an access point ?

■Depends on:

■ uniform randomness of topics

contained in access point

6 6

contained in access point tables;

■ access point table size; ■ random walk lifetime.

6

re Laboratory

TERA

Middleware L

MIDLAB

slide-105
SLIDE 105

■Neighborhood set: list of the M closest nodes. ■Node distance is measured using a proximity metric (IP

hops, latency, bandwidth, etc).

■Nodes in this list are used to update entries in the routing

table.

9 4

table.

9

re Laboratory

SCRIBE

Middleware L

MIDLAB

slide-106
SLIDE 106

■Load imposed on nodes is fairly distributed:

■ no hot spots or single points of failure; ■ Nodes that subscribe to more topics suffer more load.

7 6 7

re Laboratory

TERA

Middleware L

MIDLAB

slide-107
SLIDE 107

■Experiments show how

the system scales with respect to:

■ Number of subscriptions. ■ Number of topics. ■ Event publication rate.

8 6

■ Event publication rate. ■ Number of nodes.

■ (reference figure is given by a

simple event flooding approach)

8

re Laboratory

TERA

Middleware L

MIDLAB

slide-108
SLIDE 108

■Experiments show how

the system scales with respect to:

■ Number of subscriptions. ■ Number of topics. ■ Event publication rate.

9 6

■ Event publication rate. ■ Number of nodes.

■ (reference figure is given by a

simple event flooding approach)

9

re Laboratory

TERA

Middleware L

MIDLAB

slide-109
SLIDE 109

■Experiments show how

the system scales with respect to:

■ Number of subscriptions. ■ Number of topics. ■ Event publication rate.

7

■ Event publication rate. ■ Number of nodes.

■ (reference figure is given by a

simple event flooding approach) re Laboratory

TERA

Middleware L

MIDLAB

slide-110
SLIDE 110

■Experiments show how

the system scales with respect to:

■ Number of subscriptions. ■ Number of topics. ■ Event publication rate.

1 7

■ Event publication rate. ■ Number of nodes.

■ (reference figure is given by a

simple event flooding approach)

1

re Laboratory

TERA

Middleware L

MIDLAB

slide-111
SLIDE 111

■Experiments show how

the system scales with respect to:

■ Number of subscriptions. ■ Number of topics. ■ Event publication rate.

  • cost incurred to

maintain the

2 7

■ Event publication rate. ■ Number of nodes.

■ (reference figure is given by a

simple event flooding approach)
  • ! "#
$

maintain the general overlay

2

re Laboratory
  • %
& '
  • $
$ $

cost incurred to diffuse events inside topic

  • verlays

TERA

Middleware L

MIDLAB