1
CCD: Efficient Customized Content Dissemination in Distributed Publish/Subscribe
- H. Jafarpour, B. Hore, S. Mehrotra and N. Venkatasubramanian
Information Systems Group
- Dept. of Computer Science
UC Irvine
CCD: Efficient Customized Content Dissemination in Distributed - - PowerPoint PPT Presentation
CCD: Efficient Customized Content Dissemination in Distributed Publish/Subscribe H. Jafarpour, B. Hore, S. Mehrotra and N. Venkatasubramanian Information Systems Group Dept. of Computer Science UC Irvine 1 Customized content dissemination
1
CCD: Efficient Customized Content Dissemination in Distributed Publish/Subscribe
Information Systems Group
UC Irvine
2 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 2
Customized content dissemination on distributed Pub/Sub (CCD)
Motivation Problem definition and formulation CCD algorithm Heuristic CCD algorithm Experimental evaluation
3
Under response Over response
Goal: Customized Notifications are sent to the population using multiple modalities One or a few generic messages sent to the entire impacted population
4 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 4
Leveraging pub/sub framework for dissemination of
rich content formats, e.g., multimedia content.
Same content format may not be consumable by all subscribers!!!
5
CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 5 Español Español!!!
Customize content to the required formats before delivery!
6 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 6
How to specify required
formats?
Receiving context:
Receiving device capabilities Display screen, available
software,…
Communication capabilities Available bandwidth User profile Location, language,…
Subscription:
Subscription:
Subscription:
Context: PC, DSL, AVI Context: Phone, 3G, FLV Context: Laptop, 3G, AVI, Spanish subtitle
7 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 7
How is content customization done?
Adaptation operators
Original content Size: 28MB
Low resolution and small content suitable for mobile clients Size: 8MB Transcoder Operator
8 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 8
How do we perform content customization in
distributed pub/sub infrastructures?
9 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 9
Option 1: Perform all the required customizations in the sender broker
28MB 28MB 28MB 15MB 12MB 8MB 8MB 8MB 8MB 15MB
28+12+8 = 48MB 28+12+8 = 48MB 12MB 8MB
10 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 10
Option 2: Perform all the required customization in the proxy
brokers (leaves)
28MB 28MB 28MB 15MB 12MB 8MB 8MB 8MB 8MB 15MB
28MB 28MB 28MB Repeated Operator
11 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 11
Option 3: Perform all the required customization in the broker
28MB 28MB 28MB 15MB 12MB 8MB 8MB 8MB 8MB 15MB
12 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 12
Customized content dissemination on distributed Pub/Sub (CCD)
Motivation Problem definition and formulation CCD algorithm Heuristic CCD algorithm Experimental evaluation
13 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 13
DHT-based routing schema,
We use Tapestry [ZHS04]
Rendezvous Point
14 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 14
For a published content we can estimate the dissemination
tree in the broker overlay network
Using DHT-based routing properties The dissemination tree is rooted at the corresponding rendezvous
broker
Rendezvous Point
15 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 15
All possible content formats in the system All available adaptation operators in the system
Size: 28MB Frame size: 1280x720 Frame rate: 30 Size: 8MB Frame size: 128x96 Frame rate: 30 Size: 15MB Frame size: 704x576 Frame rate: 30 Size: 10MB Frame size: 352x288 Frame rate: 30
16 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 16
A transmission (communication) cost is associated with each
format
Sending content in format Fi from a broker to another one has the
transmission cost of
A computation cost is associated with each operator
Performing operator O(i,j) on content has the computation cost
F1/28 F3/12 F2/15 F4/8
60 60 60 25 25 25 V={F1,F2,F3,F4} E={O(1,2),O(1,3),O(1,4),O(2,3),O(2,4),O(3,4)}
17 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 17
A CCD plan for a content is the dissemination tree:
Each node (broker) is annotated with the operator(s) that
are performed on it
Each link is annotated with the format(s) that are
transmitted over it
{O(1,2),O(2,4)} {O(2,3)} {} {} {} {} {} {F2} {F2} {F4} {F2} {F3} {F4}
F1/28 F3/12 F2/15 F4/8
60 60 60 25 25 25
18 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 18
Communication cost for a plan,
Sum of all costs for the formats transmitted through all
edges
Computation cost for a plan,
Sum of the costs for all operators in all plan nodes
Total CCD plan cost
19 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 19
For a given CAG and dissemination tree, , find
CCD plan with minimum total cost.
20 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 20
Customized content dissemination on distributed Pub/Sub (CCD)
Motivation Problem definition and formulation CCD algorithm Heuristic CCD algorithm Experimental evaluation
21 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 21
Input:
A dissemination tree A CAG The initial format Requested formats by each broker
Output:
The minimum cost CCD plan
22
Based on dynamic programming Annotates the dissemination tree in a bottom-up
fashion
For each broker:
Assume all the optimal sub plans are available for each child Find the optimal plan for the broker accordingly
CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 22
Ni Nj Nk
….
23 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 23
F1 F1 F1 F2 F3 F4 F4 F4 F2
F1/28 F3/12 F2/15 F4/8
60 60 60 25 25 25
24
Input:
All possible input format sets Requested formats
Output:
Optimal plan for each input format set
CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 24 {F1} {F1 , F3} Requested format set Input format set {F2} {F1 , F3} {F1,F2} {F1 , F3} {F1,F2,F3,F4} {F1 , F3}
…. ….
F1/28 F3/12 F2/15 F4/8
60 60 60 25 25 25
{O(1,3)} {F1} Plan cost: 28+60 = 86
25 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 25
Input:
All possible input format sets Optimal sub plan for child nodes for any given input format set
Output:
Optimal plan for the given input format set Ni Nj Nk
2m sub plans 2m sub plans
{F1,F2}
….
combination of sub plans
Optimal sub plan for input set: {F1} Optimal sub plan for input set: {F2} Optimal sub plan for input set: {F1} Optimal sub plan for input set: {F2}
26 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 26
Algorithm complexity
n : number of nodes in the tree k avg : average number of children for a node m : number of formats in the CAG
: complexity of minimum conversion cost computation in CAG
Exponential in m, CAG size
27 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 27
Customized content dissemination on distributed Pub/Sub (CCD)
Motivation Problem definition and formulation CCD algorithm Heuristic CCD algorithm Experimental evaluation
28 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 28
Directed Steiner tree problem can be reduced to CCD
Given a directed weighted graph G(V,E,w) , a specified root r and a subset of its vertices S, find a tree rooted at r of minimal weight which includes all vertices in S.
29 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 29
Cartesian product of CAG and dissemination tree {F1} {F1,F4}
F1/10 F4/15 F3/8
7 3 5 4
F2/5
{F4} {F1,F3}
30 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 30
F1/10 F2/5 F3/8 F4/15
Source Terminal 7 5 3 4
30
31 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 31
A -approximate has been proposed k is the number of terminals i is the algorithm approximation parameter Time complexity is O(v i k 2i) v is the number of nodes in the multilayer graph High time complexity for large dissemination trees
v = n . m
Example:
Number of brokers (n)= 1000, Number of formats (m) = 20 v = 20000 , k <= 20000
32 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 32
An iterative heuristic algorithm
Start with an initial plan Pick a node in the plan for refinement Refine the one level sub plan rooted at the selected node using
multilayer graph
Operators performed in the sub plan Formats transmitted from parent to each child
33 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 33
Initial plan selection
Any valid plan can be used as initial plan
All in leaves All in root Single-format
Node selection for plan refinement
Random Slack
Maximum expected benefit (cost reduction) from
selecting a node
34 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 34
Communication cost slack
Current communication cost – lower bound for
communication cost
Estimation of lower bound for communication cost
Computation cost slack
Current computation cost-lower bound for computation
cost
Estimation of lower bound for computation cost
Total slack for a node
Communication slack + Computation slack
{ Fi , Fj , …, Fk } { Fi
min , Fj min ,…, Fk min }
Max { Fi
min , Fj min ,…, Fk min }
35 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 35
Customized content dissemination on distributed Pub/Sub (CCD)
Motivation Problem definition and formulation CCD algorithm Heuristic CCD algorithm Experimental evaluation
36 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 36
System setup
1024 brokers Matching ratio: percentage of brokers with matching
subscription for a published content
Zipf and uniform distributions
Communication and computation costs are assigned
based on profiling
36
37 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 37
Dissemination scenarios
Annotated map Customized video dissemination Synthetic scenarios
37
38 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 38
Cost reduction in CCD and Heuristic CCD algorithms
Matching Ratio Cost reduction percentage (%)
5 10 15 20 25 30 35 40 45 50 1 5 10 20 50 70 CCD vs. All In Leaves 10 20 30 40 50 60 1 5 10 20 50 70 Heuristic CCD vs. All In Leaves Heuristic CCD vs. All In Root
Matching Ratio
39 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 39
Iteration number Cost reduction percentage (%)
0% 1% 2% 3% 4% 5% 6% 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 Matching ratio = 5% Matching ratio = 50%
Iteration number
5 10 15 20 25 30 35 1 26 51 76 101 126 151 176 201 226 251 276 301 326 351 376 Slack Random
CCD vs. heuristic CCD Slack vs. Random next step selection
40 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 40
nalini@ics.uci.edu http://www.ics.uci.edu/~nalini
41 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 41
Cost factor for performing operators at a broker
: Cost factor for broker Ni
Cost of performing operator O(i,j) at Ni is computed as
follow
Every link in the tree also has a cost factor
: Cost factor for link <Ni,Nj>
Cost of transmitting content in format Fi over the link is
computed as follow
42
CCD plan cost reduction considering heterogeneity
CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 42 2 4 6 8 10 12 1 5 10 20 50 70
Matching Ratio Cost reduction percentage (%)
43
CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 43 20 40 60 80 100 120 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 Matching Ratio = 10% Matching Ratio = 20% Matching Ratio = 70%
Number of publications Cost reduction percentage (%)
44 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 44
Communication cost slack
Current communication cost – lower bound for
communication cost
Estimation of lower bound for communication cost
Computation cost slack
Current computation cost-lower bound for computation
cost
Estimation of lower bound for computation cost
Total slack for a node
Communication slack + Computation slack
{ Fi , Fj , …, Fk } { Fi
min , Fj min ,…, Fk min }
Max { Fi
min , Fj min ,…, Fk min }