CCD: Efficient Customized Content Dissemination in Distributed - - PowerPoint PPT Presentation

ccd efficient customized content dissemination in
SMART_READER_LITE
LIVE PREVIEW

CCD: Efficient Customized Content Dissemination in Distributed - - PowerPoint PPT Presentation

CCD: Efficient Customized Content Dissemination in Distributed Publish/Subscribe H. Jafarpour, B. Hore, S. Mehrotra and N. Venkatasubramanian Information Systems Group Dept. of Computer Science UC Irvine 1 Customized content dissemination


slide-1
SLIDE 1

1

CCD: Efficient Customized Content Dissemination in Distributed Publish/Subscribe

  • H. Jafarpour, B. Hore, S. Mehrotra and N. Venkatasubramanian

Information Systems Group

  • Dept. of Computer Science

UC Irvine

slide-2
SLIDE 2

2 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 2

Customized content dissemination on distributed Pub/Sub (CCD)

 Motivation  Problem definition and formulation  CCD algorithm  Heuristic CCD algorithm  Experimental evaluation

slide-3
SLIDE 3

3

Domain: Emergency Notification Systems

Under response Over response

Goal: Customized Notifications are sent to the population using multiple modalities One or a few generic messages sent to the entire impacted population

slide-4
SLIDE 4

4 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 4

Motivation

 Leveraging pub/sub framework for dissemination of

rich content formats, e.g., multimedia content.

Same content format may not be consumable by all subscribers!!!

slide-5
SLIDE 5

5

Customized delivery

CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 5 Español Español!!!

Customize content to the required formats before delivery!

slide-6
SLIDE 6

6 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 6

Subscriptions in CCD

 How to specify required

formats?

 Receiving context:

 Receiving device capabilities  Display screen, available

software,…

 Communication capabilities  Available bandwidth  User profile  Location, language,…

Subscription:

  • Team: USC
  • Video: Touch Down

Subscription:

  • Team: USC
  • Video: Touch Down

Subscription:

  • Team: USC
  • Video: Touch Down

Context: PC, DSL, AVI Context: Phone, 3G, FLV Context: Laptop, 3G, AVI, Spanish subtitle

slide-7
SLIDE 7

7 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 7

Content customization

 How is content customization done?

 Adaptation operators

Original content Size: 28MB

Low resolution and small content suitable for mobile clients Size: 8MB Transcoder Operator

slide-8
SLIDE 8

8 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 8

Challenges

 How do we perform content customization in

distributed pub/sub infrastructures?

slide-9
SLIDE 9

9 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 9

Challenges

 Option 1: Perform all the required customizations in the sender broker

28MB 28MB 28MB 15MB 12MB 8MB 8MB 8MB 8MB 15MB

28+12+8 = 48MB 28+12+8 = 48MB 12MB 8MB

slide-10
SLIDE 10

10 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 10

Challenges

 Option 2: Perform all the required customization in the proxy

brokers (leaves)

28MB 28MB 28MB 15MB 12MB 8MB 8MB 8MB 8MB 15MB

28MB 28MB 28MB Repeated Operator

slide-11
SLIDE 11

11 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 11

Challenges

 Option 3: Perform all the required customization in the broker

  • verlay network

28MB 28MB 28MB 15MB 12MB 8MB 8MB 8MB 8MB 15MB

slide-12
SLIDE 12

12 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 12

Customized content dissemination on distributed Pub/Sub (CCD)

 Motivation  Problem definition and formulation  CCD algorithm  Heuristic CCD algorithm  Experimental evaluation

slide-13
SLIDE 13

13 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 13

DHT-based pub/sub

 DHT-based routing schema,

 We use Tapestry [ZHS04]

Rendezvous Point

slide-14
SLIDE 14

14 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 14

Dissemination tree

 For a published content we can estimate the dissemination

tree in the broker overlay network

 Using DHT-based routing properties  The dissemination tree is rooted at the corresponding rendezvous

broker

Rendezvous Point

slide-15
SLIDE 15

15 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 15

Content Adaptation Graph (CAG)

 All possible content formats in the system  All available adaptation operators in the system

Size: 28MB Frame size: 1280x720 Frame rate: 30 Size: 8MB Frame size: 128x96 Frame rate: 30 Size: 15MB Frame size: 704x576 Frame rate: 30 Size: 10MB Frame size: 352x288 Frame rate: 30

slide-16
SLIDE 16

16 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 16

Content Adaptation Graph (CAG)

 A transmission (communication) cost is associated with each

format

 Sending content in format Fi from a broker to another one has the

transmission cost of

 A computation cost is associated with each operator

 Performing operator O(i,j) on content has the computation cost

  • f

F1/28 F3/12 F2/15 F4/8

60 60 60 25 25 25 V={F1,F2,F3,F4} E={O(1,2),O(1,3),O(1,4),O(2,3),O(2,4),O(3,4)}

slide-17
SLIDE 17

17 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 17

CCD plan

 A CCD plan for a content is the dissemination tree:

 Each node (broker) is annotated with the operator(s) that

are performed on it

 Each link is annotated with the format(s) that are

transmitted over it

{O(1,2),O(2,4)} {O(2,3)} {} {} {} {} {} {F2} {F2} {F4} {F2} {F3} {F4}

F1/28 F3/12 F2/15 F4/8

60 60 60 25 25 25

slide-18
SLIDE 18

18 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 18

CCD plan cost

 Communication cost for a plan,

 Sum of all costs for the formats transmitted through all

edges

 Computation cost for a plan,

 Sum of the costs for all operators in all plan nodes

 Total CCD plan cost

slide-19
SLIDE 19

19 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 19

Problem definition

 For a given CAG and dissemination tree, , find

CCD plan with minimum total cost.

slide-20
SLIDE 20

20 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 20

Customized content dissemination on distributed Pub/Sub (CCD)

 Motivation  Problem definition and formulation  CCD algorithm  Heuristic CCD algorithm  Experimental evaluation

slide-21
SLIDE 21

21 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 21

CCD algorithm

 Input:

 A dissemination tree  A CAG  The initial format  Requested formats by each broker

 Output:

 The minimum cost CCD plan

slide-22
SLIDE 22

22

CCD algorithm

 Based on dynamic programming  Annotates the dissemination tree in a bottom-up

fashion

 For each broker:

 Assume all the optimal sub plans are available for each child  Find the optimal plan for the broker accordingly

CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 22

Ni Nj Nk

….

slide-23
SLIDE 23

23 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 23

CCD algorithm

F1 F1 F1 F2 F3 F4 F4 F4 F2

F1/28 F3/12 F2/15 F4/8

60 60 60 25 25 25

slide-24
SLIDE 24

24

CCD algorithm in leaf broker

 Input:

 All possible input format sets  Requested formats

 Output:

 Optimal plan for each input format set

CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 24 {F1} {F1 , F3} Requested format set Input format set {F2} {F1 , F3} {F1,F2} {F1 , F3} {F1,F2,F3,F4} {F1 , F3}

…. ….

F1/28 F3/12 F2/15 F4/8

60 60 60 25 25 25

{O(1,3)} {F1} Plan cost: 28+60 = 86

slide-25
SLIDE 25

25 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 25

CCD algorithm in for a non-leaf broker

 Input:

 All possible input format sets  Optimal sub plan for child nodes for any given input format set

 Output:

 Optimal plan for the given input format set Ni Nj Nk

2m sub plans 2m sub plans

{F1,F2}

….

  • Enumerate all

combination of sub plans

  • Enumerate all possible
  • utput format sets

Optimal sub plan for input set: {F1} Optimal sub plan for input set: {F2} Optimal sub plan for input set: {F1} Optimal sub plan for input set: {F2}

slide-26
SLIDE 26

26 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 26

Complexity of CCD algorithm

 Algorithm complexity

 n : number of nodes in the tree  k avg : average number of children for a node  m : number of formats in the CAG 

: complexity of minimum conversion cost computation in CAG

 Exponential in m, CAG size

slide-27
SLIDE 27

27 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 27

Customized content dissemination on distributed Pub/Sub (CCD)

 Motivation  Problem definition and formulation  CCD algorithm  Heuristic CCD algorithm  Experimental evaluation

slide-28
SLIDE 28

28 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 28

CCD Problem is NP-hard

 Directed Steiner tree problem can be reduced to CCD

Given a directed weighted graph G(V,E,w) , a specified root r and a subset of its vertices S, find a tree rooted at r of minimal weight which includes all vertices in S.

slide-29
SLIDE 29

29 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 29

Multilayer graph representation

 Cartesian product of CAG and dissemination tree {F1} {F1,F4}

F1/10 F4/15 F3/8

7 3 5 4

F2/5

{F4} {F1,F3}

slide-30
SLIDE 30

30 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 30

F1/10 F2/5 F3/8 F4/15

Source Terminal 7 5 3 4

30

slide-31
SLIDE 31

31 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 31

Approximate Steiner tree over multilayer graph

 A -approximate has been proposed  k is the number of terminals  i is the algorithm approximation parameter  Time complexity is O(v i k 2i)  v is the number of nodes in the multilayer graph  High time complexity for large dissemination trees

 v = n . m

 Example:

 Number of brokers (n)= 1000, Number of formats (m) = 20  v = 20000 , k <= 20000

slide-32
SLIDE 32

32 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 32

Heuristic CCD algorithm

 An iterative heuristic algorithm

 Start with an initial plan  Pick a node in the plan for refinement  Refine the one level sub plan rooted at the selected node using

multilayer graph

Operators performed in the sub plan Formats transmitted from parent to each child

slide-33
SLIDE 33

33 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 33

Heuristic CCD algorithm

 Initial plan selection

 Any valid plan can be used as initial plan

 All in leaves  All in root  Single-format

 Node selection for plan refinement

 Random  Slack

 Maximum expected benefit (cost reduction) from

selecting a node

slide-34
SLIDE 34

34 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 34

Slack computation for a node

 Communication cost slack

 Current communication cost – lower bound for

communication cost

 Estimation of lower bound for communication cost

 Computation cost slack

 Current computation cost-lower bound for computation

cost

 Estimation of lower bound for computation cost

 Total slack for a node

 Communication slack + Computation slack

{ Fi , Fj , …, Fk } { Fi

min , Fj min ,…, Fk min }

Max { Fi

min , Fj min ,…, Fk min }

slide-35
SLIDE 35

35 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 35

Customized content dissemination on distributed Pub/Sub (CCD)

 Motivation  Problem definition and formulation  CCD algorithm  Heuristic CCD algorithm  Experimental evaluation

slide-36
SLIDE 36

36 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 36

Experimental evaluation

 System setup

 1024 brokers  Matching ratio: percentage of brokers with matching

subscription for a published content

 Zipf and uniform distributions

 Communication and computation costs are assigned

based on profiling

36

slide-37
SLIDE 37

37 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 37

Experimental evaluation

 Dissemination scenarios

 Annotated map  Customized video dissemination  Synthetic scenarios

37

slide-38
SLIDE 38

38 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 38

Cost reduction in CCD and Heuristic CCD algorithms

Matching Ratio Cost reduction percentage (%)

5 10 15 20 25 30 35 40 45 50 1 5 10 20 50 70 CCD vs. All In Leaves 10 20 30 40 50 60 1 5 10 20 50 70 Heuristic CCD vs. All In Leaves Heuristic CCD vs. All In Root

Matching Ratio

slide-39
SLIDE 39

39 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 39

Iteration number Cost reduction percentage (%)

0% 1% 2% 3% 4% 5% 6% 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 Matching ratio = 5% Matching ratio = 50%

Iteration number

5 10 15 20 25 30 35 1 26 51 76 101 126 151 176 201 226 251 276 301 326 351 376 Slack Random

CCD vs. heuristic CCD Slack vs. Random next step selection

slide-40
SLIDE 40

40 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 40

????

Nalini Venkatasubramanian

nalini@ics.uci.edu http://www.ics.uci.edu/~nalini

slide-41
SLIDE 41

41 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 41

Heterogeneity

 Cost factor for performing operators at a broker

: Cost factor for broker Ni

 Cost of performing operator O(i,j) at Ni is computed as

follow

 Every link in the tree also has a cost factor

: Cost factor for link <Ni,Nj>

 Cost of transmitting content in format Fi over the link is

computed as follow

slide-42
SLIDE 42

42

CCD plan cost reduction considering heterogeneity

CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 42 2 4 6 8 10 12 1 5 10 20 50 70

Matching Ratio Cost reduction percentage (%)

slide-43
SLIDE 43

43

Concurrent publications

CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 43 20 40 60 80 100 120 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 Matching Ratio = 10% Matching Ratio = 20% Matching Ratio = 70%

Number of publications Cost reduction percentage (%)

slide-44
SLIDE 44

44 CCD: Efficient Customized Content Dissemination in Distributed Pub/Sub 44

Slack computation for a node

 Communication cost slack

 Current communication cost – lower bound for

communication cost

 Estimation of lower bound for communication cost

 Computation cost slack

 Current computation cost-lower bound for computation

cost

 Estimation of lower bound for computation cost

 Total slack for a node

 Communication slack + Computation slack

{ Fi , Fj , …, Fk } { Fi

min , Fj min ,…, Fk min }

Max { Fi

min , Fj min ,…, Fk min }