Efficient Content Location Using Interest-Based Locality in - - PowerPoint PPT Presentation

efficient content location using interest based locality
SMART_READER_LITE
LIVE PREVIEW

Efficient Content Location Using Interest-Based Locality in - - PowerPoint PPT Presentation

Data Centric Networking (R202) paper Efficient Content Location Using Interest-Based Locality in Peer-to-Peer Systems authors: K. Sripanidkulchai et. al. (CMU) MPhil in ACS reviewer/presenter: S. Trajanovski ( st508


slide-1
SLIDE 1

paper

Efficient Content Location Using Interest-Based Locality in Peer-to-Peer Systems

authors: K. Sripanidkulchai et. al. (CMU)

MPhil in ACS reviewer/presenter: S. Trajanovski (st508)

Data Centric Networking (R202)

slide-2
SLIDE 2

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508)

2

  • Challenges
  • file duplication
  • search algorithm
  • Difference approaches
  • Centralized system (Napster)
  • Flooding (Gnutella)
  • Both have weaknesses

Motivation File seeking in P2P systems

slide-3
SLIDE 3

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508)

3

  • Central Server
  • one central node
  • not in p2p sense
  • Performance
  • memory O(n)
  • searching O(1)
  • Resilience/Robustness
  • just attack central node/server

Motivation Centralized system (Napster)

slide-4
SLIDE 4

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508)

4

  • Sending to the neighbours and so on ...
  • first discovery
  • Performance
  • no indexing
  • searching O(N)
  • Features

robust ᵡ scalable Motivation Massive flooding (Gnutella)

slide-5
SLIDE 5

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508)

5

  • Starting point choice
  • Gnutella
  • Idea
  • robust & simple
  • improving scalability
  • global solution
  • main concept: I nterest - based locality
  • different from popular/famous

Motivation/Proposal How this could be improved?

slide-6
SLIDE 6

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508)

6

  • Building interest-based communities
  • usually exchange content
  • Examples
  • networking (Van Jacobson, Crowcroft …)
  • mathematics (Tao, Perelman …)
  • politics (Obama, Merkel …)
  • Counter examples
  • Golf or cricket players for ME

Proposal Interest-based locality

slide-7
SLIDE 7

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508)

7

  • Architecture
  • overlay on Gnutella network
  • communities
  • Entities
  • shortcuts (additional links)
  • Scenario
  • 1st: try to find in the interest group
  • 2nd: try in Gnutella

Proposal The solution

slide-8
SLIDE 8

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508)

8

  • Shortcuts
  • keeping the limited list (up to 10)
  • priority links
  • Shortcut list ranking scheme
  • content probability
  • path latency
  • available bandwidth
  • combination

Proposal The solution

slide-9
SLIDE 9

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508)

9

  • Node (peer) addition
  • initial flooding (Gnutella like)
  • forming the list (1 per time)
  • Later scenario
  • refining the list dynamically
  • some peer introduced, another removed
  • Applicable generic solution
  • other mechanisms (e.g. Kazaa)

Proposal The solution

slide-10
SLIDE 10

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 10

(a) without (b) with shortcut Proposal Usual scenario

slide-11
SLIDE 11

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 11

  • What is used?
  • different data traces
  • data from different sources
  • How?
  • methodology
  • Why?
  • Better understanding of the model
  • proof for improvement

Performance evaluation Participants

slide-12
SLIDE 12

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 12

  • Gnutella content location
  • TTL mechanism
  • avoid query duplication
  • Performance pointers
  • success rate
  • load characteristics
  • query scope
  • minimum reply path lengths
  • additional states

Performance evaluation

slide-13
SLIDE 13

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 13

  • Query workloads
  • different data traces
  • data from different sources
  • Boeing
  • Microsoft
  • CMU web
  • CMU Gnutella
  • CMU Kazaa

Performance evaluation Methodology

slide-14
SLIDE 14

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 14

  • Gnutella connectivity graph
  • using Gnutella topology
  • fitting to particular query workload
  • one with similar number of nodes
  • deleting nodes
  • degree distribution
  • max TTL = 7

Performance evaluation Methodology

slide-15
SLIDE 15

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 15

  • Web traces
  • all clients participate
  • after downloading the file, peer has it
  • no dynamic content
  • CMU Kazaa and Gnutella traces
  • clients and peers
  • after downloading the file, peer has it
  • no dynamic content

Performance evaluation Storage and Replication models

slide-16
SLIDE 16

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 16

(a) success rate (b) shortcuts target? Experimental results Shortcuts Gnutella vs. pure Gnutella

slide-17
SLIDE 17

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 17

(a) load/packet (b) shortest path/hops Experimental results Shortcuts Gnutella vs. pure Gnutella

slide-18
SLIDE 18

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 18

  • change all (more shortcuts/time & unlimited list)
  • good performance (CMU Kazaa, Microsoft)
  • implementation difficulties
  • changes one property, maybe !?
  • search in shortcuts’ shortcuts
  • slightly improved performance (rate/loads)
  • increased shortest path

Performance evaluation Possible improvements/changes?

slide-19
SLIDE 19

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 19

  • properties/structure
  • small-world behavior
  • web pages vs. web objects (files)
  • fairly better than pure Gnutella
  • objects from different publisher?
  • capture interests across multiple publishers

Additional evaluation Understanding interest-based locality

slide-20
SLIDE 20

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 20

  • query caching
  • Ring searches
  • minimize random walks
  • effective for finding popular content
  • Kazaa
  • super-nodes
  • possible Kazaa’s improvements (routing, loads)
  • YouServ, BitTorrent, Squirrel

Related work .. different from Gnutella

slide-21
SLIDE 21

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 21

  • Pros
  • evaluated improvements of
  • web contents (song, movies,..)
  • p2p systems
  • simple method (heuristic)
  • increased scalability
  • Cons
  • possible congestion in shortcuts
  • non semantic matching (similar files)

Conclusion/Summary

slide-22
SLIDE 22

Data Centric Networking (R202) presenter: Stojan Trajanovski (st508) 22

  • Questions??
  • Discussion ..