You know the list - - PowerPoint PPT Presentation

you know the list
SMART_READER_LITE
LIVE PREVIEW

You know the list - - PowerPoint PPT Presentation

P2P has lots of advantages You know the list But,


slide-1
SLIDE 1
  • Neil Daswani, Hector Garcia-Molina,

Beverly Yang

  • P2P has lots of advantages

You know the list

But, challenges to widespread (lasting)

acceptance

Security, efficiency, QoS, xacts, etc

Old distributed systems techniques don’t

apply to the scale and nature of P2P systems

This paper looks at search and security

  • Not an exhaustive survey

Other applications besides data sharing Other issues besides search and security Other issues within search and security

Based on work within the Stanford Peers

Group

  • Assume “pure” p2p

Their definition of “hybrid” is the Napster example

Challenges

Scale Unreliability

slide-2
SLIDE 2
  • Topology

How peers connect to each other autonomy vs. efficiency

Data placement

Both data and metadata

Message routing

How queries are propagated Can utilize both topology and data placement

  • Expressiveness

How powerful is the query language?

Comprehensiveness

All results vs top K vs single

Autonomy

Peers may want to only connect to trusted peers

  • Efficiency

Bandwidth + processing + storage + …

Quality of Service (QoS)

User perceived qualities

Robustness

Above good during churn

  • Key Lookup

DHTs

Keyword

Can DHTs handle this?

Ranked Keyword

Want to do ranking in the network if top K is less than total

results

Aggregates

Want to do this in the network as well

SQL

PIER and PeerDB

slide-3
SLIDE 3
  • Decoupling autonomy and efficiency is a large challenge

With less autonomy, can bound the lookup cost (Chord) By designating some nodes more equal than others, there are

some nodes guaranteed to have the answer (super-peers)

Replication increases the chance of finding the answer on a

random node

Skipnet makes progress by allowing the user to tune the

autonomy vs. efficiency tradeoff

  • By imposing rigid requirements on the

system, it becomes hard to maintain

  • Different metrics:

Number of results Response time Relevance (precision and recall) Application specific

Example: Gnutella

Tradeoff between # results and cost Directed BFS and concept clustering address this What is the best technique to optimize this

tradeoff?

  • Challenging because of the nature of P2P systems

Open Autonomous

Have to assume a hostile environment Address:

Availability File authenticity Anonymity Access control

Want to prevent, detect, manage, and recover from

attacks

slide-4
SLIDE 4
  • Each node should be able to accept

messages as well as offer services to the network

DoS Attack

Chosen-victim attack in Gnutella

A node directs all search queries it gets to a victim node Adversaries take advantage of loose protocols

Need to prevent amplification and back-door

access

  • Malicious nodes create Byzantine failures

Current approaches are unpopular because of complexity

and overhead

Also assumes complete and secure communication

between nodes

How to deal with general node failures?

Being addressed by DHTs

Other issues:

Malicious query/storage flooding File availability

  • No mention of Oceanstore, etc
  • What is the definition of authenticity?

Different than integrity

Solved with checksums/signatures

Oldest Document: the first submitted Expert-based: A single expert deems a document

authentic

Voting-based: majority of expert opinions

determine authenticity

Reputation-based: weigh votes of some experts

more

  • Good for:

“Borrowing” music Censorship resistance Freedom of speech Privacy protection

slide-5
SLIDE 5

For anonymity, should not be able to

determine which node an object in stored at

Vs.

For efficiency, should be able to determine exactly which node is responsible for an

  • bject

Onion routing/crowds address anonymity

through forwarding

Still have problems if nodes collude

  • Utility limited if there is restrictions on data-

sharing, but some level is needed for legality

Endpoint vs P2P network enforcement

  • What are the most pressing issues for P2P to

become widely acceptable?

P2P vs centralized? Structured vs unstructured? Hybrid vs pure P2P? Where will P2P make an impact? …