SLIDE 1
SpamResist: Making Peer-to-Peer Tagging SpamResist: Making - - PowerPoint PPT Presentation
SpamResist: Making Peer-to-Peer Tagging SpamResist: Making - - PowerPoint PPT Presentation
SpamResist: Making Peer-to-Peer Tagging SpamResist: Making Peer-to-Peer Tagging Systems Robust to Spam Systems Robust to Spam Ennan Ennan Zhai, Zhai, Ruichuan Ruichuan Chen, Eng Keong hen, Eng Keong Lua* Lua* Long Zhang, Long Zhang,
SLIDE 2
SLIDE 3
What is the tag spam in P2P tagging What is the tag spam in P2P tagging systems? systems? What are the existing solutions on this What are the existing solutions on this problem? problem? Our approach? Our approach?
I I I I I I
Roadmap Roadmap
1
SLIDE 4
There are some tagging services-based systems in our lives … …
Tagging Systems Tagging Systems
2
SLIDE 5
To meet the challenge, such as DoS or single point failure, tagging services are introduced into P2P content systems… …
P2P Tagging Systems P2P Tagging Systems
3
SLIDE 6
To meet the challenge, such as DoS or single point failure, tagging services are introduced into P2P content systems… …
P2P Tagging Systems P2P Tagging Systems
For example, Tagster is an open source DHT-based P2P tagging system.
(http://isweb.uni-koblenz.de/research/tagster)
3
SLIDE 7
The results for searching tag “iphone” in MyWeb.
Tag Spam Tag Spam
4
SLIDE 8
When we click this link, we will find the following page …….
Tag Spam Tag Spam
The results for searching tag “iphone” in MyWeb.
4
SLIDE 9
This Figure is not related to iphones.
Tag Spam Tag Spam
5
SLIDE 10
This Figure is not related to iphones.
Tag Spam Tag Spam
5
We can also observe that this site has been assigned many other popular but irrelevant tags.
SLIDE 11
That is the problem of tag spam! Definition of Tag Spam: The erroneous or misleading tags that are generated by some malicious users to confuse the normal users in the systems.
Tag Spam Tag Spam
6
SLIDE 12
What is the tag spam in P2P tagging What is the tag spam in P2P tagging systems? systems? What are the existing solutions on this What are the existing solutions on this problem? problem? Our approach? Our approach?
I I I I I I
Roadmap Roadmap
7
SLIDE 13
What is the tag spam in P2P tagging What is the tag spam in P2P tagging systems? systems? What are the existing solutions on this What are the existing solutions on this problem? problem? Our approach? Our approach?
I I I I I I
Roadmap Roadmap
7
SLIDE 14
Detection-based Mechanisms. Demotion-based Mechanisms. Interface-based Mechanisms.
Related Work Related Work
8
SLIDE 15
Detection-based Mechanisms. Demotion-based Mechanisms. Interface-based Mechanisms.
Related Work Related Work
8
SLIDE 16
Detection-based Mechanisms. Demotion-based Mechanisms. Interface-based Mechanisms.
Related Work Related Work
8
SLIDE 17
Detection-based Mechanisms. Demotion-based Mechanisms. Interface-based Mechanisms.
Related Work Related Work
8
SLIDE 18
What is the tag spam in P2P tagging What is the tag spam in P2P tagging systems? systems? What are the existing solutions on this What are the existing solutions on this problem? problem? Our approach? Our approach?
I I I I I I
Roadmap Roadmap
9
SLIDE 19
What is the tag spam in P2P tagging What is the tag spam in P2P tagging systems? systems? What are the existing solutions on this What are the existing solutions on this problem? problem? Our approach? Our approach?
I I I I I I
Roadmap Roadmap
9
SLIDE 20
SpamResist is a demotion-based mechanism, and encompasses two key parts: Reliability Mechanism; Social Network-based Enhancement.
SpamResist SpamResist
10
SLIDE 21
SpamResist is a demotion-based mechanism, and encompasses two key parts: Reliability Mechanism; Social Network-based Enhancement.
SpamResist SpamResist
10
SLIDE 22
For each tag (e.g., Sea) search, client calculates a reliability degree for each respondent, and uses weighted averaging to compute the rank of the search results.
What is reliability mechanism? What is reliability mechanism?
11
SLIDE 23
For each tag (e.g., Sea) search, client calculates a reliability degree for each respondent, and uses weighted averaging to compute the rank of the search results. The peer who annotated some local files with “Sea” will respond the client with these files. We call this peer as respondent.
What is reliability mechanism? What is reliability mechanism?
11
SLIDE 24
For each tag (e.g., Sea) search, client calculates a reliability degree for each respondent, and uses weighted averaging to compute the rank of the search results.
What is reliability mechanism? What is reliability mechanism?
Weight is the reliability degree of the owner of each response resource.
11
SLIDE 25
Weight is the reliability degree of the owner of each response resource. For each tag (e.g., Sea) search, client calculates a reliability degree for each respondent, and uses weighted averaging to compute the rank of the search results.
What is reliability mechanism? What is reliability mechanism?
How the client to compute the reliability degree for each peer? 11
SLIDE 26
Reliability degree is a personalized score assigned to each peer by the client, and SpamResist proposes two schemes for the client to calculate the reliability degrees of two categories of peers respectively:
- Unfamiliar peers;
- Interacted peers.
How to compute reliability? How to compute reliability?
12
SLIDE 27
Reliability degree is a personalized score assigned to each peer by the client, and SpamResist proposes two schemes for the client to calculate the reliability degrees of two categories of peers respectively:
- Unfamiliar peers;
- Interacted peers.
How to compute reliability? How to compute reliability?
Normally, the behaviors that peer A downloads some files from peer B are called interactions between A and B.
12
SLIDE 28
Reliability degree is a personalized score assigned to each peer by the client, and SpamResist proposes two schemes for the client to calculate the reliability degrees of two categories of peers respectively:
- Unfamiliar peers;
- Interacted peers.
How to compute reliability? How to compute reliability?
The peers that the client has never interacted with.
12
SLIDE 29
Unfamiliar Peer’s Reliability Unfamiliar Peer’s Reliability
13
SLIDE 30
Reliability degree is a personalized score assigned to each peer by the client, and SpamResist proposes two schemes for the client to calculate the reliability degrees of two categories of peers respectively:
- Unfamiliar peers;
- Interacted peers.
How to compute reliability? How to compute reliability?
The peers that the client has interacted with.
14
SLIDE 31
Interacted Peer’s Reliability Interacted Peer’s Reliability
The client stores the previous experiences from the interacted peers in his own experience vector (EVA,B).
n B A B A B A
v v v
, , 2 , , 1 , ,
,..., ,
15
SLIDE 32
The client stores the previous experiences from the interacted peers in his own experience vector (EVA,B).
Interacted Peer’s Reliability Interacted Peer’s Reliability
n B A B A B A
v v v
, , 2 , , 1 , ,
,..., ,
Specifically, for the peer B that client A has interacted with, A maintains a vector of length n storing the most recent n experiences with B, and as new experiences are appended the oldest ones are removed.
15
SLIDE 33
Interacted Peer’s Reliability Interacted Peer’s Reliability
16
SLIDE 34
The client stores the previous experiences from the interacted peers in his own experience vector (EVA,B).
Interacted Peer’s Reliability Interacted Peer’s Reliability
n B A B A B A
v v v
, , 2 , , 1 , ,
,..., ,
Reliability degree from A to B (interacted peer for A) is:
n v v v
n B A B A B A , , 2 , , 1 , ,
...+ + +
17
SLIDE 35
SpamResist is a demotion-based mechanism, and encompasses two key parts: Reliability Mechanism; Social Network-based Enhancement.
SpamResist SpamResist
18
SLIDE 36
SpamResist is a demotion-based mechanism, and encompasses two key parts: Reliability Mechanism; Social Network-based Enhancement.
SpamResist SpamResist
18
SLIDE 37
Social Network-based Enhancement Social Network-based Enhancement
- Re-compute the ranking score (RS) for the result file
whose RS is lower than 0.5.
- If more than half of friends have RS higher than 0.5,
re-locate the position.
- According to average of scores higher than 0.5.
19
SLIDE 38
Social Network-based Enhancement Social Network-based Enhancement
sea.jpg Alice’s search result … … Ranking Score … … 0.4 Alice
20
SLIDE 39
Social Network-based Enhancement Social Network-based Enhancement
sea.jpg Alice’s search result … … Ranking Score … … 0.4 Alice Alice’s Friends
20
SLIDE 40
Social Network-based Enhancement Social Network-based Enhancement
sea.jpg Alice’s search result … … Ranking Score … … 0.4 0.7 0.8 0.4 Ranking Scores Alice Alice’s Friends
20
SLIDE 41
Social Network-based Enhancement Social Network-based Enhancement
sea.jpg Alice’s search result … … Ranking Score … … 0.4 (0.7 + 0.8) / 2 = 0.75 Alice Alice’s Friends
20
SLIDE 42
Social Network-based Enhancement Social Network-based Enhancement
sea.jpg Alice’s search result … … Ranking Score … … 0.4 (0.7 + 0.8) / 2 = 0.75 Alice Alice’s Friends
20
SLIDE 43
Social Network-based Enhancement Social Network-based Enhancement
sea.jpg Alice’s search result … … Ranking Score … … 0.75 (0.7 + 0.8) / 2 = 0.75 Alice Alice’s Friends
20
SLIDE 44
Social Network-based Enhancement Social Network-based Enhancement
sea.jpg Alice’s search result … … Ranking Score … … 0.4 0.5 0.3 0.2 Ranking Score Alice Alice’s Friends
20
SLIDE 45
- The details about social network-based enhancement
mechanism of SpamResist please see our paper.
- The practical issue on unreliable friends please see
- ur paper.
Social Network-based Enhancement Social Network-based Enhancement
21
SLIDE 46
- The details about social network-based enhancement
mechanism of SpamResist please see our paper.
- The practical issue on unreliable friends please see
- ur paper.
Social Network-based Enhancement Social Network-based Enhancement
21
SLIDE 47
- Boolean Model.
- Occurrence Model.
- Coincidence Model.
- PINTS.
Search Models:
Evaluation Evaluation
22
SLIDE 48
- Boolean Model.
- Occurrence Model.
- Coincidence Model.
- PINTS.
Search Models:
Evaluation Evaluation
22
SLIDE 49
The search strategy of Boolean is the system randomly ranks the results associated with the search tag.
Boolean Model Boolean Model
23
SLIDE 50
- Boolean Model.
- Occurrence Model.
- Coincidence Model.
- PINTS.
Search Models:
Evaluation Evaluation
24
SLIDE 51
- Boolean Model.
- Occurrence Model.
- Coincidence Model.
- PINTS.
Search Models:
Evaluation Evaluation
24
SLIDE 52
Occurrence model ranks the search results based on the number of annotations containing the searched tag and returns the top ranking results.
Occurrence Model Occurrence Model
25
SLIDE 53
- Boolean Model.
- Occurrence Model.
- Coincidence Model.
- PINTS.
Search Models:
Evaluation Evaluation
26
SLIDE 54
- Boolean Model.
- Occurrence Model.
- Coincidence Model.
- PINTS.
Search Models:
Evaluation Evaluation
26
SLIDE 55
Coincidence model assigns each user a score computed by the number of the annotations overlapped with other users in the system.
Coincidence Model Coincidence Model
27
SLIDE 56
- Boolean Model.
- Occurrence Model.
- Coincidence Model.
- PINTS.
Search Models:
Evaluation Evaluation
28
SLIDE 57
- Boolean Model.
- Occurrence Model.
- Coincidence Model.
- PINTS.
Search Models:
Evaluation Evaluation
28
SLIDE 58
The strategy of PINTS is the client first generates a feature vector to store a characteristic score for each peer in system; then, using the vector, the client aggregates the information of annotations selectively, and randomly ranks the search result.
PINTS PINTS
29
SLIDE 59
- Random Attacks: randomly annotate misleading tags
to the resources in the system;
- Targeted Attacks: collusively annotate resources with
the same misleading tags;
- Tricky Attacks: annotate resources with both correct
and misleading tags. This attack could make some anti- spam scheme unusable.
Evaluation Evaluation
Threat Models: 30
SLIDE 60
The impact of random attack under 20% random attackers (more detail see paper)
Evaluation Evaluation
31
SLIDE 61
Evaluation Evaluation
The impact of targeted attack under 20% targeted attackers (more detail see paper)
32
SLIDE 62
Evaluation Evaluation
The impact of tricky attack under 20% tricky attackers (more detail see paper)
33
SLIDE 63
SpamResist is a novel social reliability-based mechanism towards spam-free and personalized tag search results in the P2P tagging systems.
Conclusion Conclusion
34
SLIDE 64