Supervised Rank Aggregation Approach for Link Prediction in Complex - - PowerPoint PPT Presentation

supervised rank aggregation approach for link prediction
SMART_READER_LITE
LIVE PREVIEW

Supervised Rank Aggregation Approach for Link Prediction in Complex - - PowerPoint PPT Presentation

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari & Rushed Kanawati LIPN - UMR CNRS 7030


slide-1
SLIDE 1

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Supervised Rank Aggregation Approach for Link Prediction in Complex Networks

Manisha Pujari & Rushed Kanawati

LIPN - UMR CNRS 7030 Universit´ e Paris Nord 99 Av. J.B. Clement 93430, Villetaneuse, FRANCE manisha.pujari@lipn.univ-paris13.fr

16 April, 2012

Mining Social Network Dynamics Workshop WWW-2012, Lyon,France

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 1/22

slide-2
SLIDE 2

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

1

Link Prediction

2

Supervised Rank Aggregation based Link Prediction

3

Experiment

4

Conclusion

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 2/22

slide-3
SLIDE 3

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Problem

Link Prediction Predicting new links between nodes of a graph. Applications Recommender systems Academic/Professional collaborations Identification of structures of criminal networks Biological networks

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 3/22

slide-4
SLIDE 4

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Link Prediction Approaches

Dyadic: Computation of link score for unlinked vertices Structural: Mining rules for evolution of sub-graphs Topology based: Attributes computed for graph Node-feature based: Attributes computed for nodes Hybrid : Combination of the two Temporal: Consider dynamics of the networks Static: Do not consider the dynamics of a network

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 4/22

slide-5
SLIDE 5

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Link Prediction Approaches

Dyadic: Computation of link score for unlinked vertices Topology based: Attributes computed for graph Temporal: Consider dynamics of the networks

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 4/22

slide-6
SLIDE 6

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Dyadic Topological Approaches

Work of [Liben-Nowell & al.,2007] Prediction on a co-authorship network. For each unlinked node pair (u, v) , compute a set of topological attributes [A1, A2, ..., An]. Rank all (u, v) based on attribute values. Considering only top k ranked edges as predicted edges, performance of each attribute is found. Attributes : Neighborhood-based attributes: Jaccard’s coefficient,Common neighbors,Adamic/Adar [Adamic & al.2003], Preferential attachment etc. Distance-based attributes: Shortest path distance,Katz [Katz,1953], Maximum forest algorithm etc. Centrality-based attributes: PageRank, Degree centrality, Clustering coefficient etc.

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 5/22

slide-7
SLIDE 7

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Dyadic Topological Approaches

Combining the effect of different topological measures: Application of supervised machine learning algorithms [Benchettara & al.,2010],[Hasan & al., 2006]

Examples: (Nodex, Nodey) − → [a0, a1, a2, ...., an]

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 6/22

slide-8
SLIDE 8

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Dyadic Topological Approaches

Combining the effect of different topological measures: Application of supervised machine learning algorithms [Benchettara & al.,2010],[Hasan & al., 2006]

Examples: (Nodex, Nodey) − → [a0, a1, a2, ...., an]

Can we apply rank aggregation methods ?

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 6/22

slide-9
SLIDE 9

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Rank Aggregation (Social choice theory)

⇒ To find an aggregated list with minimum possible disagreement ⇒ Equal weight to all experts Expert1 = ⇒ L1 = [A, B, C, D] Expert2 = ⇒ L2 = [B, D, A, C] Expert3 = ⇒ L3 = [C, D, A, B] ... ... ... Expertn = ⇒ Ln = [D, C, A, B] ——————————————— Laggregate = [?, ?, ?, ?] Types of input lists Full/Complete lists Partial lists Disjoint lists

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 7/22

slide-10
SLIDE 10

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Distance Measure

Spearman Footrule Distance: F(L1, L2) = Σi∈n | L1(i) − L2(i) | Kendall Tau Distance: K(L1, L2) =| (i, j) s.t. L1(i) < L2(j) & L1(i) > L2(j) | Example:

L1 = [A, B, C, D] and L2 = [B, D, C, A] F(L1, L2) = | L1 (A) - L2 (B) | + | L1 (B) - L2 (B)| + | L1 (C) - L2 (C)| + | L1 (D) - L2 (D)| = 7 K(L1, L2) = 4

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 8/22

slide-11
SLIDE 11

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Borda’s Method [Borda, 1781]

Based on absolute positioning of elements BLk(i) = {count(j)|Lk(j) < Lk(i)&j ∈ Lk} ; B(i) =

k

  • t=1

BLt(i) (1)

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 9/22

slide-12
SLIDE 12

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Kemeny Optimal Aggregation [Dwork & al.,2001]

Based on relative ranking of elements SK(π, L1, L2, L3, ....., Ln) =

  • i∈[1,n]

K(π, Li) (2)

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 10/22

slide-13
SLIDE 13

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Supervised Rank Aggregation

Combining different rankings to get an aggregation giving different weights to the experts ⇒ Proposed approaches Supervised Borda Supervised local Kemeny w1 ← Expert1 = ⇒ L1 → [k elements] w2 ← Expert2 = ⇒ L2 → [k elements] w3 ← Expert3 = ⇒ L3 → [k elements] ... ... ... wn ← Expertn = ⇒ Ln → [k elements]

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 11/22

slide-14
SLIDE 14

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Supervised Borda Method

Borda score B(i) =

n

  • t=1

wi ∗ BLt(i) ; where t ∈ [1, k] (3)

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 12/22

slide-15
SLIDE 15

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Supervised Local Kemeny Aggregation

Steps:

1

L = [L1, L2, . . . , Ln], [w1, w2, . . . , wn] , m elements(U)

2

Initialize m × m matrix M with M(x, y) = 0

3

∀(x, y) ∈ U, Compute score(x, y) = n

i=1(wi ∗ (x ≻ y)) where

x ≻ y =

  • if Li(x) < Li(y)

1 if Li(x) > Li(y)

4

If score(x, y) > 0.5 ∗ n

i=1 wi, Insert M(x, y) = true and

M(y, x) = false

5

Initial aggregation R = L1

6

For x, y ∈ R, Swap(x, y) if M(x, y) = false

7

R is the final aggregation.

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 13/22

slide-16
SLIDE 16

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Supervised Local Kemeny Aggregation

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 14/22

slide-17
SLIDE 17

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Link Prediction based on Supervised Rank Aggregation

Examples: (Nodex, Nodey) − → [a0, a1, a2, ...., an] Steps:

1

Rank learning examples by attribute values

2

Consider only top k examples and compute attribute weight wai

3

Rank test examples by attribute to get n ranked lists

4

Apply supervised rank aggregation

5

Consider only top k examples of the aggregate list and compute performance.

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 15/22

slide-18
SLIDE 18

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Link Prediction Based on Supervised Rank Aggregation

Computation of attribute weights: Maximization of positive precision: Wai = n ∗ Precisionai (4) Minimization of false positive rate: Wai = n FPRai (5)

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 16/22

slide-19
SLIDE 19

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Experiment

DBLP database

Datasets Training Validation Training examples Test examples Time Time Positive Total Positive Total Dataset 1 [1970-1975] [1971-1976] 30 1693 41 3471 Dataset 2 [1972-1977] [1973-1978] 87 19332 82 18757 Dataset 3 [1974-1979] [1975-1980] 102 35190 164 60046 Table: DBLP Datasets

Performance measure: F = Precision ∗ Recall Precision + Recall (6)

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 17/22

slide-20
SLIDE 20

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Results

Experiment-1 : Performance based on learning on complete training dataset

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 18/22

slide-21
SLIDE 21

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Results

Experiment-2 : Performance in terms of precision based on learning on samples of training dataset

P is the number of positive examples and N is the number of negative

  • examples. In any sample, N = m ∗ P where m is any positive integer.

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 19/22

slide-22
SLIDE 22

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Results

Experiment-3 : Performance based on learning on complete training dataset and validation on test sample (N = 5 ∗ P).

Datasets Training examples Test examples Positive Total Positive Total Dataset 1 30 1693 41 246 Dataset 2 87 19332 82 492 Dataset 3 102 35190 164 984

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 20/22

slide-23
SLIDE 23

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Conclusion

A new definition for supervised rank aggregation. Application of supervised rank aggregation to link prediction. Future work:

⇒ Validation on other types of networks like e-commerce networks. ⇒ Application for tag recommendation in folksonomy [Pujari & al., 2011],[Pujari & al., 2012] ⇒ Application of our approach with community detection methods.

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 21/22

slide-24
SLIDE 24

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 22/22

slide-25
SLIDE 25

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

References I

[Benchettara & al.,2010] N. Benchettara, R. Kanawati, C. Rouveirol. Supervised machine learning applied to link prediction in bipartite social networks . In International Conference on Advances in Social Network Analysis and Mining, ASONAM 2010, 2010. [Dwork & al.,2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank Aggregation method for Web. WWW 01: Proceedings of 10th international conference on World Wide Web, pages 613-622 (2001). [Kumar & al., 2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank aggregation revisited . Manuscript, 1953. [Katz,1953] L.Katz. A new status index derived from socimetric analysis. (article) . Vol. 18, pages 39-43, 2001. [Borda, 1781] J.C.Borda FAG03 M´ emoire sur les ´ elections au Scrutin . Histoire de l’Acad´ emie Royale des Sciences,1781 . [Fagin & al., 2003] R.Fagin, R.Kumar,D. Sivakumar. Efficient similarity search and classification via rank aggregation. Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 301-312, 2003, New York. [Sculley,2007]D.Sculley. Rank Aggregation for Similar Items. Proceedings of the Seventh SIAM International Conference on Data Mining, April 26-28, 2007, Minneapolis, Minnesota, USA (2007)). [Adamic & al.2003]L. A. Adamic, O. Buyukkokten, and E. Adar. A social network caught in the web. First Monday, Vol. 8, No.

  • 6. (June 2003).

[Bisson & al., 2008] G. Bisson and F. Hussain. χ-Sim: A new similarity measure for the co- clustering task. In Seventh International Conference on Machine Learning and Application(ICMLA), IEEE Computer Society (2008) , pages 211-217. [Mrosek & al.,2009]J. Mrosek, S. Bussmann, H. Albers, K. Posdziech, B. Hengefeld, N. Opperman, S. Robert and G. Spirar. Content- and graph-based tag recommendation: Two variatons. ECML PKDD Discovery Challenge 2009 DC09, 497, pages 189-199. Bled, Slovenia, CEUR Workshop Proceedings, September 2009. M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 23/22

slide-26
SLIDE 26

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

References II

[Lipczak, 2008]M.Lipczak. Tag recommendation for folksonomies oriented towards individual users . In Proceedings of ECML PKDD Discovery Challenge (RSDC08) (2008), pages. 84-95. [Liben-Nowell & al.,2007] David Liben-Nowell, and Jon Kleinberg The link prediction problem for social networks. In Proceedings

  • f the 16th international conference on World Wide Web), pages. 481-490,New York,USA,2007.

[Hasan & al., 2006] Mohammad Al Hasan, Vineet Chaoji, Saeed Salem, and Mohammed Zaki. Link prediction using supervised

  • learning. SIAM Workshop on Link Analysis, Counterterrorism and Security with SIAM Data Mining Conference, 2006.

[Acar & al.,2009]Evrim Acar, Daniel M. Dunlavy and Tamara G Kolda. Link Prediction on Evolving Data Using Matrix and Tensor Factorizations.. ICDM Workshops, pages.262-269, 2009. [Lahiri & al., 2007]Lahiri, Mayank and Berger-Wolf, Tanya Y. Structure Prediction in Temporal Networks using Frequent

  • Subgraphs. CIDM, pages 35-42,2007.

[Pujari & al., 2011]Manisha Pujari and Rushed Kanawati. Supervised machine learning link prediction approach for tag

  • recommandation. 4th International Conference on Online Communities and Social Computing @ HCI International,Orlando,Florida

9-14 July 2011. [Pujari & al., 2012]Manisha Pujari and Rushed Kanawati. Tag recommendation by link prediction based on supervised machine

  • learning. Sixth International AAAI Conference on Weblogs and Social Media (ICWSM 2012). 5-8 June 2012, Dublin..

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 24/22