Supervised Rank Aggregation Approach for Link Prediction in Complex - - PowerPoint PPT Presentation

supervised rank aggregation approach for link prediction
SMART_READER_LITE
LIVE PREVIEW

Supervised Rank Aggregation Approach for Link Prediction in Complex - - PowerPoint PPT Presentation

Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari & Rushed Kanawati A3 firstname.lastname@lipn.univ-paris13.fr 17/10/2012 Outline Link Prediction Supervised Rank Aggregation Experiment


slide-1
SLIDE 1

Supervised Rank Aggregation Approach for Link Prediction in Complex Networks

Manisha Pujari & Rushed Kanawati

A3 firstname.lastname@lipn.univ-paris13.fr

17/10/2012

slide-2
SLIDE 2

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

1

Link Prediction

2

New Approach: Supervised Rank Aggregation

3

Experiment

4

Conclusion

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 2/28

slide-3
SLIDE 3

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Link Prediction Problem

Link Prediction Predicting missing/hidden/new links between nodes of a graph. Applications Recommender systems Academic/Professional collaborations Identification of structures of criminal networks Biological networks

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 3/28

slide-4
SLIDE 4

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Link Prediction Approaches

Dyadic: Computation of link score for unlinked vertices Structural: Mining rules for evolution of sub-graphs Topology based: Attributes computed for graph Node-feature based: Attributes computed for nodes Hybrid: Combination of the two Temporal: Consider dynamics of the networks Static: Do not consider the dynamics of a network

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 4/28

slide-5
SLIDE 5

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Link Prediction Approaches

Dyadic: Computation of link score for unlinked vertices Topology based: Attributes computed for graph Temporal: Consider dynamics of the networks

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 4/28

slide-6
SLIDE 6

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Dyadic Topological Approaches

Work of [Liben-Nowell & al.,2007]

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 5/28

slide-7
SLIDE 7

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Dyadic Topological Approaches

Work of [Liben-Nowell & al.,2007]

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 6/28

slide-8
SLIDE 8

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Combining the effect of different topological measures: Application of supervised machine learning algorithms

Work of [Hasan & al., 2006]

Examples: (Nodex, Nodey) − → [a0, a1, ...., an] [3, 1, Positive] [1, 0.33, Negative] . . .

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 7/28

slide-9
SLIDE 9

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Combining the effect of different topological measures: Application of supervised machine learning algorithms

Work of [Benchettara & al.,2010]

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 8/28

slide-10
SLIDE 10

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Combining the effect of different topological measures: Application of supervised machine learning algorithms

Work of [Benchettara & al.,2010]

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 9/28

slide-11
SLIDE 11

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Combining the effect of different topological measures: Application of supervised machine learning algorithms

Work of [Benchettara & al.,2010] Can we apply rank aggregation methods ?

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 9/28

slide-12
SLIDE 12

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Rank Aggregation (Social choice theory)

Combining various lists of ranked candidates to find a single list with minimum possible disagreement Expert1 = ⇒ L1 = [A, B, C, D] Expert2 = ⇒ L2 = [B, D, A, C] Expert3 = ⇒ L3 = [C, D, A, B] ... ... ... Expertn = ⇒ Ln = [D, C, A, B] ——————————————— Laggregate = [?, ?, ?, ?]

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 10/28

slide-13
SLIDE 13

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Supervised Rank Aggregation

Combining different rankings to get an aggregation giving different weights to the experts w1 ← Expert1 = ⇒ L1 → [k elements] w2 ← Expert2 = ⇒ L2 → [k elements] w3 ← Expert3 = ⇒ L3 → [k elements] ... ... ... wn ← Expertn = ⇒ Ln → [k elements] We propose Link prediction based on

1

Supervised Borda

2

Supervised Kemeny

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 11/28

slide-14
SLIDE 14

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Supervised Borda Aggregation

Borda score: B(x) =

n

  • i=1

BLi (x) ; where BLi (x) = {count(y)|Li(y) > Li(x)&y ∈ Li} (1) Supervised Borda score: B(x) =

n

  • i=1

wi ∗ BLi (x) (2)

NOTE: Li(x) represent the rank (or index) of element x in input list Li. The lower the value of rank, the higher is the preference. U is the set

  • f all elements in the lists.
  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 12/28

slide-15
SLIDE 15

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Kemeny Optimal Aggregation [Dwork & al.,2001]

Based on relative ranking of elements NP-hard Approximate Kemeny

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 13/28

slide-16
SLIDE 16

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Supervised Kemeny Aggregation

Inputs: Ranked lists [L1, L2, . . . , Ln], Weights [w1, w2, . . . , wn] , each list with m elements (U) Steps:

1

Initial aggregation R

2

∀(x, y) ∈ R, Compute score(x, y) = n

i=1(wi ∗ Prefi(x, y)) where

Prefi(x, y) =

  • if y ≻ x i.e. Li(x) > Li(y)

1 if x ≻ y i.e. Li(x) < Li(y)

3

If score(x, y) > wT 2 where wT = n

i=1 wi, then x ≻w y

4

Apply a sorting algorithm on R: Swap(x, y) only if y ≻w x

5

R is the final aggregation.

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 14/28

slide-17
SLIDE 17

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Supervised Kemeny Aggregation

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 15/28

slide-18
SLIDE 18

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Link Prediction based on Supervised Rank Aggregation

Examples: (Nodex, Nodey) − → [a0, a1, a2, ...., an] Steps: Learning

1

Rank learning examples by attribute values

2

Consider only top t examples and compute attribute weight wai Validation

1

Rank test examples by attribute to get n ranked lists

2

Apply supervised rank aggregation

3

Consider only top k examples of the aggregate list and compute performance.

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 16/28

slide-19
SLIDE 19

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Link Prediction Based on Supervised Rank Aggregation

Computation of attribute weights: Maximization of identification positive examples: Wai = n ∗ Precisionai (3)

where n is the total number of attributes and Precisionai is the precision

  • f attribute ai .

precision = fraction of retrieved examples that are really positive

Minimization of identification of negative examples: Wai = n ∗ (1 − FPRai) (4)

where n is the total number of attributes FPRai is the false positive rate

  • f attribute ai .

false positive rate = fraction of negative examples retrieved as positive

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 17/28

slide-20
SLIDE 20

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

DBLP data

Author-Document bipartite graphs Datasets Training Time Graph Authors Publications Edges Dataset1 [1970,1973] 2661 1487 6634 Dataset2 [1972,1975] 4536 2542 10855 Projected graphs Datasets Author Graph Publication Graph Nodes Edges Nodes Edges Dataset1 2661 2575 1487 1520 Dataset2 4536 4510 2542 2813 Examples

Datasets Training Labeling Testing Training examples Test examples Time Time Time Pos Neg Pos Neg Dataset1 [1970,1973] [1974,1975] [1971,1974] 30 1663 41 3430 Dataset2 [1972,1975] [1976,1977] [1973,1976] 87 19245 82 18675

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 18/28

slide-21
SLIDE 21

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Topological Attributes

Neighborhood-based attributes: Common neighbors : VC(x, y))= Γ(x) ∩ Γ(y) Jaccard’s coefficient : JC(x, y))= Γ(x)∩Γ(y)

Γ(x)∪Γ(y)

Adamic Adar: AD(x, y)=

z∈Γ(x)∩Γ(y) 1 logΓ(z) [Adamic & al.2003]

Preferential attachment: AP(x, y)= Γ(x) × Γ(y) [Huang & al., 2005] Distance-based attributes: Shortest path distance(Dis) Katz: Katz(x, y) = Σ∞

l=1βℓ× path(ℓ) x,y , where path(ℓ) x,y is the

number of paths between x and y of length ℓ and β is a positive parameter which favours shortest paths [Katz,1953] Maximum forest algorithm (MFA) [Fouss & al., 2007] Centrality-based attributes: Product of PageRank (PPR)[Brin & al., 1998] Product of degree centrality (PCD) Product of clustering coefficient (PCF)

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 19/28

slide-22
SLIDE 22

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Results

Results(average precision) obtained by ranking the test examples by attribute values Attributes Dataset1 Dataset2 Katz MFA 0.0244 0.0732 PPR 0.0244 0.0244 PCF 0.0732 0.0244 PCD VC 0.5122 0.4268 JC 0.2195 0.1707 AD 0.1463 0.1463 AP 0.0488 Dis 0.0122 Attributes Dataset1 Dataset2 Indirect Katz 0.1220 0.1098 Indirect MFA 0.0488 0.0732 Indirect PPR 0.0488 Indirect PCF 0.4878 0.4756 Indirect PCD 0.0244 0.0122 Indirect VC 0.0488 0.0488 Indirect JC 0.0488 0.1098 Indirect AD 0.0976 0.0488 Indirect AP 0.0244 Indirect Dis 0.6098 0.5366

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 20/28

slide-23
SLIDE 23

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Experiment-1

Performance measure: Precision = |Positive links ∩ Predicted links| |Predicted links| Figure: Precision for complete test set by learning on complete training set

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 21/28

slide-24
SLIDE 24

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Experiment-2(a): Performance of supervised Kemeny by varying K for validation

Dataset-1 Figure: Precision Figure: Recall

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 22/28

slide-25
SLIDE 25

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Experiment-2(b): Performance of supervised Borda by varying K for validation

Dataset-1 Figure: Precision Figure: Recall

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 23/28

slide-26
SLIDE 26

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

Conclusion

Link prediction and temporal dyadic approaches A new definition for supervised Kemeny aggregation Application of supervised rank aggregation to link prediction Perspectives ⇒ Application of top − k aggregation[Kumar & al., 2009]: Reduce complexity caused due to rank aggregation ⇒ Use of communities and community based information (work with Zied Yakoubi) ⇒ Application on heterogeneous scientific collaboration network ⇒ Application for tag recommendation in folksonomy

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 24/28

slide-27
SLIDE 27

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

MERCI et QUESTION ?

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 25/28

slide-28
SLIDE 28

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

References I

[Benchettara & al.,2010] N. Benchettara, R. Kanawati, C. Rouveirol. Supervised machine learning applied to link prediction in bipartite social networks . In International Conference on Advances in Social Network Analysis and Mining, ASONAM 2010, 2010. [Brin & al., 1998] Sergey Brin, Lawerence Page. The anatomy of a large scale hypertextual web search. Proceedings of seventh International conference on the world wide web, 1998. [Dwork & al.,2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank Aggregation method for Web. WWW 01: Proceedings of 10th international conference on World Wide Web, pages 613-622 (2001). [Kumar & al., 2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank aggregation revisited . Manuscript, 1953. [Katz,1953] L.Katz. A new status index derived from socimetric analysis. (article) . Vol. 18, pages 39-43, 2001. [Borda, 1781] J.C.Borda FAG03 M´ emoire sur les ´ elections au Scrutin . Histoire de l’Acad´ emie Royale des Sciences,1781 . [Fagin & al., 2003] R.Fagin, R.Kumar,D. Sivakumar. Efficient similarity search and classification via rank aggregation. Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 301-312, 2003, New York. [Sculley,2007] D.Sculley. Rank Aggregation for Similar Items. Proceedings of the Seventh SIAM International Conference on Data Mining, April 26-28, 2007, Minneapolis, Minnesota, USA (2007)). [Adamic & al.2003] L. A. Adamic, O. Buyukkokten, and E. Adar. A social network caught in the web. First Monday, Vol. 8, No.

  • 6. (June 2003).

[Bisson & al., 2008] G. Bisson and F. Hussain. χ-Sim: A new similarity measure for the co- clustering task. In Seventh International Conference on Machine Learning and Application(ICMLA), IEEE Computer Society (2008) , pages 211-217.

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 26/28

slide-29
SLIDE 29

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

References II

[Mrosek & al.,2009] J. Mrosek, S. Bussmann, H. Albers, K. Posdziech, B. Hengefeld, N. Opperman, S. Robert and G. Spirar. Content- and graph-based tag recommendation: Two variatons. ECML PKDD Discovery Challenge 2009 DC09, 497, pages 189-199. Bled, Slovenia, CEUR Workshop Proceedings, September 2009. [Lipczak, 2008] M.Lipczak. Tag recommendation for folksonomies oriented towards individual users . In Proceedings of ECML PKDD Discovery Challenge (RSDC08) (2008), pages. 84-95. [Liben-Nowell & al.,2007] David Liben-Nowell, and Jon Kleinberg The link prediction problem for social networks. In Proceedings

  • f the 16th international conference on World Wide Web, pages. 481-490,New York, USA,2007.

[Hasan & al., 2006] Mohammad Al Hasan, Vineet Chaoji, Saeed Salem, and Mohammed Zaki. Link prediction using supervised

  • learning. SIAM Workshop on Link Analysis, Counterterrorism and Security with SIAM Data Mining Conference, 2006.

[Huang & al., 2005] Zan Huang, Xin Li and Hsinchun Chen. Link prediction approach to collaborative filtering. Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries,pages. 141-142,New York, USA , 2005. [Fouss & al., 2007] Francois Fouss, Alain Pirotte, Jean-Michel Renders and Marco Sarens. Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommandation IEEE Transactions on knowledge and data engineering, pages. 355-369, 2007. [Acar & al.,2009] Evrim Acar, Daniel M. Dunlavy and Tamara G Kolda. Link Prediction on Evolving Data Using Matrix and Tensor Factorizations.. ICDM Workshops, pages.262-269, 2009. [Lahiri & al., 2007] Mayank Lahiri and Tanya Y. Berger-Wolf Structure Prediction in Temporal Networks using Frequent

  • Subgraphs. CIDM, pages 35-42,2007.

[Kumar & al., 2009] Ravi Kumar, Kunal Punera, Torsten Suel and Sergei Vassilvitskii. Top-k aggregation using intersections of ranked inputs. Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages. 222-231, WSDM, 2009,Barcelona, Spain.

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 27/28

slide-30
SLIDE 30

Outline Link Prediction Supervised Rank Aggregation Experiment Conclusion

References III

[Pujari & al., 2011] Manisha Pujari and Rushed Kanawati. Supervised machine learning link prediction approach for tag

  • recommandation. 4th International Conference on Online Communities and Social Computing @ HCI International,Orlando,Florida

9-14 July 2011. [Pujari & al., 2012] Manisha Pujari and Rushed Kanawati. Tag recommendation by link prediction based on supervised machine

  • learning. Sixth International AAAI Conference on Weblogs and Social Media (ICWSM 2012). 5-8 June 2012, Dublin.

[Subbian & al., 2011] K. Subbian and P. Melville. Supervised rank aggregation for predicting influence in networks. In the proceedings of the IEEE Conference on Social Computing (SocialCom-2011)., Boston, October 2011. [Liu & al., 2007] Yu-Ting Liu, Tie-Yan Liu, Tao Qin, Zhi-Ming Ma, and Hang Li. Supervised rank aggregation. In Proceedings of the16th international conference on World Wide Web,WWW ’07, pages 481-490, New York, NY, USA, 2007. ACM.

  • M. Pujari & R. Kanawati

Link Prediction by Supervised Rank Aggregation 28/28