 
              Link prediction The link prediction space is vast and imbalanced : real approaches focus only in the 2-hop social neighborhood, i.e., friends-of- friends . A new source of promising candidates for link prediction: the places visited by each user. Place-friends are users that check-in in the same places but that are not socially connected.
The importance of Place-Friends Friends-of-friends only Friends-of-friends and place-friends Place-friends only • We have analyzed four monthly Other snapshots of Gowalla data containing user profiles, friends 14% list and check-ins. 16% • We found that about 30% of new links are added among 32% “place-friends”, or users who check-in at the same places. 38%
Reducing the link prediction space FoF PF FoF and PF Complete 10M 100M 1B 10B Number of candidate pairs in the link prediction spaces: by focusing prediction efforts only on place-friends (PF) and friends-of-friends (FoF) the prediction space can be reduced by about 15 times, while still covering two-thirds of all new links.
“A focus is a social, psychological, physical or legal entity around which joint activities are organized. [...] individuals whose activities are organized around the same focus will tend to become interpersonally tied. [...] The structure of the social ties is dependent upon the constraint and the size of the underlying foci.” Scott Feld, “ The Focused Organization of Social Ties ” American Journal of Sociology, Vol. 86, No. 5. (1981). The focus theory of Physical places represent social foci and they correlate with the social ties creation of social ties.
Place properties, focus theory and social ties Fraction of check-ins at place 10 0 k made by user i Link probability 10 − 1 � E k = − q ik log q ik 10 − 2 Snapshot 1 i Snapshot 2 10 − 3 Snapshot 3 Place entropy Snapshot 4 10 − 4 0 1 2 3 4 5 6 7 8 9 Place entropy [bits] A place where only a small number of regular users is likely to be a place with a significant importance for them, such as private houses, gyms, offices. A place with a sporadic check-ins made by several users is likely to be a public place without great significance to its visitors, such as touristic places, airports, train stations.
Prediction features • Place features (place-friends only) ✦ (weighted) shared places ✦ (weighted) shared check-ins ✦ entropy of shared places ✦ popularity of shared places • Social features (friends-of-friends only) ✦ shared friends ✦ Jaccard coefficient ✦ Adamic/Adar measure • Global features (all pairs): ✦ geographic distance ✦ preferential attachment ✦ user activity
System design We adopt a supervised learning approach to link prediction over three disjoint prediction sets: • Social : links appearing only among friends-of-friends; • Place : links appearing only among place-friends; • Place-social : links appearing among friends-of-friends and place-friends We adopt place features , social features and global features across the three prediction spaces; then we train our models on a set of data and we test them on future data.
Prediction performance: classifiers 1 0.875 Area under the ROC curve (AUC) for different AUC classifiers on the three 0.75 different prediction sets. Results obtained with 10- fold cross validation. 0.625 0.5 Social Place Place-social Model trees Random forest J48 Naive Bayes
Prediction performance: temporal snapshots 1 0.95 Prediction performance in terms of AUC of model trees on the three separate AUC 0.9 prediction sets in each temporal snapshot: results obtained by training on one 0.85 month and testing on the next one. 0.8 Month 1 Month 2 Month 3 Social Place Place-social
Recommend
More recommend