Learning in Social Networks E. Viennet Laboratoire de Traitement et - PowerPoint PPT Presentation

Learning in Social Networks E. Viennet Laboratoire de Traitement et Transport de l’Information L2TI Université Paris 13 6/5/2009 E. Viennet (L2TI) Learning in Social Networks 6/5/2009 1 / 47

Agenda Introduction to Social Networks 1 Detection of communities in networks 2 3 Node classification Kernel methods for graphs 4 E. Viennet (L2TI) Learning in Social Networks 6/5/2009 2 / 47

Learning from data From tables to structured data... Models: classification, regression, clustering... E. Viennet (L2TI) Learning in Social Networks 6/5/2009 3 / 47

Data mining and social networks Relations, interactions → structure Examples: Web Semantic networks Electronic mail Instant messaging (IM) Forums Telecommunications (cellphones, ...) Biology E. Viennet (L2TI) Learning in Social Networks 6/5/2009 4 / 47

Social networks data is everywhere Call networks Email networks Movie networks Coauthor networks Affiliation networks Friendship networks Organizational networks E. Viennet (L2TI) Learning in Social Networks 6/5/2009 5 / 47

Firms increasingly are collecting data on explicit social networks of consumers E. Viennet (L2TI) Learning in Social Networks 6/5/2009 6 / 47

Another example: Twitter Social Network (2007, Bruno Peeters, Belgium) E. Viennet (L2TI) Learning in Social Networks 6/5/2009 7 / 47

Applications & problems Social networks: community and structure (animation, targeted marketing) WWW: search, information retreival (group web sites or documents) Targeted marketing: identify groups of customers or products to make recommandations (targeted advertising, viral marketing) Personalization (interfaces, services) Epidemiology Fraud detection Security (counterterrorism) ... E. Viennet (L2TI) Learning in Social Networks 6/5/2009 8 / 47

Marketing & recommandation: the long tail Chris Anderson, The Long Tail, Wired, Issue 12.10 - October 2004 E. Viennet (L2TI) Learning in Social Networks 6/5/2009 9 / 47

Marketing, recommandation and SN Need for personalized recommandations ! > 50 % of people do research online before purchasing electronics personalized recommendations based on prior purchase patterns and ratings Amazon, “ people who bought x also bought y ” ◮ MovieLens, “ based on ratings of users like you... ” ◮ Epinions, “ based on the opinions of the raters you trust... ” We are more influenced by our friends than by strangers ! 68% of consumers consult friends and family before purchasing home electronics (Burke 2003) E. Viennet (L2TI) Learning in Social Networks 6/5/2009 10 / 47

Some interesting problems for data miners... Caracterize networks Model diffusion of information (for, e.g., viral marketing) Model evolution (link creation) Extract information for learning (node classification) E. Viennet (L2TI) Learning in Social Networks 6/5/2009 11 / 47

Our objectives today... Give some insight about Social Network Analysis 1 Present some recent advances in community detection 2 Define the node classification problem 3 Show how to define kernels for graph data 4 E. Viennet (L2TI) Learning in Social Networks 6/5/2009 12 / 47

Typical size of datasets used in the field Number of nodes e-mails of a lab (2 months) ≈ 1000 e-mails (2 years) ≈ 50000 Friendship among bloggers 4.4 millions Cellular phone calls (CDR) ≈ 20 millions IM communications 240 millions Sparse networks : number of links proportional to the number of nodes. E. Viennet (L2TI) Learning in Social Networks 6/5/2009 13 / 47

What’s different about networked data ? A social netwok is a graph, but: nodes can have attributes edges (links) may be weighed and/or directed, or not so, the similarity between two nodes is = f ( attributes , links ) the network’s graph is not a simple random graph (special structural properties) Nodes are not i.i.d. ! E. Viennet (L2TI) Learning in Social Networks 6/5/2009 14 / 47

Small world effect The shortest path between two random nodes is on average small . This property is related to the distribution of the degrees of the nodes: scale-free network (Barabasi, 2000) P ( degree = k ) ∝ k − γ random graph scale-free graph (Albert et al, 2000) E. Viennet (L2TI) Learning in Social Networks 6/5/2009 15 / 47

Common properties characterizing nodes or links Clustering coefficient Related to the number of neighbors of a node which are linked together (triangles) (Watts et Strogatz, 1998) Betweenness Number of shortest paths passing through a given edge (or node) (Newman 2004) E. Viennet (L2TI) Learning in Social Networks 6/5/2009 16 / 47

Part 2 Detection of communities in networks E. Viennet (L2TI) Learning in Social Networks 6/5/2009 17 / 47

Communities in networks (P . Pons, 2007) Finding communities = partition the graphe in N clusters Identify = finding the (small) communauty around a given node E. Viennet (L2TI) Learning in Social Networks 6/5/2009 18 / 47

Model-based clustering for social networks Modelize simultanously the distribution of nodes attributes and positions in “ social space ”: latent variable model Representation of the social network The matrix Y ij describes the links between nodes. Z = z i ∈ R d gives the positions of the nodes in social space R d “social space”. E. Viennet (L2TI) Learning in Social Networks 6/5/2009 19 / 47

Model-based clustering (continued): the model Handcock & Raftery, 2006 n nodes, Y = y ij adjacency matrix (“sociomatrix”). Links are considered as independents: � P ( Y | Z , X , β ) = P ( y ij | z i , z j , x ij , β ) i � = j where X : attributes of nodes (or of pair ( i , j ) ) β : parameters of the model Modelization by logistic regression: logit ( y ij = 1 | z i , z j , x ij , β ) = β T 0 x ij − β 1 | z i − z j | i | z i | 2 = 1 with 1 � n E. Viennet (L2TI) Learning in Social Networks 6/5/2009 20 / 47

Model-based clustering (continued) Clustering via modelization of the coordinates z i by gaussian mixture: G λ g exp ( −| z i − µ g | 2 � � z i ∝ ) with λ g > 0 and λ g = 1 2 σ 2 g g = 1 G number of clusters, fixed a priori Estimation of parameters : maximum likelyhood or bayesian (markov chain or Monte Carlo) � estimation is computationally costly E. Viennet (L2TI) Learning in Social Networks 6/5/2009 21 / 47

Model-based clustering (continued): application The choice of the number of clusters G can be posed as a model selection problem (e.g. BIC criteria) � slow ! Links between monks Sociological study: “friendship” between monks 18 nodes (monks) � 3 groups of monks (match those identified by sociologists) E. Viennet (L2TI) Learning in Social Networks 6/5/2009 22 / 47

Model-based clustering (continued): application 2 Links between teenagers in a school Relations between 71 adolescents (here 6 clusters) E. Viennet (L2TI) Learning in Social Networks 6/5/2009 23 / 47

Model-based clustering: conclusions Complex methods (heavy computations) giving precise results Take in account both links and attributes at the same time Restricted to problems of small size ! = ⇒ we will now focus on “structural” methods (using only links) E. Viennet (L2TI) Learning in Social Networks 6/5/2009 24 / 47

Criteria: Modularity Mesure the quality of a clustering of the graph in c communities � � d ij ) 2 ) Q = ( d ii − ( i j D matrix c × c , with elements d ij giving the proportion of edges linking nodes from community i to nodes of community j Q ∈ [ − 1 , 1 ] measures the density of links inside communities compared to links between communities E. Viennet (L2TI) Learning in Social Networks 6/5/2009 25 / 47

Finding structural communities Lot of recent work and progress... Méthods based on ( betweenness ) First attempt: Newman & Girvan (2004) Repeat: compute betweeness of edges 1 cut most important edge 2 until no more edges For a sparse graph of size n nodes: O ( n 3 ) Newman & Girvan 2004 O ( n 2 ) Newman 2004 O ( n log 2 n ) Wakita & Tsurumi 2007 Blondel et al. (Louvain) 2008 linear ? � less than 5 minutes for 1 million nodes, or 40 minutes for 23 millions E. Viennet (L2TI) Learning in Social Networks 6/5/2009 26 / 47

Finding communities: Louvain method Local optimization by switching labels considering only neighborhood of each node. Blondel et al., Fast unfolding of communites in large networks, 2008 E. Viennet (L2TI) Learning in Social Networks 6/5/2009 27 / 47

Hierarchical communities and modularity From Newman & Girvan, 2004 E. Viennet (L2TI) Learning in Social Networks 6/5/2009 28 / 47

Example (scientists collaboration network) From K. Martin et M. Avnet, 2006. E. Viennet (L2TI) Learning in Social Networks 6/5/2009 29 / 47

Identification of communities Look for a neighborhood (micro-community) around a given node E. Viennet (L2TI) Learning in Social Networks 6/5/2009 30 / 47

Learning in Social Networks E. Viennet Laboratoire de Traitement et - PowerPoint PPT Presentation

Learning in Social Networks E. Viennet Laboratoire de Traitement et Transport de lInformation L2TI Universit Paris 13 6/5/2009 E. Viennet (L2TI) Learning in Social Networks 6/5/2009 1 / 47 Agenda Introduction to Social Networks 1

Introduction Social and Economic Networks MohammadAmin Fazli Social and Economic Networks 1

Submodular Maximization applied to Marketing Over Social Networks Vahab Mirrokni Google

SOCIAL NETWORKS OF ELDERLY PEOPLE Hayden Manseau 1 1. THE PROBLEM 2 THE IMPACT OF SOCIAL

Types of networks (social networks, computer networks, entity- relationship networks, )

Querying Geo-social Data by Bridging Spatial Networks and Social Networks Yerach Ben Yaron

Social Networks What are they, really? What we will learn today What is a social network?

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Evaluating Attack Amplification in Online Social Networks in Online Social Networks Blase E. Ur

Graphs and social networks Social networks Active area of research motivated in part by

Social Networks CPSC 533c Presentation J. Karen Parker Social Networks? From Wikipedia:

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

European Social Network Social services in Europe Christian Fillet Chair, European Social

Mobile Communications Ad-Hoc Networks & Wireless Sensor Networks Ad-hoc networks

Influence maximisation Social and Technological Networks Rik Sarkar University of Edinburgh,

Influence Identification on Independent Cascade Model

Learning Cascaded Influence under Partial Monitoring Jiaqi Ma 1 Jie Zhang 2 Jie Tang 3 1 Dept. of

Chapter 17 Integrated Marketing Communications (IMC) Course evaluations 2 A Couple of

Maximizing the Spread of Influence through a Social Network Han Wang Department of Computer

Accountability in Hosted Virtual Networks Eric Keller, Ruby B. Lee, Jennifer Rexford Princeton

Virtual Events and Why You Should (or Shouldnt) Run One Nick Giallourakis Agenda Who Is

Virtual Community Meeting July 7, 2020 @ 7pm Welcome & Introductions Elise Larsen,

Learning in Social Networks E. Viennet Laboratoire de Traitement et - PowerPoint PPT Presentation

Learning in Social Networks E. Viennet Laboratoire de Traitement et Transport de lInformation L2TI Universit Paris 13 6/5/2009 E. Viennet (L2TI) Learning in Social Networks 6/5/2009 1 / 47 Agenda Introduction to Social Networks 1

Introduction Social and Economic Networks MohammadAmin Fazli Social and Economic Networks 1

Submodular Maximization applied to Marketing Over Social Networks Vahab Mirrokni Google

SOCIAL NETWORKS OF ELDERLY PEOPLE Hayden Manseau 1 1. THE PROBLEM 2 THE IMPACT OF SOCIAL

Types of networks (social networks, computer networks, entity- relationship networks, )

Querying Geo-social Data by Bridging Spatial Networks and Social Networks Yerach Ben Yaron

Social Networks What are they, really? What we will learn today What is a social network?

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Evaluating Attack Amplification in Online Social Networks in Online Social Networks Blase E. Ur

Graphs and social networks Social networks Active area of research motivated in part by

Social Networks CPSC 533c Presentation J. Karen Parker Social Networks? From Wikipedia:

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

European Social Network Social services in Europe Christian Fillet Chair, European Social

Mobile Communications Ad-Hoc Networks &amp; Wireless Sensor Networks Ad-hoc networks

Influence maximisation Social and Technological Networks Rik Sarkar University of Edinburgh,

Influence Identification on Independent Cascade Model

Learning Cascaded Influence under Partial Monitoring Jiaqi Ma 1 Jie Zhang 2 Jie Tang 3 1 Dept. of

Chapter 17 Integrated Marketing Communications (IMC) Course evaluations 2 A Couple of

Maximizing the Spread of Influence through a Social Network Han Wang Department of Computer

Accountability in Hosted Virtual Networks Eric Keller, Ruby B. Lee, Jennifer Rexford Princeton

Virtual Events and Why You Should (or Shouldnt) Run One Nick Giallourakis Agenda Who Is

Virtual Community Meeting July 7, 2020 @ 7pm Welcome &amp; Introductions Elise Larsen,

Mobile Communications Ad-Hoc Networks & Wireless Sensor Networks Ad-hoc networks

Virtual Community Meeting July 7, 2020 @ 7pm Welcome & Introductions Elise Larsen,