Vers une Analyse Conceptuelle des Rseaux Sociaux Erick Stattner - - PowerPoint PPT Presentation

vers une analyse conceptuelle des r seaux sociaux
SMART_READER_LITE
LIVE PREVIEW

Vers une Analyse Conceptuelle des Rseaux Sociaux Erick Stattner - - PowerPoint PPT Presentation

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Vers une Analyse Conceptuelle des Rseaux Sociaux Erick Stattner Martine Collard Laboratory of Mathematics and Computer Science (LAMIA) University of


slide-1
SLIDE 1

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives

Vers une Analyse Conceptuelle des Réseaux Sociaux

Erick Stattner Martine Collard

Laboratory of Mathematics and Computer Science (LAMIA) University of the French West Indies and Guiana, France

MARAMI 2012

Erick Stattner, Martine Collard MARAMI 2012 1 / 27

slide-2
SLIDE 2

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives

Motivation

Social Mining “New Science of Networks” focuses on interactions between entities and investigates new methods and techniques Knowledge extraction from data on real world phenomena studied through interactions among individuals Issues New data mining techniques: Link Mining (Node classification, Link-based Clustering, Link prediction, Frequent patterns...) Attributed graph mining (Cohesive sub-graphs, Summarization, ...)

Erick Stattner, Martine Collard MARAMI 2012 2 / 27

slide-3
SLIDE 3

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives

Data Mining Task

Context: Search for frequent patterns to answer to questions like :

◮ What are the groups of nodes the most connected? ◮ What are the nodes properties the most frequently found in connection?

Contribution: Search for Frequent Links in Social Networks

◮ between groups of nodes sharing internal common properties ◮ by combining network structure and node attribute values

b r b b b b b b b b r r r r

Frequent link (b,r) Erick Stattner, Martine Collard MARAMI 2012 3 / 27

slide-4
SLIDE 4

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Frequent pattern discovery Node clustering

Outline

1

Social Mining Frequent pattern discovery Node clustering

2

Concept of Frequent Link

3

Experimental results

4

Conclusion and Perspectives

Erick Stattner, Martine Collard MARAMI 2012 4 / 27

slide-5
SLIDE 5

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Frequent pattern discovery Node clustering

Pattern Mining in Social Networks

Current Methods

Main methods: Link prediction Frequent pattern discovery Node clustering Formal concept analysis

Erick Stattner, Martine Collard MARAMI 2012 5 / 27

slide-6
SLIDE 6

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Frequent pattern discovery Node clustering

Pattern Mining in Social Networks

Frequent pattern discovery

Frequent pattern discovery: pattern = subgraph search for subgraphs occuring frequently

into a large network into a set of networks

5. X 3. X 4. X 2. X 1. X 7. Y 10. Y 6. Y 11. Z 9. Z 8. Z X Y Y X X Y X X Z Z Y X X Z

Erick Stattner, Martine Collard MARAMI 2012 6 / 27

slide-7
SLIDE 7

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Frequent pattern discovery Node clustering

Pattern Mining in Social Networks

Node clustering

Node clustering: based on links to detect subgraphs or "communities"

  • bjective: identifying groups of nodes densely connected into the

network by maximizing intra-cluster links while minimizing inter-cluster links

Erick Stattner, Martine Collard MARAMI 2012 7 / 27

slide-8
SLIDE 8

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Frequent pattern discovery Node clustering

Pattern Mining in Social Networks

Hybrid Node clustering

Hybrid node clustering: based on links and on node attributes values

  • bjective: identifying groups of nodes that share common contacts

Erick Stattner, Martine Collard MARAMI 2012 8 / 27

slide-9
SLIDE 9

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Frequent pattern discovery Node clustering

Formal concept analysis

Formal concept of links: based on links and on nodes

  • bjective: identifying groups of nodes that share common contacts

Erick Stattner, Martine Collard MARAMI 2012 9 / 27

slide-10
SLIDE 10

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Frequent pattern discovery Node clustering

Pattern Mining in Social Networks

Observation

Current methods mainly use network structure

  • ften ignore nodes properties

Concept of frequent link combines information both from links and from node attributes values represents a regularity involving two groups of nodes that share internal common characteristics

% %

Erick Stattner, Martine Collard MARAMI 2012 10 / 27

slide-11
SLIDE 11

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Definition Knowledge extracted Analogy with lattices of itemsets

Outline

1

Social Mining

2

Concept of Frequent Link Definition Knowledge extracted Analogy with lattices of itemsets

3

Experimental results

4

Conclusion and Perspectives

Erick Stattner, Martine Collard MARAMI 2012 11 / 27

slide-12
SLIDE 12

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Definition Knowledge extracted Analogy with lattices of itemsets

Conceptual link

Definition

G = (V,E) network (directed) V defined as a relation R(A1,...,Ap) A1,...,Ap node attributes each node v ∈ V defined by the itemset A1 = a1 and ... and Ap = ap or a1...ap for m an itemset Vm: set of nodes satisfying m sm sub-itemset of m Vm ⊆ Vsm ex: Vabc ⊆ Vab

Erick Stattner, Martine Collard MARAMI 2012 12 / 27

slide-13
SLIDE 13

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Definition Knowledge extracted Analogy with lattices of itemsets

Conceptual link

Definition

G = (V,E) network IV set of all possible itemsets on G Left-hand side link set LEm = {e ∈ E ; e = (a,b) a ∈ Vm} Right-hand side link set REm = {e ∈ E ; e = (a,b) b ∈ Vm} Conceptual link

(m1,m2) =

LEm1 ∩ REm2 (1)

= {e ∈ E ; e = (a,b)

a ∈ Vm1 et b ∈ Vm2} (2)

Erick Stattner, Martine Collard MARAMI 2012 13 / 27

slide-14
SLIDE 14

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Definition Knowledge extracted Analogy with lattices of itemsets

Frequent conceptual link

Definition

Support Support of l = (m1,m2) supp[(m1,m2)] = |(m1,m2|

|E| β: link support threshold (m1,m2) is a frequent conceptual link iff:

supp[(m1,m2)] > β

Erick Stattner, Martine Collard MARAMI 2012 14 / 27

slide-15
SLIDE 15

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Definition Knowledge extracted Analogy with lattices of itemsets

Frequent Links

Knowledge provided

Frequent Links: Provide knowledge on the groups of nodes the most connected in the social network i.e. knowledge on the properties most often connected Example: Bipartite network customer-product:

m1 : Gender=‘M’ and Interest=‘computer science’ m2 : Category=‘Science Fiction’ and Product=‘book’ supp[(m1,m2)] = 14%

Erick Stattner, Martine Collard MARAMI 2012 15 / 27

slide-16
SLIDE 16

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Definition Knowledge extracted Analogy with lattices of itemsets

Frequent conceptual link

Downward-closure property

Sub and Super conceptual links

(sm1,sm2) sub conceptual link of (m1,m2) (sm1,sm2) ⊆ (m1,m2)

Downward-closure property if l is frequent then all its sub-links sl are also frequent if l is unfrequent then all its super-links sl are also unfrequent

Erick Stattner, Martine Collard MARAMI 2012 16 / 27

slide-17
SLIDE 17

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Definition Knowledge extracted Analogy with lattices of itemsets

Maximal frequent conceptual link

Definition

Maximal frequent conceptual link

(m1,m2) maximal frequent conceptual link iff ∄l′ frequent conceptual link such as l ⊂ l′.

Erick Stattner, Martine Collard MARAMI 2012 17 / 27

slide-18
SLIDE 18

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Definition Knowledge extracted Analogy with lattices of itemsets

Conceptual view

Lattice

Extraction of maximal frequent conceptual link on G Concept lattice and search space reduction

a, a a, b b, a ab, b ab, ab a, ab b, ab ab, a b, b Φ, Φ

(a)

a, b ab, b ab, ab a, ab b, ab ab, a b, b a, a b, a Φ, Φ

(b)

Erick Stattner, Martine Collard MARAMI 2012 18 / 27

slide-19
SLIDE 19

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Definition Knowledge extracted Analogy with lattices of itemsets

Conceptual view

Definition

β: link support threshold

FLVmax set of all maximal frequent conceptual links on G FLVmax conceptual view of the social network G

Réseau Social Liens Conceptuels Fréquents Vue Conceptuelle Seuil de support β 31% 22% 13%

Erick Stattner, Martine Collard MARAMI 2012 19 / 27

slide-20
SLIDE 20

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Testbed Extracted patterns

Outline

1

Social Mining

2

Concept of Frequent Link

3

Experimental results Testbed Extracted patterns

4

Conclusion and Perspectives

Erick Stattner, Martine Collard MARAMI 2012 20 / 27

slide-21
SLIDE 21

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Testbed Extracted patterns

Experimental results

Testbed

Testbed: Sub-network of the proximity contact network (City of Portland) simulated with Episim [Eubank,2005] Each node:

◮ age class, i.e. ⌊ age

10 ⌋,

◮ gender (1-male, 2-female), ◮ worker status, ◮ type of relationship with householder, ◮ contact class, i.e. ⌊ degree

2

◮ sociability

General Origine Portland Type Undirected #nodes 3000 #links 4683 Density 0.00110413 #comp 1 cc avg 0.63627 Degree avg 3.087 max 15 Distribution

0,1 0,2 0,3 1 3 5 7 9 11 13 15

Erick Stattner, Martine Collard MARAMI 2012 21 / 27

slide-22
SLIDE 22

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Testbed Extracted patterns

Experimental results

Extracted patterns

Some examples of extracted patterns:

β = 0.1

Maximal cfl Support

((4;∗;1;∗,∗,∗),(∗;∗;2;∗,∗,∗))

0.107

((2;∗;∗;2,∗,∗),(∗;∗;2;2,∗,∗))

0.105

((∗;1;1;∗,∗,∗),(∗;∗;1;∗,∗,∗))

0.113 “10.7% of the links of the network connect 40 years old people who have a job to people who do not have a job”

β = 0.2

Maximal cfl Support

((∗;2;∗;∗,∗,∗),(∗;∗;1;∗,∗,∗))

0.231

((∗;1;∗;∗,∗,∗),(∗;∗;2;∗,∗,∗))

0.288

((∗;2;∗;∗,∗,∗),(∗;1;∗;∗,∗,∗))

0.297 “23.1% of the links of the network connect men to people who have a job”

Erick Stattner, Martine Collard MARAMI 2012 22 / 27

slide-23
SLIDE 23

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Testbed Extracted patterns

Experimental results

Conceptuel view

Summarization

Erick Stattner, Martine Collard MARAMI 2012 23 / 27

slide-24
SLIDE 24

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives Testbed Extracted patterns

Experimental results

Results

Network measures versus support threshold: Number of nodes and links (c), Density and clustering coeff. (d) and Degree distribution (e).

20 40 60 80 0,11 0,12 0,13 0,14 0,15 0,16 0,17 0,18 0,19 0,2 Support # Noeuds # Liens

(c)

0,1 0,2 0,3 0,4 0,5 0,6 0,11 0,12 0,13 0,14 0,15 0,16 0,17 0,18 0,19 0,2 Support

  • Coeff. Clust.

Densité

(d)

0,1 0,2 0,3 0,4 0,5 1 2 3 4 5 6 7 8 9 10 11 12 P(k) k 0,1 0,15 0,2

Erick Stattner, Martine Collard MARAMI 2012 24 / 27

slide-25
SLIDE 25

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives

Outline

1

Social Mining

2

Concept of Frequent Link

3

Experimental results

4

Conclusion and Perspectives

Erick Stattner, Martine Collard MARAMI 2012 25 / 27

slide-26
SLIDE 26

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives

Conclusion and Perspectives

Conclusion: New approach for extract frequent pattern in social data Combine information both from attributes values and links Two interests:

Extract novel patterns : groups of nodes most connected Provide a kind of summarized representation of the network

Perspectives: Optimization Scalability

Erick Stattner, Martine Collard MARAMI 2012 26 / 27

slide-27
SLIDE 27

Social Mining Concept of Frequent Link Experimental results Conclusion and Perspectives

Conclusion and Perspectives

Thanks for your attention !

Erick Stattner, Martine Collard MARAMI 2012 27 / 27