Chair of Computer Science 5 RWTH Aachen University Learning Layers - - PowerPoint PPT Presentation

chair of computer science 5 rwth aachen university
SMART_READER_LITE
LIVE PREVIEW

Chair of Computer Science 5 RWTH Aachen University Learning Layers - - PowerPoint PPT Presentation

Chair of Computer Science 5 RWTH Aachen University Learning Layers Analysis of Overlapping Analysis of Overlapping Communities in Communities in Signed Complex Signed Complex Networks Networks Mohsen Shahriari, Mohsen Shahriari, Ying


slide-1
SLIDE 1

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 1

Analysis of Overlapping Communities in Signed Complex Networks

Mohsen Shahriari, Ying Li, Ralf Klamma Advanced Community Information Systems (ACIS) RWTH Aachen University, Germany shahriari@dbis.rwth-aachen.de

Chair of Computer Science 5 RWTH Aachen University

slide-2
SLIDE 2

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 2

Agenda

  • Introduction to OCD
  • Related Work
  • Motivation & Research Questions
  • Overlapping Community Detection (OCD) Algorithms

for Signed Networks

  • Evaluation
  • Results
  • Conclusion and Outlook
slide-3
SLIDE 3

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 3

Introduction to OCD in Signed Networks

  • Community detection as an important part of network

analysis

  • Two key characteristics of signed social networks
  • Nodes in the overlapping communities
  • Relations with signs
  • Community structure

Inside Communities

  • Dense
  • Positive

Between Communities

  • Negative
  • Sparse
  • +

+ + + + + + + + + + + + + +

slide-4
SLIDE 4

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 4

Motivation

  • Practical application of OCD in signed networks like
  • Informal learning networks
  • Review sites
  • Open source developer networks
  • Contribute to the current research on OCD in signed

networks with the following difficiencies

  • Few algorithms
  • No comparison between available algorithms
slide-5
SLIDE 5

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 5

Related Work on Community Detection in Signed Graphs

  • Non-overlapping community detection
  • Agent-based finding and extracting communities (FEC) [YaCL07]
  • Two-step approach by maximizing modularity and minimizing

frustration [AnMa12]

  • Clustering re-clustering algorithm (CRA) [AmPi13]
  • Overlapping community detection
  • Signed Disassortative Degree Mixing and Information Diffusion

Algorithm (SDMID) [ShKl15]

  • Signed Probabilistic Mixture Model (SPM) [CWYT14]
  • Multi-objective Evolutionary Algorithm based on Similarity for

Community Detection in Signed Networks (MEAs-SN) [LiLJ14]

slide-6
SLIDE 6

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 6

Research Questions

  • How do Signed Disassortative degree Mixing and

Information Diffusion (SDMID), Signed Probabilistic Mixture model (SPM) and Multi-objective Evolutionary Algorithm (MEA) perform in comparison with each

  • ther, in terms of knowledge-driven and statistical

metrics?

  • What are the structural properties of covers detected

by SDMID, SPM and MEA and how do they differ?

slide-7
SLIDE 7

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 7

Signed Disassortative Degree Mixing and Information Diffusion Algorithm: Phase 1

Identify leaders

  • Calculate Local Leadership Value (LLD) using effective

degree (ED) and normalized disassortativeness (DASS)

  • Identify local leaders:
  • Identify global leaders:

where FL: Follower Set, LL: Local Leader Set 𝑭𝑬 𝒋 = 𝑵𝒃𝒚( 𝒋𝒐+(𝒋) − 𝒋𝒐−(𝒋) , 𝟏) 𝒋𝒐+(𝒋) + 𝒋𝒐−(𝒋) 𝑬𝑩𝑻𝑻 𝒋 = 𝒌∈𝑶𝒇𝒋(𝒋) 𝐞𝐟𝐡 𝒋 − 𝐞𝐟𝐡(𝒌) 𝒌∈𝑶𝒇𝒋(𝒋) 𝒆𝒇𝒉 𝒋 + 𝒆𝒇𝒉(𝒌) 𝑴𝑴𝑬 𝒋 = 𝜷 × 𝑬𝑩𝑻𝑻 𝒋 + (𝟐 − 𝜷) × 𝑭𝑬(𝒋) ∀𝒌 ∈ 𝑶𝒇𝒋 𝒋 , 𝑴𝑴𝑬(𝒋) ≥ 𝑴𝑴𝑬(𝒌) 𝑮𝑴(𝒋) > 𝒌∈𝑴𝑴 𝑮𝑴(𝒌) 𝑴𝑴

slide-8
SLIDE 8

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 8

Cascading (network coordination game)

  • Assign a leader node k behavior B and all other nodes behavior A
  • Node i with current behavior A will change its behavior to that (B) of

its neighbors, if the potential payoff pB(i) is above a predefined threshold, i.e. LLD:

𝒒𝑪(𝒋) = 𝒗|𝒗 ∈ 𝑶𝒇𝒋+ 𝒋 𝐛𝐨𝐞 𝒄𝒇𝒊𝒃𝒘𝒋𝒑𝒔 𝒗 = 𝑪 − 𝒘|𝒘 ∈ 𝑶𝒇𝒋+ 𝒋 𝒃𝒐𝒆 𝒄𝒇𝒊𝒃𝒘𝒋𝒑𝒔 𝒘 = 𝑪 𝒗|𝒗 ∈ 𝑶𝒇𝒋+ 𝒋 𝒃𝒐𝒆 𝒄𝒇𝒊𝒃𝒘𝒋𝒑𝒔 𝒗 = 𝑪 + 𝒘|𝒘 ∈ 𝑶𝒇𝒋+ 𝒋 𝒃𝒐𝒆 𝒄𝒇𝒊𝒃𝒘𝒋𝒑𝒔 𝒘 = 𝑪

Signed Disassortative Degree Mixing and Information Diffusion Algorithm: Phase 2

0.6 0.7 0.5 0.2 + + + + + + +

  • 0.6

0.7 0.5 0.2 + + + + + + +

  • 0.6

0.7 0.5 0.2 + + + + + + +

slide-9
SLIDE 9

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 9

Signed Probabilistic Mixture Model

  • Based on Expectation-Maximization (EM) method
  • Maximize the log function of the marginal likelihood of

the signed network:

Estimation Maximization

Use 𝜕, 𝜄 to compute

  • The probability of a positive edge from a community r : 𝑞1
  • The probability of a negative edge from two communities r and s: 𝑞2

Update 𝜕, 𝜄 with 𝑞1 and 𝑞2 by maximizing 𝑚𝑜𝑄(𝐹|𝜕, 𝜄)

𝑸 𝑭 𝝏, 𝜾 =

𝒇𝒋𝒌∈𝑭 𝒔𝒔

𝝏𝒔𝒔 𝜾𝒔𝒋𝜾𝒔𝒌

𝑩𝒋𝒌

+

𝒔𝒕(𝒔≠𝒕)

𝝏𝒔𝒕 𝜾𝒔𝒋𝜾𝒕𝒌

𝑩𝒋𝒌

slide-10
SLIDE 10

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 10

Multi-Objective Evolutionary Algorithm Based

  • n Similarity for Community Detection in

Signed Networks

  • Based upon structural similarity between adjacent nodes

where 𝛺 𝑦 = 0, if 𝑥𝑣𝑦 < 0 and 𝑥𝑤𝑦 < 0; 𝑥𝑣𝑦 𝑥𝑤𝑦, 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓

  • Objective functions
  • Maximize the sum of positive similarities within communities
  • Maximize the sum of negative similarities between communities
  • Optimal solution is selected with MOEA/D (multiobjective

evolutionary algorithm based on decomposition) [ZhLi07]

  • Decomposition into scalar optimization
  • Simultaneous optimization of these subproblems

s(𝒗, 𝒘) =

𝒚∈𝑪(𝒗)∩𝑪(𝒘) 𝜴(𝒚) 𝒚∈𝑪(𝒗) 𝒙𝒗𝒚

𝟑

∙ 𝒚∈𝑪(𝒘) 𝒙𝒘𝒚

𝟑

slide-11
SLIDE 11

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 11

Evaluation Metrics

  • Normalized mutual information: regards 𝑁𝑗𝑙, 𝑁𝑗𝑚′ as two random

variables and determines the mutual information (𝑁𝑗: membership vector, k: k-th community in detected cover, 𝑚′: 𝑚′-th community in real cover)

  • Signed modularity: measures the strength of a community partition by

taking into account the degree distribution

  • Frustration: normalized weighted weight sum of negative edges inside

communities and positive edges between communities

  • Execution time

𝑮𝒔𝒗𝒕𝒖𝒔𝒃𝒖𝒋𝒑𝒐 = 𝜷 × 𝒙𝒋𝒐𝒖𝒔𝒃

− 𝒇 + (𝟐 − 𝜷) × |(𝒙𝒋𝒐𝒖𝒇𝒔 +

)𝒇| (𝒙+)𝒇+|(𝒙−)𝒇| 𝑹𝑻𝑷 =

𝟐 𝟑(𝒙+)𝒇+𝟑|(𝒙−)𝒇| 𝒇𝒋𝒌 𝒙𝒋𝒌 − 𝒙𝒋

+𝒙𝒌 +

𝟑(𝒙+)𝒇 − 𝒙𝒋

−𝒙𝒌 −

𝟑|(𝒙−)𝒇|

𝜺 𝑫𝒋, 𝑫𝒌 , where 𝜀 𝐷𝑗, 𝐷

𝑘 : No.of communities 𝑓𝑗𝑘 resides

slide-12
SLIDE 12

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 12

Synthetic Network Generator

  • Comes from the idea of [LiLJ14] and is based on the Lancichinetti-

Fortunato-Radicchi (LFR) model (directed and unweighted) and a model from [YaCL07]

  • Parameters
  • From LFR: no. of nodes, average/max degree, minus exponents for the

degree and community size distributions which are power laws, min/max community size, no. of overlapping nodes, no. of communities, fraction of edges that each node shares with other communities.

  • From [YaCL07]: proportion of negative edges inside communities P- and

proportion of positive edges between communities P+

  • Generation

Generate a normal LFR Network Negate all inter-community edges Randomly negate P-of all intra-community edges Randomly negate P+ of all inter-community edges

slide-13
SLIDE 13

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 13

Experiments on Benchmark Networks: Community Structure (1)

1 2 3 4 5 2 3 4 5 6 7 9 10 11 12 15 18 21 23 25 26 27 28 29 30 31 41 42 52 57

  • No. of Communties

Community Distribution

1 2 3 6 7 10 13 16 17 18 19 21 22 23 27 33 35 38 41 43 45 47 55 58 Community Size

SDMID MEA SPM Ground Truth

Parameters: n=100, k=3, maxk=6, μ=0.1, t1=-2.0, t2=-1.0, minc=5, on=5, om=2, P-=0.01, P+=0.01

Maxc=35 Maxc=40

  • SDMID has a more similar community distribution in comparison

to the ground truth

  • SPM detects the biggest community sizes
slide-14
SLIDE 14

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 14

Experiments on Benchmark Networks: Community Structure (2)

5 8

5 10

  • No. of Nodes

Standalone Nodes

9

5 10

  • No. of Nodes

5 28

10 20 30

  • No. of Nodes

SDMID MEA SPM Ground Truth

221

1 13 5

100 200 300

  • No. of Nodes

SDMID MEA SPM Ground Truth 208 17 9 5

100 200 300

  • No. of Nodes

157 11 11 5

100 200

  • No. of Nodes

Nodes in Overlapping Communities

  • MEA detects the

highest number of standalone nodes

  • SDMID also

identifies some

  • f the nodes as

standalone

  • SDMID assigns most
  • f the nodes as
  • verlapping
slide-15
SLIDE 15

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 15

Experiment on Real World Network Wiki-Elec: Metric Values

0.28 0.21 0.26 0.10 0.11 0.10 0.16 3,101 1,760

500 1,000 1,500 2,000 2,500 3,000 3,500 0.00 0.05 0.10 0.15 0.20 0.25 0.30

SDMID MEA SPM

Execution Time in Minutes Modularity/Frustration Algorithm

Experiment on Wiki-Elec

Modularity Frustration Execution Time in Minutes

  • SDMID has the highest modularity value
  • SDMID and SPM obtain the lowest frustration values
  • SDMID is the best regarding the execution time
slide-16
SLIDE 16

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 16

Experiments on Real World Network Wiki-Elec: Community Structure

5 10 2 2,148 2,385 2,645 3,014 3,043 3,935 6,796 6,819 6,833

  • No. of Communties

Community Size

Community Distrubtion (size>1)

SDMID MEA SPM

149 3,250 77

2000 4000

  • No. of Nodes

Standalone Nodes

SDMID MEA SPM

6,853 5 6,354

5000 10000

  • No. of Nodes

Nodes in Overlapping Communties

SDMID MEA SPM

  • MEA detects most of the nodes as standalone and most of the nodes

are in one community

  • Fewest number of standalone nodes observed in SDMID and SPM
  • SDMID and SPM approximately detect high number of overlapping

ndoes

slide-17
SLIDE 17

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 17

Experiment Summary: Evaluation Radar

Modularity Frustration Execution Time

Wiki-Elec Dataset

Modularity Frustration NMI Execution Time

Benchmark Networks

SDMID MEA SPM

  • In Wiki-Elec, SDMID has the best performance regarding modularity,

execution time and frustration

  • In Benchmark networks, SDMI has better performance regarding

modularity, execution time and NMI

  • Performance of SPM is better regarding Frustration
slide-18
SLIDE 18

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 18

Experiment Summary: Community Structure

  • SDMID
  • Big-sized communities
  • Large areas of overlapping
  • MEAs-SN
  • Small-sized communities
  • Few nodes in the overlapping area
  • Large number of stand-alone nodes
  • SPM
  • Predefined number of communities k
  • Large areas of overlapping with a small k
slide-19
SLIDE 19

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 19

Conclusion & Message

  • We compared SDMID, SPM and MEA OCD

algorithms from different aspects

  • There are few algorithms for overlapping

community detection in signed networks

  • Currently SDMID and SPM are the best options to

be applied on datasets in signed networks

  • SDMID is the fastest and has the highest modularity
  • SDMID obtained the best performance on the real world

network Wiki-Elec

  • SDMID might be a better choice when diffusion of
  • pinions is preferred across community borders
slide-20
SLIDE 20

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 20

References

  • [CWYT14] Yi Chen, Xiaolong Wang, Bo Yuan and Buzhou Tang. Overlapping Community

Detection in Networks with Positive and Negative Links. In: Journal of Statistical Mechanics: Theory and Experiment 2014.3: P03021, 2014.

  • [LiLJ14] Chenlong Liu, Jing Liu and Zhongzhou Jiang. A Multiobjective Evolutionary Algorithm

Based on Similarity for Community Detection from Signed Social Networks. In:IEEE Transactions on Cybernetics 44.12: pp.2274-2286, 2014.

  • [ShKl15] Mohsen Shahriari and Ralf Klamma. Signed Social Networks: Link Prediction and

Overlapping Community Detection. In: Proceedings of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 2015.

  • [YaCL07] Bo Yang, William K. Cheung, and Jiming Liu. Community Mining from Signed Social
  • Networks. In: IEEE Transactions on Knowledge and Data Engineering 19.10: pp. 1333-1348,

2007.

  • [ZhLi07] Qingfu Zhang and Hui Li. MOEA/D: A Multiobjective Evolutionary Algorithm Based on
  • Decomposition. In:IEEE Transactions on Evolutionary Computation 11.6: pp. 712-731, 2007.
slide-21
SLIDE 21

Lehrstuhl Informatik 5 (Information Systems)

  • Prof. Dr. M. Jarke

Mohsen Shahriari, Ying Li, Ralf Klamma Learning Layers Analysis of Overlapping Communities in Signed Complex Networks

Slide 21

Thank you Thank you !