Empirical Comparison of Approximate Inference Algorithms for - - PowerPoint PPT Presentation

empirical comparison of approximate inference algorithms
SMART_READER_LITE
LIVE PREVIEW

Empirical Comparison of Approximate Inference Algorithms for - - PowerPoint PPT Presentation

Empirical Comparison of Approximate Inference Algorithms for Networked Data Prithviraj Sen Lise Getoor Department of Computer Science University of Maryland, College Park. Workshop on Open Problems in Statistical Relational Learning, 2006


slide-1
SLIDE 1

Empirical Comparison of Approximate Inference Algorithms for Networked Data

Prithviraj Sen Lise Getoor

Department of Computer Science University of Maryland, College Park.

Workshop on Open Problems in Statistical Relational Learning, 2006

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 1/16

slide-2
SLIDE 2

Introduction

Recent, widespread interest in structured classification. Numerous approximate inference algorithms for networked data exist. We empirically compare three of the most popular ones:

Iterative Classification Algorithm Mean Field Relaxation Labeling Loopy Belief Propagation

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 2/16

slide-3
SLIDE 3

Parameters of Interest

Performance on random graph data. Effects of noise in attribute values. Effects of noise in correlations across links. Effects of varying link density. Effects of different link patterns.

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 3/16

slide-4
SLIDE 4

Iterative Classification Algorithm (ICA)

Simple, greedy, iterative algorithm. Introduced by Besag [Besag, 1986]. In each iteration, for each node, looks at neighbourhood class labels. bi(y) ← αφi(y) exp{

  • Yj∈N(Yi)

wy,yj}

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 4/16

slide-5
SLIDE 5

Mean Field Relaxation Labeling (MF)

Soft-version of ICA. Many

  • ther versions exist.

Discovered by vision community [Hummel & Zucker, 1983]. In each iteration, for each node, looks at neighbour’s label distribution. bi(y) ← αφi(y) exp{

  • Yj∈N(Yi),y ′

wy,y ′bj(y ′)}

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 5/16

slide-6
SLIDE 6

Loopy Belief Propagation (LBP)

Message-passing algorithm. Attempts to stop sending messages in loops. Discovered by iterative decoding community [Kschischang & Frey, 1998, McEliece et al, 1998, Kschischang et al, 2001]. Messages computed without considering destination node’s message.

mi→j(y) ← α

  • y′

φi(y ′)ewy,y′

  • Yk ∈N (Yi )\Yj

mk→i(y ′) bi(y) ← αφi(y)

  • Yj ∈N (Yi )

mj→i(y)

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 6/16

slide-7
SLIDE 7

Synthetic Graph Generation Algorithm

Based on power-law graph generation algorithm [Bollobas et al, 2003]

1: Begin with a single node graph G. 2: repeat 3:

With probability α introduce an edge in G

4:

With probability 1 − α introduce a new node with a randomly sampled label, connect new node to G

5: until size of G = n 6: generate attributes for all nodes.

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 7/16

slide-8
SLIDE 8

Preferential attachment scheme used

When choosing node to link node ν to:

With probability ρ choose node with same label. With probability 1 − ρ choose node with different label. Preference given to nodes with high degree.

1 10 100 1000 10000 100000 1 10 100 Frequency Degree Degree distribution exp(12-3*log(x))

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 8/16

slide-9
SLIDE 9

Experimental setup

Performed 3-fold cross validation. Metric used: avg. classification accuracy. Compared three models: ICA, MF, LBP Performed experiments on binary class data.

Parameters of interest

α: controls number of edges ρ: controls degree of correlation across edges. ω: controls noise in attribute values.

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 9/16

slide-10
SLIDE 10

Varying Correlations across Links

40 50 60 70 80 90 100 0.5 0.6 0.7 0.8 0.9 1

  • avg. accuracy (%)

ρ Varying link noise with α = 0.1 LBP MF ICA 40 50 60 70 80 90 100 0.5 0.6 0.7 0.8 0.9 1

  • avg. accuracy (%)

ρ Varying link noise with α = 0.3 LBP MF ICA 40 50 60 70 80 90 100 0.5 0.6 0.7 0.8 0.9 1

  • avg. accuracy (%)

ρ Varying link noise with α = 0.5 LBP MF ICA Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 10/16

slide-11
SLIDE 11

Varying Correlations across Links – contd.

40 50 60 70 80 90 100 0.1 0.2 0.3 0.4 0.5

  • avg. accuracy (%)

ρ Varying link noise with α = 0.1 LBP MF ICA 40 50 60 70 80 90 100 0.1 0.2 0.3 0.4 0.5

  • avg. accuracy (%)

ρ Varying link noise with α = 0.3 LBP MF ICA 40 50 60 70 80 90 100 0.1 0.2 0.3 0.4 0.5

  • avg. accuracy (%)

ρ Varying link noise with α = 0.5 LBP MF ICA Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 11/16

slide-12
SLIDE 12

Varying Attribute Noise

50 60 70 80 90 100 0.2 0.4 0.6 0.8 1

  • avg. accuracy (%)

ω Varying attr. noise with α = 0.1 LBP MF ICA 50 60 70 80 90 100 0.2 0.4 0.6 0.8 1

  • avg. accuracy (%)

ω Varying attr. noise with α = 0.1 LBP MF ICA 50 60 70 80 90 100 0.2 0.4 0.6 0.8 1

  • avg. accuracy (%)

ω Varying attr. noise with α = 0.1 LBP MF ICA Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 12/16

slide-13
SLIDE 13

Varying Attribute Noise – contd.

20 40 60 80 100 0.2 0.4 0.6 0.8 1

  • avg. accuracy (%)

ω Varying attr. noise with α = 0.1 LBP MF ICA 20 40 60 80 100 0.2 0.4 0.6 0.8 1

  • avg. accuracy (%)

ω Varying attr. noise with α = 0.3 LBP MF ICA 20 40 60 80 100 0.2 0.4 0.6 0.8 1

  • avg. accuracy (%)

ω Varying attr. noise with α = 0.5 LBP MF ICA Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 13/16

slide-14
SLIDE 14

Effect of different link patterns

In the case of Homophily or Perfect Assortative Mixing (figure on left), the generated graphs form densely connected clusters introducing closed loops hampering LBP and MF.

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 14/16

slide-15
SLIDE 15

Conclusion

We empirically compared three of the most popular approximate inference techniques for networked data. MF tends to get stuck at local minima in a variety of cases, e.g., high link correlation, high link density. LBP tends to face issues in the presence of high link density and a specific type of link pattern known as Homophily or Perfect Assortative Mixing but otherwise performs well. We found that LBP’s convergence does not necessarily indicate good results. ICA is the most consistent of the three approaches considered, returning reasonable results in a wide variety of conditions.

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 15/16

slide-16
SLIDE 16

References

  • J. Besag, On the statistical analysis of dirty pictures”, Journal of the

Royal Statistical Society, 1986.

  • R. Hummel and S. Zucker, On the foundations of relaxation labeling

processes, IEEE Trans. on Pattern Analysis and Machine Intelligence, 1983.

  • F. R. Kschischang and B. J. Frey, Iterative decoding of compound

codes by probability progation in graphical models, IEEE Journal on Selected Areas in Communication, 1998.

  • F. R. Kschischang and B. J. Frey and H. A. Loeliger, Factor graphs

and the sum-product algorithm, IEEE Trans. on Information Theory, 2001.

  • R. J. McEliece and D. J. C. MacKay and J. F. Cheng, Turbo

decoding as an instance of Pearl’s belief propagation algorithm, IEEE Journal on Selected Areas in Communication, 1998.

  • B. Bollobas, C. Borgs, J. T. Chayes and O.Riordan, Directed

scale-free graphs, In Proceedings of ACM-SIAM Symposium on Discrete Algorithms, 2003.

Empirical Comparison of Approximate Inference Algorithms for Networked Data –Prithviraj Sen, Lise Getoor 16/16