Bayesian nonparametric models for bipartite graphs
Fran¸ cois Caron
Department of Statistics, Oxford
Statistics Colloquium, Harvard University
November 11, 2013
- F. Caron
1 / 27
Bayesian nonparametric models for bipartite graphs Fran cois Caron - - PowerPoint PPT Presentation
Bayesian nonparametric models for bipartite graphs Fran cois Caron Department of Statistics, Oxford Statistics Colloquium, Harvard University November 11, 2013 F. Caron 1 / 27 Bipartite networks Readers/Customers A 1 A 2 B 1 B 2 B 3 B 4
1 / 27
2 / 27
2 / 27
2 / 27
2 / 27
2 / 27
2 / 27
2 / 27
3 / 27
10 10
1
10
2
10
3
10
4
10
−7
10
−6
10
−5
10
−4
10
−3
10
−2
10
−1
10
Degree Distribution
10 10
1
10
2
10
−7
10
−6
10
−5
10
−4
10
−3
10
−2
10
−1
10
Degree Distribution
4 / 27
◮ Exponential random graph, stochastic block-models, Rasch models, etc ◮ Do not capture power-law behavior ◮ Inference do not scale well with the number of nodes
◮ Preferential attachment ◮ Lacks interpretable parameters, non-exchangeability
5 / 27
◮ Dirichlet Process Mixtures: Clustering/density estimation with
◮ Language modeling, image segmentation
6 / 27
◮ Number of nodes is fixed and dimension of the latent structure
◮ Can capture power-law degree distributions for books ◮ Poisson degree distribution for readers
7 / 27
8 / 27
◮ zij = 1 if reader i has read book j, 0 otherwise ◮ {θj} is the set of books
9 / 27
◮ zij = 1 if reader i has read book j, 0 otherwise ◮ {θj} is the set of books
9 / 27
◮ zij = 1 if reader i has read book j, 0 otherwise ◮ {θj} is the set of books
9 / 27
◮ zij = 1 if reader i has read book j, 0 otherwise ◮ {θj} is the set of books
9 / 27
◮ Latent scores sij ∼ Gumbel(log(wj), 1) ◮ All books with a score above − log(γi) are retained, others are
0.5 1 1.5 2 2.5 3 5 10 15 20 25 30
popularity books
−8 −6 −4 −2 2 4 5 10 15 20 25 30
score books − log(γi)
10 / 27
11 / 27
−8 −6 −4 −2 2 4 5 10 15 20 25 30
score books − log(γi)
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5 10 15 20 25 30
censored score books
12 / 27
13 / 27
14 / 27
14 / 27
14 / 27
14 / 27
14 / 27
14 / 27
14 / 27
14 / 27
14 / 27
14 / 27
14 / 27
α Γ(1−σ)w−σ−1e−τw, τ = 1, γi = 2.
Books Readers 20 40 60 80 5 10 15 20 25 30
Books Readers 20 40 60 80 5 10 15 20 25 30
Books Readers 20 40 60 80 5 10 15 20 25 30
Books Readers 20 40 60 80 5 10 15 20 25 30
Books Readers 20 40 60 80 5 10 15 20 25 30
Books Readers 20 40 60 80 5 10 15 20 25 30
15 / 27
◮ The total number of books read by n readers is O(nσ) ◮ Asympt., the proportion of books read by m readers is O(m−1−σ)
16 / 27
17 / 27
17 / 27
17 / 27
17 / 27
18 / 27
◮ Stable Indian Buffet Process ◮ Proposed model where G follows a Generalized Gamma process of
◮ with shared and unknown γi = γ ◮ with nonparametric prior where Γ follows a generalized gamma process
19 / 27
10 10
210 10
110
210
3Degree
Model Data
10 10
210 10
110
210
3Degree
Model Data
10 10
210 10
110
210
310
4Degree
Model Data
10 10 10
110
210
310
410
5Degree
Model Data
10 10 10
110
210
310
410
5Degree
Model Data
10 10 10
110
210
310
410
5Degree
Model Data
20 / 27
10 10
210 10
110
210
3Degree
Model Data
10 10
210 10
110
210
3Degree
Model Data
10 10
210 10
110
210
3Degree
Model Data
10 10 10
110
210
310
410
5Degree
Model Data
10 10 10
110
210
310
410
5Degree
Model Data
10 10 10
110
210
310
410
5Degree
Model Data
21 / 27
0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 20 40 60 80 100 120 140
σγ Posterior
0.755 0.76 0.765 0.77 0.775 0.78 0.785 0.79 0.795 20 40 60 80 100 120 140
σw Posterior
22 / 27
23 / 27
24 / 27
25 / 27
25 / 27
25 / 27
26 / 27
27 / 27