diversified recommendation on graphs pitfalls measures
play

Diversified Recommendation on Graphs: Pitfalls, Measures, and - PowerPoint PPT Presentation

Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms Onur Kktun 1,2 Erik Saule 1 Kamer Kaya 1 mit V. atalyrek 1,3 1 Dept. Biomedical Informatics 2 Dept. of Computer Science and Engineering 3 Dept. of Electrical and


  1. Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms Onur Küçüktunç 1,2 Erik Saule 1 Kamer Kaya 1 Ümit V. Çatalyürek 1,3 1 Dept. Biomedical Informatics 2 Dept. of Computer Science and Engineering 3 Dept. of Electrical and Computer Engineering The Ohio State University WWW 2013, May 13–17, 2013, Rio de Janeiro, Brazil.

  2. Outline • Problem definition – Motivation – Result diversification algorithms • How to measure diversity – Classical relevance and diversity measures – Bicriteria optimization?! – Combined measures • Best Coverage method – Complexity, submodularity – A greedy solution, relaxation • Experiments Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 2/25

  3. Problem definition Let G = ( V, E ) be an undirected graph. Given a set of m seed nodes Q = { q 1 , . . . , q m } s.t. Q ⊆ V , and a parameter k , return top- k items which are relevant to the ones in Q , but diverse among themselves, covering di ff erent aspects of the query. Online shopping Academic Social product paper-to-paper collaboration friendship G = ( V, E ) co-purchasing citations network network • one product • paper/field of interest • user himself/herself Q ⊆ V • previous purchases • set of references • set of people • page visit history • researcher himself/herself product recommendations references for related work friend recommendations R ⊂ V “you might also like…” new collaborators “you might also know…” Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 3/25

  4. Problem definition Let G = ( V, E ) be an undirected graph. Given a set of m seed nodes Q = { q 1 , . . . , q m } s.t. Q ⊆ V , and a parameter k , return top- k items which are relevant to the ones in Q , but diverse among themselves, covering di ff erent aspects of the query. • We assume that the graph itself is the only information we have, and no categories or intents are available • no comparisons to intent-aware algorithms [Agrawal09,Welch11,etc.] • but we will compare against intent-aware measures • Relevance scores are obtained with Personalized PageRank (PPR) [Haveliwala02] ( 1 /m, if v ∈ Q p ∗ ( v ) = 0 , otherwise. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 4/25

  5. Result diversification algorithms • GrassHopper [Zhu07] – ranks the graph k times • turns the highest-ranked vertex into a sink node at each iteration highest-ranked in the next step 6 g 2 highest-ranked 4 vertex g 1 turned into 8 a sink node 2 0.015 6 g 1 0 0.01 10 10 5 4 5 0 0 R = {g 1 ,g 2 } 0.005 (c) 2 1.5 g 3 0 10 10 0 5 5 1 0 0 0 5 10 (a) (b) R = {g 1 } 0.5 0 10 10 5 5 0 0 R = {g 1 ,g 2 ,g 3 } (d) Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 5/25

  6. Result diversification algorithms • GrassHopper [Zhu07] – ranks the graph k times • turns the highest-ranked vertex into a sink node at each iteration • DivRank [Mei10] – based on vertex-reinforced random walks (VRRW) • adjusts the transition matrix based on the number of visits to the vertices ( rich-gets-richer mechanism) sample graph weighting with PPR diverse weighting Kucuktunc et al “Diversified Recommendation on Graphs: Pitfalls Measures, and Algorithms” WWW ’13 6/25

  7. Result diversification algorithms • GrassHopper [Zhu07] – ranks the graph k times • turns the highest-ranked vertex into a sink node at each iteration • DivRank [Mei10] – based on vertex-reinforced random walks (VRRW) • adjusts the transition matrix based on the number of visits to the vertices ( rich-gets-richer mechanism) • Dragon [Tong11] – based on optimizing the goodness measure • punishes the score when two neighbors are included in the results Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 7/25

  8. Measuring diversity Relevance measures Diversity measures • Normalized relevance • l -step graph density P P u,v 2 S,u 6 = v d ` ( u, v ) v ∈ S π v rel ( S ) = dens ` ( S ) = P k | S | × ( | S | − 1) i =1 ˆ π i • Difference ratio • l -expansion ratio S ) = 1 − | S ∩ ˆ S | σ ` ( S ) = | N ` ( S ) | di ff ( S, ˆ | S | n where • nDCG N ` ( S ) = S ∪ { v ∈ ( V − S ) : ∃ u ∈ S, d ( u, v ) ≤ ` } π s 1 + P k π si i =2 log 2 i nDCG k = π 1 + P k ˆ ˆ π i i =2 log 2 i Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 8/25

  9. Bicriteria optimization measures • aggregate a relevance and a diversity measure • [Carbonell98] X X f MMR ( S ) = (1 − λ ) max sim ( u, v ) π v − λ v 2 S v 2 S u 2 S u 6 = v • [Li11] π v + λ | N ( S ) | X f L ( S ) = n v ∈ S • [Vieira11] X X X f MSD ( S ) = ( k − 1)(1 − λ ) π v + 2 λ div ( u, v ) v 2 S u 2 S v 2 S u 6 = v • max-sum diversification, max-min diversification, k-similar diversification set, etc. [Gollapudi09] Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 9/25

  10. Bicriteria optimization is not the answer • Objective: diversify top-10 results • Two query-oblivious algorithms: – top-% + random – top-% + greedy- σ 2 Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 10/25

  11. Bicriteria optimization is not the answer • normalized relevance and 2-step graph density 1 1 1 1 0.3 0.3 top-90%+random top-90%+greedy- σ 2 better top-75%+random top-75%+greedy- σ 2 top-50%+random top-50%+greedy- σ 2 0.25 0.25 top-25%+random top-25%+greedy- σ 2 0.8 0.8 0.8 0.8 All greedy- σ 2 All random 0.2 0.2 0.6 0.6 0.6 0.6 dens 2 dens 2 dens 2 dens 2 σ 2 σ 2 0.15 0.15 0.4 0.4 0.4 0.4 0.1 0.1 0.2 0.2 0.2 0.2 0.05 0.05 better 0 0 0 0 0 0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1 1 0 0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1 1 rel rel rel rel rel rel • evaluating result diversification as a bicriteria optimization problem with – a relevance measure that ignores diversity , and – a diversity measure that ignores relevancy . Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 11/25

  12. A better measure? Combine both • We need a combined measure that tightly integrates both relevance and diversity aspects of the result set • goodness [Tong11] penalize the score when two results share an edge X X f G ( S ) = 2 A ( j, i ) π j π i − d i ∈ S i,j ∈ S max-sum relevance X X p ∗ ( i ) − (1 − d ) π j j ∈ S i ∈ S – downside: highly dominated by relevance Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 12/25

  13. Proposed measure: l -step expanded relevance • a combined measure of ` -step expanded relevance: – l -step expansion ratio ( σ 2 ) X exprel ` ( S ) = ⇡ v – relevance scores ( π ) v ∈ N ` ( S ) • quantifies: relevance of where N ` ( S ) is the ` -step expansion the covered region set of the result set S , and ⇡ is the PPR scores of the items in the graph. of the graph 1 top-90%+random top-75%+random top-50%+random 0.8 top-25%+random • do some sanity check All random top-90%+greedy- σ 2 0.6 with this new measure exprel 2 top-75%+greedy- σ 2 top-50%+greedy- σ 2 top-25%+greedy- σ 2 0.4 All greedy- σ 2 0.2 0 5 10 20 50 100 k Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 13/25

  14. Correlations of the measures relevance diversity goodness is dominated by the relevancy measures exprel has no high correlations with other relevance or diversity measures Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 14/25

  15. Proposed algorithm: Best Coverage • Can we use -step expanded ` ALGORITHM 1: BestCoverage Input : k, G, ⇡ , ` relevance as an objective function? Output : a list of recommendations S S = ∅ • Define: while | S | < k do exprel ` -diversified top-k ranking (DTR ` ) v ∗ ← argmax v g ( v, S ) S ← S ∪ { v ∗ } exprel ` ( S 0 ) S = argmax return S S 0 ✓ V | S 0 | = k • Complexity: generalization of weighted maximum coverage problem – NP-hard! ALGORITHM 2: BestCoverage (relaxed) – but exprel l is a submodular function (Lemma 4.2) Input : k, G, ⇡ , ` Output : a list of recommendations S – a greedy solution (Algorithm 1) that selects the item S = ∅ Sort ( V ) w.r.t ⇡ i non-increasing with the highest marginal utility S 1 ← V [1 ..k 0 ], i.e., top- k 0 vertices where k 0 = k ¯ � ` ∀ v ∈ S 1 , g ( v ) ← g ( v, ∅ ) g ( v, S ) = P v 0 ∈ N ` ( { v } ) − N ` ( S ) π v 0 ∀ v ∈ S 1 , c ( v ) ← Uncovered while | S | < k do at each step is the best possible polynomial time v ⇤ ← argmax v 2 S 1 g ( v ) approximation (proof based on [Nemhauser78]) S ← S ∪ { v ⇤ } S 2 ← N ` ( { v ⇤ } ) for each v 0 ∈ S 2 do if c ( v 0 ) = Uncovered then • Relaxation: computes BestCoverage on S 3 ← N ` ( { v 0 } ) highest ranked vertices to improve runtime ∀ u ∈ S 3 , g ( u ) ← g ( u ) − ⇡ v 0 c ( v 0 ) ← Covered return S Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 15/25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend