Diversified Recommendation on Graphs: Pitfalls, Measures, and - PowerPoint PPT Presentation

Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms Onur Küçüktunç 1,2 Erik Saule 1 Kamer Kaya 1 Ümit V. Çatalyürek 1,3 1 Dept. Biomedical Informatics 2 Dept. of Computer Science and Engineering 3 Dept. of Electrical and Computer Engineering The Ohio State University WWW 2013, May 13–17, 2013, Rio de Janeiro, Brazil.

Outline • Problem definition – Motivation – Result diversification algorithms • How to measure diversity – Classical relevance and diversity measures – Bicriteria optimization?! – Combined measures • Best Coverage method – Complexity, submodularity – A greedy solution, relaxation • Experiments Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 2/25

Problem definition Let G = ( V, E ) be an undirected graph. Given a set of m seed nodes Q = { q 1 , . . . , q m } s.t. Q ⊆ V , and a parameter k , return top- k items which are relevant to the ones in Q , but diverse among themselves, covering di ff erent aspects of the query. Online shopping Academic Social product paper-to-paper collaboration friendship G = ( V, E ) co-purchasing citations network network • one product • paper/field of interest • user himself/herself Q ⊆ V • previous purchases • set of references • set of people • page visit history • researcher himself/herself product recommendations references for related work friend recommendations R ⊂ V “you might also like…” new collaborators “you might also know…” Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 3/25

Problem definition Let G = ( V, E ) be an undirected graph. Given a set of m seed nodes Q = { q 1 , . . . , q m } s.t. Q ⊆ V , and a parameter k , return top- k items which are relevant to the ones in Q , but diverse among themselves, covering di ff erent aspects of the query. • We assume that the graph itself is the only information we have, and no categories or intents are available • no comparisons to intent-aware algorithms [Agrawal09,Welch11,etc.] • but we will compare against intent-aware measures • Relevance scores are obtained with Personalized PageRank (PPR) [Haveliwala02] ( 1 /m, if v ∈ Q p ∗ ( v ) = 0 , otherwise. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 4/25

Result diversification algorithms • GrassHopper [Zhu07] – ranks the graph k times • turns the highest-ranked vertex into a sink node at each iteration highest-ranked in the next step 6 g 2 highest-ranked 4 vertex g 1 turned into 8 a sink node 2 0.015 6 g 1 0 0.01 10 10 5 4 5 0 0 R = {g 1 ,g 2 } 0.005 (c) 2 1.5 g 3 0 10 10 0 5 5 1 0 0 0 5 10 (a) (b) R = {g 1 } 0.5 0 10 10 5 5 0 0 R = {g 1 ,g 2 ,g 3 } (d) Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 5/25

Result diversification algorithms • GrassHopper [Zhu07] – ranks the graph k times • turns the highest-ranked vertex into a sink node at each iteration • DivRank [Mei10] – based on vertex-reinforced random walks (VRRW) • adjusts the transition matrix based on the number of visits to the vertices ( rich-gets-richer mechanism) sample graph weighting with PPR diverse weighting Kucuktunc et al “Diversified Recommendation on Graphs: Pitfalls Measures, and Algorithms” WWW ’13 6/25

Result diversification algorithms • GrassHopper [Zhu07] – ranks the graph k times • turns the highest-ranked vertex into a sink node at each iteration • DivRank [Mei10] – based on vertex-reinforced random walks (VRRW) • adjusts the transition matrix based on the number of visits to the vertices ( rich-gets-richer mechanism) • Dragon [Tong11] – based on optimizing the goodness measure • punishes the score when two neighbors are included in the results Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 7/25

Measuring diversity Relevance measures Diversity measures • Normalized relevance • l -step graph density P P u,v 2 S,u 6 = v d ` ( u, v ) v ∈ S π v rel ( S ) = dens ` ( S ) = P k | S | × ( | S | − 1) i =1 ˆ π i • Difference ratio • l -expansion ratio S ) = 1 − | S ∩ ˆ S | σ ` ( S ) = | N ` ( S ) | di ff ( S, ˆ | S | n where • nDCG N ` ( S ) = S ∪ { v ∈ ( V − S ) : ∃ u ∈ S, d ( u, v ) ≤ ` } π s 1 + P k π si i =2 log 2 i nDCG k = π 1 + P k ˆ ˆ π i i =2 log 2 i Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 8/25

Bicriteria optimization measures • aggregate a relevance and a diversity measure • [Carbonell98] X X f MMR ( S ) = (1 − λ ) max sim ( u, v ) π v − λ v 2 S v 2 S u 2 S u 6 = v • [Li11] π v + λ | N ( S ) | X f L ( S ) = n v ∈ S • [Vieira11] X X X f MSD ( S ) = ( k − 1)(1 − λ ) π v + 2 λ div ( u, v ) v 2 S u 2 S v 2 S u 6 = v • max-sum diversification, max-min diversification, k-similar diversification set, etc. [Gollapudi09] Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 9/25

Bicriteria optimization is not the answer • Objective: diversify top-10 results • Two query-oblivious algorithms: – top-% + random – top-% + greedy- σ 2 Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 10/25

Bicriteria optimization is not the answer • normalized relevance and 2-step graph density 1 1 1 1 0.3 0.3 top-90%+random top-90%+greedy- σ 2 better top-75%+random top-75%+greedy- σ 2 top-50%+random top-50%+greedy- σ 2 0.25 0.25 top-25%+random top-25%+greedy- σ 2 0.8 0.8 0.8 0.8 All greedy- σ 2 All random 0.2 0.2 0.6 0.6 0.6 0.6 dens 2 dens 2 dens 2 dens 2 σ 2 σ 2 0.15 0.15 0.4 0.4 0.4 0.4 0.1 0.1 0.2 0.2 0.2 0.2 0.05 0.05 better 0 0 0 0 0 0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1 1 0 0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1 1 rel rel rel rel rel rel • evaluating result diversification as a bicriteria optimization problem with – a relevance measure that ignores diversity , and – a diversity measure that ignores relevancy . Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 11/25

A better measure? Combine both • We need a combined measure that tightly integrates both relevance and diversity aspects of the result set • goodness [Tong11] penalize the score when two results share an edge X X f G ( S ) = 2 A ( j, i ) π j π i − d i ∈ S i,j ∈ S max-sum relevance X X p ∗ ( i ) − (1 − d ) π j j ∈ S i ∈ S – downside: highly dominated by relevance Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 12/25

Proposed measure: l -step expanded relevance • a combined measure of ` -step expanded relevance: – l -step expansion ratio ( σ 2 ) X exprel ` ( S ) = ⇡ v – relevance scores ( π ) v ∈ N ` ( S ) • quantifies: relevance of where N ` ( S ) is the ` -step expansion the covered region set of the result set S , and ⇡ is the PPR scores of the items in the graph. of the graph 1 top-90%+random top-75%+random top-50%+random 0.8 top-25%+random • do some sanity check All random top-90%+greedy- σ 2 0.6 with this new measure exprel 2 top-75%+greedy- σ 2 top-50%+greedy- σ 2 top-25%+greedy- σ 2 0.4 All greedy- σ 2 0.2 0 5 10 20 50 100 k Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 13/25

Correlations of the measures relevance diversity goodness is dominated by the relevancy measures exprel has no high correlations with other relevance or diversity measures Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 14/25

Proposed algorithm: Best Coverage • Can we use -step expanded ` ALGORITHM 1: BestCoverage Input : k, G, ⇡ , ` relevance as an objective function? Output : a list of recommendations S S = ∅ • Define: while | S | < k do exprel ` -diversified top-k ranking (DTR ` ) v ∗ ← argmax v g ( v, S ) S ← S ∪ { v ∗ } exprel ` ( S 0 ) S = argmax return S S 0 ✓ V | S 0 | = k • Complexity: generalization of weighted maximum coverage problem – NP-hard! ALGORITHM 2: BestCoverage (relaxed) – but exprel l is a submodular function (Lemma 4.2) Input : k, G, ⇡ , ` Output : a list of recommendations S – a greedy solution (Algorithm 1) that selects the item S = ∅ Sort ( V ) w.r.t ⇡ i non-increasing with the highest marginal utility S 1 ← V [1 ..k 0 ], i.e., top- k 0 vertices where k 0 = k ¯ � ` ∀ v ∈ S 1 , g ( v ) ← g ( v, ∅ ) g ( v, S ) = P v 0 ∈ N ` ( { v } ) − N ` ( S ) π v 0 ∀ v ∈ S 1 , c ( v ) ← Uncovered while | S | < k do at each step is the best possible polynomial time v ⇤ ← argmax v 2 S 1 g ( v ) approximation (proof based on [Nemhauser78]) S ← S ∪ { v ⇤ } S 2 ← N ` ( { v ⇤ } ) for each v 0 ∈ S 2 do if c ( v 0 ) = Uncovered then • Relaxation: computes BestCoverage on S 3 ← N ` ( { v 0 } ) highest ranked vertices to improve runtime ∀ u ∈ S 3 , g ( u ) ← g ( u ) − ⇡ v 0 c ( v 0 ) ← Covered return S Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW ’13 15/25

Diversified Recommendation on Graphs: Pitfalls, Measures, and - PowerPoint PPT Presentation

Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms Onur Kktun 1,2 Erik Saule 1 Kamer Kaya 1 mit V. atalyrek 1,3 1 Dept. Biomedical Informatics 2 Dept. of Computer Science and Engineering 3 Dept. of Electrical and

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

Graphs Graphs Simple graphs Algorithms Depth-first search Breadth-first search

Searching on Graphs November 16, 2016 CMPE 250 Graphs- Searching on Graphs November 16, 2016 1

CS200: Graphs Prichard Ch. 14 Rosen Ch. 10 CS200 - Graphs 1 Graphs A collection of What can

Today. Types of graphs. Today. Types of graphs. Complete Graphs. Trees. Hypercubes. Today.

Trakcja Tiltra S.A. Diversified Infrastructure Construction Group March 2012 Diversified

Pitfalls in Using a case based approach, we will Arrhythmias review pitfalls in management of:

Knowledge Engineering Pitfalls Knowledge Engineering Pitfalls Which one is better to represent

Examples of Obstructions to Apex Graphs, Edge-Apex Graphs, and Contraction-Apex Graphs

STACKED GRAPHS STACKED GRAPHS EVOLUTION OF STACKED GRAPHS Stacked Area Chart Themeriver

Algorithms for Lipschitz Learning on Graphs Sushant Sachdeva Yale Institute of Network Sciences

Graphs Graph definitions There are two kinds of graphs: directed graphs (sometimes called

Digital Urban Simulation Content Space as a configuration Spatial configurations and

Quantitative Review Slide 2 / 8 Questions #1-2 refer to the following graph of logistic

Lecture 4 Why the Grass May Not Be Greener On The Other Side: A Comparison of Locking vs.

Fundamental groups of complements of dual varieties in Grass- mannian Hakata, 2007 September

Grasshoppers, Ants and Locusts: the future of the world economy Martin Wolf Associate editor and

DesignScript @ Domain Specific Modelling 2016 Robert Aish, Bartlett/UCL and Emmanuel Mendoza, ARM

Sid iddh dharth Kr h Kris ishn hna procedure insert(lst: Node , elt: Node ) returns (res: Node

Gamification Human motivation What is Gamification Gamification Examples Gameful Design