Ranking prediction by online learning Rbert Plovics Informatics - PowerPoint PPT Presentation

Ranking prediction by online learning Róbert Pálovics Informatics Laboratory, Department of Computer and Automation Research Institute, Hungarian Academy of Sciences https://dms.sztaki.hu/en July 2, 2015

O UTLINE ◮ Online ranking prediction ◮ Exploiting social influence in online RS ◮ Location-aware online learning

R ECOMMENDER SYSTEMS ◮ Utility matrix R , only a few known values ◮ Rating prediction vs. ranking prediction ◮ Explicit vs implicit data ◮ Collaborative filtering vs. contend based

O NLINE RANKING PREDICTION ◮ Online recommendation – after each event recommend a new top list of items – after each event update the recommender model – implicit data ◮ Temporal evaluation – for each tuple < u , i , t > (user, item, timestamp) – evaluate the given single tuple in question against the recommended top list ◮ Iterate on the dataset only at once < u , i , t > tuple time

O NLINE RANKING PREDICTION ◮ Evaluate the given single tuple in question against the recommended top list ◮ There is only one single relevant item, use  0 if rank ( i ) > K ;   DCG@K ( a ) = 1 otherwise .  log 2 ( rank ( i ) + 1 )  top list for < u , i , t > i rank ( i )

M ATRIX FACTORIZATION R = P · Q , where P ∈ R n × k and Q ∈ R k × m , ˆ ◮ Model ˆ r ui = p u · q i ◮ Objective - mean squared error (MSE), for ( u , i ) ∈ Tr r ui ) 2 F ui = ( r ui − ˆ ◮ Optimization - stochastic gradient descent (SGD) p u ← p u − lrate · ∂ F = p u − lrate · Err · q i ∂ p u Items q p r Users

O NLINE M ATRIX F ACTORIZATION ◮ Single iteration over the training data in temporal order ◮ Updating after each new element ◮ High learning rates ◮ More emphasis on recent events ◮ Works well on non-stationary datasets

N ETWORK I NFLUENCE ◮ User-User social graph + User-Item activity time series (bipartite graph) ◮ Detect social influences, influential pairs ◮ Improve top- k recommendation Time User u Time series User v Social network

L AST . FM ◮ Online service in music based social networking ◮ "Scrobbling": collecting listening activity of users ◮ Music recommendation system ◮ Social network ◮ Users see each others scrobbling activity

I NFLUENCE P ROBABILITY ◮ Key concept: influence between neighbors u and v , a ;∆ t ≤ t – subsequent scrobble, v − − − − → u – and the reason is influence ◮ Influence probability a ;∆ t ≤ t a ;∆ t ≤ t a ;∆ t ≤ t P ( Influence , v − − − − → u ) = P ( Influence | v − − − − → u ) · P ( v − − − − → u ) v , a , t v scrobbles of user v subsequent scrobble, possible influence time scrobbles of user u < u , a , t u >

I NFLUENCE P ROBABILITY , L EFT T ERM a ;∆ t ≤ t a ;∆ t ≤ t a ;∆ t ≤ t P ( Influence , v − − − − → u ) = P ( Influence | v − − − − → u ) · P ( v − − − − → u ) ◮ Approximation by measurements a ;∆ t ≤ t P ( Influence | v − − − − → u ) ≈ P ( Influence | ∆ t ≤ t ) ≈ 1 − c log t ◮ Slowly decreasing logarithmic function

I NFLUENCE P ROBABILITY , R IGHT T ERM a ;∆ t ≤ t a ;∆ t ≤ t a ;∆ t ≤ t P ( Influence , v − − − − → u ) = P ( Influence | v − − − − → u ) · P ( v − − − − → u ) a ;∆ t ≤ t ◮ Probability of event v − − − − → u in the time series ◮ Learned by modeling v u a v a a u v u

E XPERIMENTS - A BOUT L AST . FM ◮ Available for us under NDA for Last.fm, selection criteria ◮ Structure: network + scrobbling time series – 71 , 000 users, 285 , 241 edges – 2 year scrobble timeline, 2 , 073 , 395 artists – between 01 January 2010 and 31 December 2011 – 979 , 391 , 001 scrobbles – 57 , 274 , 158 1st-time scrobbles ◮ We train factor models only on the 1st time scrobbles ◮ Artists with popularity less than 14 are excluded ◮ Evaluation on each 1st time scrobble in the second year

2 E XPERIMENTS - F INAL C OMBINATION ◮ Factor and influence models combine well, the average improvement is – 7 % for DCG@10 0.01 K =10 0.009 0.008 average DCG 0.007 0.006 0.005 factor factor + influence 0.004 0 20 40 60 80 100 time (days) factor factor + influence

L OCATION - AWARE ONLINE LEARNING ◮ Twitter dataset ◮ Temporal hashtag recommendation ◮ Twitter: highly non-stationary data ◮ ( u , h , l , t ) geoinfo ◮ Idea: tree structure of geographical areas number of records 6,978,478 number of unique user-hashtag pairs 2,993,183 number of users 792,860 number of items 268,489 number of countries 49

T REE CONSTRUCTION ◮ 214,230 nodes containing 190,315 leaves. ◮ The depth of the tree is 6 ◮ The hashtag time series data covered 30,450 leaves from the whole tree. World ... Europe Asia Africa ... Germany Austria ... Vienna Graz ... ...

R ECENCY 1e+06 100,000 N (IET = t ) 10,000 1,000 100 10 1 0.1 1e+00 1e+01 1e+02 1e+03 1e+04 1e+05 1e+06 1e+07 t (sec) P ( τ = t ) = ( α − 1 ) · t − α and P ( 1 ≤ τ ≤ t ) = 1 − t ( 1 − α ) � ( 1 − α ) P ( t < τ ≤ t + ∆ t | τ > t ) = P ( τ ≤ t +∆ t ) − P ( τ ≤ t ) 1 + ∆ t � = 1 − 1 − P ( τ ≤ t ) t

M ODELING ◮ Online MF as baseline → NOT working ! ◮ Tree + Recency + Bias model: � ˆ ˆ r ( u , h , t , l ) = w n · f ( t − t n , h ) n ∈ Path(l) ◮ ˆ w n node biases learned with SGD ◮ ˆ w n already includes node reliability and popularity ◮ Different heuristic baselines

3 3 3 3 3 3 R ESULTS 0.4 average cumlative DCG@100 0.35 0.3 0.25 0.2 world 0.15 leaves countries 0.1 countries without recency tree 0.05 tree with learned node weights 0 2 4 6 8 10 12 14 16 18 20 time (days)

Ranking prediction by online learning Rbert Plovics Informatics - PowerPoint PPT Presentation

Ranking prediction by online learning Rbert Plovics Informatics Laboratory, Department of Computer and Automation Research Institute, Hungarian Academy of Sciences https://dms.sztaki.hu/en July 2, 2015 O UTLINE Online ranking prediction

Online Submodular Set Cover, Ranking, and Repeated Active Learning Online Ranking: At each round,

Easy and Hard Outline Constraint Ranking in OT The Constraint Ranking problem Making fast

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Ranking candidate genes from Ranking candidate genes from perturbation experiments Niko

TVM for Ads Ranking @ Facebook Hao Lu, Ansha Yu, Yinghai Lu, Andrew Tulloch Ads Ranking at

Online Learning Lorenzo Rosasco MIT, 9.520 L. Rosasco Online Learning About this class Goal

Online Ranking Combination Erzs ebet Frig o Institute for Computer Science and Control (MTA

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Branch Prediction Branch Prediction vs vs Execution Time Execution Time Prediction

PRanking with Ranking Koby Crammer Technion Israel Institute of Technology Based on joint

1 Similarity ranking: example Weighted scoring with linear combination A simple weighted

Online Learning and Online Investing Jia Mao February 20, 2006 Jia Mao () Online Learning and

Using lasso and related estimators for prediction Di Liu StataCorp July 12, 2019 1 / 20

Prediction and Odds 18.05 Spring 2017 Probabilistic Prediction Also called probabilistic

Using Stata 16s lasso features for prediction and inference Di Liu StataCorp 1 / 50

CS 104 Computer Organization and Design Branch Prediction CS104:Branch Prediction 1 Branch

Workshop 3 Philip Newsome, CER EC Market Coupling Governance Guideline CACM Draft Network

Deep Learning Based Recommendation Systems Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1

I = ( M 1 , . . . , M q ) monomial ideal in polynomial ring. Question. What are the Betti numbers

Mixed Factorization for Collaborative Recommendation with Heterogeneous Explicit Feedbacks Weike

Vers un environnement collaboratif multi-utilisateurs Laurent Lucas 1 , Herv Deleau 2,1 ,

Panel Data Analysis Part I Classical Methods: Background Material James J. Heckman

Theory overview of DM-induced phonon excitations Tongyan Lin UCSD Fermilab, June 5,

P