ranking prediction by online learning
play

Ranking prediction by online learning Rbert Plovics Informatics - PowerPoint PPT Presentation

Ranking prediction by online learning Rbert Plovics Informatics Laboratory, Department of Computer and Automation Research Institute, Hungarian Academy of Sciences https://dms.sztaki.hu/en July 2, 2015 O UTLINE Online ranking prediction


  1. Ranking prediction by online learning Róbert Pálovics Informatics Laboratory, Department of Computer and Automation Research Institute, Hungarian Academy of Sciences https://dms.sztaki.hu/en July 2, 2015

  2. O UTLINE ◮ Online ranking prediction ◮ Exploiting social influence in online RS ◮ Location-aware online learning

  3. R ECOMMENDER SYSTEMS ◮ Utility matrix R , only a few known values ◮ Rating prediction vs. ranking prediction ◮ Explicit vs implicit data ◮ Collaborative filtering vs. contend based

  4. O NLINE RANKING PREDICTION ◮ Online recommendation – after each event recommend a new top list of items – after each event update the recommender model – implicit data ◮ Temporal evaluation – for each tuple < u , i , t > (user, item, timestamp) – evaluate the given single tuple in question against the recommended top list ◮ Iterate on the dataset only at once < u , i , t > tuple time

  5. O NLINE RANKING PREDICTION ◮ Evaluate the given single tuple in question against the recommended top list ◮ There is only one single relevant item, use  0 if rank ( i ) > K ;   DCG@K ( a ) = 1 otherwise .  log 2 ( rank ( i ) + 1 )  top list for < u , i , t > i rank ( i )

  6. M ATRIX FACTORIZATION R = P · Q , where P ∈ R n × k and Q ∈ R k × m , ˆ ◮ Model ˆ r ui = p u · q i ◮ Objective - mean squared error (MSE), for ( u , i ) ∈ Tr r ui ) 2 F ui = ( r ui − ˆ ◮ Optimization - stochastic gradient descent (SGD) p u ← p u − lrate · ∂ F = p u − lrate · Err · q i ∂ p u Items q p r Users

  7. O NLINE M ATRIX F ACTORIZATION ◮ Single iteration over the training data in temporal order ◮ Updating after each new element ◮ High learning rates ◮ More emphasis on recent events ◮ Works well on non-stationary datasets

  8. N ETWORK I NFLUENCE ◮ User-User social graph + User-Item activity time series (bipartite graph) ◮ Detect social influences, influential pairs ◮ Improve top- k recommendation Time User u Time series User v Social network

  9. L AST . FM ◮ Online service in music based social networking ◮ "Scrobbling": collecting listening activity of users ◮ Music recommendation system ◮ Social network ◮ Users see each others scrobbling activity

  10. I NFLUENCE P ROBABILITY ◮ Key concept: influence between neighbors u and v , a ;∆ t ≤ t – subsequent scrobble, v − − − − → u – and the reason is influence ◮ Influence probability a ;∆ t ≤ t a ;∆ t ≤ t a ;∆ t ≤ t P ( Influence , v − − − − → u ) = P ( Influence | v − − − − → u ) · P ( v − − − − → u ) v , a , t v scrobbles of user v subsequent scrobble, possible influence time scrobbles of user u < u , a , t u >

  11. I NFLUENCE P ROBABILITY , L EFT T ERM a ;∆ t ≤ t a ;∆ t ≤ t a ;∆ t ≤ t P ( Influence , v − − − − → u ) = P ( Influence | v − − − − → u ) · P ( v − − − − → u ) ◮ Approximation by measurements a ;∆ t ≤ t P ( Influence | v − − − − → u ) ≈ P ( Influence | ∆ t ≤ t ) ≈ 1 − c log t ◮ Slowly decreasing logarithmic function

  12. I NFLUENCE P ROBABILITY , R IGHT T ERM a ;∆ t ≤ t a ;∆ t ≤ t a ;∆ t ≤ t P ( Influence , v − − − − → u ) = P ( Influence | v − − − − → u ) · P ( v − − − − → u ) a ;∆ t ≤ t ◮ Probability of event v − − − − → u in the time series ◮ Learned by modeling v u a v a a u v u

  13. E XPERIMENTS - A BOUT L AST . FM ◮ Available for us under NDA for Last.fm, selection criteria ◮ Structure: network + scrobbling time series – 71 , 000 users, 285 , 241 edges – 2 year scrobble timeline, 2 , 073 , 395 artists – between 01 January 2010 and 31 December 2011 – 979 , 391 , 001 scrobbles – 57 , 274 , 158 1st-time scrobbles ◮ We train factor models only on the 1st time scrobbles ◮ Artists with popularity less than 14 are excluded ◮ Evaluation on each 1st time scrobble in the second year

  14. 2 E XPERIMENTS - F INAL C OMBINATION ◮ Factor and influence models combine well, the average improvement is – 7 % for DCG@10 0.01 K =10 0.009 0.008 average DCG 0.007 0.006 0.005 factor factor + influence 0.004 0 20 40 60 80 100 time (days) factor factor + influence

  15. L OCATION - AWARE ONLINE LEARNING ◮ Twitter dataset ◮ Temporal hashtag recommendation ◮ Twitter: highly non-stationary data ◮ ( u , h , l , t ) geoinfo ◮ Idea: tree structure of geographical areas number of records 6,978,478 number of unique user-hashtag pairs 2,993,183 number of users 792,860 number of items 268,489 number of countries 49

  16. T REE CONSTRUCTION ◮ 214,230 nodes containing 190,315 leaves. ◮ The depth of the tree is 6 ◮ The hashtag time series data covered 30,450 leaves from the whole tree. World ... Europe Asia Africa ... Germany Austria ... Vienna Graz ... ...

  17. R ECENCY 1e+06 100,000 N (IET = t ) 10,000 1,000 100 10 1 0.1 1e+00 1e+01 1e+02 1e+03 1e+04 1e+05 1e+06 1e+07 t (sec) P ( τ = t ) = ( α − 1 ) · t − α and P ( 1 ≤ τ ≤ t ) = 1 − t ( 1 − α ) � ( 1 − α ) P ( t < τ ≤ t + ∆ t | τ > t ) = P ( τ ≤ t +∆ t ) − P ( τ ≤ t ) 1 + ∆ t � = 1 − 1 − P ( τ ≤ t ) t

  18. M ODELING ◮ Online MF as baseline → NOT working ! ◮ Tree + Recency + Bias model: � ˆ ˆ r ( u , h , t , l ) = w n · f ( t − t n , h ) n ∈ Path(l) ◮ ˆ w n node biases learned with SGD ◮ ˆ w n already includes node reliability and popularity ◮ Different heuristic baselines

  19. 3 3 3 3 3 3 R ESULTS 0.4 average cumlative DCG@100 0.35 0.3 0.25 0.2 world 0.15 leaves countries 0.1 countries without recency tree 0.05 tree with learned node weights 0 2 4 6 8 10 12 14 16 18 20 time (days)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend