svd lda topic modeling for full text recommender systems
play

SVD-LDA: Topic Modeling for Full-Text Recommender Systems Sergey - PowerPoint PPT Presentation

Recommender systems From LDA to SVD-LDA SVD-LDA: Topic Modeling for Full-Text Recommender Systems Sergey Nikolenko Steklov Mathematical Institute at St. Petersburg Laboratory for Internet Studies, National Research University Higher School of


  1. Recommender systems From LDA to SVD-LDA SVD-LDA: Topic Modeling for Full-Text Recommender Systems Sergey Nikolenko Steklov Mathematical Institute at St. Petersburg Laboratory for Internet Studies, National Research University Higher School of Economics, St. Petersburg October 30, 2015 Sergey Nikolenko SVD-LDA

  2. Recommender systems Intro From LDA to SVD-LDA Recsys overview Outline Recommender systems 1 Intro Recsys overview From LDA to SVD-LDA 2 Latent Dirichlet Allocation SVD-LDA Sergey Nikolenko SVD-LDA

  3. Recommender systems Intro From LDA to SVD-LDA Recsys overview Overview Very brief overview of the paper: our main goal is to recommend full text items (posts in social networks, web pages etc.) to users; in particular, we want to extend recommender systems with features coming from the texts; this is especially important for cold start; these features can come from topic modeling; in this work, we combine classical SVD and LDA models into one, training them together. Sergey Nikolenko SVD-LDA

  4. Recommender systems Intro From LDA to SVD-LDA Recsys overview Recommender systems Recommender systems analyze user interests and attempt to predict what the current user will be most interested in now. Collaborative filtering: given a sparse matrix of ratings assigned by users to items, predict unknown ratings (and hence recommend items with best predictions): nearest neighbor methods (user-user and item-item) – GroupLens; SVD – singular value decomposition: decompose a user × item matrix, reducing dimensionality as user × item = ( user × feature )( feature × item ) (with very few features compared to users and items), learning user and item features that can be used to make predictions. Sergey Nikolenko SVD-LDA

  5. Recommender systems Intro From LDA to SVD-LDA Recsys overview Recommender systems Formally speaking, SVD models a rating as r SVD = µ + b i + b a + q ⊤ ^ a p i , where i , a b i is the baseline predictor for user i ; b a is the baseline predictor for user a ; q a and p i are feature vectors for i and a . Then you can train b i , b a , q a , p i together by fitting actual ratings to the model (by alternating least squares). Importantly for us, if you have likes/dislikes rather than explicit ratings, you can use logistic SVD (trained by alternating logistic regression): � � µ + b i + b a + q ⊤ p ( Like i , a ) = σ a p i . Sergey Nikolenko SVD-LDA

  6. Recommender systems Intro From LDA to SVD-LDA Recsys overview Recommender systems Many modifications of classical recommender systems use additional information: implicit user preferences (what the user viewed – e.g., SVD++); the time when ratings appear (e.g., time-SVD++); social graph when the users’ social network profiles are available; context-aware recommendations (time of day, situation, company etc.); recommendations aware of other recommendations (optimizing diversity, novelty, serendipity). In this work, we concentrate on the textual content of items. Sergey Nikolenko SVD-LDA

  7. Recommender systems Intro From LDA to SVD-LDA Recsys overview Recommender systems The main dataset for the project comes from a Russian recommender system Surfingbird : Surfingbird recommends web pages to users; a user clicks “Surf”, sees a new page, maybe rates it by clicking “Like” or “Dislike”; web pages usually have content , often textual content; the text may be very useful for recommendations; how do we use it? Sergey Nikolenko SVD-LDA

  8. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA Outline Recommender systems 1 Intro Recsys overview From LDA to SVD-LDA 2 Latent Dirichlet Allocation SVD-LDA Sergey Nikolenko SVD-LDA

  9. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA Topic modeling with LDA Latent Dirichlet Allocation (LDA) – topic modeling for a corpus of texts: a document is represented as a mixture of topics; a topic is a distribution over words; to generate a document, for each word we sample a topic and then sample a word from that topic; by learning these distributions, we learn what topics appear in a dataset and in which documents. Sergey Nikolenko SVD-LDA

  10. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA Topic modeling with LDA Sample LDA result from (Blei, 2012): Sergey Nikolenko SVD-LDA

  11. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA Topic modeling with LDA Sample LDA result from (Blei, 2012): Sergey Nikolenko SVD-LDA

  12. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA PGM for LDA Sergey Nikolenko SVD-LDA

  13. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA Inference in LDA There are two major approaches to inference in probabilistic models with a loopy factor graph like LDA: variational approximations simplify the graph by approximating the underlying distribution with a simpler one, but with new parameters that are subject to optimization; Gibbs sampling approaches the underlying distribution by sampling a subset of variables conditional on fixed values of all other variables. Both approaches have been applied to LDA. In a way, LDA is similar to SVD – it performs dimensionality reduction and, so to speak, decomposes document × word = ( document × topic )( topic × word ) . Sergey Nikolenko SVD-LDA

  14. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA LDA likelihood Thus, the total likelihood of the LDA model is � p ( z , w , α, β ) = p ( θ | α ) p ( z | θ ) p ( w | z , φ ) p ( φ | β ) d θ d φ . θ , φ And in Gibbs sampling, we sample p ( z w = t | z − w , w , α, β ) ∝ q ( z w , t , z − w , w , α, β ) = n ( d ) n ( w ) − w , t + α − w , t + β = � . � � � n ( d ) n ( w ′ ) � − w , t ′ + α � − w , t + β t ′ ∈ T w ′ ∈ W Sergey Nikolenko SVD-LDA

  15. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA LDA extensions There already exist LDA extensions relevant to our research: DiscLDA: LDA for classification with a class-dependent transformation in the topic mixtures; Supervised LDA: documents with a response variable, we mine topics that are indicative of the response; TagLDA: words have tags that mark context or linguistic features; Tag-LDA: documents have topical tags, the goal is to recommend new tags to documents; Topics over Time: topics change their proportions with time; hierarchical modifications with nested topics are also important. In this work, we develop a novel extension: SVD-LDA. Sergey Nikolenko SVD-LDA

  16. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA Supervised LDA We begin with supervised LDA: assumes that each document has a response variable; and the purpose is to predict this variable rather than just “learn something about the dataset”; can we learn topics that are relevant to this specific response variable? In recommender systems, the response variable would be the probability of a like, an explicit rating, or some other desirable action. This adds new variables to the graph. Sergey Nikolenko SVD-LDA

  17. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA PGM for sLDA Sergey Nikolenko SVD-LDA

  18. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA Supervised LDA Mathematically, we add a factor corresponding to the response variable (Gaussian in sLDA): � − 1 � 2 � � p ( y d | z , b , σ 2 ) = exp y d − b ⊤ ✖ z − a , 2 the total likelihood is now p ( z | w , y , b , α, β, σ 2 ) ∝ � � 2 � B ( n d + α ) B ( n t + β ) − 1 � � � � y d − b ⊤ ✖ z d − a exp , ∝ B ( α ) B ( β ) 2 t d d Sergey Nikolenko SVD-LDA

  19. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA Supervised LDA The Gibbs sampling goes as p ( z w = t | z − w , w , α, β ) ∝ � − 1 � 2 � � y d − b ⊤ ✖ ∝ q ( z w , t , z − w , w , α, β ) exp z − a = 2 n ( d ) n ( w ) − w , t + α − w , t + β � � 2 � − 1 � y d − b ⊤ ✖ = z − a , � exp � � � n ( d ) n ( w ′ ) 2 � − w , t ′ + α � − w , t + β t ′ ∈ T w ′ ∈ W but it is now a two-step iterative algorithm: sample z according to equations above; train b , a as a regression. Sergey Nikolenko SVD-LDA

  20. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA Logistic sLDA Hence, our first results: a Gibbs sampling scheme for supervised LDA (the original paper offered only variational approximations); an extension of supervised LDA to handle logistic variables: � � b ⊤ ✖ p = σ z + a . Sergey Nikolenko SVD-LDA

  21. Recommender systems Latent Dirichlet Allocation From LDA to SVD-LDA SVD-LDA Logistic sLDA Hence, our first results: logistic regression is used to train b , a ; and the Gibbs sampling goes as p ( z w = t | z − w , w , α, β ) ∝ �� 1 − y x = �� y x � � � � � b ⊤ ✖ b ⊤ ✖ ∝ q ( z w , t , z − w , w , α, β ) σ z d + a 1 − σ z d + a x ∈ X d n ( d ) n ( w ) − w , t + α − w , t + β = � × � � � n ( w ′ ) n ( d ) � − w , t ′ + α � − w , t + β t ′ ∈ T w ′ ∈ W × exp ( s d log p d + ( | X d | − s d ) log ( 1 − p d )) . Sergey Nikolenko SVD-LDA

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend