 
              Combining Factorization Model and Additive Forest for Recommendation Presenter: Tianqi Chen Team ACMClass@SJTU August 11, 2012
Team ACMClass@SJTU ◮ Original team name: undergrads ◮ Members are students from ACMClass in SJTU ◮ All members are undergraduates, except the presenter:) 1 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Outline Overview of General Approach Go Beyond Factorization Models More Example Models used in Solution Results and Conclusion 1 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Overview of Our Solution Modeling Approach Rank One Joint Model, No Ensemble Op tj miza tj on Factoriza tj on Models Final Solu tj on Combina tj on Addi tj ve Forest Incorporated Informa tj on social network/ac tj on user age/gender Focus point of this presenta tj on item taxonomy tj mestamp … 2 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Feature-based Matrix Factorization T     α ( u ) β ( i ) γ ( u , i )  �  �  + � ˆ r ui = c p c c q c g c (1) c  c ∈ C ( u ) c ∈ C ( i ) c ∈ C ( u , i ) ◮ Θ = { p , q , g } , trained via stochastic gradient descent ◮ α ( u ) c : user feature of user u : user social network/action, keyword/tag ◮ β ( i ) c : item feature weight of item(celeberity) i : item taxonomy/network ◮ γ ( u , i ) : global feature related to interaction between u and i : c user age/gender bias 3 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Outline Overview of General Approach Go Beyond Factorization Models More Example Models used in Solution Results and Conclusion 3 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Additive Forest S � r ui = ˆ f s , root ( i , s ) ( x u ) (2) s =1 ◮ x u : property feature of user u ◮ f s , root ( i , s ) : function defined by a regression tree ◮ Learning via gradient boosting algorithm item i: Kaifu LEE item 1 item 2 item k Major=IT? yes no Forest 1 Occupation=Student? Age<25? yes no yes no Age>18? Gender=Female? Occupation=Student? yes no yes no yes no Age>12? Forest 2 yes no Like Dislike 4 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
An Example of Additive Forest Individual f 1 root:SIGKDD root:Barbie tree for each item age<20? is female? Yes Yes No No Forest 1 0 0 age< 18? Major=CS? Yes No Yes No +1 0.2 +1 0.5 root:Barbie f 2 root:SIGKDD Forest 2 is Has User is female? Tag Data learned to No Mining? Yes Yes No complement 0 0 Forest 1 age< 14? +2 No Yes 0 +1 5 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Factorization Model vs Additive Forest Factorization Additive Forest handling of sparse matrix capable, not very very well data well combination of different linear combina- nonlinear com- information tion position handling of continuous need predefined automatic seg- property segmentation mentation feature selection, model complexity control regularization prunning ◮ Both models have their own advantages on different aspect ◮ Understanding their properties and knowing when to use which one is very important 6 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Information Combination: User Social Network Factorization Model Additive Forest ◮ Condition composition T   1 � ◮ Feature selection r ui = ˆ p j q i   � | F ( u ) | j ∈ F ( u ) root: item i ◮ F ( u ) : follow set of u ◮ Linear combination Follow B? Yes No Follow A? Follow B? 0 Follow A? Yes Yes Yes No T q i p B p A T q i +1 0 Score for users who “Follow both A and B” = p A T q i + p B T q i Speci fi c score for condi tj on “Follow A and B” 7 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Continuous Feature Handling: User Age Factorization Model Additive Forest root: item i Automa tj c fi nd spli ttj ng point r ′ ui = p T age < 17? ˆ u q i + W i , ag ( u ) (3) Yes No ◮ ag ( u ): age segment index 0 age <10? ◮ Require predefined partition Yes No age bias parameters +1 -1 W i,1 W i,2 W i,4 W i,3 10 20 30 age par tjtj on points 8 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Factorization Model vs Additive Forest Factorization Additive Forest handling of sparse matrix capable, not very very well data well combination of different linear combina- nonlinear com- information tion position handling of continuous need predefined automatic seg- property segmentation mentation feature selection, model complexity control regularization prunning ◮ Both models have their own advantages on different aspect ◮ Understanding their properties and knowing when to use which one is very important 9 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Outline Overview of General Approach Go Beyond Factorization Models More Example Models used in Solution Results and Conclusion 9 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Time-aware Model Traditional Time Bin Model Our Time-aware Model S r ′ ˆ ui ( t ) = ˆ r ui + b i , binid ( t ) r ′ � ˆ ui ( t ) = ˆ r ui + f s , i ( t ) s =1 ◮ binid ( t ): time bin index of t ◮ f s , i ( t ): k -piece step function Bias Bias t 1 t 2 t 3 t 4 t 1 t 2 t 3 t 4 Time Time (a) Item Time Bin (b) K-piece Step Function Figure: Comparison of Two Temporal Models 10 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
User Sequential Pattern S � r ′ ˆ ui ( t ) = ˆ r ui + f s ( x seq ) (4) s =1 Features include in x seq : ◮ time difference between clicks ◮ average click speed of current user 1 0.4 0 0.2 −1 f( ∆ t) f( ∆ t) 0 −2 −0.2 −3 −4 −0.4 0 50 100 0 50 100 ∆ t (sec) ∆ t (sec) (a) ∆ t = t next − t curr (b) ∆ t = t curr − t prev Figure: Single Variable Pattern � S s =1 f s (∆ t ) 11 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Final Model T     α ( u ) β ( i )  �  � r ui = ˆ c p c c q c   c ∈ C ( u ) c ∈ C ( i ) (5) S γ ( u , i ) � � + g c + f s , root ( s , i ) ( x ui ) c c ∈ C ( u , i ) s =1 ◮ Combination of all the factorization model and additive forest ◮ Boosting from result of factorization part 12 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Outline Overview of General Approach Go Beyond Factorization Models More Example Models used in Solution Results and Conclusion 12 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Experiment Results ID model public private ∆ public ∆ private 1 item bias 34 . 6% 34 . 0% 2 1 + user follow/action 36 . 7% 35 . 8% 2 . 1% 1 . 8% 3 2 + user age/gender 38 . 0% 37 . 2% 1 . 3% 1 . 4% 4 3 + user tag/keyword 38 . 5% 37 . 6% 0 . 5% 0 . 4% 5 4 + item taxonomy 38 . 7% 37 . 8% 0 . 2% 0 . 2% 6 5 + time-aware model 39 . 0% 37 . 9% 0 . 3% 0 . 1% 7 6 + age/gender(forest) 39 . 1% 38 . 0% 0 . 1% 0 . 1% 8 7 + sequential patterns 44 . 2% 42 . 7% 5 . 1% 4 . 7% Table: MAP@3 of different methods ◮ User Modeling and Sequential Patterns contributes the most ◮ Time-aware model is more effective in public data ◮ All of them are important for winning 13 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Summary ◮ Seems Ensemble methods do not work in our experiment ◮ Choose right methods to utilize different kinds of data ◮ Factorization models are powerful, but also have drawbacks ◮ Additive forest can automatic cut the continuous features, sometimes smarter than human ◮ Use automatic cutting to build robust time-aware model ◮ Fully utilize the available information ◮ Source code: svdfeature.apexlab.org 14 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Thank You, Questions?
Appendix ◮ The rest parts of the slides are appendix 15 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Objective Function ◮ Loss function of Pairwise Ranking: AUC optimization 1 � L u = C (ˆ r ui − ˆ r uj ) (6) |{ ( i , j ) | r ui > r uj }| ( i , j ): r ui > r uj ◮ Pseudo loss function of LambdaRank: MAP optimization 1 � L u = | ∆ ij MAP |C (ˆ r ui − ˆ r uj ) (7) |{ ( i , j ) | r ui > r uj }| ( i , j ): r ui > r uj ◮ ∆ ij MAP is MAP change when we swap i and j in current list ◮ C ( x ) is a surrogate convex loss function ◮ logistic loss(BPR): C ( x ) = ln(1 + e − x ) ◮ hinge loss(maximum margin): C ( x ) = max (0 , 1 − x ) ◮ L u is normalized by number of pairs( |{ ( i , j ) | r ui > r uj }| ): Balance over all users is important 16 / 14 ACMClass@SJTU Combining Factorization Model and Additive Forest for Recommendation
Recommend
More recommend