rafm
play

RaFM Rank-Aware Factorization Machines Yin Zheng On Behalf of - PowerPoint PPT Presentation

RaFM Rank-Aware Factorization Machines Yin Zheng On Behalf of Xiaoshuang Chen, Yin Zheng, Jiaxing Wang, Wenye Ma, Junzhou Huang Motivation Factorization Machines Different features have different frequencies of occurences Factorized


  1. RaFM Rank-Aware Factorization Machines Yin Zheng On Behalf of Xiaoshuang Chen, Yin Zheng, Jiaxing Wang, Wenye Ma, Junzhou Huang

  2. Motivation Factorization Machines Different features have different frequencies of occurences Factorized embeddings for each feature:  v V i i Modeling pairwise interactions:      y ˆ V V , x x w x b a i s i j i j i i    i j , F , i j i F D     V V , v v v v i j i j i f , j f , F M  f 1 What is the best rank of the embeddings?

  3. Motivation Performance of FMs with fixed ranks Overfitting Underfitting MovieLens Tag

  4. Basic Model Rank-Aware Factorization Machines High-Rank FM Low-Rank FM Rank-Aware FM

  5. Basic Model Rank-Aware Factorization Machines      ˆ y V V , x x w x b a i s i j i j i i    i j , F , i j i F Multiple embeddings with different ranks:         k k          k min k k , V V , v v  v 1 2 k ij ij V , v , , v  i ij i j i j i j i i i i RaFM The largest rank to Choose a proper rank for computation of pairwise interaction avoid overfitting (hyperparameters) What is the time and space complexity? • How to efficiently train RaFM? •

  6. Space Complexity Active and Inactive Factors Described by Feature Set      Inactive Factors F i F : k k k i   p Inactive factors: v Need NOT be stored!  F F p   p v Active factors: F p   m  O D F Space Complexity:   k k    k 1 Active Factors

  7. Time Complexity      F i F : k k k i Auxiliary Variables      k max l ,min k k ,   ij ij   l k ,   2 1   2               l  l  l  l O D F A v v x x v x v x l k l k , i j i j  i i i i  2 2       i j , F , i j i F i F k k 2 k       v k | k |      B ij v ij x x l k , l k , RaFM  B l k , i j i j 1, m  i j It is easy to prove that   m     B B A A O D F       l k , 1 l k , k k , 1 k 1, k 1 k k    k 1

  8. Learning Algorithm      F i F : k k k i Free and Dependent Factors Bi-Level Optimization   1   p N  Inactive Factors v min L B , y  F F 1, m p x 1      p     v argmin L B , B , 1 p m  1, p 1, p 1 N F  p 1 x Pushing dependent factors to approximate free factors Free Factors   p v  F F  p p 1   p v Dependent Factors F  p 1 Proved by Thm. 6

  9. Experiment RaFM outperforms FM. • RaFM is also more computational • efficient than FM. Improvement: 0.5%~15% Model Size: 20%~66% Training Time: 24%~95%

  10. Experiment Results on Tencent CTR Dataset RaFM vs. FM RaFM: 32 + 512 RaFM-low has similar performance as FM-32.

  11. Pacific Ballroom Jun 13 th 6:30PM~9:00PM PosterID 220 Thanks! Code https://github.com/cxsmarkchan/RaFM Xiaoshuang Chen https://cxsmarkchan.github.io Yin Zheng https://sites.google.com/site/zhengyin1126

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend