RaFM
Rank-Aware Factorization Machines
Xiaoshuang Chen, Yin Zheng, Jiaxing Wang, Wenye Ma, Junzhou Huang Yin Zheng On Behalf of
RaFM Rank-Aware Factorization Machines Yin Zheng On Behalf of - - PowerPoint PPT Presentation
RaFM Rank-Aware Factorization Machines Yin Zheng On Behalf of Xiaoshuang Chen, Yin Zheng, Jiaxing Wang, Wenye Ma, Junzhou Huang Motivation Factorization Machines Different features have different frequencies of occurences Factorized
Rank-Aware Factorization Machines
Xiaoshuang Chen, Yin Zheng, Jiaxing Wang, Wenye Ma, Junzhou Huang Yin Zheng On Behalf of
Motivation
, ,
ˆ ,
i j j i i i j i j i i
y x x i s x b a w
F F
V V
, , 1
,
D i j i j M j f F i f f
v v
v v V V Different features have different frequencies of occurences Factorization Machines
i i
v V
Factorized embeddings for each feature: Modeling pairwise interactions:
What is the best rank of the embeddings?
Motivation
Performance of FMs with fixed ranks MovieLens Tag Overfitting Underfitting
Basic Model
Rank-Aware Factorization Machines Low-Rank FM High-Rank FM Rank-Aware FM
Basic Model
Rank-Aware Factorization Machines
min ,
ij i j
k k k
,
ij ij
RaFM k k i j i j
v v V V
, ,
ˆ ,
i j j i i i j i j i i
y x x i s x b a w
F F
V V
1 2
, , ,
i
k i i i i
v v v V
Multiple embeddings with different ranks: The largest rank to avoid overfitting (hyperparameters) Choose a proper rank for computation of pairwise interaction
Space Complexity
Active and Inactive Factors
Active Factors Inactive Factors
:
k i
i k k F F
1 m k k k
D O
F Described by Feature Set
Active factors: Inactive factors:
p
p
v
F
p
p
v
F F
Need NOT be stored! Space Complexity:
Time Complexity
:
k i
i k k F F
, ,
| | ,
ij ij l k l k
l k j i k i j k i j
x x
v v B
,
max ,min ,
ij ij l k
k l k k
, 1 , , 1 1, 1 l k l k k k k k
B B A A
1,m
RaFM B
1 m k k k
D O
F Auxiliary Variables
2 2 , 2 , , 2
1 2
k k k
l l l i i l l k j i j i i i j i j i i i
x x x x
v v v v
F F F
A
l k
O D F
It is easy to prove that
Learning Algorithm
Free and Dependent Factors
Free Factors Inactive Factors
:
k i
i k k F F
Dependent Factors
1 p p
p
v
F F
p
p
v
F F
1 p
p
v
F
Bi-Level Optimization
1,
1 min ,
m
L y N
x
B
1
1, 1, 1
1 argmin , , 1
p
p p p
p m L N
x
v
F
B B
Pushing dependent factors to approximate free factors Proved by Thm. 6
Experiment
efficient than FM.
Improvement: 0.5%~15% Model Size: 20%~66% Training Time: 24%~95%
Experiment
RaFM vs. FM Results on Tencent CTR Dataset
RaFM-low has similar performance as FM-32. RaFM: 32 + 512
Code Xiaoshuang Chen Yin Zheng https://github.com/cxsmarkchan/RaFM https://cxsmarkchan.github.io https://sites.google.com/site/zhengyin1126
Pacific Ballroom Jun 13th 6:30PM~9:00PM PosterID 220