Sequence-Aware Factored Mixed Similarity Model for Next-Item - PowerPoint PPT Presentation

Sequence-Aware Factored Mixed Similarity Model for Next-Item Recommendation Liulan Zhong, Jing Lin, Weike Pan ∗ and Zhong Ming ∗ zhongliulan2017@email.szu.edu.cn, linjing2018@email.szu.edu.cn, panweike@szu.edu.cn, mingz@szu.edu.cn National Engineering Laboratory for Big Data System Computing Technology, Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 1 / 30

Introduction Problem Definition Next-Item Recommendation Input: ( u , S u ) , i.e., a sequence of items for each user u . Goal: Rank the unobserved items at user u ’s next step by estimating the score ˆ r uj , j ∈ I\I u to form the recommendation list. Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 2 / 30

Introduction Notations (1/2) Table: Some notations and explanations. n number of users m number of items u user ID, u ∈ { 1 , 2 , . . . , n } i item ID, i ∈ { 1 , 2 , . . . , m } U the whole set of users I the whole set of items P the whole set of observed ( u , i ) pairs A a sampled set of unobserved ( u , i ) pairs I u a set of items that have been interacted by user u d ∈ R number of latent dimensions V i · , W i · ∈ R 1 × d item-specific latent feature vector w.r.t. item i b i ∈ R item bias learning rate γ tradeoff parameters of regularization terms α w , α v , β η , β v T iteration number Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 3 / 30

Introduction Notations (2/2) Table: Some notations and explanations. u , . . . , i |S u | a sequence of items, S u = { i 1 u , i 2 S u } u i t the t th item in S u u predicted preference of user u to item i t ˆ r ui t u u s ij predefined similarity between item i and item j tradeoff parameter in mixed similarity λ L the order of Markov chains the ℓ th order of Markov chains, ℓ ∈ { 1 , 2 , . . . , L } ℓ i t - ℓ the ( t - ℓ ) th item in S u u η ∈ R 1 × L global weighting vector η u ∈ R 1 × L personalized weighting vector w.r.t. user u Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 4 / 30

Background Motivation Previously proposed methods usually model the general representation and the sequential representation in two divided factorization components. Our proposed model can integrate the items’ general similarity and the items’ learnable sequential representations in a unified component. On the basis of Fossil [He et al., 2016], it considers the short-term sequential information via high-order Markov chains. The rationale behind the specific term η ℓ + η u ℓ is that each of the previous L locations should contribute with different weights to the high-order smoothness, lacking the weight contribution from the latest specific items. Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 5 / 30

Background Fossil: Prediction Rule On the basis of FISM [Kabbur et al., 2013], Fossil combines a similarity-based method and high-order Markov chains, the predicted function is as follows, U − i t u + ¯ u · V T ˆ r ui t u = b i t u (1) u · , i t where L 1 U − i t ¯ � � ( η ℓ + η u u = W i ′ · + ℓ ) W i i − ℓ · , (2) u · � |I u \{ i t u }| u i ′ ∈I u \{ i t ℓ = 1 u } and η u ℓ controls the weight of user u ’s preference and sequential dynamics, while η ℓ is a global parameter shared by all the users. Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 6 / 30

Background Overall of Our Solution Sequence-Aware Factored Mixed Similarity Model (S-FMSM) Our S-FMSM considers the weights of the specific history item i t - ℓ and u its relative position in contributing to the target item i t u for sequence modeling. Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 7 / 30

Method S-FMSM: Prediction Rule The predicted preference of user u to item i t u : U − i t u + ¯ u · V T ˆ r ui t u = b i t u u · , (3) i t where L 1 U − i t ¯ � � ( η ℓ + η u u = W i ′ · + ℓ )(( 1 − λ ) + λ s i t u i t - ℓ u ) W i t - ℓ · . (4) u · � |I u \{ i t u }| u i ′ ∈I u \{ i t ℓ = 1 u } Notes: u and item i t - ℓ is the cosine similarity between item i t s i t u i t - ℓ u . In fact, u what it captures is the weight of the history item i t - ℓ in contributing u to the target item i t u . The tradeoff parameter λ tuned among { 0,0.2,0.4,0.6,0.8,1 } adjusts the influence of s i t in preference prediction. Notice that u i t - ℓ u when λ = 0, it reduces to Fossil [He et al., 2016]. Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 8 / 30

Method S-FMSM: Objective Function The objective function is as follows, � � � min f ui t u j , (5) Θ u ∈U i t j / ∈I u u ∈S u , t � = 1 where Θ = { V i · , W i · , b i , η ℓ , η u ℓ , i = 1 , 2 , . . . , m ; u = 1 , 2 , . . . , n ; ℓ = 1 , 2 , . . . , L } 2 � 2 + � � � � r uj ) + α v + α v � �� u j = − ln σ (ˆ u − ˆ and f ui t r ui t � V i t � V j · � � � � u · 2 2 � � � � 2 is a tentative i ′ ∈I u || W i ′ · || 2 + β v 2 || η ℓ || 2 + β η j + β η α w u + β v 2 b 2 2 b 2 � �� η u �� i t ℓ 2 2 objective function for a randomly sampled triple ( u , i t u , j ) via “first positive ( u , i t u ) then negative j ”. Notes: Because the pairwise preference relaxes the assumption of the pointwise preference, we adopt a personalized pairwise ranking to keep the loss at a minimum. Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 9 / 30

Method Gradients (1/2) ∂ ( f uit uj ) The gradient of each parameter θ ∈ Θ , i.e., ∇ θ = , is computed ∂θ as follows: ∇ b it = β v b it + ( − 1 ) σ (ˆ r uj − ˆ r uit ) , (6) u u u ∇ b j = β v b j + σ (ˆ r uj − ˆ r uit ) , (7) u L 1 � � ( η ℓ + η u u · + ( − 1 ) σ (ˆ r uj − ˆ ∇ V it u · = α v V it r uit )[ W i ′ · + ℓ )(( 1 − λ ) + λ s it ) W it - ℓ · ] , (8) uit - ℓ � u |I u \{ i t u u u }| i ′ ∈I u \{ it ℓ = 1 u } L 1 ( η ℓ + η u � � ∇ V j · = α v V j · + σ (ˆ r uj − ˆ )[ W i ′ · + ℓ )(( 1 − λ ) + λ s jit - ℓ ) W it - ℓ · ] , r uit (9) � u |I u | u u i ′ ∈I u ℓ = 1 Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 10 / 30

Method Gradients (2/2) β η η ℓ + ( − 1 ) σ (ˆ r uj − ˆ ∇ η ℓ = r uit ) W it - ℓ · u u [ V T ) − V T u · (( 1 − λ ) + λ s it j · (( 1 − λ ) + λ s jit - ℓ )] , ℓ = 1 , . . . , L , (10) uit - ℓ it u u ∇ η u β η η u = ℓ + ( − 1 ) σ (ˆ r uj − ˆ r uit ) W it - ℓ ℓ · u u [ V T ) − V T u · (( 1 − λ ) + λ s it j · (( 1 − λ ) + λ s jit - ℓ )] , ℓ = 1 , . . . , L , (11) uit - ℓ it u u 1 ∇ W i ′ · = α w W i ′ · + ( − 1 ) σ (ˆ r uj − ˆ r uit )[ � u |I u \{ i t u }| 1 V j · ] , i ′ ∈ I u \{ i t u , i t − 1 , . . . , i t − L V it u · − } , (12) u u � |I u | 1 u · + ( − 1 ) σ (ˆ r uj − ˆ ∇ W it u · = α w W it r uit )[ − V j · ] , (13) � |I u | u ∇ W it - ℓ = α w W it - ℓ · + ( − 1 ) σ (ˆ r uj − ˆ r uit )[( V it u · (( 1 − λ ) + λ s it ) uit - ℓ · u u u u ))( η ℓ + η u − V j · (( 1 − λ ) + λ s jit - ℓ ℓ )] , ℓ = 1 . . . , L . (14) u Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 11 / 30

Method Update Rules We have the update rule for each parameter, θ = θ − γ ∇ θ, (15) where γ > 0 is the learning rate, θ ∈ Θ , Θ = { V i · , W i · , b i , η ℓ , η u ℓ , i = 1 , 2 , . . . , m ; u = 1 , 2 , . . . , n ; ℓ = 1 , 2 , . . . , L } . Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 12 / 30

Method Algorithm Algorithm 1 The algorithm of S-FMSM. 1: Initialize the model parameters. 2: for t = 1 , · · · , T do for each ( u , i t u ) ∈ P in a random order do 3: Randomly pick up an item j from I\I u 4: Calculate gradients according to Eqs.(6-14) 5: Update the model parameters via Eq.(15) 6: end for 7: 8: end for Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 13 / 30

Experiments Datasets (1/2) We adopt two commonly used datasets in the experiments, including the MovieLens data, i.e., MovieLens 100K (ML100K), MovieLens 1M (ML1M), and the Amazon e-commerce data, i.e., Office Products (Office), Automotive (Auto), Video Games (Video), and Cell Phones & Accessories (Cell). We treat all the observed behaviors as positive feedback and preprocess each dataset as follows: We remove the records of the users who rate fewer than five times; We remove the records of the items that are rated fewer than five times; We sort all the records according to the timestamps and split each user’s sequence into three parts, i.e., the item(s) at the last step for test, the item(s) at the penultimate step for validation, and the remaining items for training. Zhong, Lin, Pan and Ming (SZU) S-FMSM IEEE BigComp 2020 14 / 30

Sequence-Aware Factored Mixed Similarity Model for Next-Item - PowerPoint PPT Presentation

Sequence-Aware Factored Mixed Similarity Model for Next-Item Recommendation Liulan Zhong, Jing Lin, Weike Pan and Zhong Ming zhongliulan2017@email.szu.edu.cn, linjing2018@email.szu.edu.cn, panweike@szu.edu.cn, mingz@szu.edu.cn National

Planning and Optimization December 4, 2019 G1. Factored MDPs G1.1 Factored MDPs Planning and

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Connectionist Temporal Classification 1 Sequence-to-sequence

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in

Mixed Precision Training PAI Overview What is mixed-precision

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Time- -dependent Similarity Measure dependent Similarity Measure Time Time-dependent Similarity

Mixing it up with random effects Joshua Loftus Mixed models Intro to mixed models What is a

Planning and Optimization G1. Factored MDPs Malte Helmert and Thomas Keller Universit at

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

sss r ss s t

Neogene Atlas Fossils Rock Layer Images Capitola Beach, near Santa Cruz, California Image is

Best pracces for HTTP-CoAP mapping implementaon

RESTful Approaches To Financial Systems Integration Kirk Wylie qCon London 2009

Retrieval of CO 2 Using AIRS and IASI Breno Imbiriba, L. Larrabee Strow, Scott Hannon, Sergio

The Athena Impact Estimator for Buildings Grant Finlayson Senior Research Associate About the

Is Africa leapfrogging to renewables or heading for carbon lock-in? Predicting success of

GEOTHERMAL INNOVATION FOR EAST AFRICA ECONOMIC GROWTH Kevin Kitz KitzWorks LLC

Sequence-Aware Factored Mixed Similarity Model for Next-Item - PowerPoint PPT Presentation

Sequence-Aware Factored Mixed Similarity Model for Next-Item Recommendation Liulan Zhong, Jing Lin, Weike Pan and Zhong Ming zhongliulan2017@email.szu.edu.cn, linjing2018@email.szu.edu.cn, panweike@szu.edu.cn, mingz@szu.edu.cn National

Planning and Optimization December 4, 2019 G1. Factored MDPs G1.1 Factored MDPs Planning and

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Connectionist Temporal Classification 1 Sequence-to-sequence

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

SEQUENCE ANALYSIS The term &quot; sequence analysis &quot; in biology implies subjecting a DNA or

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in

Mixed Precision Training PAI Overview What is mixed-precision

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Time- -dependent Similarity Measure dependent Similarity Measure Time Time-dependent Similarity

Mixing it up with random effects Joshua Loftus Mixed models Intro to mixed models What is a

Planning and Optimization G1. Factored MDPs Malte Helmert and Thomas Keller Universit at

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

sss r ss s t

Neogene Atlas Fossils Rock Layer Images Capitola Beach, near Santa Cruz, California Image is

Best prac*ces for HTTP-CoAP mapping implementa*on

RESTful Approaches To Financial Systems Integration Kirk Wylie qCon London 2009

Retrieval of CO 2 Using AIRS and IASI Breno Imbiriba, L. Larrabee Strow, Scott Hannon, Sergio

The Athena Impact Estimator for Buildings Grant Finlayson Senior Research Associate About the

Is Africa leapfrogging to renewables or heading for carbon lock-in? Predicting success of

GEOTHERMAL INNOVATION FOR EAST AFRICA ECONOMIC GROWTH Kevin Kitz KitzWorks LLC

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or

Best pracces for HTTP-CoAP mapping implementaon