bert4rec sequential recommendation with
play

BERT4Rec : Sequential Recommendation with Bidirectional Encoder - PowerPoint PPT Presentation

BERT4Rec : Sequential Recommendation with Bidirectional Encoder Representations from Transformer Advisor: Jia-Ling Koh Presenter: You-Xiang Chen Source: CIKM19 Data: 2020/04/20 INTRODUCTION Introduction target item Sequential


  1. BERT4Rec : Sequential Recommendation with Bidirectional Encoder Representations from Transformer Advisor: Jia-Ling Koh Presenter: You-Xiang Chen Source: CIKM’19 Data: 2020/04/20

  2. INTRODUCTION

  3. Introduction target item Sequential Recommendation Recommender system historical subsequence

  4. Motivation & Goal ▪ Unidirectional models often assume a rigidly ordered sequence over data which is not always true for user behaviors in real-world applications. Proposing bidirectional self-attention network - BERT4Rec

  5. Motivation & Goal ▪ Conventional bidirectional models encode each historical subsequence to predict the target item . ▪ This approach is very time and resources consuming since we need to create a new sample for each position historical subsequence target item in the sequence and predict them separately. Introducing the Cloze task to produce more samples to train a more powerful model.

  6. METHOD

  7. Problem Statement Sets of user & item Output Interaction sequence

  8. Framework

  9. Embedding Layer Input representation 𝟏 item embedding matrix 𝒊 𝟐 d-dim. position embedding matrix

  10. Transformer Layer Multi-Head Self-Attention 𝒎 , 𝒊 𝟑 𝒎 , … , 𝒊 𝒖 𝑰 𝒎 = [𝒊 𝟐 𝒎 ] projects 𝑰 𝒎 into 𝒐 subspaces 𝑜 Scaled Dot-Product Attention 𝑜

  11. Transformer

  12. Multi-Head Attention

  13. Transformer Layer Position-wise Feed-Forward Network Gaussian Error Linear Unit (GELU) activation function separately and identically at each position

  14. Gaussian Error Linear Units https://arxiv.org/pdf/1606.08415.pdf

  15. Transformer Layer Stacking Transformer Layer LN(·) : layer normalization function https://arxiv.org/pdf/1607.06450.pdf

  16. Output Layer 𝑿 𝑸 : 𝑴𝒇𝒃𝒐𝒃𝒄𝒎𝒇 𝒒𝒔𝒑𝒌𝒇𝒅𝒖𝒋𝒑𝒐 𝒏𝒃𝒖𝒔𝒋𝒚 𝑭: 𝑭𝒏𝒄𝒇𝒆𝒆𝒋𝒐𝒉 𝑵𝒃𝒖𝒔𝒋𝒚 𝒑𝒈 𝒋𝒖𝒇𝒏𝒕

  17. Model Learning 𝒏𝒃𝒕𝒍𝒇𝒆 𝒘𝒇𝒔𝒕𝒋𝒑𝒐 𝒈𝒑𝒔 𝒗𝒕𝒇𝒔 𝒄𝒇𝒊𝒃𝒘𝒋𝒑𝒔 𝒖𝒊𝒇 𝒏𝒃𝒕𝒍𝒇𝒆 𝒋𝒖𝒇𝒏𝒕

  18. EXPERIMENT

  19. Datasets

  20. Baselines ▪ POP ▪ BPR-MF 𝒐𝒑𝒐 − 𝒕𝒇𝒓𝒗𝒇𝒐𝒖𝒋𝒃𝒎 ▪ NCF ▪ FPMC markov chain ▪ GRU4Rec + 𝒕𝒇𝒓𝒗𝒇𝒐𝒖𝒋𝒃𝒎 ▪ Caser ▪ SASRec

  21. Evaluation metrics Hit Ratio Mean Reciprocal Rank 𝐼𝑆@𝐿 = 𝑂𝑣𝑛𝑐𝑓𝑠 𝑝𝑔 𝐼𝑗𝑢𝑡 @ 𝐿 𝐻𝑈 𝑅 𝑁𝑆𝑆 = 1 1 𝑅 ෍ 𝑠𝑏𝑜𝑙 𝑗 Normalized Discounted cumulative gain 𝑗=1 𝑙 2 𝑠𝑓𝑚 𝑗 − 1 𝐸𝐷𝐻 𝑙 = ෍ log 2 (𝑗 + 1) 𝑗=1 𝑂𝐸𝐷𝐻@𝐿 = 𝐸𝐷𝐻@𝐿 𝐽𝐸𝐷𝐻

  22. Performance 𝑼𝒔𝒃𝒐𝒕𝒈𝒑𝒔𝒏𝒇𝒔 𝒕𝒇𝒓𝒗𝒇𝒐𝒖𝒋𝒃𝒎 𝒐𝒑𝒐 − 𝒕𝒇𝒓𝒗𝒇𝒐𝒖𝒋𝒃𝒎 𝒙𝒑𝒔𝒕𝒖

  23. Analysis on Bidirection and Cloze

  24. CONCLUSION ▪ We introduce a deep bidirectional sequential model called BERT4Rec for sequential recommendation. ▪ For model training, we introduce the Cloze task which predicts the masked items using both left and right context. ▪ Extensive experimental results on four real-world datasets show that our model outperforms state-of-the-art baselines.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend