Future of Personalized Recommendation Systems Xing Xie Microsoft - - PowerPoint PPT Presentation

future of personalized recommendation systems
SMART_READER_LITE
LIVE PREVIEW

Future of Personalized Recommendation Systems Xing Xie Microsoft - - PowerPoint PPT Presentation

Future of Personalized Recommendation Systems Xing Xie Microsoft Research Asia Recommendation Everywhere Personalized News Feed Online Advertising History 2010 (Various data competitions) Hybrid models with machine learning 1990s (Tapestry,


slide-1
SLIDE 1

Future of Personalized Recommendation Systems

Xing Xie Microsoft Research Asia

slide-2
SLIDE 2

Recommendation Everywhere

slide-3
SLIDE 3

Personalized News Feed

slide-4
SLIDE 4

Online Advertising

slide-5
SLIDE 5

CB

History

FM

ML DL

1990s (Tapestry, GroupLens) Content based filtering Collaborative filtering 2006 (Netflix prize) Factorization-based Models SVD++ 2010 (Various data competitions) Hybrid models with machine learning LR, FM, GBDT, etc. Pair-wise ranking 2015 (Deep learning) Flourish with neural models PNN, Wide&Deep, DeepFM, xDeepFM, etc.

CF

Explainable recommendation Knowledge enhanced recommendation Reinforcement learning Transfer learning …

slide-6
SLIDE 6

Our Research

user Item Deep learning based user modeling Knowledge enhanced recommendation Explainable recommendation Deep learning based recommendation

slide-7
SLIDE 7

Microsoft Recommenders

  • Helping researchers and developers to quickly select, prototype, demonstrate,

and productionize a recommender system

  • Accelerating enterprise-grade development and deployment of a recommender

system into production

  • https://github.com/microsoft/recommenders
slide-8
SLIDE 8

User Behavioral Data

slide-9
SLIDE 9

Explicit User Representation

Demographic Age Gender Life stage Marital status Residence Education Vocation Personality Openness Conscientiousness Extraversion Agreeableness Neuroticism Impulsivity Novelty-seeking Indecisiveness Interests Food Book Movie Music Sport Restaurant Status Emotion Event Health Wealth Device Social Friend Coworker Spouse Children Other relatives Tie strength Schedule Task Driving route Metro/bus line Appointment Vacation

slide-10
SLIDE 10

Explicit vs Implicit

IDs Texts Images Network ID Embedding Text Embedding Image Embedding Network Embedding Deep Models User Embedding Item Embedding DNN Model

Implicit User Representation

Feature Engineering Classification/Regression Models

Explicit User Representation

Representation Pros Cons Explicit

  • Easy to understand;
  • Can be directly

bidden by advertisers

  • Hard to obtain

training data;

  • Difficult to satisfy

complex and global needs; Implicit

  • Unified and

heterogenous user representation;

  • End-to-end learning
  • Difficult to explain;
  • Need to fine-tune in

each task

slide-11
SLIDE 11

Query Log based User Modeling

gifts for classmates cool math games mickey mouse cartoon shower chair for elderly presbyopic glasses costco hearing aids groom to bride gifts tie clips philips shaver lipstick color chart womans ana blouse Dior Makeup

Chuhan Wu, Fangzhao Wu, Junxin Liu, Shaojian He, Yongfeng Huang, Xing Xie, Neural Demographic Prediction using Search Query, WSDM 2019

slide-12
SLIDE 12

Query Log based User Modeling

birthday gift for grandson central garden street google my health plan medicaid new York medicaid for elderly in new York alcohol treatment amazon.com documentary grandson youtube Different records have different informativeness Neighboring records may have relatedness, while far

  • nes usually not

Different words may have different importance The same word may have different importance in different contexts

slide-13
SLIDE 13

Query Log based User Modeling

slide-14
SLIDE 14

Experiments

  • Dataset:
  • 15,346,617 users in total with age category labels
  • Randomly sampled 10,000 users for experiments
  • Search queries posted from October 1, 2017 to March 31, 2018

Mapping between age category and age range Distribution of age category Distribution of query number per user Distribution of query length

slide-15
SLIDE 15

Experiments

discrete feature, linear model continuous feature, linear model flat DNN models hierarchical LSTM model

slide-16
SLIDE 16

User Age Inference

Queries from a young user Queries from an elder user

slide-17
SLIDE 17

Car / Pet Segment

slide-18
SLIDE 18

Universal User Representation

  • Existing user representation learning are task-specific
  • Difficult to generalize to other tasks
  • Highly rely on labeled data
  • Costly to exploit heterogenous unlabeled user behavior data
  • Learn universal user representations from heterogenous and multi-

source user data

  • Capture global patterns of online users
  • Easily applied to different tasks as additional user features
  • Do not rely on manually labeled data
slide-19
SLIDE 19

Deep Learning Based Recommender System

Learning latent representations Learning feature interactions

slide-20
SLIDE 20

Motivations

  • We try to design a new neural structure that
  • Automatically learns explicit high-order interactions
  • Vector-wise interaction, rather than bit-wise
  • Different types of feature interactions can be combined easily
  • Goals
  • Higher accuracy
  • Reducing manual feature engineering work

Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, Guangzhong Sun, xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems, KDD 2018

slide-21
SLIDE 21

Compressed Interaction Network (CIN)

slide-22
SLIDE 22

Relation with CNN

m D Direction of filter sliding

Feature map HK+1 An example of image CNN Feature map 1

slide-23
SLIDE 23

Extreme Deep Factorization Machine (xDeepFM)

  • Combining explicit and implicit feature interaction network
  • Integrate both memorization and generalization
slide-24
SLIDE 24

Data

  • Criteo: ads click-through-rate prediction
  • Dianping: restaurant recommendation
  • Bing News: news recommendation
slide-25
SLIDE 25

Experiments

slide-26
SLIDE 26

Experiments

slide-27
SLIDE 27

Knowledge Graph

  • A kind of semantic network, where node indicates entity or concept,

edge indicates the semantic relation between entity/concept

slide-28
SLIDE 28

Knowledge Enhanced Recommendation

  • Precision
  • More semantic content about items
  • Deep user interest
  • Diversity
  • Different types of relations in knowledge

graph

  • Extend user’s interest in different paths
  • Explainability
  • Connect user interest and

recommendation results

  • Improve user satisfaction, boost user

trust

slide-29
SLIDE 29

Knowledge Graph Embedding

  • Learns a low-dimensional vector for each entity and relation in KG,

which can keep the structural and semantic knowledge

❑ Apply distance-based score function to estimate the triple probability ❑ TransE, TransH, TransR, etc.

Distance-based Models

slide-30
SLIDE 30

Knowledge Graph Embedding

❑ Apply similarity-based score function to estimate the triple probability ❑ SME, NTN, MLP, NAM, etc.

Matching-based Models

slide-31
SLIDE 31

Knowledge Graph Embedding

KGE KG Entity Vector Relation Vector RS Task Feed into User Vector Item Vector Learning KGE KG Entity Vector, Relation Vector Learning (Successive Training) (Alternate Training) RS User Vector, Item Vector KGE KG Entity Vector, Relation Vector User Vector, Item Vector RS Learning (Joint Training)

slide-32
SLIDE 32

Deep Knowledge-aware Network

Hongwei Wang, Fuzheng Zhang, Xing Xie, Minyi Guo, DKN: Deep Knowledge-Aware Network for News Recommendation, WWW 2018

slide-33
SLIDE 33

Deep Knowledge-aware Network

slide-34
SLIDE 34

Extract Knowledge Representations

  • Additionally use contextual entity embeddings to include structural

information

  • Context implies one-step neighbor
slide-35
SLIDE 35

Deep Knowledge-aware Network

slide-36
SLIDE 36

Experiments

slide-37
SLIDE 37

Examples

slide-38
SLIDE 38

Ripple Network

  • Users interests as seed entity, propagates in the graph step by step
  • Decay in the propagating process

Hongwei Wang, etc. Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems, CIKM 2018

slide-39
SLIDE 39

Ripple Network

slide-40
SLIDE 40

Experiments

slide-41
SLIDE 41

Example

slide-42
SLIDE 42

Presentation Quality

Explainable Recommendation Systems

Effectiveness Persuasiveness Readability Model Explainability Transparency Trust

slide-43
SLIDE 43

Explainable Recommendation Systems

Their tan tan noodles are made of

  • magic. The chili oil is really appetizing.

However, prices are on the high side.

Fog Harbor Fish House 1-800-FLOWERS.COM – Elegant Flowers for Lovers

Presentation Quality Effectiveness Persuasiveness Readability Model Explainability Transparency Trust

slide-44
SLIDE 44

Problem Definition

  • Input
  • User set 𝑉, 𝑣 ∈ 𝑉 is a user
  • Item set 𝑊, 𝑤 ∈ 𝑊 is an item
  • A recommendation model to be explained 𝑔(𝑣, 𝑤)
  • Output
  • z is generated based on the selected components
  • Explanation 𝑨 = expgen

𝑣: user ID and user attributes 𝑗: item ID 𝑚𝑘: interpretable component

The 𝑘th interpretable component is selected The 𝑘th interpretable component is not selected

slide-45
SLIDE 45

Outline

Items 𝑊 Users 𝑉 Recommendation model 𝑔(𝑣, 𝑤) Explanation Method Explanation 𝑨 Recommended items ′

Can we enhance persuasiveness (presentation quality) in a data-driven way?

Users 𝑉 Explanation 𝑨 Explanation Method Recommended items … … Items 𝑊

Can we build an explainable deep model (enhance model explainability)? Can we design a pipeline which better balances presentation quality and model explainability?

Explainable Recommendation Through Attentive Multi-View Learning, AAAI 2019 A Reinforcement Learning Framework for Explainable Recommendation, ICDM 2018 Feedback Aware Generative Model, Shipped to Bing Ads, revenue increased by 0.5%

Recommendation model 𝑔(𝑣, 𝑤)

slide-46
SLIDE 46

Explainable Recommendation for Ads

Search Ads Native Ads / Outlook.com Native Ads / MSN Advertiser Platform

slide-47
SLIDE 47

Feedback Aware Generative Model

  • Traditional Seq2Seq model

𝑏𝑠𝑕max

𝜄

𝑗

𝑞(𝑧𝑗|𝑦𝑗; 𝜄)

  • Feedback aware model

𝑏𝑠𝑕𝑛𝑏𝑦

𝜄

𝑗

𝐹𝑧𝑗~𝑞 𝑧𝑗 𝑦𝑗; 𝜄 𝑠(𝑦𝑗, 𝑧𝑗)

𝒚𝒋 (input) 𝒛𝒋 (output) 𝒒(𝒛𝒋|𝒚𝒋; 𝜾) 𝑭𝒛𝒋~𝒒 𝒛𝒋 𝒚𝒋; 𝜾 𝒔(𝒚𝒋, 𝒛𝒋)

Input 𝒚𝒋 Output 𝒛𝒋 Reward 𝑠(∙) Ad title, category, keyword, sitelink title Ad title, Ad description, sitelink description CTR

Ad title: Flowers delivered today Category: Occasions & Gifts Elegant flowers for any occasion. 100% smile guarantee!

slide-48
SLIDE 48

Example Results

Input AdTitle US passport application Output AdDescriptions

Find US passport application and related articles. Search now! Quick & easy application. Apply for your passport online today! Quick & easy application. Find government passport application and related articles. Government passport application. Quick and easy to search results! Start your passport online today. Apply now & find the best results! Open your passport online today. 100% free tool!

Input AdTitle job applications online Output AdDescriptions

New: job application online. Apply today & find your perfect job! Now hiring - submit an application. Browse full & part time positions. 3 open positions left -- apply now! Jobs in your area Open positions left -- apply now! Job application online. 7 open positions left -- apply now! Jobs in your area Sales positions open. Hiring now - apply today!

The model has the ability to generate persuasive phrases Diversified results The model can differentiate similar inputs

slide-49
SLIDE 49

Explainable Recommendation Through Attentive Multi-View Learning

  • Existing methods are either “deep but unexplainable” or “explainable

but shallow”

  • We want to develop an explainable deep model which
  • Achieves the state-of-art accuracy and is also explainable
  • Models multi-level user interest in an unsupervised manner

26-year-old female user 30-year-old male user

IsA

slide-50
SLIDE 50

Model

Hierarchical Propagation (User-Feature Interest) Attentive Multi-View Learning Hierarchical Propagation (Item-Feature Quality) You might be interested in [features in E], on which this item performs well

slide-51
SLIDE 51

Data

Amazon Review: user, item, rating, review text, timestamp Amazon Yelp

slide-52
SLIDE 52

Accuracy

𝜇𝑤: weight for the co- regularization term

slide-53
SLIDE 53

Explainability

  • 20 participants, all Yelp users
  • Collect their Yelp reviews and generate personalized explanations
  • Ask them to rate the usefulness of each explanation
slide-54
SLIDE 54

Reinforcement Learning Framework for Explainable Recommendation

slide-55
SLIDE 55

Couple Agents

slide-56
SLIDE 56

Optimization Goal

Model explainability Presentation quality Reward 𝑠

slide-57
SLIDE 57

Evaluation

𝑁𝑑: presentation quality 𝑁𝑓: explainability

slide-58
SLIDE 58

Case Study

Words related to food Words related to services

Frequent words in reviews: User A User B

User A User B

slide-59
SLIDE 59

Conclusions and Future Work

  • Personalized recommendation systems will continue to develop in

various directions, including effectiveness, diversity, computational efficiency, and explainability

  • Develop an easy-to-use tool for implementing deep learning based

user representation and recommendation models

  • Collaborate with researchers in psychology, sociology and other

disciplines

slide-60
SLIDE 60

Thanks!