Recommender Systems: The Power of Personalization Presenter - - PowerPoint PPT Presentation

recommender systems the power of personalization
SMART_READER_LITE
LIVE PREVIEW

Recommender Systems: The Power of Personalization Presenter - - PowerPoint PPT Presentation

Recommender Systems: The Power of Personalization Presenter Moderator Dr. Joseph A. Konstan Dr. Gary M. Olson University of Minnesota University California, Irvine konstan@cs.umn.edu golson@uci.edu ACM Learning Center ( http: / /


slide-1
SLIDE 1

Recommender Systems: The Power of Personalization

Presenter Moderator

  • Dr. Joseph A. Konstan
  • Dr. Gary M. Olson

University of Minnesota University California, Irvine konstan@cs.umn.edu golson@uci.edu

slide-2
SLIDE 2

ACM Learning Center ( http: / / learning.acm.org)

  • 1,300+ trusted technical books and videos by leading

publishers including O’Reilly, Morgan Kaufmann, others

  • Online courses with assessments and certification-track

mentoring, member discounts at partner institutions

  • Learning W ebinars on big topics (Cloud Computing/ Mobile

Development, Cybersecurity, Big Data)

  • ACM Tech Packs on big current computing topics: Annotated

Bibliographies compiled by subject experts

  • Learning Paths (accessible entry points into popular languages)
  • Popular video tutorials/ keynotes from ACM Digital Library,

Podcasts with industry leaders/ award winners

slide-3
SLIDE 3
slide-4
SLIDE 4

A Bit of History

  • Ants, Cavemen, and Early Recommender Systems

– The emergence of critics

  • Information Retrieval and Filtering
  • Manual Collaborative Filtering
  • Automated Collaborative Filtering
  • The Commercial Era
slide-5
SLIDE 5

A Bit of History

  • Ants, Cavemen, and Early Recommender Systems

– The emergence of critics

  • Information Retrieval and Filtering
  • Manual Collaborative Filtering
  • Automated Collaborative Filtering
  • The Commercial Era
slide-6
SLIDE 6

Information Retrieval

  • Static content base

– Invest time in indexing content

  • Dynamic information need

– Queries presented in “real time”

  • Common approach: TFIDF

term frequency inverse document frequency

– Rank documents by term overlap – Rank terms by frequency

slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9

Information Filtering

  • Reverse assumptions from IR

– Static information need – Dynamic content base

  • Invest effort in modeling user need

– Hand-created “profile” – Machine learned profile – Feedback/ updates

  • Pass new content through filters
slide-10
SLIDE 10
slide-11
SLIDE 11

A Bit of History

  • Ants, Cavemen, and Early Recommender Systems

– The emergence of critics

  • Information Retrieval and Filtering
  • Manual Collaborative Filtering
  • Automated Collaborative Filtering
  • The Commercial Era
slide-12
SLIDE 12

Collaborative Filtering

  • Premise

– Information needs more complex than keywords or topics: quality and taste

  • Small Community: Manual

– Tapestry – database of content & comments – Active CF – easy mechanisms for forwarding content to relevant readers

slide-13
SLIDE 13

A Bit of History

  • Ants, Cavemen, and Early Recommender Systems

– The emergence of critics

  • Information Retrieval and Filtering
  • Manual Collaborative Filtering
  • Automated Collaborative Filtering
  • The Commercial Era
slide-14
SLIDE 14

Automated CF

  • The GroupLens Project (CSCW ’94)

– ACF for Usenet News

  • users rate items
  • users are correlated with other users
  • personal predictions for unrated items

– Nearest-Neighbor Approach

  • find people with history of agreement
  • assume stable tastes
slide-15
SLIDE 15

Usenet Interface

slide-16
SLIDE 16

Does it Work?

  • Yes: The numbers don’t lie!

– Usenet trial: rating/ prediction correlation

  • rec.humor: 0.62 (personalized) vs. 0.49 (avg.)
  • comp.os.linux.system: 0.55 (pers.) vs. 0.41 (avg.)
  • rec.food.recipes: 0.33 (pers.) vs. 0.05 (avg.)

– Significantly more accurate than predicting average or modal rating. – Higher accuracy when partitioned by newsgroup

slide-17
SLIDE 17

It Works Meaningfully Well!

  • Relationship with User Behavior

– Twice as likely to read 4/ 5 than 1/ 2/ 3

  • Users Like GroupLens

– Some users stayed 12 months after the trial!

slide-18
SLIDE 18

A Bit of History

  • Ants, Cavemen, and Early Recommender Systems

– The emergence of critics

  • Information Retrieval and Filtering
  • Manual Collaborative Filtering
  • Automated Collaborative Filtering
  • The Commercial Era
slide-19
SLIDE 19
slide-20
SLIDE 20
slide-21
SLIDE 21

Amazon.com

slide-22
SLIDE 22
slide-23
SLIDE 23

Recommenders

  • Tools to help identify worthwhile stuff

– Filtering interfaces

  • E-mail filters, clipping services

– Recommendation interfaces

  • Suggestion lists, “top-n,” offers and promotions

– Prediction interfaces

  • Evaluate candidates, predicted ratings
slide-24
SLIDE 24

Historical Challenges

  • Collecting Opinion and Experience Data
  • Finding the Relevant Data for a Purpose
  • Presenting the Data in a Useful Way
slide-25
SLIDE 25

Recommender Application Space

slide-26
SLIDE 26

Scope of Recommenders

  • Purely Editorial Recommenders
  • Content Filtering Recommenders
  • Collaborative Filtering Recommenders
  • Hybrid Recommenders
slide-27
SLIDE 27

Recommender Application Space

  • Dimensions of Analysis

– Domain – Purpose – Whose Opinion – Personalization Level – Privacy and Trustworthiness – Interfaces – < Algorithms Inside>

slide-28
SLIDE 28

Domains of Recommendation

  • Content to Commerce

– News, information, “text” – Products, vendors, bundles

slide-29
SLIDE 29

Google: Content Example

slide-30
SLIDE 30

C H

slide-31
SLIDE 31

Purposes of Recommendation

  • The recommendations themselves

– Sales – Information

  • Education of user/ customer
  • Build a community of users/ customers around products or

content

slide-32
SLIDE 32

Buy.com customers also bought

slide-33
SLIDE 33

Epinions Sienna overview

slide-34
SLIDE 34

OWL Tips

slide-35
SLIDE 35

ReferralWeb

slide-36
SLIDE 36

Whose Opinion?

  • “Experts”
  • Ordinary “phoaks”
  • People like you
slide-37
SLIDE 37

Wine.com Expert recommendations

slide-38
SLIDE 38

PHOAKS

slide-39
SLIDE 39

Personalization Level

  • Generic

– Everyone receives same recommendations

  • Demographic

– Matches a target group

  • Ephemeral

– Matches current activity

  • Persistent

– Matches long-term interests

slide-40
SLIDE 40

Lands’ End

slide-41
SLIDE 41

Brooks Brothers

slide-42
SLIDE 42

Amazon.com

slide-43
SLIDE 43

Cdnow album advisor

slide-44
SLIDE 44

CDNow Album advisor recommendations

slide-45
SLIDE 45
slide-46
SLIDE 46

Privacy and Trustworthiness

  • Who knows what about me?

– Personal information revealed – Identity – Deniability of preferences

  • Is the recommendation honest?

– Biases built-in by operator

  • “business rules”

– Vulnerability to external manipulation

slide-47
SLIDE 47

Interfaces

  • Types of Output

– Predictions – Recommendations – Filtering – Organic vs. explicit presentation

  • Agent/ Discussion Interface Example
  • Types of Input

– Explicit – Implicit

slide-48
SLIDE 48

Wide Range of Algorithms

  • Simple Keyword Vector Matches
  • Pure Nearest-Neighbor Collaborative Filtering
  • Machine Learning on Content or Ratings
slide-49
SLIDE 49

Collaborative Filtering: Techniques and Issues

slide-50
SLIDE 50

Collaborative Filtering Algorithms

  • Non-Personalized Sum m ary Statistics
  • K-Nearest Neighbor
  • Dimensionality Reduction
  • Content + Collaborative Filtering
  • Graph Techniques
  • Clustering
  • Classifier Learning
slide-51
SLIDE 51

Teaming Up to Find Cheap Travel

  • Expedia.com

– “data it gathers anyway” – (Mostly) no cost to helper – Valuable information that is otherwise hard to acquire – Little processing, lots of collaboration

slide-52
SLIDE 52

Expedia Fare Compare # 1

slide-53
SLIDE 53

Expedia Fare Compare # 2

slide-54
SLIDE 54

Zagat Guide Amsterdam Overview

slide-55
SLIDE 55

Zagat Guide Detail

slide-56
SLIDE 56

Zagat: Is Non-Personalized Good Enough?

  • What happened to my favorite guide?

– They let you rate the restaurants!

  • What should be done?

– Personalized guides, from the people who “know good restaurants!”

slide-57
SLIDE 57

Collaborative Filtering Algorithms

  • Non-Personalized Summary Statistics
  • K-Nearest Neighbor

– user-user – item-item

  • Dimensionality Reduction
  • Content + Collaborative Filtering
  • Graph Techniques
  • Clustering
  • Classifier Learning
slide-58
SLIDE 58

CF Classic: K-Nearest Neighbor User-User

C.F. Engine

Ratings Correlations

slide-59
SLIDE 59

CF Classic: Submit Ratings

C.F. Engine

Ratings Correlations ratings

slide-60
SLIDE 60

CF Classic: Store Ratings

C.F. Engine

Ratings Correlations ratings

slide-61
SLIDE 61

CF Classic: Compute Correlations

C.F. Engine

Ratings Correlations pairwise corr.

slide-62
SLIDE 62

CF Classic: Request Recommendations

C.F. Engine

Ratings Correlations request

slide-63
SLIDE 63

CF Classic: Identify Neighbors

C.F. Engine

Ratings Correlations find good … Neighborhood

slide-64
SLIDE 64

CF Classic: Select Items; Predict Ratings

C.F. Engine

Ratings Correlations Neighborhood

predictions recommendations

slide-65
SLIDE 65

Understanding the Computation

Hoop Dreams Star Wars Pretty Woman Titanic Blimp Rocky XV

Joe

D A B D ? ?

John

A F D F

Susan

A A A A A A

Pat

D A C

Jean

A C A C A

Ben

F A F

Nathan

D A A

slide-66
SLIDE 66

Hoop Dreams Star Wars Pretty Woman Titanic Blimp Rocky XV

Joe

D A B D ? ?

John

A F D F

Susan

A A A A A A

Pat

D A C

Jean

A C A C A

Ben

F A F

Nathan

D A A

Understanding the Computation

slide-67
SLIDE 67

Hoop Dreams Star Wars Pretty Woman Titanic Blimp Rocky XV

Joe

D A B D ? ?

John

A F D F

Susan

A A A A A A

Pat

D A C

Jean

A C A C A

Ben

F A F

Nathan

D A A

Understanding the Computation

slide-68
SLIDE 68

Hoop Dreams Star Wars Pretty Woman Titanic Blimp Rocky XV

Joe

D A B D ? ?

John

A F D F

Susan

A A A A A A

Pat

D A C

Jean

A C A C A

Ben

F A F

Nathan

D A A

Understanding the Computation

slide-69
SLIDE 69

Hoop Dreams Star Wars Pretty Woman Titanic Blimp Rocky XV

Joe

D A B D ? ?

John

A F D F

Susan

A A A A A A

Pat

D A C

Jean

A C A C A

Ben

F A F

Nathan

D A A

Understanding the Computation

slide-70
SLIDE 70

Hoop Dreams Star Wars Pretty Woman Titanic Blimp Rocky XV

Joe

D A B D ? ?

John

A F D F

Susan

A A A A A A

Pat

D A C

Jean

A C A C A

Ben

F A F

Nathan

D A A

Understanding the Computation

slide-71
SLIDE 71

Hoop Dreams Star Wars Pretty Woman Titanic Blimp Rocky XV

Joe

D A B D ? ?

John

A F D F

Susan

A A A A A A

Pat

D A C

Jean

A C A C A

Ben

F A F

Nathan

D A A

Understanding the Computation

slide-72
SLIDE 72

ML-home

slide-73
SLIDE 73

ML-scifi-search

slide-74
SLIDE 74

ML-clist

slide-75
SLIDE 75

ML-rate

slide-76
SLIDE 76

ML-search

slide-77
SLIDE 77

ML-buddies

slide-78
SLIDE 78

User-User Collaborative Filtering

?

Target Customer

Weighted Sum

3

slide-79
SLIDE 79

A Challenge: Sparsity

  • Many E-commerce and content applications have many

more customers than products

  • Many customers have no relationship
  • Most products have some relationship
slide-80
SLIDE 80

Another challenge: Synonymy

– Similar products treated differently

  • Have skim milk? Want whole milk too?

– Increases apparent sparsity – Results in poor quality

slide-81
SLIDE 81

Item-Item Collaborative Filtering

I I I I I I I I I I I I I I I I I

slide-82
SLIDE 82

Item-Item Collaborative Filtering

I I I I I I I I I I I I I I I I I

slide-83
SLIDE 83

Item-Item Collaborative Filtering

I I I I I I I I I I I I I I I I I

slide-84
SLIDE 84

Item Similarities 1 2 3 i n-1 n 1 2 u m m-1 j R

  • R
  • R R

R R R R

si,j=?

Used for similarity computation

slide-85
SLIDE 85

1 2 u m

2nd 1st 3rd 5th 4th

5 closest neighbors

R R R R

  • u

R R R R

i 1 2 3 i-1 m-1 m

si,1 si,3 si,i-1 si,m

  • prediction

weighted sum regression-based

Raw scores for prediction generation Approximation based on linear regression Target item

Item-Item Matrix Formulation

slide-86
SLIDE 86

Item-Item Discussion

  • Good quality, in sparse situations
  • Promising for incremental model building

– Small quality degradation

  • Nature of recommendations changes

– Big performance gain

slide-87
SLIDE 87

Collaborative Filtering Algorithms

  • Non-Personalized Summary Statistics
  • K-Nearest Neighbor
  • Dimensionality Reduction

– Singular Value Decomposition – Factor Analysis

  • Content + Collaborative Filtering
  • Graph Techniques
  • Clustering
  • Classifier Learning
slide-88
SLIDE 88

Dimensionality Reduction

  • Latent Semantic Indexing

– Used by the IR community – Worked well with the vector space model – Used Singular Value Decomposition (SVD)

  • Main Idea

– Term-document matching in feature space – Captures latent association – Reduced space is less noisy

slide-89
SLIDE 89

SVD: Mathematical Background

= R m X n U m X r S r X r V’ r X n Sk k X k Uk m X k Vk’ k X n

The reconstructed matrix Rk = Uk.Sk.Vk’ is the closest rank-k matrix to the original matrix R.

Rk

slide-90
SLIDE 90

SVD for Collaborative Filtering

.

  • 2. Direct

Prediction m x n

  • 1. Low dimensional representation

O(m+n) storage requirement m x k k x n

slide-91
SLIDE 91

Singular Value Decomposition

Reduce dimensionality of problem

– Results in small, fast model – Richer Neighbor Network

Incremental Update

– Folding in – Model Update

Trend

– Towards use of probabilistic LSI

slide-92
SLIDE 92

Collaborative Filtering Algorithms

  • Non-Personalized Summary Statistics
  • K-Nearest Neighbor
  • Dimensionality Reduction
  • Content + Collaborative Filtering
  • Graph Techniques

– Horting: Navigate Similarity Graph

  • Clustering
  • Classifier Learning

– Rule-Induction Learning – Bayesian Belief Networks

slide-93
SLIDE 93

Resources

  • Survey Articles

– Recommender Systems: From Algorithms to User Experience (2012): http: / / www.grouplens.org/ node/ 480 – Collaborative Filtering Recommender Systems (2011): http: / / www.grouplens.org/ node/ 475

  • Books

– Recommender Systems: An Introduction (2010) buy Jannach et al. – Recommender Systems Handbook (2010) by Ricci et al.

  • Software Tools

– LensKit – http: / / lenskit.grouplens.org – MyMedia – http: / / www.mymediaproject.org – Mahout – http: / / mahout.apache.org

slide-94
SLIDE 94

ACM: The Learning Continues

  • Questions about this webinar? learning@acm.org
  • ACM Learning Center: http: / / learning.acm.org
  • ACM SIGCHI: http: / / www.sigchi.org
  • ACM Conference on Recommender Systems

http: / / recsys.acm.org