modeling user behavior and interactions m d li u b h i d
play

Modeling User Behavior and Interactions M d li U B h i d I t ti - PowerPoint PPT Presentation

Modeling User Behavior and Interactions M d li U B h i d I t ti Lecture 4: Search Personalization Eugene Agichtein Emory University Lecture 4 Outline 1. Approaches to Search Personalization 2 Dimensions of Personalization 2. Dimensions


  1. Modeling User Behavior and Interactions M d li U B h i d I t ti Lecture 4: Search Personalization Eugene Agichtein Emory University

  2. Lecture 4 Outline 1. Approaches to Search Personalization 2 Dimensions of Personalization 2. Dimensions of Personalization 1. Which queries to personalize? 2 2. What input to use for personalization? What input to use for personalization? 3. Granularity: personalization vs. groupization 4. Context: Geograpical, search session 4 C G i l h i Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 2

  3. Approaches to Personalization pp 1. Pitkow et al., 2002 2. Qiu et al., 2006 3. Jeh et al., 2003 5 4. Teevan et al., 2005 5. Das et al., 2007 1 3 2 4 Figure adapted from: Personalized search on the world wide web , by Micarelli, A. and Gasparetti, F. and Sciarrone, F. and Gauch, S., LNCS 2007 Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 3

  4. When to Personalize Figure adapted from: Personalized search on the world wide web , by Micarelli, A. and Gasparetti, F. and Sciarrone, F. and Gauch, S., LNCS 2007 Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 4

  5. Example: Outride p From Pitkow et al., 2002 Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 5

  6. Outride (Results) ( ) From Pitkow et al., 2002 Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 6

  7. Input to Personalization p • Behavior (clicks): Qiu and Cho, 2006 – Use clicks to tune a personalized (topic sensitive) PageRank model for each user – Use personalized PageRank to re-rank web search results • Profile (user model): SeeSaw (Teevan et al., 2005) ( ) ( , ) Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 7

  8. PageRank Computation g p I: Set of Incoming links O: Set of Outgoing links O: Set of Outgoing links c: Dampening factor (~0.15) or “teleportation probability” E: Some probability vector over the Webpages p y p g q PR(q) ∑ ∑ q PR(p) = (1-c) PR(p) (1 c) +c E(p) +c E(p) p p ⋅ ⋅ O(q) q I(p) q ∈ E vector can be: E vector can be: � Uniformly distributed probabilities over all Web Page (democratic) � Biased distributed probabilities to a number of important pages • Top-levels of Web Servers Top levels of Web Servers • Hub/ Authority pages � Used for Customization (Personalization) Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 8

  9. Topic-Sensitive PageRank • • Uninfluenced PageRank Influenced PageRank “Page is important if many g p f y “Page is important if many important g p f y p important pages point to it” pages point to it, and btw, the following are by definition important pages.” Main Idea � Assign multiple a-priori “importance” estimates to pages with respect to a set of topics t t t f t i � One PageRank score per basis topic • Query specific rank score (+) Q y p ( ) • Make use of context (+) • Inexpensive at runtime (+) 9

  10. PageRank vs Topic-Sensitive PageRank PageRank query Input: Query Processor y Web graph G Web graph G Web graph page → rank Output: Query-time Rank vector r : (page → page r : (page → page P PageRank() R k() importance) Offline Topic-Sensitive PageRank query context context Input: Web W, Basis topics [c1, ... ,c16] Query Processor e.g. 16 categories (first level Web graph (Page, topic) of ODP) → rank topic k Classifier Query-time Output: List of rank vectors [r1, ... TSPageRank() ,r16] r j : page → page importance in topic c j Yahoo! Offline or ODP Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 10

  11. Input to Personalization p • Behavior (clicks): Qiu and Cho, 2006 – Use clicks to tune a personalized (topic sensitive) PageRank model for each user � Map clicked results to ODP – Use personalized PageRank to re-rank web search results l • Profile (user model): SeeSaw (Teevan et al., 2005) Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 11

  12. PS Search Engine (Profile-based) [Teevan et al [Teevan et al., 2005] 2005] bellevue User profile: Content, interaction history Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

  13. Result Re-Ranking • Ensures privacy • Good evaluation framework • Can look at rich user profile Can look at rich user profile • Look at light weight user models – Collected on server side Collected on server side – Sent as query expansion Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

  14. BM25 with Relevance Feedback BM25 with Relevance Feedback Score = Σ tf i * w i N n i R r i (r i +0.5)(N-n i -R+r i +0.5) w i = log (n i -r i +0.5)(R-r i +0.5) Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

  15. User Model as Relevance Feedback Score = Σ tf i * w i N R N’ = N+R r r i n i n i ’ = n i +ri (r i +0.5)(N’-n i ’-R+r i +0.5) w i = log (n i ’- r i +0.5)(R-r i +0.5) Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

  16. User Model as Relevance Feedback World Focused Matching World Focused Matching World Score = Σ tf i * w i N User Web related to query R r r n i R i n i N User related r i to query to query Query Focused Matching Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

  17. User Representation p • Stuff I’ve Seen (SIS) index – MSR research project [Dumais, et al.] – Index of everything a user’s seen • Recently indexed documents • Web documents in SIS index Web documents in SIS index • Query history • None Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

  18. World Representation p • Document Representation – Full text – Title and snippet • Corpus Representation – Web Web – Result set – title and snippet – Result set – full text Result set full text Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

  19. Parameters • Matching Query focused World focused All SIS Recent SIS • User representation User representation Web SIS Web SIS Query history None • World representation W ld t ti Full text Title and snippet • Query expansion Web Result set – full text Result set – title and snippet Result set title and snippet Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

  20. Results: Seesaw Improves Retrieval p 0.6 0.6 � No user 0.5 model � Random 0.4 DCG � Relevance Relevance 0.3 D Feedback 0.2 � Seesaw � Seesaw 0 1 0.1 0 None None Rand Rand RF RF SS SS Web Web Combo Combo Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

  21. Results: Feature Contribution 0.6 0.6 0.5 0.4 DCG 0.3 D 0.2 0 1 0.1 0 None None Rand Rand RF RF SS SS Web Web Combo Combo Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

  22. Summary � Rich user model important for search � Rich user model important for search personalization � Seesaw improves text based retrieval � Seesaw improves text based retrieval � Need other features 1 � to improve Web t i W b future 0.8 � Lots of room 0.6 � for improvement 0.4 0.2 0 None SS Web Group ? Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

  23. Evaluating Personalized Search • Explicit judgments (offline and in situ) – Evaluate components before system Evaluate components before system – NOTE: What’s relevant for you • Deploy system Deploy system – Verbatim feedback, Questionnaires, etc. – Measure behavioral interactions (e.g., click, reformulation, abandonment etc ) abandonment, etc.) – Click biases –order, presentation, etc. – Interleaving for unbiased clicks g • Link implicit and explicit (Curious Browser plugin) • Beyond a single query -> sessions and beyond Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 23

  24. User Control in Personalization (RF) ( ) J-S Ahn P Brusilovsky D He and S Y Syn Open user profiles for J S. Ahn, P. Brusilovsky, D. He, and S.Y. Syn. Open user profiles for adaptive news systems: Help or harm? WWW 2007 Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 24

  25. Study: Comparing Personalization Strategies [ D [ Dou et al., 2007] t l 2007] • 10,000 users, 56,000 queries, and 94,000 clicks over 12 days. • Used the first 11 days' worth of data to form user profiles and clicks. • Simulated the application of five different personalization algorithms on the remaining 4,600 queries from the last day of the log. • Retrieved top 50 results for each query from the comparison search engine and assumed that clicking a link indicated a relevance judgment for the query Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 25

  26. Results: Which Strategy is Most Effective? [ D [ Dou et al., 2007] t l 2007] • Compared two click-based (behavior) personalization strategies to three profile-based strategies • Click-based strategies appear more effective than profile-based (but carefully combining p y g historical profile data helps slightly) • Search context crucial Search context crucial • Personalization effectiveness varies by query • Evaluated using naïve click models E l t d i ï li k d l Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend