relevance feedback in web search
play

Relevance Feedback in Web Search Sergei Vassilvitskii (Stanford - PowerPoint PPT Presentation

Relevance Feedback in Web Search Sergei Vassilvitskii (Stanford University) Eric Brill (Microsoft Research) Introduction Web search is a non-interactive system. Exceptions are spell checking and query suggestions By design search


  1. Relevance Feedback in Web Search Sergei Vassilvitskii (Stanford University) Eric Brill (Microsoft Research)

  2. Introduction • Web search is a non-interactive system. • Exceptions are spell checking and query suggestions • By design search engines are stateless • But many searches become interactive: • query, get results back, reformulate query... • Can use interaction to retrieve user intent

  3. Relevance Feedback

  4. Using This Information • Classical methods: e.g. Rocchio’s term reweighing (TFiDF) + cosine similarity scores. • There is more information here: what can the structure of the web tell us?

  5. Hypothesis • For a given query: • Relevant pages tend to point to other relevant pages. ➡ Similar to Pagerank.

  6. Hypothesis • For a given query: • Relevant pages tend to point to other relevant pages. ➡ Similar to Pagerank. • Irrelevant pages tend to be pointed to by other irrelevant pages. ➡ “Reverse Pagerank” ➡ Those who point to web spam are likely to be spammers.

  7. Dataset • Dataset • 9500 queries • For each query 5 - 30 result URLs • each URL rated on a scale of 1 (poor) to 5 (perfect) • Total 150,000 (query, url, rating) triples • Will use this data to simulate relevance feedback • Only reveal the ratings for some URLs

  8. Hypothesis Validation Baseline • Relevance distribution of all URLs in the dataset 0.4 0.3 0.2 0.1 0 1 2 3 4 5

  9. Hypothesis Validation Baseline Perfect Targets • Relevance distribution of all URLs in the dataset 0.4 0.3 • Compared to the URLs that are targets 0.2 of perfect results 0.1 0 1 2 3 4 5

  10. Towards an Algorithm url 1 url 2 url 3 url 4 url 5 url 6

  11. Towards an Algorithm url 1 url 2 url 3 url 4 url 5 url 6 unrated result good result bad result

  12. Towards an Algorithm url 1 url 6 url 2 url 4 url 5 url 3 unrated result good result bad result

  13. Towards an Algorithm url 1 url 6 url 2 url 4 url 5 url 3 unrated result good result bad result

  14. Towards an Algorithm url 1 url 2 url 2 url 6 url 3 url 1 url 4 url 4 url 5 url 3 url 6 url 5 unrated result good result bad result

  15. Percolating the Ratings • Calculate the effect on u • Begin with a probability distribution on relevance of (Baseline histogram) u • For all highly rated documents v • If there exists a short path, update . v → u u • For all irrelevant documents v • If there exists a short path, update . u → v u • Combine the static score together with the relevance information

  16. Algorithm parameters • If there exists a “short” path... • Strength of signal decreases with length • Recall of the system increases with length • Computational considerations • Looked at paths of 4 hops or less

  17. Algorithm parameters • If there exists a “short” path... • Strength of signal decreases with length • Recall of the system increases with length • Computational considerations • Looked at paths of 4 hops or less • ...update . u • Maintain a probability distribution on the relevance of . u

  18. Experimental Setup • For each query in the dataset split the URLs into • Train: the relevance is revealed to the algorithm • Test: Only the static score is revealed • Compare the ranking of the test URLs by their static score vs. static + RF scores.

  19. Evaluation Measure • Measure: NDCG (Normalized Discounted Cumulative Gain): 2 rel ( i ) − 1 � NDCG ∝ log(1 + i ) i • Why NDCG? • sensitive to the position of highest rated page • Log-discounting of results • Normalized for different lengths lists

  20. Result Summary • NDCG change for Alg Rocchio three subsets of pages. 4 • Complete Dataset 3 2 1 0 -1 Roccio: Demotes the best result

  21. Result Summary • NDCG change for Alg Rocchio three subsets of pages. 4 • Complete Dataset 3 • Only queries with 2 NDCG < 100 1 0 -1

  22. Result Summary • NDCG change for Alg Rocchio three subsets of pages. 4 • Complete Dataset 3 • Only queries with 2 NDCG < 100 1 • Only queries with 0 NDCG < 85 -1 Increased performance for harder queries

  23. Result Summary (2) • Recall for the three Alg Rocchio datasets. 30.0 • Complete Dataset • Only Queries with 22.5 NDCG < 100 15.0 • Only Queries with NDCG < 85 7.5 0

  24. Results Summary (3) • Many more experiments: • How does the number of URLs rated affect the results? • Are some URLs better to rate than others? • Can we predict when recall will be low?

  25. Future Work • Hybrid Systems: Combining text based and link based RF approaches • Learning feedback based on clickthrough data • Large scale experimental evaluation of different RF approaches

  26. Thank You Any Questions?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend