Search Evaluation at Grooveshark Yoni Teitelbaum 2013-07-02 - - PowerPoint PPT Presentation
Search Evaluation at Grooveshark Yoni Teitelbaum 2013-07-02 - - PowerPoint PPT Presentation
Search Evaluation at Grooveshark Yoni Teitelbaum 2013-07-02 Traditional Evaluation: TREC Image Courtesy of TREC, http://trec.nist.gov Disadvantages of TREC-Style Evaluation Methods 1. Expensive: a. e.g., 2005 GOV2 collection i. > 45k
Traditional Evaluation: TREC
Image Courtesy of TREC, http://trec.nist.gov
Disadvantages of TREC-Style Evaluation Methods
- 1. Expensive:
- a. e.g., 2005 GOV2 collection
- i. > 45k judgments2
- ii. > 25 million documents3
- 2. Mostly news articles
- a. significantly different data set than GS songs
database
GS Weaknesses: Small Team, Few Resources
GS Strengths:
We’ve got a huge audience!
A/B Testing Using Click Data
Song 1 Song 2 Song 3 Song 4 Song 2 Song 3 Song 1 Song 4 A Group Sees: B Group Sees:
What to Measure?
- Average Rank of Click?
- Bounce Rate (% of Searches Without a
Click)
- Average Amount of Time Spent on Search
Page?
- Median Rank of Click?
- ...?
So Which One's Better?
"Gold Standard" Algorithms4
Song 7 Song 2 Song 3 Song 5 Song 4 Song 6 Song 1 Song 8
Low Power on Conventional Metrics
Image courtesy of Radlinski, Kurup, and Joachims, 2008.
Low Power Cont'd
Image courtesy of Radlinski, Kurup, and Joachims, 2008.
Interleaving Method5
Algorithm A Algorithm B
Song 1A Song 1B Song 2A Song 3A Song 2B Song 3B
Interleaving Method
User Sees...
Song 1A Song 1B Song 2A Song 3A Song 2B Song 3B
R Script to Process Results
Results From Interleaving Test
The Whole Stack
HTML client (javascript) Server (PHP) HIVE / Hadoop (SQL) Binomial Test (R Script)
References
1. Text Retrieval Conference. http://trec.nist.gov/ 2. TREC list of judgments for 2005 ad hoc query track. http://trec.nist. gov/data/terabyte/05/05.adhoc_qrels 3. University of Glasgow, Information Retrieval Group http://ir.dcs.gla.ac. uk/test_collections/gov2-summary.htm 4.
- F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data
reflect retrieval quality? In Conference on Information and Knowledge Management (CIKM), 2008. 5.
- T. Joachims. Evaluating retrieval performance using clickthrough data. In
- J. Franke, G. Nakhaeizadeh, and I. Renz, editors, Text Mining, pages 79-
- 96. Physica/Springer Verlag, 2003.