query session detection as a cascade
play

Query Session Detection as a Cascade Matthias Hagen Benno Stein - PowerPoint PPT Presentation

Query Session Detection as a Cascade Matthias Hagen Benno Stein Tino R ub Bauhaus-Universit at Weimar matthias.hagen@uni-weimar.de CIKM 2011 Glasgow, Scotland October 25, 2011 Hagen, Stein, R ub Query Session Detection as a


  1. Query Session Detection as a Cascade Matthias Hagen Benno Stein Tino R¨ ub Bauhaus-Universit¨ at Weimar matthias.hagen@uni-weimar.de CIKM 2011 Glasgow, Scotland October 25, 2011 Hagen, Stein, R¨ ub Query Session Detection as a Cascade 1

  2. It’s quiz time! Hagen, Stein, R¨ ub Query Session Detection as a Cascade 2

  3. It’s quiz time! What is the user searching? paris hilton Hagen, Stein, R¨ ub Query Session Detection as a Cascade 2

  4. Without context . . . paris hilton source: [http://upload.wikimedia.org/wikipedia/commons/2/26/Paris Hilton 3 Crop.jpg] Hagen, Stein, R¨ ub Query Session Detection as a Cascade 3

  5. What if you knew the previous queries? paris hotels paris marriott paris hyatt paris hilton Hagen, Stein, R¨ ub Query Session Detection as a Cascade 4

  6. What if you knew the previous queries? paris hotels paris marriott paris hyatt paris hilton sources: [http://www.alison-anderson.com/wp-content/uploads/hilton hotel paris 2.jpg] [http://maps.google.com/] [http://upload.wikimedia.org/wikipedia/en/e/eb/HI mk logo hiltonbrandlogo.jpg] Hagen, Stein, R¨ ub Query Session Detection as a Cascade 4

  7. Query sessions: same information need The benefits Improved understanding of user intent Improved retrieval performance via session knowledge Hagen, Stein, R¨ ub Query Session Detection as a Cascade 5

  8. Query sessions: same information need The benefits Improved understanding of user intent Improved retrieval performance via session knowledge The“minor”issue Users do not announce when querying for a new information need. Hagen, Stein, R¨ ub Query Session Detection as a Cascade 5

  9. A typical query log User Query Click domain + Click rank Time 42 en.wikipedia.org 1 2011-10-22 20:34:17 istanbul 42 2011-10-23 12:02:54 istanbul archeology 42 www.turizm.tr 6 2011-10-23 12:03:15 istanbul archeology 42 www.arkeoloji.tr 13 2011-10-23 18:24:07 istanbul archeology 42 2011-10-23 19:12:40 constantinople 42 en.wikipedia.org 4 2011-10-23 19:13:02 constantinople 42 2011-10-23 19:16:01 soccr glasgo 42 2011-10-23 19:16:11 soccer glasgow 42 www.soccer.uk 3 2011-10-23 19:16:15 soccer glasgow 42 2011-10-23 20:33:04 celtics vs rangers 42 en.wikipedia.org 5 2011-10-23 20:33:12 celtics vs rangers 42 2011-10-23 22:42:48 old firm Hagen, Stein, R¨ ub Query Session Detection as a Cascade 6

  10. How to determine the break points? User Query Click domain + Click rank Time 42 en.wikipedia.org 1 2011-10-22 20:34:17 istanbul 42 2011-10-23 12:02:54 istanbul archeology 42 www.turizm.tr 6 2011-10-23 12:03:15 istanbul archeology 42 www.arkeoloji.tr 13 2011-10-23 18:24:07 istanbul archeology 42 2011-10-23 19:12:40 constantinople 42 en.wikipedia.org 4 2011-10-23 19:13:02 constantinople — — — — — — — — — — — — — — — — — — 42 2011-10-23 19:16:01 soccr glasgo 42 2011-10-23 19:16:11 soccer glasgow 42 www.soccer.uk 3 2011-10-23 19:16:15 soccer glasgow 42 2011-10-23 20:33:04 celtics vs rangers 42 en.wikipedia.org 5 2011-10-23 20:33:12 celtics vs rangers 42 2011-10-23 22:42:48 old firm Hagen, Stein, R¨ ub Query Session Detection as a Cascade 7

  11. The key is . . . Automatic query session detection Hagen, Stein, R¨ ub Query Session Detection as a Cascade 8

  12. Automatic query session detection Usual“technique” Check for consecutive queries whether same/new information need. Example 42 2011-10-22 20:34:17 istanbul � same 42 2011-10-23 18:24:07 istanbul archeology � same 42 2011-10-23 19:12:40 constantinople — — — — — — — — — � new 42 2011-10-23 19:16:11 soccer glasgow Hagen, Stein, R¨ ub Query Session Detection as a Cascade 9

  13. Typical features Temporal thresholds 5 minutes [Silverstein et al., 1999] 10–15 minutes [He and G¨ oker, 2000] 30 minutes [Downey et al., 2007] user specific [Murray et al., 2006] Lexical similarity n -gram overlap [Zhang and Moffat, 2006] Levenshtein distance [Jones and Klinkner, 2008] Semantic similarity Search results [Radlinski and Joachims, 2005] ESA [Lucchese et al., 2011] Hagen, Stein, R¨ ub Query Session Detection as a Cascade 10

  14. Previous methods Feature combinations More accurate than single features One of the best: Geometric method (time + lexical) [Gayo-Avello, 2009] Hagen, Stein, R¨ ub Query Session Detection as a Cascade 11

  15. Previous methods Feature combinations More accurate than single features One of the best: Geometric method (time + lexical) [Gayo-Avello, 2009] Shortcomings All features evaluated simultaneously → runtime Geometric method ignores semantics → accuracy Examples Subset test suffices Geometric method fails celtics vs rangers � same soccer � same soccer glasgow old firm Hagen, Stein, R¨ ub Query Session Detection as a Cascade 11

  16. We address the shortcomings in a cascade . . . source: [http://wp.ltchambon.com/wp-content/uploads/2010/09/Cascade-de-Tufs-Baume-les-messieurs-Jura.jpg] Hagen, Stein, R¨ ub Query Session Detection as a Cascade 12

  17. . . . well . . . a small 4-step cascade source: [http://www.solarshop.com/solarpix/Solar Cascade 4 Tier GreenL.jpg] Hagen, Stein, R¨ ub Query Session Detection as a Cascade 13

  18. . . . well . . . a small 4-step cascade Step 1: Subset test ց Step 2: Geometric method ց Step 3: ESA similarity ւ Step 4: Search results source: [http://www.solarshop.com/solarpix/Solar Cascade 4 Tier GreenL.jpg] Basic Idea Increased feature cost (runtime) from step to step. Expensive features only if previous steps“unreliable.” Hagen, Stein, R¨ ub Query Session Detection as a Cascade 13

  19. Step 1: Subset test User Query Click domain + Click rank Time 42 en.wikipedia.org 1 2011-10-22 20:34:17 istanbul 42 2011-10-23 12:02:54 istanbul archeology 42 www.turizm.tr 6 2011-10-23 12:03:15 istanbul archeology 42 www.arkeoloji.tr 13 2011-10-23 18:24:07 istanbul archeology — — — — — — — — — — — — — — — — — — 42 2011-10-23 19:12:40 constantinople 42 en.wikipedia.org 4 2011-10-23 19:13:02 constantinople — — — — — — — — — — — — — — — — — — 42 2011-10-23 19:16:01 soccr glasgo — — — — — — — — — — — — — — — — — — 42 2011-10-23 19:16:11 soccer glasgow 42 www.soccer.uk 3 2011-10-23 19:16:15 soccer glasgow — — — — — — — — — — — — — — — — — — 42 2011-10-23 20:33:04 celtics vs rangers 42 en.wikipedia.org 5 2011-10-23 20:33:12 celtics vs rangers — — — — — — — — — — — — — — — — — — 42 2011-10-23 22:42:48 old firm Hagen, Stein, R¨ ub Query Session Detection as a Cascade 14

  20. Step 2: Geometric method [Gayo-Avello, 2009] User Query Click domain + Click rank Time 42 en.wikipedia.org 1 2011-10-22 20:34:17 istanbul 42 2011-10-23 12:02:54 istanbul archeology 42 www.turizm.tr 6 2011-10-23 12:03:15 istanbul archeology 42 www.arkeoloji.tr 13 2011-10-23 18:24:07 istanbul archeology — — — — — — — — — — — — — — — — — — 42 2011-10-23 19:12:40 constantinople 42 en.wikipedia.org 4 2011-10-23 19:13:02 constantinople — — — — — — — — — — — — — — — — — — 42 2011-10-23 19:16:01 soccr glasgo 42 2011-10-23 19:16:11 soccer glasgow 42 www.soccer.uk 3 2011-10-23 19:16:15 soccer glasgow — — — — — — — — — — — — — — — — — — 42 2011-10-23 20:33:04 celtics vs rangers 42 en.wikipedia.org 5 2011-10-23 20:33:12 celtics vs rangers — — — — — — — — — — — — — — — — — — 42 2011-10-23 22:42:48 old firm Hagen, Stein, R¨ ub Query Session Detection as a Cascade 15

  21. Step 3: Explicit Semantic Analysis [Gabrilovich and Markovitch, 2007] User Query Click domain + Click rank Time 42 en.wikipedia.org 1 2011-10-22 20:34:17 istanbul 42 2011-10-23 12:02:54 istanbul archeology 42 www.turizm.tr 6 2011-10-23 12:03:15 istanbul archeology 42 www.arkeoloji.tr 13 2011-10-23 18:24:07 istanbul archeology 42 2011-10-23 19:12:40 constantinople 42 en.wikipedia.org 4 2011-10-23 19:13:02 constantinople — — — — — — — — — — — — — — — — — — 42 2011-10-23 19:16:01 soccr glasgo 42 2011-10-23 19:16:11 soccer glasgow 42 www.soccer.uk 3 2011-10-23 19:16:15 soccer glasgow 42 2011-10-23 20:33:04 celtics vs rangers 42 en.wikipedia.org 5 2011-10-23 20:33:12 celtics vs rangers — — — — — — — — — — — — — — — — — — 42 2011-10-23 22:42:48 old firm Hagen, Stein, R¨ ub Query Session Detection as a Cascade 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend