query understanding in web search
play

Query Understanding in Web Search - by Large Scale Log Data Mining - PowerPoint PPT Presentation

COLING 2010 NLPIX Workshop August 28, 2010 Query Understanding in Web Search - by Large Scale Log Data Mining and Statistical Learning Hang Li Microsoft Research Asia Joint Work with Colleagues, Interns, Collaborators 1 Web Search is Part


  1. COLING 2010 NLPIX Workshop August 28, 2010 Query Understanding in Web Search - by Large Scale Log Data Mining and Statistical Learning Hang Li Microsoft Research Asia Joint Work with Colleagues, Interns, Collaborators 1

  2. Web Search is Part of Our Life 2

  3. Search System = `Black Boxes’ 3

  4. Advanced Web Search Technologies Are Used… Natural Language Processing Information Retrieval Data Mining Statistical Learning Large Scale Distributed Computing 4

  5. Web Search Relies on NLP and IR • Query Understanding – Classification, structure prediction, topic modeling, similarity learning • Document Understanding – Classification, structure prediction, topic modeling, learning on graph • Query Document Matching – Language model, similarity learning • Ranking – Learning to rank • User Understanding – Classification, topic modeling 5

  6. Query Understanding • Input: query • Output: query representation – Refined query (e.g., spelling error correction) – Similar queries – Categories – Topics – Key phrases – Named entities 6

  7. This Talk = Query Understanding 7

  8. Talk Outline: Three Projects • LOGAL: Search and Browse Log Mining Platform • Semantic Matching: Improving Tail Query Relevance • Context aware Search: Better Search Using Context Information 2010/8/30 8

  9. PROJECT: LOGAL (LOG OBJECT GALLERY) Joint work with Daxin Jiang, Xiaohui Sun 9

  10. LOGAL Search and Browse Log Mining Platform 10

  11. Data Structure of Search/Browse Logs Users • Various types of data • Complex relationship Sessions among data objects – Hierarchical Queries relationship – Sequential • • Search • • • • result pages • • relationship • • Srch/Ads clicks Follow- up clicks

  12. Rich Log Mining Applications Query Expansion Query Suggestion Query Query Substitution Understanding Query Classification Search Keyword Generation Applications Document Annotation Document Document Classification Log Mining Understanding Applications Document Summarization Search Results Clustering Search UI Design User Behavior Targeting Ads Understanding Personalized Search Applications User Satisfaction Prediction Contextual Advertising Query-Doc Ad Click-Through Prediction Matching Web Site Recommendation Document & Ad (re-)Ranking Search Results Diversification 12

  13. The Problem Query Document User Query-Doc Apps Understanding Understanding Understanding Matching A huge gap between the data and the applications Log Data Search Log Ads. Log Web Site Log Toolbar Data 13

  14. The Problem Query Document User Query-Doc Apps Understanding Understanding Understanding Matching Each researcher or developer 1. Has to access the raw log data directly 2. Has to build the application from scratch Very difficult to build large-scale log mining applications Log Data Search Log Ads. Log Web Site Log Toolbar Data 14

  15. Log Data Mining Platform Query User Query Doc Document App Understanding Understanding Matching Understanding Level Data Platform Middle Level Log Objects Gallery (LOGAL) Raw Search Toolbar Ads. Web Site Data Raw Logs Log Data Log Log Level 15

  16. Query Histogram Query Count facebook 3,157 K google 1,796 K youtube 1,162 K myspace 702 K facebook com 665 K yahoo 658 K yahoo mail 486 K yahoo com 486 K ebay 486 K facebook login 445 K Example applications: • Query auto completion • Query suggestion • Query analysis: temporal changes of query frequency

  17. Click-through Bipartite • Example applications – Document (re-)ranking – Search results clustering – Web page summarization – Query suggestion click-through bipartite

  18. Click Pattern Query   Doc 1 Doc 1 Doc 1   Doc 2 Doc 2 Doc 2 … … …   … … … … … … … … … … … … … … … … … … …  Doc N Doc N Doc N Pattern 1 Pattern 2 Pattern n (count) (count) (count) • Example applications – Estimate relevance of document to query – Predict users’ satisfaction – Query classification (informational vs navigational)

  19. Session Pattern Click: Query Srch click • Example applications Ads click – Doc (re-)ranking … – Query suggestion – Site recommendation – User satisfaction prediction User activities in a session Browse Srch click: search click Ads click: advertisement click

  20. PROJECT: SEMANTIC MATCHING Joint work with Gu Xu, Jun Xu, Jingfang Xu 20

  21. Semantic Matching Improving Tail Query Relevance 21

  22. Different Queries Can Represent Same Intent “Distance between Sun and Earth” - Luke DeLorme • • • distance from earth to the sun how far away is the sun from earth "how far" earth sun • • distance from sun to earth how far away is the sun from the • "how far" sun earth • distance from sun to the earth • "how far" sun earth • how far earth from sun • distance from the earth to the sun • average distance earth sun • how far earth is from the sun • distance from the sun to earth • average distance from earth to sun • how far earth sun • distance from the sun to the earth • average distance from the earth to the sun • how far from earth is the sun • distance of earth from sun • distance between earth & sun • how far from earth to sun • distance of earth from the sun • • distance between earth and sun how far from the earth to the sun • distance of earth to sun • • distance between earth and the sun how far from the sun is earth • distance of earth to the sun • • how far from the sun is the earth distance between earth sun • distance of sun from earth • how far is earth away from the sun • • distance between sun and earth distance of sun from the earth • how far is earth from sun • • distance between the earth and sun distance of sun to earth • how far is earth from the sun • • distance of the earth from the sun distance between the earth and the sun • • how far is earth to the sun distance of the earth to the sun • distance between the sun and earth • • how far is it from earth to the sun distance of the sun from earth • distance between the sun and the earth • • how far is it from the earth to the sun distance of the sun from the earth • distance earth and sun • • how far is sun from earth distance of the sun to earth • distance earth from sun • • how far is the earth away from the distance of the sun to the earth • distance earth is from the sun sun • distance sun • • distance earth sun how far is the earth from sun • distance sun and earth • • how far is the earth from the sun distance earth to sun • distance sun earth • how far is the earth to the sun • distance earth to the sun • distance sun from earth • how far is the sun • distance from earth to sun • distance sun to earth • how far is the sun away from earth • • distance from earth to the sun distance to sun from earth • how far is the sun away from the • • distance from sun to earth distance to the sun from earth earth • • earth and sun distance distance from sun to the earth • distance from the earth to the sun • distance from the sun to earth Microsoft Confidential

  23. Different Levels of Semantic Matching Match intent with answers (structures of query and document) Microsoft Office home find homepage of Microsoft Office Level of Semantics 21 movie find movie named 21 buy laptop less than 1000 find online dealers to buy Structure laptop with less than 1000 dollars Match topics of query and documents Microsoft Office … working for Microsoft … my office is in … Topic Topic: PC Software Topic: Personal Homepage Match terms with same meanings utube youtube Word Sense NY New York motherboard mainboard Match exactly same terms NY New York Term disk disc 23

  24. Semantic Matching Is Useful for • General Search Relevance • Vertical Search • Entity Search • Task Completion 24

  25. SYSTEM VIEW OF SEMANTIC MATCHING 25

  26. Overall System Microsoft Query Ranked Documents Query Representation Online Query Processing Semantic Matching Document Query Knowledge Online Representations Offline Query Index Document Index Offline Query Processing Offline Document Processing Search Log Data Web Data 26

  27. Online Query Processing [michael I. jordan: PersonName ] Named Entity [berkeley: Location ]: academic Recognition in Query [michael jordan: PersonName ] [berkeley: Location ]: academic Structure Query Topic michael I. jordan berkeley: academic michael jordan berkeley: academic Identification Topic michael I. jordan berkeley Similar Query Finding michael jordan berkeley michael jordan berkeley Query Refinement Sense michael jordan berkele 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend