improving web search with language technologies
play

Improving Web Search with Language Technologies Thomas Hofmann - PowerPoint PPT Presentation

Improving Web Search with Language Technologies Thomas Hofmann Director of Engineering - Zurich Improving Web Search with Language Technologies 1 Lexical Semantics 2 Machine Translation 3 Information Extraction 4 Automatic Speech


  1. Improving Web Search with Language Technologies Thomas Hofmann Director of Engineering - Zurich

  2. Improving Web Search with Language Technologies 1 Lexical Semantics 2 Machine Translation 3 Information Extraction 4 Automatic Speech Recognition 2

  3. Improving Ads Targeting & Search Quality 1 Lexical Semantics 3

  4. Natural Language Processing for Search Quality Two main ingredients: stemming and synonyms Challenges for synonym expansion - Learning of lexical semantics from data - High precision in order to avoid loss of topicality - Use context cues to trigger synonyms 4

  5. Natural Language Processing for Search quality Synonym expansion depends on context: ab = Alberta ab = Allen Bradley 5

  6. Expanded Matching in On-line Ads Targeting Targeting mechanisms for AdWords : match user queries with advertiser (bidded) keywords Types of matches - Phrase match : all tokens from a keyword appear consecutively in the query, and in the same order (keyword) used cars -> (query) cheap used cars - Broad match : all tokens from a keyword appear somewhere in the query, regardless of order (keyword) used cars -> (query) used toyota cars - Expanded broad match : some tokens from a keyword or its related words appear in the query (keyword) used cars -> (query) used automobiles, automobiles 6

  7. Expanded Matching in On-line Ads Targeting 7

  8. 2 Machine Translation Enriching Web Content 8

  9. Machine Translation for Web Search Machine translation system developed in-house at Google (Franz Och) Goals : enrich Web content in languages with limited content Usage : Web page translation, translate this page link on result page, cross-language retrieval (Russian, Arabic) Challenges in machine translation: - MT from English into other target languages - MT for any text types & topics - Model size optimization & efficient search - Interface, usability, user feedback 9

  10. translate.google.com 10

  11. translate.google.com 11

  12. Search Results – “Translate this page” link 12

  13. Translation in Google Toolbar 13

  14. Translation Feedback -- Launched in Feb ‘07 14

  15. Supporting Question Answer Retrieval 3 Information Extraction 15

  16. Information Extraction for Question-Answer Retrieval Open domain extraction of facts from the Web Goals : provide succinct answers to queries that are questions Usage : currently triggers a special “search onebox” to deliver a fact Challenges in information extraction: - Reliability of extracted facts - Coverage of relevant facts from all domains - Reputation of sources and combination thereof - Triggering of Q&A retrieval - Combination of evidence and inference 16

  17. Question Answering Retrieval: Example Compile fact with source reference for simple question-like queries: 17

  18. 4 Automatic Speech Recognition 1-800-GOOG-411 18

  19. Automatic Speech Recognition 1-800-GOOG-411 service from mobile phones Goals : local business information completely free, directly from your phone Usage : easy to use speech interface for mobile devices Challenges : - Speaker variability - Background noise - Navigation & usability 19

  20. 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend