Going Beyond the Document-Query Lexical Match Oren Kurland Faculty - PowerPoint PPT Presentation

Going Beyond the Document-Query Lexical Match Oren Kurland Faculty of Industrial Engineering and Management Technion 1 / 29

Search engines 2 / 29

The ad hoc retrieval task Relevance Ranking Rank documents in a corpus by their relevance to the information need expressed by a given query 3 / 29

The vector space model Salton ’68 q = “Technion” � q = < 0 , . . . , 0 , 1 , 0 , . . . , 0 > d = “Technion faculty student Technion” � d = < 0 , . . . , 0 , 1 , 0 , . . . , 0 , 1 , 0 , . . . , 0 , 2 , 0 , . . . , 0 > def = cos ( � score V S ( d ; q ) d, � q ) Term weighting scheme: TF.IDF TF: the number of occurrences of the term in the document IDF: The inverse of the document frequency of the term 4 / 29

The language modeling approach Ponte&Croft ’98 Ranking def � score LM ( d ; q ) = p ( w | d ) w ∈ q p ( w | d ) is the probability that w is generated from a language model induced from d A language model = (1 − λ )2 def p ( “Hello” | “Hello Hello World” ) 3+ λp ( “Hello” | corpus ) 5 / 29

The document-query similarity estimate Retrieval frameworks Probabilistic retrieval (Maron&Kuhns ’60); Okapi BM25 (Robertson et al. ’93) Vector space model (Salton ’68) The inference network model (Turtle&Croft ’90) Pivoted document length normalization (Singhal et al. ’96) Language modeling (Ponte&Croft ’98) Divergence from randomness (Amati&van Rijsbergen ’00) Are all these the same? It’s all about TF, IDF and length normalization (Fang et al. ’04, ’09) Axiomatization of document-query similarity functions used for ranking (Fang et al. ’05) 6 / 29

Web search A variety of relevance signals The similarity between the page and the query (query dependent) The similarity between the anchor text and the query (query dependent) The PageRank score of the page (query independent) Additional document quality measures; e.g., spam score, entropy (query independent) The clickthrough rate for the page (query independent) ... 7 / 29

Learning to rank A training set: { ( f ( q i , d j ) , l ( q i , d j )) } i,j q i : query d j : document f ( q i , d j ) : a representation for the pair ( q i , d j ) l ( q i , d j ) : a relevance judgment for the pair ( q i , d j ) Minimize a loss function using pointwise/pairwise/listwise approaches (Liu ’09) 8 / 29

Observations Relevance is determined based on whether the document content satisfies the information need expressed by the query The document-query similarity is among the most important features for ranking pages in Web search (Liu ’09) Can document-query similarity estimates be further improved? 9 / 29

The surface-level document query similarity The vocabulary mismatch problem Relevant documents might not contain some, or even all, query terms Short queries Short documents (e.g., Tweets) query: “shipment vehicles” document: “cargo freight truck” 10 / 29

The risk minimization framework Lafferty&Zhai ’01 11 / 29

Semantic matching Li&Xu ’13 Query reformulation Term dependence models Translation models Topic models Latent space models 12 / 29

Short queries Automatic query expansion Global methods analyze the corpus or external resources in a query-independent fashion Local methods rely on some initial search Global methods Using Wordnet (Voorhees ’94), large external corpus (Diaz&Metzler ’06), Wikipedia (Xu et al. ’09) Translation model (Berger&Lafferty ’99) def = � � w ′ ∈ Lexicon p ( w ′ | d ) T ( w ′ | w ) p ( q | d ) w ∈ q Estimating T using mutual information (Karimzadehgan&Zhai ’09) Effective for microbolog search (Karimzadehgan et al. ’13) 13 / 29

Pseudo-feedback-based query expansion Utilize information from documents that are highly ranked by an initial search performed in response to the query Relevance modeling (Lavrenko&Croft ’01) A generative theory of relevance: The query and the relevant documents are sampled from the same language model (relevance model; R ) def p ( w | R ) = λp ( w | q ) + (1 − λ ) � d ∈D init p ( w | d ) p ( d | q ) � � def � � score ( d ; q ) = KL p ( ·| R ) � p ( ·| d ) � � � State-of-the-art (unigram) pseudo-feedback-based query expansion approach (Lv&Zhai ’09) How do we set λ ? Adaptive/selective query expansion ... 14 / 29

Beyond bag-of-terms (unigram) representations Markov Random Fields (Metzler&Croft ’05) Q : query composed of the terms q 1 , q 2 , . . . D : document p ( Q, D ) = ? 15 / 29

Markov Random Fields P ( D | Q ) rank � = λ c f ( c ) c ∈ Cliques ( G ) G : graph; f ( c ) : feature function def Unigram features: f T ( c ) = log p ( q i | D ) Ordered phrase features: def f O ( c ) = log p ( ow ( q i , . . . , q i + k ) | D ) Unordered phrase features: def f U ( c ) = log p ( uw ( q i , . . . , q i + k ) | D ) Additional models Linear discriminant model (Gao et al. ’05) Differential concept weighting (Shi&Nie ’10) Modeling higher order term (concept) dependencies using query hypergraphs (Bendersky&Croft ’12) 16 / 29

Latent concept expansion Metzler&Croft ’07 def � p ( E | Q ) = ( f QD ( Q, D ) + f D ( D ) + f QD ( E, D ) + f Q ( E )) D ∈D init Additional models Using hierarchical Markov Random Fields for query expansion (Lang et al. ’10) Learning concept importance (Bendersky et al. ’11) 17 / 29

Parametrized concept weighting Bendersky et al. ’11 def � � score ( D ; Q ) = λ c f ( c, D ) T ∈T c ∈ T c : concept T : Types of concepts: query terms, phrases (bigrams), biterms, expansion terms def � λ c = w ϕ ϕ ( c ) ϕ ∈ φ T φ T : a set of feture (importance) functions for a concept of type T (e.g., using the corpus, Google n-grams, Wikipedia, search log) � � � score ( D ; Q ) = w ϕ ϕ ( c ) f ( c, D ) T ∈T ϕ ∈ φ T c ∈ T 18 / 29

Positional language models Lv&Zhai ’09 c ( w, i ) : count of term w at position i in document D k ( i, j ) : the term count propagated to position i from position j def = P N c ′ ( w, i ) j =1 c ( w, j ) k ( i, j ) c ′ ( w, i ) def p ( w | D, i ) = P w ′ ∈ Vocabulary c ′ ( w ′ , i ) def “ ˛ ˛ ” score ( Q, D, i ) = KL p ( ·| Q ) ˛ p ( ·| D ) ˛ ˛ ˛ Query expansion: A positional relevance language model (Lv&Zhai ’10) 19 / 29

Matching in a latent space The term document matrix: w 1 w 2 A = d 1 f ( w 1 ; d 1 ) f ( w 2 ; d 1 ) f ( w 1 ; d 2 ) f ( w 2 ; d 2 ) d 2 Latent Semantic Analysis (LSA; Deerwester et al. ’90): Low rank approximation using SVD A k = min X : rank ( X )= k || A − X || F Probabilistic Latent Semantic Analysis (pLSA; Hofmann ’99) Supervised methods for doc-query matching in a latent space (Bai et al. ’09, Huang et al. ’13, Wu et al. ’13) 20 / 29

The cluster hypothesis The cluster hypothesis (Jardine&van Rijsbergern ’71, van Rijsbergen ’79) : Closely associated documents tend to be relevant to the same requests Leveraging the hypothesis: enrich a document representation using information induced from its corpus context 21 / 29

Smoothing document representations c ( w, D ) def p ( w | D ) = λ 1 w ′ c ( w ′ , D )+ � c ( w, corpus ) � λ 2 w ′ c ( w ′ , corpus ) + λ 3 p ( w | t ) p ( t | D ) � t ∈ Topics Topics: Clusters with which D is associated (Kurland&Lee ’04, Liu&Croft ’04) LDA (Blei et al. ’03; Wei&Croft ’06) or pLSA (Hofmann ’99; Lu et al. ’11) or PAM (Li&McCallum ’06; Yi&Allan ’09) 22 / 29

Smoothing document representations Empirical observations (Yi&Allan ’09) Using more sophisticated topic models doesn’t yield improved retrieval effectiveness Using nearest-neighbors clusters as “topics” results in retrieval performance as good as that as using topic models Pseudo-feedback-based query expansion (specifically, relevance modeling) outperforms using topic models Cluster-based smoothing is highly effective for microblog retrieval (Efron ’11) 23 / 29

A different approach to utilizing corpus context Cluster ranking Document ranking method Query Clustering Initial list of method documents Cluster ranking Set of method clusters Each cluster is replaced with its Ranking of documents clusters Ranking of documents 24 / 29

The optimal cluster p@5 Doc-query similarity Query expansion Oracle experiment 25 / 29

Ranking clusters using Markov Random Fields Raiber&Kurland ’13 Winner of the Web track in TREC 2013 = p ( c, q ) def rank p ( c | q ) = p ( c, q ) p ( q ) p ( c, q ) rank � = λ l f l ( l ) l ∈ Cliques ( G ) f l ( l ) : feature function defined over the clique l 26 / 29

Challenges Query = “oren kurland dblp” Search #1 Search #2 27 / 29

Going Beyond the Document-Query Lexical Match Oren Kurland Faculty - PowerPoint PPT Presentation

Going Beyond the Document-Query Lexical Match Oren Kurland Faculty of Industrial Engineering and Management Technion 1 / 29 Search engines 2 / 29 The ad hoc retrieval task Relevance Ranking Rank documents in a corpus by their relevance to

Heterogeneous Lexical Resources MultiJEDI ERC 259234 Lexical Resource Lexical Resource Lexical

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log

LEXICAL TYPOLOGY Peter Koch (Part I) Koch, Lexical typology, 2010-8-24 A. General introduction

Compilers Lexical Analysis Alex Aiken Lexical Analysis 1. Lexical Analysis 2. Parsing 3.

Query Execution 2 and Query Optimization Instructor: Matei Zaharia cs245.stanford.edu Query

LEXICAL TYPOLOGY LEXICAL TYPOLOGY Peter Koch (Part II) Department of Romance Studies, Tbingen

LEXICAL SEMANTICS LEXICAL SEMANTICS CS 224N 2011 Gerald Penn Slides largely adapted from

Lesson 2 Lexical Analysis CS 226/326 Spring 2003 Lexical Analysis Transform source program

Lexical analysis Lexical analysis Lexical analysis checks the correctness of program words and

Introduction to Lexical Analysis Outline Informal sketch of lexical analysis

Query Processing Relevance feedback; query expansion; Web Search 1 Overview Indexes Query

Query Understanding: A Manifesto Daniel Tunkelang queryunderstanding.com Overview What is

Perfect Query FORMULA 5 critical sections in every successful query letter (c) 2019

Query Op)miza)on 1 Query op)miza)on Given an SQL query,

CS4224/CS5424 Lecture 9 Distributed Query Processing Query Processing Translates query into a

Lexical Databases Like a dictionary Lexical properties of interest to psycholinguists

Query Operations Query Operations Berlin Chen 2004 Reference: 1. Modern Information Retrieval .

Analysis and performance of morphological query expansion and language-filtering words on Basque

Relevance Feedback and Query Expansion Debapriyo Majumdar

Query Expansion & Passage Reranking NLP Systems & Applications LING 573 April 17, 2014

RelevanceFeedback CISC489/689010,Lecture#15 Monday,April13 th

SLWWW at the NTCIR-13 WWW Task Peng XIAO , Yimeng FAN , Lingtao Li, Tetsuya Sakai Waseda

Announcements Coordinating with other presenters Presentation length: ~20 minutes HW1

Information Retrieval Lecture 4 Recap of last time Postings pointer storage Dictionary

Going Beyond the Document-Query Lexical Match Oren Kurland Faculty - PowerPoint PPT Presentation

Going Beyond the Document-Query Lexical Match Oren Kurland Faculty of Industrial Engineering and Management Technion 1 / 29 Search engines 2 / 29 The ad hoc retrieval task Relevance Ranking Rank documents in a corpus by their relevance to

Heterogeneous Lexical Resources MultiJEDI ERC 259234 Lexical Resource Lexical Resource Lexical

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log

LEXICAL TYPOLOGY Peter Koch (Part I) Koch, Lexical typology, 2010-8-24 A. General introduction

Compilers Lexical Analysis Alex Aiken Lexical Analysis 1. Lexical Analysis 2. Parsing 3.

Query Execution 2 and Query Optimization Instructor: Matei Zaharia cs245.stanford.edu Query

LEXICAL TYPOLOGY LEXICAL TYPOLOGY Peter Koch (Part II) Department of Romance Studies, Tbingen

LEXICAL SEMANTICS LEXICAL SEMANTICS CS 224N 2011 Gerald Penn Slides largely adapted from

Lesson 2 Lexical Analysis CS 226/326 Spring 2003 Lexical Analysis Transform source program

Lexical analysis Lexical analysis Lexical analysis checks the correctness of program words and

Introduction to Lexical Analysis Outline Informal sketch of lexical analysis

Query Processing Relevance feedback; query expansion; Web Search 1 Overview Indexes Query

Query Understanding: A Manifesto Daniel Tunkelang queryunderstanding.com Overview What is

Perfect Query FORMULA 5 critical sections in every successful query letter (c) 2019

Query Op)miza)on 1 Query op)miza)on Given an SQL query,

CS4224/CS5424 Lecture 9 Distributed Query Processing Query Processing Translates query into a

Lexical Databases Like a dictionary Lexical properties of interest to psycholinguists

Query Operations Query Operations Berlin Chen 2004 Reference: 1. Modern Information Retrieval .

Analysis and performance of morphological query expansion and language-filtering words on Basque

Relevance Feedback and Query Expansion Debapriyo Majumdar

Query Expansion &amp; Passage Reranking NLP Systems &amp; Applications LING 573 April 17, 2014

RelevanceFeedback CISC489/689010,Lecture#15 Monday,April13 th

SLWWW at the NTCIR-13 WWW Task Peng XIAO , Yimeng FAN , Lingtao Li, Tetsuya Sakai Waseda

Announcements Coordinating with other presenters Presentation length: ~20 minutes HW1

Information Retrieval Lecture 4 Recap of last time Postings pointer storage Dictionary

Query Expansion & Passage Reranking NLP Systems & Applications LING 573 April 17, 2014