Attentive Neural Architecture for Ad-hoc Structured Document - PowerPoint PPT Presentation

Attentive Neural Architecture for Ad-hoc Structured Document Retrieval Saeid Balaneshin 1 Alexander Kotov 1 Fedor Nikolaev 1 , 2 1 Textual Data Analytics Lab, Department of Computer Science, Wayne State University 2 Kazan Federal University 1/25

Ad-hoc Structured (Multi-field) Document Retrieval IR research traditionally views documents as holistic and homogeneous units of text The task of retrieving structured (multi-field) documents arises in many information access scenarios: ◮ Entity retrieval from knowledge graph(s) ◮ Web document retrieval ◮ Product search in e-Commerce 2/25

Entity Retrieval from Knowledge Graph(s) Names ֌ Attributes ֌ Categories ֌ Similar Entity Names ֌ Related Entity Names ֌ 3/25

Product Search Title ֌ Description ֌ Attributes ֌ 4/25

Web Search Title ֌ Texts in Large Font ֌ Contents ֌ Incoming Hyper-links ֌ Document Meta-data ֌ Alternative Texts for Im- ֌ ages 5/25

Document vs. Structured Document Retrieval Document Retrieval Structured Document Retrieval - relevance is quantified by aggre- - requires strategies for aggregating gating heuristics calculated at the heuristics calculated at the level of document or collection level (# of document fields into the matching occurrences and proximity of query score of an entire document terms, IDF, document length) - effective for retrieving documents with lexically similar, but semanti- cally diverse fields 6/25

Importance of Document Fields Aggregation of field-level statistics of query terms in structured document retrieval is informed by a relative importance of document fields, which depends on: properties or semantics of document fields : e.g. a query term matched in a section of a Web page, which is in larger font, should have a different importance than a query term matched in other sections query intent : e.g. in the query “attractive outdoor light with security features” “attractive” refers to product description, “outdoor light” to product name and “security features” to product attributes 7/25

Mixture of Language Models (MLM) [Ogilvie and Callan, SIGIR’03] Document D with F fields is ranked w.r.t query Q according to: P ( Q | D ) rank P ( q i | θ D ) n ( q i , Q ) � = q i ∈ Q where F w j P ( q i | θ j ) � P ( q i | θ D ) = j =1 8/25

Fielded Sequential Dependence Model (FSDM) [Zhiltsov et al., SIGIR’15] Extends SDM to the case of structured document retrieval (i.e. accounts for both unigram and sequential bigram concepts in a query and document structure) Document D with F fields is ranked w.r.t query Q according to: P ( D | Q ) rank � ˜ � ˜ = λ T f T ( q i , D ) + λ O f O ( q i , q i +1 , D )+ q i ∈ Q q i ∈ Q ˜ � f U ( q i , q i +1 , D ) λ U q i ∈ Q Potential function for query unigram q i : F ˜ � w j P ( q i | θ j ) f T ( q i , D ) = log j =1 9/25

Challenges of Structured Document Retrieval Methods for structured document retrieval (SDR) face three major challenges: identifying the key concepts (words or phrases) in keyword queries semantic matching of the key query concepts in different fields of structured documents aggregating the scores of the matched query phrases into the overall score of a structured document Key limitation: all previously proposed SDR methods are based on direct matching of concepts in queries and document fields → lexical gap 10/25

Proposed Neural Architecture A ttention-based N eural Architecture for Ad-hoc S tructured Document R etrieval ( ANSR ): Input: embeddings of words in a query and document fields Pooling layers: create compressed interaction matrices of the same dimensions between unigram- and bigram-based query and document field phrases Matching score aggregation layers: combine the matching scores of query phrases in different document fields into the overall document relevance score by taking into account relative importance of query phrases and document fields Document field attention layers: calculate relative importance of document fields Query phrase attention layers: calculate relative importance of query phrases 11/25

Pooling Layers (1) Step 1: create distributed representations of a query and each document field Query : automobile capital and the Detroit of Italy Document : http://dbpedia.org/page/Turin Taurinum Turin is an important business and cultural center in northern Italy, capital city of the Piedmont re- gion located mainly on the T aurinum Italy attributes attributes ... ... left bank of the Po River · · · city capital Susa Valley · · · · · · · · · Italy it is also dubbed la cap- itale Sabauda Savoyard capital · · · · · · − → Space Station Teatro Carig- related space Milan nano Savoie List of political ... ... entity philosophers Haifa Parola, Parola Pertini related names Carlo Residences of the entity Royal House of Savoy Eco names · · · , Duchy of Mi- · · · lan Mezzo-soprano Genoa Ginzburg Alessandro Pertini 12/25

Pooling Layers (2) Step 2 : create document fields interaction matrix for each query phrase distributed representations compressed interaction matrices of query and document fields for unigram-based query phrases automobile capital Italy automobile T aurinum ... ... capital attributes capital city 0.28 0.30 space Milan Italy related ... ... entity Parola Pertini names 0.19 0.22 Italy Italy T aurinum Italy T aurinum ... ... attributes ... ... city capital capital attributes city 0.35 0.39 space Milan related Milan space related ... ... entity ... ... entity Parola Pertini Pertini names Parola names 0.34 0.36 13/25

Document Field Attention Layers Goal: compute the importance weights of document fields for aggregating the matching scores of query phrases Document : http://dbpedia.org/page/Turin importance weights Italy T aurinum attributes attributes ... ... 0.21 capital city softmax related related space Milan entity entity ... ... 0.18 names names Parola Pertini 14/25

Query Phrase Attention Layers Goal: compute the importance weights of query phrases for aggregating the matching scores of query phrases of the same type Query : automobile capital and the Detroit of Italy importance weights automobile query automobile 0.24 capital phrase capital softmax query Italy 0.19 Italy phrase 15/25

Matching Score Aggregation Layers Interaction Matrices query phrase: automobile capital matching score of 'automobile capital' in all document fields T aurinum Italy ... ... attributes capital city 0.28 0.30 matching score of unigram based query phrases related space Milan ... ... entity Parola Pertini 0.19 0.22 names aggregation of matching scores of all unigram- 0.30 and bigram-based query query phrase: phrases Italy T aurinum Italy ... ... attributes capital city 0.35 0.39 related space Milan ... matching score of 'Italy' ... entity Parola Pertini in all document fields 0.34 0.36 names aggregation of query aggregation of matching phrase matching scores of query phrases scores in document fields of the same type 16/25

Training ANSR is trained to minimize contrastive max-margin loss, given a collection of triplets < q , d n , d r > consisting of relevant d r and non-relevant d n documents for query q : � max(0 , ζ − s ( q , d r ) + s ( q , d n )) + γ � 2 ||W|| 2 � min 2 W < q , d n , d r > ∈T 17/25

Experiments Language modeling and probabilistic baselines: ◮ PRMS (Probabilistic Retrieval Model for Semistructured Data) [Kim, Xue and Croft, ECIR’09] ◮ MLM (Mixture of Language Models) [Ogilvie and Callan, SIGIR’03] ◮ BM25F [Robertson, Zaragoza and Taylor, CIKM’04] ◮ FSDM (Fielded Sequential Dependence Model) [Zhiltsov, Kotov and Nikolaev, SIGIR’15] Neural baselines: ◮ DRMM (Deep Relevance Matching Model) [Guo, Fan, Ai and Croft, CIKM’16] ◮ DESM (Dual Embedding Space Model ) [Nalisnick, Mitra, Craswell and Caruanan, WWW’16] ◮ NRM-F (Neural Ranking Model with Multiple Document Fields) [ Zamani, Mitra, Song, Craswell and Tiwary, WSDM’18] 18/25

Attentive Neural Architecture for Ad-hoc Structured Document - PowerPoint PPT Presentation

Attentive Neural Architecture for Ad-hoc Structured Document Retrieval Saeid Balaneshin 1 Alexander Kotov 1 Fedor Nikolaev 1 , 2 1 Textual Data Analytics Lab, Department of Computer Science, Wayne State University 2 Kazan Federal University 1/25

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Area 11 Redistricting Ad-Hoc Committee AREA 11 Redistricting Ad-Hoc Committee March 8 th 2017 a

Routing In Ad Hoc Networks 1. Introduction to Ad-hoc networks 2. Routing in Ad-hoc networks 3.

Ad-hoc and Mesh Networks MAP-I Manuel P. Ricardo Faculdade de Engenharia da Universidade do

Mobile Communications Ad-hoc and Mesh Networks Manuel P. Ricardo Faculdade de Engenharia da

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Granger-causal Attentive Mixtures of Experts Learning Important Features with Neural Networks

Towards General Vision Architectures: Attentive Single-Tasking of Multiple Tasks depth Neural

Process Slides 1 Directing a Project Mandate REQUEST AN EXCEPTION PLAN Ad Ad Ad Ad hoc

Mobile Communications Ad-Hoc Networks & Wireless Sensor Networks Ad-hoc networks

Wireless Ad Hoc & Sensor Networks Wireless Ad Hoc & Sensor Networks Introduction -

Corridor Routing Routing in Mobile in Mobile Ad Ad- -hoc hoc Networks Networks Corridor

Energy Management Issue in Ad Hoc Networks Outline In ad hoc networks the devices are battery

Introduction to Jest testing framework Painless JavaScript Testing platform Vasyl Boroviak

Learning Theory Bridges Loss Functions July 13 rd , 2020 Han Bao (The University of Tokyo / RIKEN

Developing Managed Code Rootkits for the Java Runtime Environment DEFCON 24, August 6th 2016

Analysis of optimistic multi-party contract signing Rohit Chadha 1,2 , Steve Kremer 3 , Andre

JavaScript: The Good Parts vs. JavaScript: The Definitive Guide CS 252: Advanced Programming

Open source password manager for teams 80% of security incidents are due to poor password

Math for Liberal Arts MAT 110: Chapter 4 Notes Taking Control of Your Finances Managing Money

Photons Be Free! The State of Fauxton 2015 Photons Be Free! The State of Fauxton 2015

Attentive Neural Architecture for Ad-hoc Structured Document - PowerPoint PPT Presentation

Attentive Neural Architecture for Ad-hoc Structured Document Retrieval Saeid Balaneshin 1 Alexander Kotov 1 Fedor Nikolaev 1 , 2 1 Textual Data Analytics Lab, Department of Computer Science, Wayne State University 2 Kazan Federal University 1/25

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Area 11 Redistricting Ad-Hoc Committee AREA 11 Redistricting Ad-Hoc Committee March 8 th 2017 a

Routing In Ad Hoc Networks 1. Introduction to Ad-hoc networks 2. Routing in Ad-hoc networks 3.

Ad-hoc and Mesh Networks MAP-I Manuel P. Ricardo Faculdade de Engenharia da Universidade do

Mobile Communications Ad-hoc and Mesh Networks Manuel P. Ricardo Faculdade de Engenharia da

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Granger-causal Attentive Mixtures of Experts Learning Important Features with Neural Networks

Towards General Vision Architectures: Attentive Single-Tasking of Multiple Tasks depth Neural

Process Slides 1 Directing a Project Mandate REQUEST AN EXCEPTION PLAN Ad Ad Ad Ad hoc

Mobile Communications Ad-Hoc Networks &amp; Wireless Sensor Networks Ad-hoc networks

Wireless Ad Hoc &amp; Sensor Networks Wireless Ad Hoc &amp; Sensor Networks Introduction -

Corridor Routing Routing in Mobile in Mobile Ad Ad- -hoc hoc Networks Networks Corridor

Energy Management Issue in Ad Hoc Networks Outline In ad hoc networks the devices are battery

Introduction to Jest testing framework Painless JavaScript Testing platform Vasyl Boroviak

Learning Theory Bridges Loss Functions July 13 rd , 2020 Han Bao (The University of Tokyo / RIKEN

Developing Managed Code Rootkits for the Java Runtime Environment DEFCON 24, August 6th 2016

Analysis of optimistic multi-party contract signing Rohit Chadha 1,2 , Steve Kremer 3 , Andre

JavaScript: The Good Parts vs. JavaScript: The Definitive Guide CS 252: Advanced Programming

Open source password manager for teams 80% of security incidents are due to poor password

Math for Liberal Arts MAT 110: Chapter 4 Notes Taking Control of Your Finances Managing Money

Photons Be Free! The State of Fauxton 2015 Photons Be Free! The State of Fauxton 2015

Mobile Communications Ad-Hoc Networks & Wireless Sensor Networks Ad-hoc networks

Wireless Ad Hoc & Sensor Networks Wireless Ad Hoc & Sensor Networks Introduction -