Accelerating Document Retrieval and Ranking for Cognitive - PowerPoint PPT Presentation

Accelerating Document Retrieval and Ranking for Cognitive Applications Presenters: Tim Kaldewey – Performance Architect David Wendt – Performance Engineer

Disclaimer The author's views expressed in this presentation do not necessarily reflect the views of IBM.

Watson evolution *http://www-03.ibm.com/software/businesscasesudies/us/en/corp?synkey=Y362451T34615G34

Watson evolution 40x* *http://www-03.ibm.com/software/businesscasestudies/us/en/corp?synkey=Y362451T34615G34

A “brainwave” for answering a question Time [ms]

Background • Querying unstructured data (text) to identify relevant documents is a prerequisite for many cognitive data processing tasks (NLP) • The large number of queries and the volume of unstructured data require a highly performant mechanism Example: - Lucene index of Wikipedia (5 million docs) is 105GB - Average search comprises 7 terms (keywords) - On average 115 thousand documents scored per search • Scoring of candidate documents and passages is highly parallelizable. ➔ Acceleration can can be leveraged to improve response time and/or enable more complex queries to improve accuracy

Document Search Index is This provincial government of Canada is officially organized in known as the government of Newfoundland and term-document what region? format • Retrieve the documents that are most likely to have the answer(s) to the question • Search for documents that contain the words from the question • Rank the documents based on – How frequent the words and word combinations appear in each document – The distance between these words in those documents

Anatomy of Lucene Query Turn text into a Lucene query to retrieve relative documents. This provincial government of Canada +canada +newfoundland +provinci +govern +offici +known^0.5 +region is officially known as the government "provinci govern"~2 "govern canada"~2 "offici known"~2^0.9 of Newfoundland and what region? "known govern"~2 "govern newfoundland"~2 "offici region"~3 • Words are stemmed and some stop words (the, of, as, …) are removed. • Key words become term clauses: canada newfoundland provinci govern offici … – Scores are computed based on term frequency. • Word pairs (phrases) become span clauses: "provinci govern"~2 … – Scores are computed based on frequency of phase and word distance between words • Complex queries (e.g. nested span clauses) can improve accuracy by scoring higher more relevant documents.

Scoring term clauses • Lucene is very efficient making only one- pass to match and score • Index format is optimized for speed in matching terms to documents • For each document, score each term clause and then sum the scores • Scorer takes three values: – Term frequency – Document length – Term probability

Scoring span clauses "provinci govern"~2 "govern canada"~2 "offici known"~2 "known govern"~2 "govern newfoundland"~2 "offici region"~3 Scoring here uses a ‘sloppy’ frequency value calculated based on how often the term pair appears and how close together the terms are to each other. Clause form: span(term1,term2,slop,order) Example: span(provinci,govern,2,false)

Scoring span clauses – continued span(provinci,govern,2,false) • Position vectors vary in length per term per document.

Analysis • Scoring for each document is independent from other documents • At the end, scores are sorted to provide the document rank order

Perfect for GPU • Floating point operations for thousands of items (documents) that can occur in parallel • Each query clause is implemented as a set of kernels and the scores accumulate in a float array where each element is the score for a unique document • The top N ranked document ids are returned to the host application

Scoring on the GPU • We used the thrust libraries for sorting and intersecting to more easily include a CPU-only alternative • All term clauses are scored first and can be calculated in a single kernel (loop) • Spans are computed to maximize caching of term position values • Once scored, the results are sorted and the top N document ids are returned along with their scores Only 5 custom kernels were required.

Results

Making it Real • Accessing the index data: ids, frequencies, positions • Managing GPU access • Recursion for nested clauses • Scoring special cases • Coverage of query types

Shared index data • First approach was to create a custom index with only the values we needed for scoring. • Sharing the index with the rest of Lucene would be ideal but how much would this cost us?

Shared index data - results

Managing GPU access • Need to handle simultaneous queries from many host threads • A dedicated set of streams – one per host thread – to handle each query • Limited the number of streams based on the available GPU memory and index size • Once the GPU is fully utilized, additional host threads can be blocked or can fallback to calling Lucene directly

Recursion for nested spans • Although CUDA supports recursion, having an unknown stack-size becomes an issue. • Implemented the recursions as loops and managed a fake stack in global memory

Query Types vs Coverage • Query types are unique combinations of search clauses: terms, spanNear, spanOr, nested spans, etc. • Coverage progression is from most common clause type to least common. .

Scoring span clauses has special cases • There are some special cases like when phrases overlap.

Conclusion • Speed up by half an order of magnitude • Many challenges: shared index, query types, recursion, … • GPU performance is even higher for complex queries – Words resulting in many documents requiring more threads – Complex span clauses with many position values • Speeding up query allows building more complex queries and scoring documents better which may help improve accuracy

Questions?

Accelerating Document Retrieval and Ranking for Cognitive - PowerPoint PPT Presentation

Accelerating Document Retrieval and Ranking for Cognitive Applications Presenters: Tim Kaldewey Performance Architect David Wendt Performance Engineer Disclaimer The author's views expressed in this presentation do not necessarily

Decommissioning: Winds of Change in Offshore Oil & Gas Accelerating NAMEPA & NOIA Winds

Sustainably Faster: Accelerating Sustainably Faster: Accelerating Innovation in Transportation

SSL Accelerating Test Bench SSL accelerating Test Method Stefan Deelen & Maurits van der

ACCELERATING YOUR VR APPLICATIONS WITH VRWORKS Cem Cebenoyan Edward Liu 1 ACCELERATING YOUR

CuZr-Mo bimetals for CLIC accelerating structures for CLIC accelerating structures Introduction

The Use of Prediction for The Use of Prediction for Accelerating Upgrade Misses in Accelerating

Andr Walker-Loud Staff Scientist Lawrence Berkeley National Laboratory S91010 - Accelerating

Accelerating the Impact of Industrial IoT Sao Paulo Co-Design Workshop The World Economic Forum

Accelerating Trade in West Africa (ATWA) Borderless Alliance annual conference, Abidjan - May 2015

Accelerating structure test results and whats next Walter Wuensch CTF3 collaboration meeting

Accelerating Multi- Modal and Small Project Delivery ASLA Board of Trustees Meeting April 25,

The Power to Do More: Accelerating Results Join the conversation. #DoMore AGENDA 1:30 2:30

Q4 & FY 2017 Results Accelerating transformation The Hague, 26 February 2018 Q4 & FY

EA+EE: Accelerating Energy Access with Energy Efficiency 8 June 2017 Our Mission CLASP improves

Self accelerating universe from nonlinear massive gravity Chunshan Lin Outline

Accelerating the Impact of Industrial IoT Sao Paulo Kick-Off Workshop The World Economic Forum is

Quantitative Evaluation of Passage Retrieval Algorithms for Question Answering Stefanie Tellex,

Video Retrieval using Speech and Image Information Alexander G. Hauptmann, Rong Jin, and Tobun D.

FLAT 3 : Feature Location & Textual Tracing Tool Trevor Savage, Meghan Revelle, Denys

Joint Visual-Text Modeling for Multimedia Retrieval JHU CLSP Workshop 2004 Final

ELECTRONIC WD-10 TRAINING http://www.dol.gov/whd/programs/ dbra/wd10/index.htm Electronic WD-10

P. Hamk, I, Kopeek, R. Olejek , J. Plhk LAB OF SOFTWARE ARCHITECTURES AND

2 3 4 5 6

with Continuous Relevance Feedback KALICIAK, MYRHAUG, AND GOKER Content Image Retrieval

Accelerating Document Retrieval and Ranking for Cognitive - PowerPoint PPT Presentation

Accelerating Document Retrieval and Ranking for Cognitive Applications Presenters: Tim Kaldewey Performance Architect David Wendt Performance Engineer Disclaimer The author's views expressed in this presentation do not necessarily

Decommissioning: Winds of Change in Offshore Oil &amp; Gas Accelerating NAMEPA &amp; NOIA Winds

Sustainably Faster: Accelerating Sustainably Faster: Accelerating Innovation in Transportation

SSL Accelerating Test Bench SSL accelerating Test Method Stefan Deelen &amp; Maurits van der

ACCELERATING YOUR VR APPLICATIONS WITH VRWORKS Cem Cebenoyan Edward Liu 1 ACCELERATING YOUR

CuZr-Mo bimetals for CLIC accelerating structures for CLIC accelerating structures Introduction

The Use of Prediction for The Use of Prediction for Accelerating Upgrade Misses in Accelerating

Andr Walker-Loud Staff Scientist Lawrence Berkeley National Laboratory S91010 - Accelerating

Accelerating the Impact of Industrial IoT Sao Paulo Co-Design Workshop The World Economic Forum

Accelerating Trade in West Africa (ATWA) Borderless Alliance annual conference, Abidjan - May 2015

Accelerating structure test results and whats next Walter Wuensch CTF3 collaboration meeting

Accelerating Multi- Modal and Small Project Delivery ASLA Board of Trustees Meeting April 25,

The Power to Do More: Accelerating Results Join the conversation. #DoMore AGENDA 1:30 2:30

Q4 &amp; FY 2017 Results Accelerating transformation The Hague, 26 February 2018 Q4 &amp; FY

EA+EE: Accelerating Energy Access with Energy Efficiency 8 June 2017 Our Mission CLASP improves

Self accelerating universe from nonlinear massive gravity Chunshan Lin Outline

Accelerating the Impact of Industrial IoT Sao Paulo Kick-Off Workshop The World Economic Forum is

Quantitative Evaluation of Passage Retrieval Algorithms for Question Answering Stefanie Tellex,

Video Retrieval using Speech and Image Information Alexander G. Hauptmann, Rong Jin, and Tobun D.

FLAT 3 : Feature Location &amp; Textual Tracing Tool Trevor Savage, Meghan Revelle, Denys

Joint Visual-Text Modeling for Multimedia Retrieval JHU CLSP Workshop 2004 Final

ELECTRONIC WD-10 TRAINING http://www.dol.gov/whd/programs/ dbra/wd10/index.htm Electronic WD-10

P. Hamk, I, Kopeek, R. Olejek , J. Plhk LAB OF SOFTWARE ARCHITECTURES AND

2 3 4 5 6

with Continuous Relevance Feedback KALICIAK, MYRHAUG, AND GOKER Content Image Retrieval

Decommissioning: Winds of Change in Offshore Oil & Gas Accelerating NAMEPA & NOIA Winds

SSL Accelerating Test Bench SSL accelerating Test Method Stefan Deelen & Maurits van der

Q4 & FY 2017 Results Accelerating transformation The Hague, 26 February 2018 Q4 & FY

FLAT 3 : Feature Location & Textual Tracing Tool Trevor Savage, Meghan Revelle, Denys