AutoAdapt @ TREC 2010 Dyaa Albakour October 7, 2010 Dyaa Albakour - PowerPoint PPT Presentation

The AutoAdapt Project TREC 2010 ClueWeb09 and Indexing Experiments Future Work AutoAdapt @ TREC 2010 Dyaa Albakour October 7, 2010 Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 ClueWeb09 and Indexing Experiments Future Work Table of contents 1 The AutoAdapt Project 2 TREC 2010 What is TREC? The Session Track 3 ClueWeb09 and Indexing 4 Experiments Overview Baseline 1 Baseline 2 The AutoAdapt Approach 5 Future Work Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 ClueWeb09 and Indexing Experiments Future Work Update on the AutoAdapt Project Ant Colony Optimisation for Deriving Suggestions from Intranet Query Logs, WI10 paper. A Methodology for Simulated Experiments in Interactive Search. SimInt 2010 @ SIGIR. Towards Adaptive Search in Digital Libraries. Submitted as a book chapter for AT4DL. Building an adaptive search system. Collaborating with a number of Industrial partners. Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 What is TREC? ClueWeb09 and Indexing The Session Track Experiments Future Work What is TREC? The purpose was to support research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies. Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 What is TREC? ClueWeb09 and Indexing The Session Track Experiments Future Work What is TREC? The purpose was to support research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies. Co-sponsored by the National Institute of Standards and Technology (NIST) and U.S. Department of Defense. started in 1992. Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 What is TREC? ClueWeb09 and Indexing The Session Track Experiments Future Work What is TREC? The purpose was to support research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies. Co-sponsored by the National Institute of Standards and Technology (NIST) and U.S. Department of Defense. started in 1992. Annual Competition: Tracks announced in February. Results usually submitted in summer. Assessments are back in September. Conference takes place November. Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 What is TREC? ClueWeb09 and Indexing The Session Track Experiments Future Work What is TREC? The purpose was to support research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies. Co-sponsored by the National Institute of Standards and Technology (NIST) and U.S. Department of Defense. started in 1992. Annual Competition: Tracks announced in February. Results usually submitted in summer. Assessments are back in September. Conference takes place November. seven tracks in TREC 2010: Blog Track, Chemical IR track, Entity Track, Legal Track, Relevance Feedback track, Session track, Web Track. Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 What is TREC? ClueWeb09 and Indexing The Session Track Experiments Future Work The Session Track Evaluate the effectiveness of search engines in interpreting query reformulations. Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 What is TREC? ClueWeb09 and Indexing The Session Track Experiments Future Work The Session Track Evaluate the effectiveness of search engines in interpreting query reformulations. A good search engine should be able to utilise the previous queries in the sequence of a session to provide better results that reflect the user needs throughout the session. Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 What is TREC? ClueWeb09 and Indexing The Session Track Experiments Future Work The Session Track Evaluate the effectiveness of search engines in interpreting query reformulations. A good search engine should be able to utilise the previous queries in the sequence of a session to provide better results that reflect the user needs throughout the session. Example: Britney Spears → Paris Hilton France Hotels → Paris Hilton Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 What is TREC? ClueWeb09 and Indexing The Session Track Experiments Future Work The Session Track Evaluate the effectiveness of search engines in interpreting query reformulations. A good search engine should be able to utilise the previous queries in the sequence of a session to provide better results that reflect the user needs throughout the session. Example: Britney Spears → Paris Hilton France Hotels → Paris Hilton The session track provides a framework to assess this particular issue in Information Retrieval systems. Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 What is TREC? ClueWeb09 and Indexing The Session Track Experiments Future Work The Session Track - The Task Only sessions with two queries are considered this year. Participants are given a set of 150 query pairs, each query pair (original query, query reformulation) represents a user session. The participants are asked to submit three ranked lists of documents form the ClueWeb09 dataset: One for the original query ( RL 1). One for the query reformulation ignoring the original query ( RL 2). One for the query reformulation taking the original query into consideration ( RL 3). Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 What is TREC? ClueWeb09 and Indexing The Session Track Experiments Future Work The Session Track - Type of Queries 1 Generalisation : ‘low carb high fat diet’ → ‘types of diets’. 2 Specification : ‘us map’ → ‘us map states and capitals’ 3 Drifting/Parallel Reformulation : ‘music man performances’ → ‘music man script’. Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 What is TREC? ClueWeb09 and Indexing The Session Track Experiments Future Work The Session Track - Evaluation 1 Can search engines improve their performance for a given query using previous queries? RL 2, RL 3 2 How do they perform over an entire session? RL 1, RL 3. PC (10) and nDCG (10) will be exactly estimated. Participants can be ranked and their performance can be compared over RL 2 and RL 3. Primary comparison measure between participants is the nDCG (10) for RL 3. Documents that appear in RL 1 will be penalised if they reappear in RL 2 and RL 3. Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project TREC 2010 ClueWeb09 and Indexing Experiments Future Work The ClueWeb09 Dataset 1,040,809,705(1 billion) web pages, in 10 languages. ClueWeb09 Category B: 50m English pages (Tier 1 web crawl). Public index available using Indri Search Engine . The Indri search engine supports language retrieval models (query likelihood model). Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project Overview TREC 2010 Baseline 1 ClueWeb09 and Indexing Baseline 2 Experiments The AutoAdapt Approach Future Work The Runs Matrix RL1 RL2 RL3 System 1 D q Dr (baseline 1) System 2 D q Dr (baseline 2) System 3 D q Dr (AutoAdapt Approach) q : The original query consisting of a number of terms qt i . r : The reformulated query consisting of a number of terms rt i . Dq : a ranked list of documents returned by Indri D q < d q , 1 , d q , 2 , ..., d q , n > ; d q , i / ∈ SPAM , n < 1000 Query likelihood model. 70% of ClueWeb09 documents are considered spam. Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project Overview TREC 2010 Baseline 1 ClueWeb09 and Indexing Baseline 2 Experiments The AutoAdapt Approach Future Work Baseline 1 For ( RL 3), we return the list D q + r : Submit a query qt ∪ qr . Indri combine function becoming dj → dj jobs Submitted Indri query: combine(becoming dj jobs) Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project Overview TREC 2010 Baseline 1 ClueWeb09 and Indexing Baseline 2 Experiments The AutoAdapt Approach Future Work Baseline 2 For ( RL 3), we return the list: D r − D q = { d ; d ∈ D r , d �∈ D q } The documents in D r − D q are ordered using their ranking in D r Dyaa Albakour AutoAdapt @ TREC 2010

The AutoAdapt Project Overview TREC 2010 Baseline 1 ClueWeb09 and Indexing Baseline 2 Experiments The AutoAdapt Approach Future Work Mining Query Logs Fonseca’s Association Rules from query logs to extract query suggestions [3]. Dyaa Albakour AutoAdapt @ TREC 2010

AutoAdapt @ TREC 2010 Dyaa Albakour October 7, 2010 Dyaa Albakour - PowerPoint PPT Presentation

The AutoAdapt Project TREC 2010 ClueWeb09 and Indexing Experiments Future Work AutoAdapt @ TREC 2010 Dyaa Albakour October 7, 2010 Dyaa Albakour AutoAdapt @ TREC 2010 The AutoAdapt Project TREC 2010 ClueWeb09 and Indexing Experiments

Regional Trec - September 27, 2015 - Cadogan Farms TREC Workshop April 2015 Regional TREC

Overview of TREC 2014 Ellen Voorhees Text REtrieval Conference (TREC) TREC 2014 Track

TREC, TAC, takeoffs, tacks, tasks, and titillations for 2009 Ian Soboroff, NIST

Search Evaluation at Grooveshark Yoni Teitelbaum 2013-07-02 Traditional Evaluation: TREC Image

Overview of TREC 2013 Ellen Voorhees Text REtrieval Conference (TREC) Back to our roots, writ

Text REtrieval Conference (TREC) TREC TRACKS Crowdsourcing Personal Blog, Microblog documents

Community Power in Ontario The Road Ahead Clean Air Council November 24, 2017 20 year

TREC 2003 Tracks A Tale of Two Evaluat ions Retrieval in a domain Genome Novelty Answers,

Beyond TREC-QA Ling573 NLP Systems and Applications May 28, 2013 Roadmap Beyond

Webis at the TREC 2012 Session track Matthias Hagen Martin Potthast Matthias Busse Jakob Gomoll

Eye and Brain Eye and Brain Central visual pathways 1 2/22/2010 2 2/22/2010 3 2/22/2010 4

I.M. Skaugen SE 3Q 2010 presentation IMS Innovative Maritime Solutions 15 October 2010 1

Financial Results for 4/2010- -9/2010 9/2010 Financial Results for 4/2010 and and Financial

Your Specialty Chemical Partner Click to edit Master title style December 2018 TREC Disclaimer:

Fostering Success in Transdisciplinary Team Science: Lessons Learned from TREC 1 Participants

Annual Stockholders Meeting May 17, 2016 TREC Safe Harbor Statements in this presentation

EDIT FLUFD BAUER ULRICH DISTANCE UNIVERSAL WORKSHOP EINSTEIN ON & TOPOLOGY DISCRETE

Center Finding Algorithm for Point Source Observation of Slit Spectrometer (IGRINS) Hye-In Lee 1

Accreditation in Croatian family medicine ideas and lessons from past experience Hrvoje

Vaccine Schedules Gregory Hussey Vaccines for Africa Initiative Institute of Infectious Diseases

CORBA Object Transaction Service Telcordia Contact: Paolo Missier paolo@research.telcordia.com

Announcements Matlab Grader homework, emailed Thursday, 1 (of 9) homeworks Due 21 April, Binary

Photonics in Telecom Satellite Payloads Nikos Karafolas with the kind contribution of colleagues

Ontologies, semantic annotation and GATE Kalina Bontcheva Johann Petrak University of Sheffield

Sambuz

Useful Links

Newsletter

Mail Us

AutoAdapt @ TREC 2010 Dyaa Albakour October 7, 2010 Dyaa Albakour - PowerPoint PPT Presentation

The AutoAdapt Project TREC 2010 ClueWeb09 and Indexing Experiments Future Work AutoAdapt @ TREC 2010 Dyaa Albakour October 7, 2010 Dyaa Albakour AutoAdapt @ TREC 2010 The AutoAdapt Project TREC 2010 ClueWeb09 and Indexing Experiments

Regional Trec - September 27, 2015 - Cadogan Farms TREC Workshop April 2015 Regional TREC

Overview of TREC 2014 Ellen Voorhees Text REtrieval Conference (TREC) TREC 2014 Track

TREC, TAC, takeoffs, tacks, tasks, and titillations for 2009 Ian Soboroff, NIST

Search Evaluation at Grooveshark Yoni Teitelbaum 2013-07-02 Traditional Evaluation: TREC Image

Overview of TREC 2013 Ellen Voorhees Text REtrieval Conference (TREC) Back to our roots, writ

Text REtrieval Conference (TREC) TREC TRACKS Crowdsourcing Personal Blog, Microblog documents

Community Power in Ontario The Road Ahead Clean Air Council November 24, 2017 20 year

TREC 2003 Tracks A Tale of Two Evaluat ions Retrieval in a domain Genome Novelty Answers,

Beyond TREC-QA Ling573 NLP Systems and Applications May 28, 2013 Roadmap Beyond

Webis at the TREC 2012 Session track Matthias Hagen Martin Potthast Matthias Busse Jakob Gomoll

Eye and Brain Eye and Brain Central visual pathways 1 2/22/2010 2 2/22/2010 3 2/22/2010 4

I.M. Skaugen SE 3Q 2010 presentation IMS Innovative Maritime Solutions 15 October 2010 1

Financial Results for 4/2010- -9/2010 9/2010 Financial Results for 4/2010 and and Financial

Your Specialty Chemical Partner Click to edit Master title style December 2018 TREC Disclaimer:

Fostering Success in Transdisciplinary Team Science: Lessons Learned from TREC 1 Participants

Annual Stockholders Meeting May 17, 2016 TREC Safe Harbor Statements in this presentation

EDIT FLUFD BAUER ULRICH DISTANCE UNIVERSAL WORKSHOP EINSTEIN ON &amp; TOPOLOGY DISCRETE

Center Finding Algorithm for Point Source Observation of Slit Spectrometer (IGRINS) Hye-In Lee 1

Accreditation in Croatian family medicine ideas and lessons from past experience Hrvoje

Vaccine Schedules Gregory Hussey Vaccines for Africa Initiative Institute of Infectious Diseases

CORBA Object Transaction Service Telcordia Contact: Paolo Missier paolo@research.telcordia.com

Announcements Matlab Grader homework, emailed Thursday, 1 (of 9) homeworks Due 21 April, Binary

Photonics in Telecom Satellite Payloads Nikos Karafolas with the kind contribution of colleagues

Ontologies, semantic annotation and GATE Kalina Bontcheva Johann Petrak University of Sheffield

Sambuz

Useful Links

Newsletter

Mail Us

EDIT FLUFD BAUER ULRICH DISTANCE UNIVERSAL WORKSHOP EINSTEIN ON & TOPOLOGY DISCRETE