RMIT at the NTCIR-13 We Want Web Task Luke Gallagher with Joel - PowerPoint PPT Presentation

RMIT at the NTCIR-13 We Want Web Task Luke Gallagher with Joel Mackenzie, Rodger Benham, Ruey-Cheng Chen, Falk Scholer, and J. Shane Culpepper School of Science (Computer Science) RMIT University NTCIR ’17 (December 8, 2017) Gallagher, Mackenzie, Benham, Chen, Scholer and Culpepper. NTCIR ’17 RMIT at NTCIR-13 WWW 1 / 10

Large Scale Search: The Big Picture Structured Text Index Text Processing, Feature Extraction, Precomputation Index Document Collections SNs Learning to Resource Rank Selection ?? Top- k Results Query Query Parser Rewriting Query (Information Need) Gallagher, Mackenzie, Benham, Chen, Scholer and Culpepper. NTCIR ’17 RMIT at NTCIR-13 WWW 2 / 10

WWW English Subtask • RMIT submitted four systems for the English subtask • Classic effectiveness techniques: • Term dependencies (FDM, SDM) • Query Expansion • Field extents ( title , inlink , body ) • Static document features: • PageRank • Spaminess Victor Lavrenko and W. Bruce Croft. In: Proc. SIGIR . 2001. D. Metzler and W. B. Croft. In: Proc. SIGIR . 2005. Gallagher, Mackenzie, Benham, Chen, Scholer and Culpepper. NTCIR ’17 RMIT at NTCIR-13 WWW 3 / 10

System Configurations: English Subtask • RMIT-1: SDM Fields + RM3 Query Expansion (10 , 50 , 0 . 6) • RMIT-2: Linear combination of RMIT-1 + 0 . 25 × PageRank Priors • RMIT-3: FDM + RM3 Query Expansion (20 , 10 , 0 . 8) • RMIT-4: n -gram Fields + RM3 Query Expansion (10 , 50 , 0 . 6) Post-retrieval spam filtering was applied to all systems except RMIT-1 . Documents with a spam score less than 70 were removed from retrieved results. Gordon V. Cormack, Mark D. Smucker, and Charles L. Clarke. In: Inf. Retr. (2011). Gallagher, Mackenzie, Benham, Chen, Scholer and Culpepper. NTCIR ’17 RMIT at NTCIR-13 WWW 4 / 10

Structured Fields-Based Query “ big red house ” #weight( α 1 #combine(big.title red.title house.title) α 2 #combine(big.inlink red.inlink house.inlink) α 3 #weight( β 1 #combine(big.body red.body house.body) β 2 #combine(#1(big.body red.body) #1(red.body house.body)) β 3 #combine(#uw8(big.body red.body) #uw8(red.body house.body)) ) ) RMIT-1 values were ( α 1 , α 2 , α 3 ) = (0 . 20 , 0 . 05 , 0 . 75), and ( β 1 , β 2 , β 3 ) = (0 . 8 , 0 . 1 , 0 . 1). Tuned on CW09B 200 topics. Gallagher, Mackenzie, Benham, Chen, Scholer and Culpepper. NTCIR ’17 RMIT at NTCIR-13 WWW 5 / 10

English Subtask Results (CW12B) ERR@ k NDCG@ k RBP@ p System @5 @10 @5 @10 @0.9 0.5065 0.5207 0.3977 0.3968 0.7670+0.0242 RMIT-3 0.5285 0.5378 0.4186 0.4069 0.7533+0.0228 RMIT-2 0.4402 0.4249 0.7422+0.0270 RMIT-4 0.5635 0.5728 0.4783 ‡ 0.8438+0.0221 ‡ 0.5548 0.5712 0.4670 RMIT-1 Post-hoc analysis of submissions 0.6509+0.1919 ‡ 0.4760 0.4879 0.3718 0.3713 BM25 0.4955 0.5096 0.3884 0.3879 0.7560+0.0348 R3-NQE 0.5279 0.5403 0.4161 0.4125 0.7537+0.0408 R2-NQE 0.5533 0.5637 0.4276 0.4071 0.7238+0.0456 R4-NQE 0.4817 ‡ 0.4776 ‡ 0.8263+0.0025 ‡ RBC-14 0.5819 0.5951 0.4723 † 0.4877 ‡ 0.8220+0.0453 ‡ 0.5743 0.5884 R1-NQE Holm corrected pairwise statistical tests, with † and ‡ indicating significance at p = 0 . 05 and p = 0 . 01 respectively relative to RMIT-3 . Gallagher, Mackenzie, Benham, Chen, Scholer and Culpepper. NTCIR ’17 RMIT at NTCIR-13 WWW 6 / 10

NDCG@10: RMIT-1 vs. BM25 Topic ∆ Query RMIT-1 BM25 83 0.4129 0.7189 -0.3060 jetstar airlines hong kong 88 0.3179 0.5943 -0.2764 mexico climate 57 0.3209 0.5893 -0.2684 axle ratio 54 0.4505 0.7025 -0.2519 anime pillow 41 0.2281 0.4676 -0.2395 autumn 71 0.5948 0.0000 +0.5948 dog food for allergies 46 0.6958 0.1423 +0.5535 musical note 45 0.6399 0.0812 +0.5586 commendatory term 30 0.9458 0.3898 +0.5561 robot 28 0.5113 0.0000 +0.5113 typing practice Gallagher, Mackenzie, Benham, Chen, Scholer and Culpepper. NTCIR ’17 RMIT at NTCIR-13 WWW 7 / 10

Post-hoc Query Expansion Analysis • Query Expansion was used in all submitted systems • What happens if we turn it off? • Use the same system configuration without Query Expansion NDCG@10 � Win � Loss � Win Win System A System B Win Tie Loss Loss � Loss 32 39 29 1 . 103 16 . 033 15 . 391 1 . 042 R1-NQE RMIT-1 29 45 26 1 . 115 12 . 957 11 . 518 1 . 125 R2-NQE RMIT-2 RMIT-3 † 11 70 19 0 . 579 4 . 943 6 . 909 0 . 715 R3-NQE RMIT-4 ‡ 12 58 30 0 . 400 4 . 883 13 . 655 0 . 358 R4-NQE Gallagher, Mackenzie, Benham, Chen, Scholer and Culpepper. NTCIR ’17 RMIT at NTCIR-13 WWW 8 / 10

Conclusions and Future Work • We Want Web task helps to drive research in other sub-fields and vice-versa • This round we focused on classic retrieval techniques that are known to be effective • Aim to participate in Chinese subtask in future rounds • Use more sophisticated techniques in future (LTR, Duet Matching) Gallagher, Mackenzie, Benham, Chen, Scholer and Culpepper. NTCIR ’17 RMIT at NTCIR-13 WWW 9 / 10

Thank You! Gallagher, Mackenzie, Benham, Chen, Scholer and Culpepper. NTCIR ’17 RMIT at NTCIR-13 WWW 10 / 10

RMIT at the NTCIR-13 We Want Web Task Luke Gallagher with Joel - PowerPoint PPT Presentation

RMIT at the NTCIR-13 We Want Web Task Luke Gallagher with Joel Mackenzie, Rodger Benham, Ruey-Cheng Chen, Falk Scholer, and J. Shane Culpepper School of Science (Computer Science) RMIT University NTCIR 17 (December 8, 2017) Gallagher,

RMIT University 1 RMIT University 2 Think What was the best presentation you have seen? RMIT

NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: 15:30-

Neuchatel at NTCIR-4 From CLEF to NTCIR Jacques Savoy University of Neuchatel, Switzerland

I t Introduction to NTCIR-7 d ti t NTCIR 7 N Noriko Kando k K d National Institute of

KSU Teams QA System for World History Exams at the NTCIR-13 QA Lab-3 Task Tasuku Kimura, Ryo

Kyoto-U: Syntactical EBMT System for NTCIR 7 Patent System for NTCIR-7 Patent Translation Task

Project Team Tricia McLaughlin (RMIT-project lead) Andrea Chester (RMIT) James

Overview of the Sixth NTCIR Workshop Noriko Kando National Institute of Informatics

NTCIR 2014 Slides - TUW-IMP at the NTCIR-11 Math-2 Presentation February 2015 CITATIONS READS

CUTKB at NTCIR-14 QALab-PoliInfo Task Toshiki Tomihira and Yohei Seki University of Tsukuba,

Cops and Robbers on Graphs David Ellison RMIT, School of Science david.ellison2@rmit.edu.au

Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Text Conversation Task

SG01 at the NTCIR-13 STC-2 task Haizhou Zhao , Yi Du, Hangyu Li, Qiao Qian, Hao Zhou, Minlie

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

SSTUT at NTCIR-4 Web task Yinghui Xu Kyoji Umemura Software System Lab. (Umemura Lab)

IASL System for NTCIR-6 Korean-Chinese CLIR Yu-Chun Wang Cheng-Wei Lee Richard Tzong-Han Tsai

Future Welcome me Canyon Hills Coyote promo moting class of 2021 Night Canyon Hills

Plugin Architectures in Haskell Motivation 1 [1] [2] Problem Description Extensibility through

CS 523: Multimedia Systems Angus Forbes creativecoding.evl.uic.edu/courses/cs523 Today -

Outline Basic info 1 Previous education 2 Promotion research 3 Albert-Jan Yzelman

Priority Queues & Heaps CS16: Introduction to Data Structures & Algorithms Spring 2020

Preprocessing and Dimensionality Reduction J er emy Fix CentraleSup elec

Language-based Colorization of Scene Sketches Changqing Zou* 1,2 , Haoran Mo* 1 , Chengying Gao 1 ,

Commercial Detection in Heterogeneous Video Streams Using Fused Multi-Modal and Temporal Features

RMIT at the NTCIR-13 We Want Web Task Luke Gallagher with Joel - PowerPoint PPT Presentation

RMIT at the NTCIR-13 We Want Web Task Luke Gallagher with Joel Mackenzie, Rodger Benham, Ruey-Cheng Chen, Falk Scholer, and J. Shane Culpepper School of Science (Computer Science) RMIT University NTCIR 17 (December 8, 2017) Gallagher,

RMIT University 1 RMIT University 2 Think What was the best presentation you have seen? RMIT

NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: 15:30-

Neuchatel at NTCIR-4 From CLEF to NTCIR Jacques Savoy University of Neuchatel, Switzerland

I t Introduction to NTCIR-7 d ti t NTCIR 7 N Noriko Kando k K d National Institute of

KSU Teams QA System for World History Exams at the NTCIR-13 QA Lab-3 Task Tasuku Kimura, Ryo

Kyoto-U: Syntactical EBMT System for NTCIR 7 Patent System for NTCIR-7 Patent Translation Task

Project Team Tricia McLaughlin (RMIT-project lead) Andrea Chester (RMIT) James

Overview of the Sixth NTCIR Workshop Noriko Kando National Institute of Informatics

NTCIR 2014 Slides - TUW-IMP at the NTCIR-11 Math-2 Presentation February 2015 CITATIONS READS

CUTKB at NTCIR-14 QALab-PoliInfo Task Toshiki Tomihira and Yohei Seki University of Tsukuba,

Cops and Robbers on Graphs David Ellison RMIT, School of Science david.ellison2@rmit.edu.au

Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Text Conversation Task

SG01 at the NTCIR-13 STC-2 task Haizhou Zhao , Yi Du, Hangyu Li, Qiao Qian, Hao Zhou, Minlie

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

SSTUT at NTCIR-4 Web task Yinghui Xu Kyoji Umemura Software System Lab. (Umemura Lab)

IASL System for NTCIR-6 Korean-Chinese CLIR Yu-Chun Wang Cheng-Wei Lee Richard Tzong-Han Tsai

Future Welcome me Canyon Hills Coyote promo moting class of 2021 Night Canyon Hills

Plugin Architectures in Haskell Motivation 1 [1] [2] Problem Description Extensibility through

CS 523: Multimedia Systems Angus Forbes creativecoding.evl.uic.edu/courses/cs523 Today -

Outline Basic info 1 Previous education 2 Promotion research 3 Albert-Jan Yzelman

Priority Queues &amp; Heaps CS16: Introduction to Data Structures &amp; Algorithms Spring 2020

Preprocessing and Dimensionality Reduction J er emy Fix CentraleSup elec

Language-based Colorization of Scene Sketches Changqing Zou* 1,2 , Haoran Mo* 1 , Chengying Gao 1 ,

Commercial Detection in Heterogeneous Video Streams Using Fused Multi-Modal and Temporal Features

Priority Queues & Heaps CS16: Introduction to Data Structures & Algorithms Spring 2020