TIRA: Configuring, Executing, and Disseminating Information Retrieval Experiments
Tim Gollub Benno Stein Steven Burrows Dennis Hoppe Webis Group
www.webis.de
TIRA: Configuring, Executing, and Disseminating Information - - PowerPoint PPT Presentation
TIRA: Configuring, Executing, and Disseminating Information Retrieval Experiments Tim Gollub Benno Stein Steven Burrows Dennis Hoppe Webis Group www.webis.de Bauhaus-Universitt Weimar TIRA: Configuring, Executing, and Disseminating
www.webis.de
❑ A longitudinal study has shown consistent selection of weak baselines in
[Armstrong et al., 2009] ❑ A polarizing article describes how biases in research approaches lead to the
[Ioannidis, 2005] ❑ The SWIRL 2002 meeting of 45 information retrieval researchers considered
[Allan et al., 2012] ❑ “We have to explore systematically the independent parameters of
[Fuhr, Salton Award Speech, SIGIR 2012]
3 [∧] c www.webis.de 2012
❑ A longitudinal study has shown consistent selection of weak baselines in
[Armstrong et al., 2009] ❑ A polarizing article describes how biases in research approaches lead to the
[Ioannidis, 2005] ❑ The SWIRL 2002 meeting of 45 information retrieval researchers considered
[Allan et al., 2012] ❑ “We have to explore systematically the independent parameters of
[Fuhr, Salton Award Speech, SIGIR 2012]
4 [∧] c www.webis.de 2012
❑ A longitudinal study has shown consistent selection of weak baselines in
[Armstrong et al., 2009] ❑ A polarizing article describes how biases in research approaches lead to the
[Ioannidis, 2005] ❑ The SWIRL 2002 meeting of 45 information retrieval researchers considered
[Allan et al., 2012] ❑ “We have to explore systematically the independent parameters of
[Fuhr, Salton Award Speech, SIGIR 2012]
5 [∧] c www.webis.de 2012
❑ A longitudinal study has shown consistent selection of weak baselines in
[Armstrong et al., 2009] ❑ A polarizing article describes how biases in research approaches lead to the
[Ioannidis, 2005] ❑ The SWIRL 2002 meeting of 45 information retrieval researchers considered
[Allan et al., 2012] ❑ “We have to explore systematically the independent parameters of
[Fuhr, Salton Award Speech, SIGIR 2012]
6 [∧] c www.webis.de 2012
7 [∧] c www.webis.de 2012
8 [∧] c www.webis.de 2012
9 [∧] c www.webis.de 2012
10 [∧] c www.webis.de 2012
❑ Increase acknowledgment for publishing experiments, data, and software.
❑ Decrease the overhead of publishing experiments.
11 [∧] c www.webis.de 2012
❑ Enables public research on private data. ❑ Enables comparisons with private
❑ Enables linkage of experimental results
❑ Enables reproduction of results on the
❑ Enables the specification of whole
localhost:2306/programs/examples/MyProgram?p1=42&p2=Method1&p2=Method2
tira@node1:~$ ./myprogram.sh -p1 42 -p2 "method1" tira@node2:~$ ./myprogram.sh -p1 42 -p2 "method2" 1 2 4 5 3 6
12 [∧] c www.webis.de 2012
❑ Enables public research on private data. ❑ Enables comparisons with private
❑ Enables linkage of experimental results
❑ Enables reproduction of results on the
❑ Enables the specification of whole
localhost:2306/programs/examples/MyProgram?p1=42&p2=Method1&p2=Method2
tira@node1:~$ ./myprogram.sh -p1 42 -p2 "method1" tira@node2:~$ ./myprogram.sh -p1 42 -p2 "method2" 1 2 4 5 3 6
13 [∧] c www.webis.de 2012
❑ Enables public research on private data. ❑ Enables comparisons with private
❑ Enables linkage of experimental results
❑ Enables reproduction of results on the
❑ Enables the specification of whole
localhost:2306/programs/examples/MyProgram?p1=42&p2=Method1&p2=Method2
tira@node1:~$ ./myprogram.sh -p1 42 -p2 "method1" tira@node2:~$ ./myprogram.sh -p1 42 -p2 "method2" 1 2 4 5 3 6
14 [∧] c www.webis.de 2012
❑ Enables a widespread usage of the
❑ Enables the deployment of any
❑ Enables efficient computation of pending
❑ Enables retrieval and maintenance of
❑ Conduct shared work on the same
localhost:2306/programs/examples/MyProgram?p1=42&p2=Method1&p2=Method2
tira@node1:~$ ./myprogram.sh -p1 42 -p2 "method1" tira@node2:~$ ./myprogram.sh -p1 42 -p2 "method2" 1 2 4 5 3 6
15 [∧] c www.webis.de 2012
❑ Enables a widespread usage of the
❑ Enables the deployment of any
❑ Enables efficient computation of pending
❑ Enables retrieval and maintenance of
❑ Conduct shared work on the same
localhost:2306/programs/examples/MyProgram?p1=42&p2=Method1&p2=Method2
tira@node1:~$ ./myprogram.sh -p1 42 -p2 "method1" tira@node2:~$ ./myprogram.sh -p1 42 -p2 "method2" 1 2 4 5 3 6
16 [∧] c www.webis.de 2012
❑ Enables a widespread usage of the
❑ Enables the deployment of any
❑ Enables efficient computation of pending
❑ Enables retrieval and maintenance of
❑ Conduct shared work on the same
localhost:2306/programs/examples/MyProgram?p1=42&p2=Method1&p2=Method2
tira@node1:~$ ./myprogram.sh -p1 42 -p2 "method1" tira@node2:~$ ./myprogram.sh -p1 42 -p2 "method2" 1 2 4 5 3 6
17 [∧] c www.webis.de 2012
❑ Enables a widespread usage of the
❑ Enables the deployment of any
❑ Enables efficient computation of pending
❑ Enables retrieval and maintenance of
❑ Conduct shared work on the same
localhost:2306/programs/examples/MyProgram?p1=42&p2=Method1&p2=Method2
tira@node1:~$ ./myprogram.sh -p1 42 -p2 "method1" tira@node2:~$ ./myprogram.sh -p1 42 -p2 "method2" 1 2 4 5 3 6
18 [∧] c www.webis.de 2012
www.evaluatir.org
expdb.cs.kuleuven.be
www.mlcomp.org
www.myexperiment.org
www.music-ir.org
www.tunedit.org
pipes.yahoo.com
19 [∧] c www.webis.de 2012
20 [∧] c www.webis.de 2012
Front-end process Back-end process Experiment Database
Program Record
❑ A JSON-based program deployment descriptor. Example:
21 [∧] c www.webis.de 2012
Front-end process Back-end process Experiment Database
Program Record
❑ A JSON-based program deployment descriptor. Example:
22 [∧] c www.webis.de 2012
Front-end process Back-end process Experiment Database
Program Record
❑ A JSON-based program deployment descriptor. Example:
❑ Stores completed as well as pending experiments. ❑ Indexes the input parameters and provides basic retrieval functionality.
23 [∧] c www.webis.de 2012
Front-end process Back-end process Experiment Database
Program Record
retrieve create update
TIRA Server
query execute update
1..n
HTTP Client
❑ Retrieves experiments based on (partial) experiment query. ❑ Requests execution of experiment series based on query. ❑ Realizes web abstraction and creation of TIRA networks.
❑ Either a Web browser, a client program using the TIRA API, or a remote TiraServer.
24 [∧] c www.webis.de 2012
Front-end process Back-end process Experiment Database
Program Record
retrieve create update
TIRA Server
query execute update
1..n
HTTP Client
register execute
Program Scheduler Program Wrapper
update lookup
1..n
❑ Continuously queries the ExperimentDatabase for pending experiments. ❑ Registers matching experiments with the ProgramScheduler execution queue. ❑ Updates the ExperimentDatabase with notifications and results.
❑ Maintains a pool of system threads. ❑ Requests execution of the next experiments in the queue.
25 [∧] c www.webis.de 2012
Front-end process Back-end process Experiment Database
Program Record
retrieve create update
TIRA Server
query execute update
1..n
HTTP Client
register execute
Program Scheduler Program Wrapper
update lookup
1..n 1..n
26 [∧] c www.webis.de 2012
❑ Detailed comparison subtask:
❑ Evaluation metric is the plagdet score:
❑ TIRA has been used for the training and evaluation phases.
27 [∧] c www.webis.de 2012
❑ Participants upload detection results for
❑ From the user inputs the program
❑ Detection results are unzipped and
❑ Participants receive performance results
❑ The training service served as a
tira@node1:~$ unzip -o $Detection -d det && python $PROGRAM/perfmeasure.py
28 [∧] c www.webis.de 2012
❑ TIRA servers are provided for two
❑ Participants submit their plagiarism
❑ A third TIRA server controls the overall
Windows7 Ubuntu12.04 [tira@localhost] [tira@buw]
29 [∧] c www.webis.de 2012
❑ Task. Group the ranked lists from search results into coherent clusters to
❑ Benefit. Fetch search results from multiple search engines for storage as
❑ Task. Pre-compute structural design behavior through learning from large
❑ Benefit. Easily walk through large parameter spaces and avoid duplication
30 [∧] c www.webis.de 2012
❑ Keep it simple. ❑ System independence is a key requirement.
❑ Create more incentives to use TIRA as a leaderboard. ❑ The powerful parameter-substitution mechanism made it easy to get valid
❑ Automated program deployment, e.g. Google App Engine. ❑ Move from open source to open development.
31 [∧] c www.webis.de 2012
32 [∧] c www.webis.de 2012
33 [∧] c www.webis.de 2012