Explaining Query Modifications
An alternative interpretation of term addition and removal
Vera Hollink, Jiyin He, Arjen de Vries CWI, the Netherlands
1
Monday, March 26, 12
Explaining Query Modifications An alternative interpretation of term - - PowerPoint PPT Presentation
Explaining Query Modifications An alternative interpretation of term addition and removal Vera Hollink, Jiyin He, Arjen de Vries CWI, the Netherlands 1 Monday, March 26, 12 Query modifications 2 Monday, March 26, 12 Query modifications
An alternative interpretation of term addition and removal
Vera Hollink, Jiyin He, Arjen de Vries CWI, the Netherlands
1
Monday, March 26, 12
2
Monday, March 26, 12
2
Monday, March 26, 12
merlin merlin legend term addition merlin avalon term substitution merlin avalon arthur term addition avalon arthur term removal screenshot mac different
3
Monday, March 26, 12
merlin merlin legend term addition merlin avalon term substitution merlin avalon arthur term addition avalon arthur term removal screenshot mac different
3
Monday, March 26, 12
query terms
and Dennis, 1997, Costa and Seco, 2008, He et al., 2002, Jansen et al., 2009)
addition / specification ... A ... A ... B ... ... A ... ... A ... ... A ... B ... ... A ... A ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... A B removal / generalization
4
Monday, March 26, 12
addition / generalization ... A ... A ... B ... ... A ... ... A ... ... A ... B ... ... A ... A ... A ... B ... ... B ... ... A ... B ... ... A ... ... A ... ... B ... A B removal / specification
5
Monday, March 26, 12
documents that contain all query terms
the added term
addition / generalization ... A ... A ... B ... ... A ... ... A ... ... A ... B ... ... A ... A ... A ... B ... ... B ... ... A ... B ... ... A ... ... A ... ... B ... A B removal / specification
Monday, March 26, 12
Monday, March 26, 12
8
Monday, March 26, 12
Intersection-based Union-based
9
addition / specification ... A ... A ... B ... ... A ... ... A ... ... A ... B ... ... A ... A ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... A B removal / generalization addition / generalization ... A ... A ... B ... ... A ... ... A ... ... A ... B ... ... A ... A ... A ... B ... ... B ... ... A ... B ... ... A ... ... A ... ... B ... A B removal / specification
Monday, March 26, 12
Intersection-based Union-based
9
addition / specification ... A ... A ... B ... ... A ... ... A ... ... A ... B ... ... A ... A ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... A B removal / generalization addition / generalization ... A ... A ... B ... ... A ... ... A ... ... A ... B ... ... A ... A ... A ... B ... ... B ... ... A ... B ... ... A ... ... A ... ... B ... A B removal / specification
D i v e r s e C
e r e n t
Monday, March 26, 12
Intersection-based Union-based
9
addition / specification ... A ... A ... B ... ... A ... ... A ... ... A ... B ... ... A ... A ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... A B removal / generalization addition / generalization ... A ... A ... B ... ... A ... ... A ... ... A ... B ... ... A ... A ... A ... B ... ... B ... ... A ... B ... ... A ... ... A ... ... B ... A B removal / specification
D i v e r s e C
e r e n t C
e r e n t D i v e r s e
Monday, March 26, 12
Intersection-based Union-based
9
addition / specification ... A ... A ... B ... ... A ... ... A ... ... A ... B ... ... A ... A ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... ... A ... B ... A B removal / generalization addition / generalization ... A ... A ... B ... ... A ... ... A ... ... A ... B ... ... A ... A ... A ... B ... ... B ... ... A ... B ... ... A ... ... A ... ... B ... A B removal / specification
D i v e r s e C
e r e n t C
e r e n t D i v e r s e H i g h c
e r a g e L
c
e r a g e
Monday, March 26, 12
decrease the coherence of the result sets?
and term additions occur when all results do contain all terms?
10
Monday, March 26, 12
11
, where
Monday, March 26, 12
judgements in determining the coherency of a result set (Cohen’s kappa = 0.70)
12
Logs News iCLEF 08/09 Web All 556,007 49,174 20, 000 2 terms 282,039 15,713 4,842 >=2 terms 355,660 44,132 17,659
Monday, March 26, 12
13
Coher Coherence herence Avg Avg Sim Sim Covera Coverag erage Data A R A R A R all 0.65 >> 0.56 0.56 >> 0.52 0.90 >> 0.29 News 2 terms 0.66 >> 0.57 0.56 >> 0.52 0.78 >> 0.40 >=2 terms 0.66 >> 0.56 0.56 >> 0.52 0.73 >> 0.29 all 0.94 >> 0.71 0.32 >> 0.29 0.80 >> 0.39 iCLEF 2 terms 0.94 >> 0.73 0.34 >> 0.27 0.81 >> 0.51 >=2 terms 0.94 >> 0.71 0.35 >> 0.29 0.75 >> 0.39 all 0.68 >> 0.64 0.28 >> 0.27 0.69 >> 0.35 Web 2 terms 0.70 >> 0.58 0.29 >> 0.25 0.80 >> 0.61 >=2 terms 0.73 >> 0.64 0.30 >> 0.27 0.64 >> 0.35
≫/≪ indica rank sum tes ≪ ndicates significantly l um test. ntly larger/s er/smaller maller with ith p-value value <0.01 <0.01 using using the Wi the Wilcoxo Wilcoxon
Monday, March 26, 12
14
Coher Coherence herence Avg Avg Sim Sim Cov Coverag erage Data A R A R A R all
News 2 terms
>=2 terms
all
iCLEF 2 terms
>=2 terms
all
0.002 << 0.010 -0.320 << 0.337 Web 2 terms
>=2 terms
≫/≪ indica test. ≪ ndicates significantly l ntly larger/sma er/smaller w ller with p-va th p-value <0.01 <0.01 using using the Wil the Wilcoxon ra n rank sum nk sum
Monday, March 26, 12
0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Relative frequency Coverage in bins of 0.1 addition removal
Web
0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Relative frequency Coverage in bins of 0.1 addition removal
iCLEF
0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Relative frequency Coverage in bins of 0.1 addition removal
News
15
Monday, March 26, 12
between query modification and result set coherence
interpretation is not always valid
explanation to query modifications
based interpretation may lead to biased view on the intentions behind query modifications
16
Monday, March 26, 12
., Bonchi, F., Castillo, C., Vigna, S.: Query reformulation mining: models, patterns, and applications. Information Retrieval 14(3), 257–289 (2010)
., Dennis, S.: Query reformulation on the internet: empirical data and the hyperindex search engine. In: RIAO’97. pp. 488–499 (1997)
., Seco, N.: Hyponymy extraction and web search behavior analysis based on query reformulation. In: IBERAMIA’08 (2008)
session identification. Information Processing and Management 38(5), 727–742 (2002)
web searching. JASIST 60(7), 1358–1371 (2009)
Monday, March 26, 12
the modified queries and the difference of the two
“incomparable”
18
Monday, March 26, 12
judgements (Cohen’s kappa = 0.70)
19
Co Coherence scores es Majority Strong negative Roughly equal Strong positive More specific 23 3 Equal specific 3 26 6 Less specific 5 20
Monday, March 26, 12