SLIDE 3 3
13
Road Map
- Introduction
- How we use BGM for filtering
– Using expert’s heuristics as a Bayesian prior (SIGIR04) – Exploration and exploitation trade off using Bayesian active learning (ICML 03) – Combining multiple forms of evidence using graphical models (HLT 05) – Collaborative adaptive user modeling with explicit & implicit feedback (CIKM 06)
- Contribution and future work
14
Motivation: Using Heuristics as Bayesian Prior
) , | ( w x yes y P =
X Y w relevant document parameter
) var , | ( iance mean w P
variance mean priors
15
When is it Expected to Work?
number of training data performance learner: low bias heuristic algorithm used to estimate prior: low variance Rocchio algorithm hypothesis: logistic regression
16
Method: Convert Decision Boundary to Prior Distribution
Document space (N) Logistic Regression Parameter space (N+1)
Rocchio + threshold => wR wR
- Step 1: Heuristic algorithm => wR
) , | ( ) ( v w w N w P
m
=
- Step 3: Use wm as logistic regression prior mean
) ( ) , | ( ) ( 1 ) | ( w P w x y p D Z D w P
i i i t t
∏
=
* w
- Step 4: Estimate posterior distribution of logistic parameter
1 ) , cosine( and ) , | ( max arg
1
= =
∏
= R T i i i w m
w w w x y p w
R m
w w
*
α =
∏
=
= =
T i R i i R m
w x y p w w
1 * *
) , | ( max arg where α α α
α
17
Results
0.1 0.2 0.3 0.4 0.5 0.6
Logistic Regression Rocchio Logistic_UnscaledRocchio
normalized utility TREC 11 Adaptive Filtering Data
- Best TREC official result: 0.475
- Similar performance on TREC 9 data
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
LR_Rocchio Team 1 Team 2 Team 4
TDT 2004 results reported by NIST
- A little better result (0.7328) reported by
team_1 in the TDT workshop
18
Road Map
- Introduction
- How we use BGM for filtering
– Using expert’s heuristics as a Bayesian prior – Exploration and exploitation trade off using Bayesian active learning – Combining multiple forms of evidence using graphical models – Collaborative adaptive user modeling with explicit & implicit feedback
- Contribution and future work