1
A Two-Stage Approach to Domain Adaptation for Statistical Classifiers
Jing Jiang & ChengXiang Zhai
Department of Computer Science University of Illinois at Urbana-Champaign
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✠
What is domain adaptation? - - PDF document
A Two-Stage Approach to Domain Adaptation for Statistical Classifiers Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign What is domain adaptation?
1
Department of Computer Science University of Illinois at Urbana-Champaign
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✠2
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆New York Times New York Times
NER Classifier
New York Times NER Classifier
New York Times labeled data not available Reuters
3
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆NYT NYT
New York Times New York Times NER Classifier
Reuters NYT
Reuters New York Times NER Classifier
mouse mouse gene name recognizer fly mouse gene name recognizer
4
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✆Public email collection
✂personal inboxes
Digital cameras
✂cell phones
✁Movies
✂books
learning?
are aware of the training and test domain difference.
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✄5
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆wingless daughterless eyeless apexless …
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✁✡☛wingless daughterless eyeless apexless …
CD38 PABPC5 …
6
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✁ ✁7
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✁Source Domain Target Domain features generalizable features domain-specific features
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✁ ✁Target Domain Source Domain features
8
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✁Source Domain Target Domain features
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✁ ✂Source Domain Target Domain features
9
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✁✡✆Source Domain Target Domain features
Source Domain Target Domain features
10
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✁Previous work models it implicitly [Blitzer et al. 2006, Ben- David et al. 2007, Daumé III 2007].
multiple source (training) domains.
✁Some work requires labeled target data [Daumé III 2007].
semi-supervised learning.
✁Previous work does not incorporate semi-supervised learning [Blitzer et al. 2006, Ben-David et al. 2007, Daumé III 2007].
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✠ ☛11
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✠ ✁T x
' ' )
y T y T y
… and wingless are expressed in…
= N i y T y T y
1 ' ' 2
w
T x
12
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✠✁13
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✠✁14
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✠ ✆15
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✠✁s >> 1: to penalize domain-specific features Source Domain Target Domain
✄ ☎ ✆= = =
k k k
N k N i k T k i k i k K k k s k
1 1 1 2 2 } { ,
u v
+ +
✖ ✖✗✘+ + −
✙✚✛ ✓ ✔✕ ✖ ✗ ✘+ + =
✜ ✜ ✜ ✜= = = =
) ; | ( log 1 ) ; | ( log 1 1 1 min arg }) ˆ { , ˆ , ˆ (
1 1 1 2 1 2 2 } { , m i t T t i t i N k N i k T k i k i k t t K k k s k t
A y p m A y p N K
k k k t
u v x u v x u u v u u v
u u v
λ λ λ
Source Domain Target Domain
t = 1 <<
✢s : to pick up domain-specific
16
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✕✗✖ ✘Alternating optimization
✚ ✛ ✜= = =
k k k
N k N i k T k i k i k K k k s A, k
1 1 1 2 2 } { ,
u v
Idea: training on (K – 1) source domains and test
Approximation:
✹wf
k: weight for feature f learned from domain k
✹wf
k: weight for feature f learned from other domains
✹rank features by
✸See paper for details
✺=
⋅
K k k f k f w
w
1
17
✫✂✬☎✭✝✮✟✯✡✰☞✱☎✱☎✮ ✲✎✳✑✴✓✵✔✰☞✱☎✱☎✮ ✶ ✶…
domains
… … … expressed … … …
D1 D2 Dk-1 Dk (fly)
… …
… … expressed … … w 1.5 0.05 w 2.0 1.2 … … … expressed … … …
1.8 0.1
✫✂✬☎✭✝✮✟✯✡✰☞✱☎✱☎✮ ✲✎✳✑✴✓✵✔✰☞✱☎✱☎✮ ✶✂✁BioCreative Challenge Task 1B
✸Gene/protein name recognition
✸3 organisms/domains: fly, mouse and yeast
✷2 organisms for training, 1 for testing
✸F1 as performance measure
18
✫✂✬☎✭✝✮✟✯✡✰☞✱☎✱☎✮ ✲✎✳✑✴✓✵✔✰☞✱☎✱☎✮ ✶✁0.470 0.195 0.654 DA-2 (domain CV) 0.425 0.153 0.627 DA-1 (joint-opt) 0.416 0.129 0.633 BL Y+F
✂M M+Y
✂F F+M
✂Y Method
Source Domain Target Domain Source Domain Target Domain
F: fly M: mouse Y: yeast
✫✂✬☎✭✝✮✟✯✡✰☞✱☎✱☎✮ ✲✎✳✑✴✓✵✔✰☞✱☎✱☎✮ ✶✁✄0.501 0.305 0.759 DA-2-SSL 0.458 0.241 0.633 BL-SSL Y+F
☎M M+Y
☎F F+M
☎Y Method
Source Domain Target Domain
F: fly M: mouse Y: yeast
Source Domain Target Domain
19
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✕✖✆Generalization: outperformed standard supervised learning
✚Adaptation: outperformed standard bootstrapping
✙Domain cross validation is more effective
✙Single source domain?
✚Setting parameters h and m
20
✂✁☎✄✝✆✟✞✡✠☞☛☎☛☎✆ ✌✎✍✑✏✓✒✔✠☞☛☎☛☎✆ ✕✁