Connecting the Dots with Landmarks:
Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Kristen Grauman and Fei Sha
Connecting the Dots with Landmarks : Discriminatively Learning - - PowerPoint PPT Presentation
Connecting the Dots with Landmarks : Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Kristen Grauman and Fei Sha The perils of mismatched
Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Kristen Grauman and Fei Sha
T h e O f fi c e
Different underlying distributions Overfit to datasets’ idiosyncrasies
Images from [Saenko et al.’10].
T h e O f fi c e
Source domain (with labeled data) Target domain (no labels for training)
Source domain (with labeled data) Target domain (no labels for training)
Learn classifier to work well on the target
Different distributions
[Shimodaira, ’00] [Huang et al., Bickel et al., ’07] [Sugiyama et al., ’08] [Sethy et al., ’06] [Sethy et al., ’09]
+
[Shimodaira, ’00] [Huang et al., Bickel et al., ’07] [Sugiyama et al., ’08] [Sethy et al., ’06] [Sethy et al., ’09]
[Evgeniou and Pontil, ’05] [Duan et al., ’09] [Duan et al., Daumé III et al., Saenko et al., ’10] [Kulis et al., Chen et al., ’11]
+
+
[Shimodaira, ’00] [Huang et al., Bickel et al., ’07] [Sugiyama et al., ’08] [Sethy et al., ’06] [Sethy et al., ’09]
[Evgeniou and Pontil, ’05] [Duan et al., ’09] [Duan et al., Daumé III et al., Saenko et al., ’10] [Kulis et al., Chen et al., ’11]
+
+
[Pan et al., ’09] [Blitzer et al., ’06] [Gopalan et al., ’11] [Chen et al., ’12] [Daumé III, ’07] [Argyriou et al, ’08] [Gong et al., ’12] [Muandet et al., ’13]
+ + +
[Shimodaira, ’00] [Huang et al., Bickel et al., ’07] [Sugiyama et al., ’08] [Sethy et al., ’06] [Sethy et al., ’09]
[This work]
[Evgeniou and Pontil, ’05] [Duan et al., ’09] [Duan et al., Daumé III et al., Saenko et al., ’10] [Kulis et al., Chen et al., ’11]
+
+
[Pan et al., ’09] [Blitzer et al., ’06] [Gopalan et al., ’11] [Chen et al., ’12] [Daumé III, ’07] [Argyriou et al, ’08] [Gong et al., ’12] [Muandet et al., ’13]
+ + +
Attempting to adapt all source data points, including “hard” ones
Learning discrimination biased to source, rather than optimized w.r.t. target
Ease adaptation difficulty Provide discrimination (biased to target)
10
Landmarks Target Source
1 Identify landmarks
Coarse Fine- grained
11
2 Construct auxiliary domain adaptation tasks
11
2 Construct auxiliary domain adaptation tasks 3 Obtain domain- invariant features
11
2 Construct auxiliary domain adaptation tasks 3 Obtain domain- invariant features 4 Predict target labels
12
Landmarks Target Source
1 Identify landmarks
Coarse Fine- grained
Plus: universal (characteristic) Minus: how to choose the bandwidth?
Plus: universal (characteristic) Minus: how to choose the bandwidth?
Examining distributions at multiple granularities Multiple bandwidths, multiple sets of landmarks
19
Headphone Mug target Target Source
19
Headphone Mug target Target Source
6
σ=2 σ=2
σ=2
19
Headphone Mug target Target Source
6
σ=2 σ=2
σ=2
Unselected
20
2 Construct auxiliary domain adaptation tasks
Target Source Landmarks
New target New source
Landmarks
[Gong, et al. ’12]
24
2 Construct auxiliary domain adaptation tasks
24
2 Construct auxiliary domain adaptation tasks 3 Obtain domain- invariant features
26
2 Construct auxiliary domain adaptation tasks 3 Obtain domain- invariant features
26
2 Construct auxiliary domain adaptation tasks 3 Obtain domain- invariant features 4 Predict target labels
[Griffin et al. ’07, Saenko et al. 10’]
Books, DVD, electronics, kitchen appliances [Biltzer et al. ’07]
T h e O f fi c e
[Shimodaira, ’00] [Huang et al., Bickel et al., ’07] [Sugiyama et al., ’08] [Sethy et al., ’06] [Sethy et al., ’09]
[Evgeniou and Pontil, ’05] [Duan et al., ’09] [Duan et al., Daumé III et al., Saenko et al., ’10] [Kulis et al., Chen et al., ’11]
+
+
[Pan et al., ’09] [Blitzer et al., ’06] [Gopalan et al., ’11] [Chen et al., ’12] [Daumé III, ’07] [Argyriou et al, ’08] [Gong et al., ’12] [Muandet et al., ’13]
+ + +
[Huang et al., ’07]
+
+
[Pan et al., ’09] [Blitzer et al., ’06] [Gopalan et al., ’11] [Gong et al., ’12]
+ + +
15 30 45 60 A-->C A-->D C-->A C-->W W-->A W-->C No adaptation Gopalan et al.'11 Pan et al.'09 GFK Landmark
Acuracy (%)
O f fi c e
15 30 45 60 A-->C A-->D C-->A C-->W W-->A W-->C No adaptation Gopalan et al.'11 Pan et al.'09 GFK Landmark
Acuracy (%)
O f fi c e
15 30 45 60 A-->C A-->D C-->A C-->W W-->A W-->C No adaptation Gopalan et al.'11 Pan et al.'09 GFK Landmark
Acuracy (%)
O f fi c e
55 60 65 70 75 80 85 K-->D D-->B B-->E E-->K Pan et al.'09 Gopalan et al.'11 GFK Saenko et al. ’10 Blitzer et al.’06 Huang et al.’07 Landmark
Acuracy (%)
25 39 53 66 80 A-->C A-->D A-->W C-->A C-->D C-->W W-->A W-->C W-->D
Non-landmarks Random selection Landmark
Acuracy (%)
an intrinsic structure, shared between domains labeled source instances distributed similarly to the target auxiliary tasks provably easier to solve discriminative loss despite unlabeled target
37
Headphone Mug target Target Source
6
σ=2 σ=2
σ=2
Unselected
Acuracy (%)
30 40 50 60 70 80 A-->C A-->D A-->W C-->A C-->D C-->W W-->A W-->C W-->D No class balance Landmark