Towards Accurate Model Selection in Deep Unsupervised Domain - - PowerPoint PPT Presentation

towards accurate model selection in deep unsupervised
SMART_READER_LITE
LIVE PREVIEW

Towards Accurate Model Selection in Deep Unsupervised Domain - - PowerPoint PPT Presentation

Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation Kaichao You 1 , Ximei Wang 1 , Mingsheng Long 1 , Michael I. Jordan 2 1 School of Software, Tsinghua University 1 National Engineering Lab for Big Data Software 2 University of


slide-1
SLIDE 1

Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation

Kaichao You1 , Ximei Wang1, Mingsheng Long1, Michael I. Jordan2

1School of Software, Tsinghua University 1National Engineering Lab for Big Data Software 2University of California, Berkeley

International Conference on Machine Learning ICML 2019

Kaichao You et al Deep Embedded Validation June 12, 2019 1 / 10

slide-2
SLIDE 2

Validation in UDA: the problem

Outline

1

Validation in UDA: the problem

2

IWCV: the previous solution

3

Deep Embedded Validation

4

Experiments

Kaichao You et al Deep Embedded Validation June 12, 2019 2 / 10

slide-3
SLIDE 3

Validation in UDA: the problem

Validation in UDA: the problem

Supervised Learning

Kaichao You et al Deep Embedded Validation June 12, 2019 3 / 10

slide-4
SLIDE 4

Validation in UDA: the problem

Validation in UDA: the problem

Supervised Learning Unsupervised Domain Adaptation

Kaichao You et al Deep Embedded Validation June 12, 2019 3 / 10

slide-5
SLIDE 5

IWCV: the previous solution

Outline

1

Validation in UDA: the problem

2

IWCV: the previous solution

3

Deep Embedded Validation

4

Experiments

Kaichao You et al Deep Embedded Validation June 12, 2019 4 / 10

slide-6
SLIDE 6

IWCV: the previous solution

IWCV: the previous solution

Covariate Shift Assumption p(y|x) = q(y|x)

Kaichao You et al Deep Embedded Validation June 12, 2019 5 / 10

slide-7
SLIDE 7

IWCV: the previous solution

IWCV: the previous solution

Covariate Shift Assumption p(y|x) = q(y|x) Model Selection: estimate Target Risk R(g) = Ex∼qℓ(g(x), y)

Kaichao You et al Deep Embedded Validation June 12, 2019 5 / 10

slide-8
SLIDE 8

IWCV: the previous solution

IWCV: the previous solution

Covariate Shift Assumption p(y|x) = q(y|x) Model Selection: estimate Target Risk R(g) = Ex∼qℓ(g(x), y) Importance Weighted Cross Validation 1 Ex∼pw(x)ℓ(g(x), y) = Ex∼p q(x) p(x)ℓ(g(x), y) = Ex∼qℓ(g(x), y) = R(g)

1Covariate shift adaptation by importance weighted cross validation, JMLR’2007 Kaichao You et al Deep Embedded Validation June 12, 2019 5 / 10

slide-9
SLIDE 9

IWCV: the previous solution

IWCV: the previous solution

Covariate Shift Assumption p(y|x) = q(y|x) Model Selection: estimate Target Risk R(g) = Ex∼qℓ(g(x), y) Importance Weighted Cross Validation 1 Ex∼pw(x)ℓ(g(x), y) = Ex∼p q(x) p(x)ℓ(g(x), y) = Ex∼qℓ(g(x), y) = R(g)

Unbiased but the variance is unbounded Density ratio is not readily accessible

1Covariate shift adaptation by importance weighted cross validation, JMLR’2007 Kaichao You et al Deep Embedded Validation June 12, 2019 5 / 10

slide-10
SLIDE 10

Deep Embedded Validation

Outline

1

Validation in UDA: the problem

2

IWCV: the previous solution

3

Deep Embedded Validation

4

Experiments

Kaichao You et al Deep Embedded Validation June 12, 2019 6 / 10

slide-11
SLIDE 11

Deep Embedded Validation

Deep Embedded Validation

IWCV’s variance1: Varx∼p[ℓw] ≤ dα+1(qp) R(g)1− 1

α − R(g)2. 1Learning Bounds for Importance Weighting, NeurIPS’2010 Kaichao You et al Deep Embedded Validation June 12, 2019 7 / 10

slide-12
SLIDE 12

Deep Embedded Validation

Deep Embedded Validation

IWCV’s variance1: Varx∼p[ℓw] ≤ dα+1(qp) R(g)1− 1

α − R(g)2.

Feature adaptation reduces distribution discrepancy2

1Learning Bounds for Importance Weighting, NeurIPS’2010 2Conditional Adversarial Domain Adaptation, NeurIPS’2018 Kaichao You et al Deep Embedded Validation June 12, 2019 7 / 10

slide-13
SLIDE 13

Deep Embedded Validation

Deep Embedded Validation

IWCV’s variance1: Varx∼p[ℓw] ≤ dα+1(qp) R(g)1− 1

α − R(g)2.

Feature adaptation reduces distribution discrepancy2 Control variate explicitly reduces the variance

E[z] = ζ, E[t] = τ z⋆ = z + η(t − τ). E[z⋆] = E[z] + ηE[t − τ] = ζ + η(E[t] − E[τ]) = ζ. Var[z⋆] = Var[z + η(t − τ)] = η2Var[t] + 2ηCov(z, t) + Var[z] min Var[z⋆] = (1 − ρ2

z,t)Var[z], when ˆ

η = − Cov(z,t)

Var[t]

1Learning Bounds for Importance Weighting, NeurIPS’2010 2Conditional Adversarial Domain Adaptation, NeurIPS’2018 Kaichao You et al Deep Embedded Validation June 12, 2019 7 / 10

slide-14
SLIDE 14

Deep Embedded Validation

Deep Embedded Validation

IWCV’s variance1: Varx∼p[ℓw] ≤ dα+1(qp) R(g)1− 1

α − R(g)2.

Feature adaptation reduces distribution discrepancy2 Control variate explicitly reduces the variance

E[z] = ζ, E[t] = τ z⋆ = z + η(t − τ). E[z⋆] = E[z] + ηE[t − τ] = ζ + η(E[t] − E[τ]) = ζ. Var[z⋆] = Var[z + η(t − τ)] = η2Var[t] + 2ηCov(z, t) + Var[z] min Var[z⋆] = (1 − ρ2

z,t)Var[z], when ˆ

η = − Cov(z,t)

Var[t]

Density ratio can be estimated discriminatively.3

1Learning Bounds for Importance Weighting, NeurIPS’2010 2Conditional Adversarial Domain Adaptation, NeurIPS’2018 3Discriminative learning for differing training and test distributions, ICML’2007 Kaichao You et al Deep Embedded Validation June 12, 2019 7 / 10

slide-15
SLIDE 15

Experiments

Outline

1

Validation in UDA: the problem

2

IWCV: the previous solution

3

Deep Embedded Validation

4

Experiments

Kaichao You et al Deep Embedded Validation June 12, 2019 8 / 10

slide-16
SLIDE 16

Experiments

Experiments

Experiments on a toy problem under covariate shift

−0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0

x

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 Train Test −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

x

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

λ = 0 λ = 1 λ = 0.5

y = f(x) model Train Test 0.0 0.2 0.4 0.6 0.8

λ

0.0 0.2 0.4 0.6 0.8 1.0 1.2

Error Rate∗

Source Risk IWCV Target Risk DEV 0.0 0.2 0.4 0.6 0.8 1.0

λ

0.02 0.04 0.06 0.08 0.10 0.12 0.14

Standard Deviation

IWCV Target Risk DEV

Kaichao You et al Deep Embedded Validation June 12, 2019 9 / 10

slide-17
SLIDE 17

Experiments

Experiments

Experiments on a toy problem under covariate shift

−0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0

x

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 Train Test −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

x

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

λ = 0 λ = 1 λ = 0.5

y = f(x) model Train Test 0.0 0.2 0.4 0.6 0.8

λ

0.0 0.2 0.4 0.6 0.8 1.0 1.2

Error Rate∗

Source Risk IWCV Target Risk DEV 0.0 0.2 0.4 0.6 0.8 1.0

λ

0.02 0.04 0.06 0.08 0.10 0.12 0.14

Standard Deviation

IWCV Target Risk DEV

Experiments on real-world problems

Various datasets: VisDA/Office/Digits Various models: CDAN, MCD, GTA Deep Embedded Validation is empirically validated

Kaichao You et al Deep Embedded Validation June 12, 2019 9 / 10

slide-18
SLIDE 18

Experiments

Thanks!

Code available at github.com/thuml/Deep-Embedded-Validation Poster: tonight at Pacific Ballroom #259

Kaichao You et al Deep Embedded Validation June 12, 2019 10 / 10