Towards Assumption-free Unsupervised Domain Adaptation for Visual - - PowerPoint PPT Presentation

towards assumption free unsupervised domain adaptation
SMART_READER_LITE
LIVE PREVIEW

Towards Assumption-free Unsupervised Domain Adaptation for Visual - - PowerPoint PPT Presentation

Towards Assumption-free Unsupervised Domain Adaptation for Visual recognition Outline Background and Motivations Unsupervised Domain Adaptation - Estimating label information in


slide-1
SLIDE 1

Towards Assumption-free Unsupervised Domain Adaptation for Visual recognition

马锦华 数据科学与计算机学院 中山大学

slide-2
SLIDE 2

Outline

  • Background and Motivations
  • Unsupervised Domain Adaptation
  • Estimating label information in target domain
  • Domain-shared Group-sparse Representation
  • Conclusion and Future Works
slide-3
SLIDE 3

Background and Motivations

  • Traditional machine learning problem

Training Data

Machine Learning Model

Test Data

Apply the learned Model

Result

References: [1] T. Sim, et al. The cmu pose, illumination, and expression (pie) database. (AFGR 2002)

Test images from PIE dataset [1] Training images from PIE dataset [1]

slide-4
SLIDE 4

Background and Motivations

  • Dataset bias (generalization) problem:
  • Performance drops across datasets due to

distributions mismatch,

  • Source: Images of license plate are downloaded from Wikipedia

Training Dataset Test Dataset

Training Test

slide-5
SLIDE 5

Background and Motivations

  • Basic Idea of Domain Adaptation
  • Good performance if many target labeled data are

available

Domain Adaptation

Labeled training data (Source Domain) Test data w or w/o labels (Target Domain) Model* for target domain

(Model: classifier/detector/estimator/…)

Training Dataset Test Dataset Accuracy (%) of 1‐NN on test dataset 2000 target data 1000 target data 100 target data No target data Frontal faces (PIE27) Side faces (PIE05) 67.65 59.52 29.49 29.10

Face recognition accuracy (%) on test dataset

Reference: T. Sim, et al. The cmu pose, illumination, and expression (pie) database. (AFGR 2002)

slide-6
SLIDE 6

Background and Motivations

  • Labels are unavailable in test dataset in many

applications

  • Large number of datasets (e.g., large-scale camera

networks)

  • Difficulty of labeling (e.g., medical images)
  • Unsupervised Domain Adaptation

6

Unsupervised Domain Adaptation

Labeled training data (Source Domain) Test data w/o labels (Target Domain) Model for target domain

slide-7
SLIDE 7

Background and Motivations

  • Applications

References: [1] K. Saenko, et al. Adversarial Discriminative Domain Adaptation. (CVPR 2017) [2] Baoyao Yang, Andy J. Ma, PC Yuen, Domain‐shared Group‐sparse Dictionary Learning for Unsupervised Domain Adaptation (AAAI 2018) [3] Baoyao Yang, Andy J. Ma, PC Yuen, Body Parts Synthesis for Cross‐Quality Pose Estimation (TCSVT 2018) [4] Andy J Ma, Pong C Yuen, Jiawei Li and Ping Li. Cross‐Domain Person Reidentification Using Domain Adaptation Ranking SVMs.(TIP 2015) [5] Yang Zhang, et al. Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes. (ICCV 2017)

Cross-dataset Object Recognition [1,2] Cross-dataset Re-identification [4] Cross-domain Segmentation [5] Cross-domain Pose Estimation [3]

Source images Target images Source images Target images Source image pairs Target image pairs

slide-8
SLIDE 8

Background and Motivations

  • Challenges in Unsupervised Domain Adaptation
  • Joint distributions alignment WITHOUT labeled

data in target domain

  • Joint distributions mismatch
  • Equal conditional distribution assumption
  • Marginal distribution alignment

, ,

: Source data : Source labels : Target data : Target labels

| | ~

slide-9
SLIDE 9

Background and Motivations

  • Challenges in Unsupervised Domain Adaptation
  • Equal conditional distribution assumption NOT

valid

  • Estimating labels in target domain for conditional

distribution alignment

9

Training Dataset Test Dataset

| ~ |

slide-10
SLIDE 10

Background and Motivations

  • Challenges in Unsupervised Domain Adaptation
  • Imbalanced unlabeled data in target domain
  • Example applications: person re-identification,
  • bject detection, pose estimation, etc.
  • In re-identification: # of negatives >> # of positives
  • Difficulty in estimating positive labels (minority

class) accurately

(a) Balanced data (b) Imbalanced data

slide-11
SLIDE 11

Estimating Label Information

  • Estimating positive (minority-class) mean [1]
  • Estimating positive (minority-class) region [2]
  • Dynamic Graph matching [3]

11

Reference

  • 1. A J Ma, J W Li, P C Yuen and P Li, “Cross-domain person re-identification using domain

adaptation ranking SVMs”, IEEE Transactions on Image Processing (TIP), Vol. 24, No. 5, pp. 1599-1613, 2015.

  • 2. J W Li, A J Ma and P C Yuen, “Semi-Supervised Region Metric Learning for Person Re-

identification”, International Journal of Computer Vision (IJCV), In press, 2018

  • 3. M Ye, A J Ma, L Zheng, J W Li, and P C Yuen, “Dynamic Label Graph Matching for Unsupervised

Video Re-identification”, International Conference on Computer Vision (ICCV), 2017.

slide-12
SLIDE 12

Positive Mean Strategy

  • Following linear models, confidence score can

be calculated by weighted summation, i.e.

  • If positive data is available in target domain
  • Taking summation over positive samples in (*)
  • (**) is necessary condition of (*) & only related

to positive mean

12

  • (*)
  • (**)
slide-13
SLIDE 13

Positive Mean Strategy

  • Estimator 1 using distribution relationship
  • If target prior probability is known,

: mean of unlabeled image pairs

: mean of available negatives, estimation of

negative mean

  • Unreliable in practice
  • Estimation error is large, since

  • 13
slide-14
SLIDE 14

Positive Mean Strategy

  • Estimator 2 using domain similarity
  • Assume that

,

  • Upper bound of estimation error is

14

slide-15
SLIDE 15

Positive Mean Strategy

  • Estimator 3 using potential positive samples
  • Find potential positive samples by

where

∝ exp

  • .

15

max

  • ,⋯
slide-16
SLIDE 16

Positive Mean Strategy

  • Combining Estimated Means
  • Only

and are employed for their reliability in

imbalanced data

be a random vector, & are independent

  • Solving the ML problem under Gaussian distribution

assumption

16

,

  • 1
slide-17
SLIDE 17

Positive Mean Strategy

  • Adaptive Ranking SVMs
  • Asymmetric domain adaptation model on target

domain

  • Adaptive Ranking SVMs

17

min

,,

1 2 μ

  • . . θ

1

  • 1
  • 0, 0 1, ∀
slide-18
SLIDE 18

Experimental Results on Re-ID

  • CMC curves comparing RankSVM [BMCV’10]

training on target or source domain & non- learning metric

18

i-LIDS to CUHK i-LIDS to PRID i-LIDS to VIPeR

slide-19
SLIDE 19

Experimental Results on Re-ID

  • CMC curves comparing existing Domain

Adaptation methods, i.e. TCA [TNN’11] & GFS

[ICCV’11].

19

i-LIDS to CUHK i-LIDS to PRID i-LIDS to VIPeR

slide-20
SLIDE 20

Experimental Results

  • Accuracy (%) comparing existing Re-ID

algorithms on VIPeR

20

Method source Rank 1 Rank 5 Rank 10 Rank 20

Ours PRID 44.94 64.15 71.33 77.48 CUHK 47.47 66.84 72.67 78.94 i‐LIDS 45.35 66.16 73.43 78.77 SDALF [CVPR’10] ‐‐‐‐ 19.87 38.89 49.37 65.73 CPS [BMCV’11] ‐‐‐‐ 21.84 44.00 57.21 71.00 eLDFV [ECCVW’12] ‐‐‐‐ 22.34 47.00 60.04 71.00 CIS [PAMI’13] ‐‐‐‐ 24.24 44.91 56.55 69.40 eSDC [CVPR’13] ‐‐‐‐ 26.74 50.70 62.37 76.36

slide-21
SLIDE 21

Positive Region Strategy

  • Region representation
  • Regularized affine hull based positive region

representation

∑ 1,

  • Region-to-point distance metric learning
  • Maximize the distance between the positive region and the

negatives

21

Positives Positive region Negatives Region‐to‐point distance

slide-22
SLIDE 22

Positive Region Strategy

  • Positive region estimation without using

positives

  • Positive neighbors close to the positives

∈ ∃,

  • Positive region close to the region of positive

neighbors

∑ ∑ 1,

22

Positives Positive neighbors Estimated positive region Ground truth positive region Negatives Region‐to‐point distance

slide-23
SLIDE 23

Positive Region Strategy

  • Positive region estimation without using

positives

  • Positive neighbors close to the positives in a projective

space

∈ ∃, ϕ ϕ

  • positive neighbors by maximizing the probability that at least
  • ne positive locates in the -neighbourhood

max

, ̂

  • where ̂

1 ∏

ϕ ϕ

  • Solve the optimization problem by label propagation with

cross person score distribution alignment

23

slide-24
SLIDE 24

Positive Region Strategy

  • Positive Region Metric Learning
  • Learn the point-to-point distance function

by

min

∑ ∑ ,

  • s.t.

,

  • min

,

  • Solved by an iteratively computing

and find the

metric function

  • 24
slide-25
SLIDE 25

Experimental Results On Mars

  • Accuracy (%) on improving classifiers training
  • n source domain & non-learning metric

25

Source CUHK01 VIPeR Method 1 5 10 1 5 10 Ours + AdaRSVM 32.53 53.21 62.43 31.60 51.29 60.62 Source AdaRSVM [TIP’15] 21.27 35.83 43.61 16.60 30.10 37.23 Ours + XQDA 37.08 55.30 64.41 40.56 55.52 64.40 Source XQDA [CVPR’15] 22.85 39.88 48.23 24.55 41.49 50.07 Ours + L1 33.67 54.99 63.54 33.67 54.99 63.54 L1 14.60 26.58 33.98 14.60 26.58 33.98

slide-26
SLIDE 26

Experimental Results On Mars

  • Accuracy (%) comparing existing classification

methods dealing with mislabeled data

26

Source CUHK01 VIPeR Method 1 5 10 1 5 10 Ours + XQDA 37.08 37.08 55.30 55.30 64.41 64.41 40.56 40.56 55.52 55.52 64.40 64.40 RLR [ECML PKDD’10] 4.13 7.69 10.27 9.98 19.03 24.55 IW [PAMI’16] 26.54 43.14 51.69 26.99 43.17 50.89

slide-27
SLIDE 27

Experimental Results On Mars

  • Accuracy (%) comparing existing semi-

supervised learning methods

27

Source CUHK01 VIPeR Method 1 5 10 1 5 10 Ours+XQDA 37.08 37.08 55.30 55.30 64.41 64.41 40.56 40.56 55.52 55.52 64.40 64.40 SSR‐IPP [ACCV’14] 35.10 52.08 59.68 35.01 51.60 59.62 SLISC [IJCAI’11] 29.28 44.83 52.82 28.89 44.21 51.66 SSC [CVPR’14] 34.26 50.36 57.51 31.38 46.35 54.07

slide-28
SLIDE 28

Experimental Results On Mars

  • Accuracy (%) comparing existing unsupervised

Re-ID methods

  • GRDL is an unsupervised learning method, AdaRSVM

& UMDL are domain adaptation methods.

28

Source CUHK01 VIPeR Method 1 5 10 1 5 10 Ours+XQDA 37.08 37.08 55.30 55.30 64.41 64.41 40.56 40.56 55.52 55.52 64.40 64.40 AdaRSVM [TIP’15] 21.27 35.83 43.61 16.60 30.10 37.23 UMDL [CVPR’16] 36.82 54.19 62.74 36.43 54.28 62.41 GRDL [ECCV’16] 34.03 51.07 59.26 34.03 51.07 59.26

slide-29
SLIDE 29

Graph Matching Strategy

  • Why Graph Matching for label estimation (e.g. in

Re-ID)?

  • Minimizing the global matching cost

29

Cam A Cam B Node: Image sequence Edge: Intra‐camera sequence similarity

slide-30
SLIDE 30

Graph Matching Strategy

  • Dynamic Graph Matching Framework

30

Unlabeled Data Graph Construction

Learnt Metric

Estimated Labels Input: Output: Label Re‐weighting

Intermediate Labels No Yes Dynamic Graph Matching Iteration End

Metric Learning Graph Matching Machine Learning Model

Graph matching is dynamically updated by learning a better distance metric

slide-31
SLIDE 31

Graph Matching Strategy

  • Dynamic Graph Matching
  • In iteration t:
  • Graph matching is conducted to achieve intermediate labels
  • Label re-weighting is introduced to handle the noisy labels
  • Metric learning is adopted to learn a better feature space
  • Update graph matching

31

True Matching False Matching Graph Matching Label Re‐weighting

Updated Matching

0.8

Metric Learning t t +1

slide-32
SLIDE 32

Experiments on Re-ID

  • Evaluation of Dynamic Iterative Updating
  • Rank-1 accuracy at each iteration

32

  • For Training:
  • The label estimation accuracy is

improved significantly with our dynamic updating procedure.

  • For Testing:
  • The rank-1 matching accuracy is

also improved with iterations.

slide-33
SLIDE 33

Experiments on Re-ID

  • Comparison with state-of-the-arts

33

Red indicates the best performance while Blue for second best Ours

  • The proposed method outperforms existing unsupervised re‐ID methods in most cases.
  • The estimated labels are suitable for other supervised learning methods, thus yield

quite promising results on the large scale MARS dataset with deep learning methods.

slide-34
SLIDE 34

Domain-shared Group-sparse Representation

  • Joint distribution alignment without estimating

target labels

34

Reference

  • 1. B Y Yang, A J Ma, and P C Yuen, “Domain-Shared Group-Sparse Dictionary Learning for

Unsupervised Domain Adaptation”, AAAI Conference on Artificial Intelligence, 2018.

  • 2. B Y Yang, A J Ma, and P C Yuen, “Learning domain-shared group-sparse representation for

unsupervised domain adaptation”, Pattern Recognition 81: 615-632, 2018.

, ~ , ~ | ~ | Marginal distribution alignment Conditional distribution alignment

?

Challenge

slide-35
SLIDE 35

Domain-shared Group-sparse Representation

  • Idea: derive learning criteria for conditional

distribution alignment

35

Target data w/o labels

Target Sparse Representation

Source data

  • w. labels

Source Group-sparse Representation Align conditional distributions between and Specific to class 1 Specific to class 2 Remainder

Criteria:

  • 1. sparse in group
  • 2. and share the

same class order

Domain-shared group sparsity

Equivalent Equal conditional distributions

slide-36
SLIDE 36

Domain-shared Group-sparse Representation

  • Domain-shared group-sparse dictionary learning

36

Reconstruction errors in source and target domains Conditional distribution alignment with domain‐shared group sparsity Marginal distribution alignment with MMD [Gretton et al, JMLR12’]

min

,,, , , , , , , , , ,

slide-37
SLIDE 37

Domain-shared Group-sparse Representation

  • Domain-shared group-sparse dictionary learning

37

Reconstruction errors in source and target domains Conditional distribution alignment with domain‐shared group sparsity Marginal distribution alignment with MMD [Gretton et al, JMLR12’]

min

,,, , , , , , , , , ,

Domain‐shared property:

Share

  • Share dictionary across domains
slide-38
SLIDE 38

Domain-shared Group-sparse Representation

  • Domain-shared group-sparse dictionary learning

38

Reconstruction errors in source and target domains Conditional distribution alignment with domain‐shared group sparsity Marginal distribution alignment with MMD [Gretton et al, JMLR12’]

min

,,, , , , , , , , , ,

Source‐domain group sparsity:

|,

  • ° ,
  • |
  • Achieve based on labels

Samples from different classes respond to different sub dictionaries

slide-39
SLIDE 39

Domain-shared Group-sparse Representation

  • Domain-shared group-sparse dictionary learning

39

Reconstruction errors in source and target domains Conditional distribution alignment with domain‐shared group sparsity Marginal distribution alignment with MMD [Gretton et al, JMLR12’]

min

,,, , , , , , , , , ,

Target‐domain group sparsity:

| ,:

  • ° ,:
  • |
  • Responding to only one group

Each sample does not respond to different sub dictionaries simultaneously

slide-40
SLIDE 40

Domain-shared Group-sparse Representation

  • Domain-shared group-sparse dictionary learning

40

Reconstruction errors in source and target domains Conditional distribution alignment with domain‐shared group sparsity Marginal distribution alignment with MMD [Gretton et al, JMLR12’]

min

,,, , , , , , , , , , | ,:

  • ° ,:
  • |
  • |,
  • ° ,
  • |
slide-41
SLIDE 41

Domain-shared Group-sparse Representation

  • Learning target classifier
  • Domain-shared coefficients : domain-shared

information  Incorporate target-specific information from target data

  • Propagate label information
  • From source to target via
  • Within target data
  • For simplicity, employ linear classifier

41

Not discriminative enough without target-specific information

  • : similarity between ,
  • and ,
  • : similarity between

and

  • min
  • 1
  • 1
  • ,

||||

slide-42
SLIDE 42

Experiments

  • Face Recognition across Blur and Illumination
  • Dataset: CMU-PIE dataset [Sim et al,AFG02’]
  • Setting1:
  • Source: images under 11 illumination conditions
  • Target: images under other 10 illumination conditions with different

blur kernels

  • Setting 2:
  • Reverse the source and target domains in the Setting 1
  • Cross-dataset Object Recognition
  • Dataset: Office+Caltech dataset [Saenko et al, ECCV10’]
  • Feature: DeCaf6 [Donahue et al, ICML14’]

42

slide-43
SLIDE 43

Experimental Results on Face Recognition

43

Face recognition accuracy (%) across blur and illumination variations

Setting 1 Setting 2 Method σ = 3 σ = 4 L = 9 L = 11 σ = 3 σ = 4 L = 9 L = 11 SVM_s 73.8 70.0 72.9 67.4 70.9 66.5 74.1 67.1 GFK 77.7 74.7 80.9 73.5 85.3 83.2 85.3 81.5 SA 76.5 75.3 78.2 75.3 75.0 71.8 76.5 77.4 SIDL 71.5 68.5 73.5 64.4 83.5 80.6 87.1 82.1 TKL 77.9 77.4 78.2 76.8 80.9 80.6 81.8 77.4 CORAL 76.2 72.9 78.5 67.1 83.2 81.2 84.1 81.8 JDA 74.1 74.7 78.5 62.7 87.4 83.5 86.2 76.8 TJM 59.7 56.8 60.3 42.1 50.6 51.5 56.5 40.9 CTC 68.8 65.3 70.0 62.7 74.7 73.8 77.1 72.1 DsGsDL_NT (Ours) 86.5 86.8 89.1 77.9 98.5 98.5 97.9 97.4 DsGsDL (Ours) 88.8 87.9 80.2 80.2 98.8 99.4 98.2 98.8

  • Achieve good performance even without target‐specific information
  • Performance is further improved with target‐specific information
slide-44
SLIDE 44

Experimental Results on Object Recognition

44

Dataset SVM_s GFK SA SIDL TKL CORAL JDA TJM CTC DsGsDL_NT (Ours) DsGsDL (Ours) D W 99.3 97.3 98.2 99.7 98.3 99.0 99.3 98.0 97.3 99.3 100 D A 83.2 81.6 81.1 85.2 77.0 86.0 90.5 87.0 86.1 90.1 91.7 D C 76.7 75.4 76.1 80.8 74.1 77.5 84.1 76.8 74.1 84.3 85.4 W D 99.4 96.8 98.7 100 100 99.4 99.4 100 99.4 100 100 W A 81.2 87.9 77.2 82.3 76.6 82.4 90.5 85.9 78.4 88.1 91.5 W C 74.8 77.3 72.6 75.3 71.2 75.8 83.8 75.4 74.1 82.5 85.3 A D 82.8 84.1 81.3 72.0 79.0 80.9 77.7 86.0 82.8 89.8 93.6 A W 78.0 82.0 74.6 65.8 76.6 74.6 80.3 80.0 73.2 83.7 88.5 A C 83.0 83.9 81.9 81.2 80.1 83.4 80.9 78.0 79.0 85.6 87.4 C D 87.3 84.7 83.5 76.4 87.3 87.9 80.9 93.6 79.6 95.5 97.5 C W 80.3 86.1 76.2 71.5 82.4 80.0 82.4 81.7 72.2 94.9 96.3 C A 90.8 91.7 89.8 85.7 91.0 91.7 86.7 92.4 87.5 93.0 93.2 Average 84.7 85.7 82.6 81.3 82.8 84.9 85.5 86.2 82.0 90.6 92.5

Object recognition accuracy (%) across different datasets

(Color red, blue and green represent the best, second best and third best results for each pair of datasets, respectively)

slide-45
SLIDE 45

Conclusion

  • Elimination of equal conditional distribution

assumption in Unsupervised Domain Adaptation

  • Estimate label information in target domain under

imbalanced unlabeled data setting

  • Estimate weak label information by positive (minority-

class) mean or positive (minority-class) region in target domain

  • Estimate labels more accurately by dynamic graph

matching

  • Derive learning criteria for equal conditional

distributions

  • Domain-shared Group-sparse Representation

45

slide-46
SLIDE 46

Future Works

  • Cross-modality Domain Adaptation
  • Various representations, features with different

dimensions

  • Learning with only source domain classifier
  • Source domain data is not available
  • Multiple Source Domain Adaptation
  • Domain Similarity Evaluation
  • More Applications
  • Cross-dataset detection, Cross-dataset segmentation

46

slide-47
SLIDE 47

Acknowledgement

  • SDCS, SYSU
  • COMP, HKBU
  • Prof. Pong C Yuen, HKBU
  • Dr. Jiawei Li, HKBU
  • Dr. Baoyao Yang, HKBU
  • Mang Ye, HKBU

47