Towards Assumption-free Unsupervised Domain Adaptation for Visual - - PowerPoint PPT Presentation
Towards Assumption-free Unsupervised Domain Adaptation for Visual - - PowerPoint PPT Presentation
Towards Assumption-free Unsupervised Domain Adaptation for Visual recognition Outline Background and Motivations Unsupervised Domain Adaptation - Estimating label information in
Outline
- Background and Motivations
- Unsupervised Domain Adaptation
- Estimating label information in target domain
- Domain-shared Group-sparse Representation
- Conclusion and Future Works
Background and Motivations
- Traditional machine learning problem
Training Data
Machine Learning Model
Test Data
Apply the learned Model
Result
References: [1] T. Sim, et al. The cmu pose, illumination, and expression (pie) database. (AFGR 2002)
Test images from PIE dataset [1] Training images from PIE dataset [1]
Background and Motivations
- Dataset bias (generalization) problem:
- Performance drops across datasets due to
distributions mismatch,
- Source: Images of license plate are downloaded from Wikipedia
Training Dataset Test Dataset
Training Test
Background and Motivations
- Basic Idea of Domain Adaptation
- Good performance if many target labeled data are
available
Domain Adaptation
Labeled training data (Source Domain) Test data w or w/o labels (Target Domain) Model* for target domain
(Model: classifier/detector/estimator/…)
Training Dataset Test Dataset Accuracy (%) of 1‐NN on test dataset 2000 target data 1000 target data 100 target data No target data Frontal faces (PIE27) Side faces (PIE05) 67.65 59.52 29.49 29.10
Face recognition accuracy (%) on test dataset
Reference: T. Sim, et al. The cmu pose, illumination, and expression (pie) database. (AFGR 2002)
Background and Motivations
- Labels are unavailable in test dataset in many
applications
- Large number of datasets (e.g., large-scale camera
networks)
- Difficulty of labeling (e.g., medical images)
- Unsupervised Domain Adaptation
6
Unsupervised Domain Adaptation
Labeled training data (Source Domain) Test data w/o labels (Target Domain) Model for target domain
Background and Motivations
- Applications
References: [1] K. Saenko, et al. Adversarial Discriminative Domain Adaptation. (CVPR 2017) [2] Baoyao Yang, Andy J. Ma, PC Yuen, Domain‐shared Group‐sparse Dictionary Learning for Unsupervised Domain Adaptation (AAAI 2018) [3] Baoyao Yang, Andy J. Ma, PC Yuen, Body Parts Synthesis for Cross‐Quality Pose Estimation (TCSVT 2018) [4] Andy J Ma, Pong C Yuen, Jiawei Li and Ping Li. Cross‐Domain Person Reidentification Using Domain Adaptation Ranking SVMs.(TIP 2015) [5] Yang Zhang, et al. Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes. (ICCV 2017)
Cross-dataset Object Recognition [1,2] Cross-dataset Re-identification [4] Cross-domain Segmentation [5] Cross-domain Pose Estimation [3]
Source images Target images Source images Target images Source image pairs Target image pairs
Background and Motivations
- Challenges in Unsupervised Domain Adaptation
- Joint distributions alignment WITHOUT labeled
data in target domain
- Joint distributions mismatch
- Equal conditional distribution assumption
- Marginal distribution alignment
, ,
: Source data : Source labels : Target data : Target labels
| | ~
Background and Motivations
- Challenges in Unsupervised Domain Adaptation
- Equal conditional distribution assumption NOT
valid
- Estimating labels in target domain for conditional
distribution alignment
9
Training Dataset Test Dataset
| ~ |
Background and Motivations
- Challenges in Unsupervised Domain Adaptation
- Imbalanced unlabeled data in target domain
- Example applications: person re-identification,
- bject detection, pose estimation, etc.
- In re-identification: # of negatives >> # of positives
- Difficulty in estimating positive labels (minority
class) accurately
(a) Balanced data (b) Imbalanced data
Estimating Label Information
- Estimating positive (minority-class) mean [1]
- Estimating positive (minority-class) region [2]
- Dynamic Graph matching [3]
11
Reference
- 1. A J Ma, J W Li, P C Yuen and P Li, “Cross-domain person re-identification using domain
adaptation ranking SVMs”, IEEE Transactions on Image Processing (TIP), Vol. 24, No. 5, pp. 1599-1613, 2015.
- 2. J W Li, A J Ma and P C Yuen, “Semi-Supervised Region Metric Learning for Person Re-
identification”, International Journal of Computer Vision (IJCV), In press, 2018
- 3. M Ye, A J Ma, L Zheng, J W Li, and P C Yuen, “Dynamic Label Graph Matching for Unsupervised
Video Re-identification”, International Conference on Computer Vision (ICCV), 2017.
Positive Mean Strategy
- Following linear models, confidence score can
be calculated by weighted summation, i.e.
- If positive data is available in target domain
- Taking summation over positive samples in (*)
- (**) is necessary condition of (*) & only related
to positive mean
12
- (*)
- (**)
Positive Mean Strategy
- Estimator 1 using distribution relationship
- If target prior probability is known,
: mean of unlabeled image pairs
: mean of available negatives, estimation of
negative mean
- Unreliable in practice
- Estimation error is large, since
≫
- 13
Positive Mean Strategy
- Estimator 2 using domain similarity
- Assume that
,
- Upper bound of estimation error is
14
Positive Mean Strategy
- Estimator 3 using potential positive samples
- Find potential positive samples by
where
∝ exp
- .
15
max
- ,⋯
Positive Mean Strategy
- Combining Estimated Means
- Only
and are employed for their reliability in
imbalanced data
be a random vector, & are independent
- Solving the ML problem under Gaussian distribution
assumption
16
,
- 1
Positive Mean Strategy
- Adaptive Ranking SVMs
- Asymmetric domain adaptation model on target
domain
- Adaptive Ranking SVMs
17
min
,,
1 2 μ
- . . θ
1
- 1
- 0, 0 1, ∀
Experimental Results on Re-ID
- CMC curves comparing RankSVM [BMCV’10]
training on target or source domain & non- learning metric
18
i-LIDS to CUHK i-LIDS to PRID i-LIDS to VIPeR
Experimental Results on Re-ID
- CMC curves comparing existing Domain
Adaptation methods, i.e. TCA [TNN’11] & GFS
[ICCV’11].
19
i-LIDS to CUHK i-LIDS to PRID i-LIDS to VIPeR
Experimental Results
- Accuracy (%) comparing existing Re-ID
algorithms on VIPeR
20
Method source Rank 1 Rank 5 Rank 10 Rank 20
Ours PRID 44.94 64.15 71.33 77.48 CUHK 47.47 66.84 72.67 78.94 i‐LIDS 45.35 66.16 73.43 78.77 SDALF [CVPR’10] ‐‐‐‐ 19.87 38.89 49.37 65.73 CPS [BMCV’11] ‐‐‐‐ 21.84 44.00 57.21 71.00 eLDFV [ECCVW’12] ‐‐‐‐ 22.34 47.00 60.04 71.00 CIS [PAMI’13] ‐‐‐‐ 24.24 44.91 56.55 69.40 eSDC [CVPR’13] ‐‐‐‐ 26.74 50.70 62.37 76.36
Positive Region Strategy
- Region representation
- Regularized affine hull based positive region
representation
- ∑
∑ 1,
- Region-to-point distance metric learning
- Maximize the distance between the positive region and the
negatives
21
Positives Positive region Negatives Region‐to‐point distance
Positive Region Strategy
- Positive region estimation without using
positives
- Positive neighbors close to the positives
∈ ∃,
- Positive region close to the region of positive
neighbors
∑ ∑ 1,
22
Positives Positive neighbors Estimated positive region Ground truth positive region Negatives Region‐to‐point distance
Positive Region Strategy
- Positive region estimation without using
positives
- Positive neighbors close to the positives in a projective
space
∈ ∃, ϕ ϕ
- positive neighbors by maximizing the probability that at least
- ne positive locates in the -neighbourhood
max
- ⊂
, ̂
- where ̂
1 ∏
ϕ ϕ
- Solve the optimization problem by label propagation with
cross person score distribution alignment
23
Positive Region Strategy
- Positive Region Metric Learning
- Learn the point-to-point distance function
by
min
∑ ∑ ,
- Ω
- s.t.
,
- min
- ∈
,
- Solved by an iteratively computing
and find the
metric function
- 24
Experimental Results On Mars
- Accuracy (%) on improving classifiers training
- n source domain & non-learning metric
25
Source CUHK01 VIPeR Method 1 5 10 1 5 10 Ours + AdaRSVM 32.53 53.21 62.43 31.60 51.29 60.62 Source AdaRSVM [TIP’15] 21.27 35.83 43.61 16.60 30.10 37.23 Ours + XQDA 37.08 55.30 64.41 40.56 55.52 64.40 Source XQDA [CVPR’15] 22.85 39.88 48.23 24.55 41.49 50.07 Ours + L1 33.67 54.99 63.54 33.67 54.99 63.54 L1 14.60 26.58 33.98 14.60 26.58 33.98
Experimental Results On Mars
- Accuracy (%) comparing existing classification
methods dealing with mislabeled data
26
Source CUHK01 VIPeR Method 1 5 10 1 5 10 Ours + XQDA 37.08 37.08 55.30 55.30 64.41 64.41 40.56 40.56 55.52 55.52 64.40 64.40 RLR [ECML PKDD’10] 4.13 7.69 10.27 9.98 19.03 24.55 IW [PAMI’16] 26.54 43.14 51.69 26.99 43.17 50.89
Experimental Results On Mars
- Accuracy (%) comparing existing semi-
supervised learning methods
27
Source CUHK01 VIPeR Method 1 5 10 1 5 10 Ours+XQDA 37.08 37.08 55.30 55.30 64.41 64.41 40.56 40.56 55.52 55.52 64.40 64.40 SSR‐IPP [ACCV’14] 35.10 52.08 59.68 35.01 51.60 59.62 SLISC [IJCAI’11] 29.28 44.83 52.82 28.89 44.21 51.66 SSC [CVPR’14] 34.26 50.36 57.51 31.38 46.35 54.07
Experimental Results On Mars
- Accuracy (%) comparing existing unsupervised
Re-ID methods
- GRDL is an unsupervised learning method, AdaRSVM
& UMDL are domain adaptation methods.
28
Source CUHK01 VIPeR Method 1 5 10 1 5 10 Ours+XQDA 37.08 37.08 55.30 55.30 64.41 64.41 40.56 40.56 55.52 55.52 64.40 64.40 AdaRSVM [TIP’15] 21.27 35.83 43.61 16.60 30.10 37.23 UMDL [CVPR’16] 36.82 54.19 62.74 36.43 54.28 62.41 GRDL [ECCV’16] 34.03 51.07 59.26 34.03 51.07 59.26
Graph Matching Strategy
- Why Graph Matching for label estimation (e.g. in
Re-ID)?
- Minimizing the global matching cost
29
Cam A Cam B Node: Image sequence Edge: Intra‐camera sequence similarity
Graph Matching Strategy
- Dynamic Graph Matching Framework
30
Unlabeled Data Graph Construction
Learnt Metric
Estimated Labels Input: Output: Label Re‐weighting
Intermediate Labels No Yes Dynamic Graph Matching Iteration End
Metric Learning Graph Matching Machine Learning Model
Graph matching is dynamically updated by learning a better distance metric
Graph Matching Strategy
- Dynamic Graph Matching
- In iteration t:
- Graph matching is conducted to achieve intermediate labels
- Label re-weighting is introduced to handle the noisy labels
- Metric learning is adopted to learn a better feature space
- Update graph matching
31
True Matching False Matching Graph Matching Label Re‐weighting
…
Updated Matching
0.8
Metric Learning t t +1
…
…
Experiments on Re-ID
- Evaluation of Dynamic Iterative Updating
- Rank-1 accuracy at each iteration
32
- For Training:
- The label estimation accuracy is
improved significantly with our dynamic updating procedure.
- For Testing:
- The rank-1 matching accuracy is
also improved with iterations.
Experiments on Re-ID
- Comparison with state-of-the-arts
33
Red indicates the best performance while Blue for second best Ours
- The proposed method outperforms existing unsupervised re‐ID methods in most cases.
- The estimated labels are suitable for other supervised learning methods, thus yield
quite promising results on the large scale MARS dataset with deep learning methods.
Domain-shared Group-sparse Representation
- Joint distribution alignment without estimating
target labels
34
Reference
- 1. B Y Yang, A J Ma, and P C Yuen, “Domain-Shared Group-Sparse Dictionary Learning for
Unsupervised Domain Adaptation”, AAAI Conference on Artificial Intelligence, 2018.
- 2. B Y Yang, A J Ma, and P C Yuen, “Learning domain-shared group-sparse representation for
unsupervised domain adaptation”, Pattern Recognition 81: 615-632, 2018.
, ~ , ~ | ~ | Marginal distribution alignment Conditional distribution alignment
?
Challenge
Domain-shared Group-sparse Representation
- Idea: derive learning criteria for conditional
distribution alignment
35
Target data w/o labels
Target Sparse Representation
Source data
- w. labels
Source Group-sparse Representation Align conditional distributions between and Specific to class 1 Specific to class 2 Remainder
Criteria:
- 1. sparse in group
- 2. and share the
same class order
Domain-shared group sparsity
Equivalent Equal conditional distributions
Domain-shared Group-sparse Representation
- Domain-shared group-sparse dictionary learning
36
Reconstruction errors in source and target domains Conditional distribution alignment with domain‐shared group sparsity Marginal distribution alignment with MMD [Gretton et al, JMLR12’]
min
,,, , , , , , , , , ,
Domain-shared Group-sparse Representation
- Domain-shared group-sparse dictionary learning
37
Reconstruction errors in source and target domains Conditional distribution alignment with domain‐shared group sparsity Marginal distribution alignment with MMD [Gretton et al, JMLR12’]
min
,,, , , , , , , , , ,
Domain‐shared property:
Share
- Share dictionary across domains
Domain-shared Group-sparse Representation
- Domain-shared group-sparse dictionary learning
38
Reconstruction errors in source and target domains Conditional distribution alignment with domain‐shared group sparsity Marginal distribution alignment with MMD [Gretton et al, JMLR12’]
min
,,, , , , , , , , , ,
Source‐domain group sparsity:
|,
- ° ,
- |
- Achieve based on labels
Samples from different classes respond to different sub dictionaries
Domain-shared Group-sparse Representation
- Domain-shared group-sparse dictionary learning
39
Reconstruction errors in source and target domains Conditional distribution alignment with domain‐shared group sparsity Marginal distribution alignment with MMD [Gretton et al, JMLR12’]
min
,,, , , , , , , , , ,
Target‐domain group sparsity:
| ,:
- ° ,:
- |
- Responding to only one group
Each sample does not respond to different sub dictionaries simultaneously
Domain-shared Group-sparse Representation
- Domain-shared group-sparse dictionary learning
40
Reconstruction errors in source and target domains Conditional distribution alignment with domain‐shared group sparsity Marginal distribution alignment with MMD [Gretton et al, JMLR12’]
min
,,, , , , , , , , , , | ,:
- ° ,:
- |
- |,
- ° ,
- |
Domain-shared Group-sparse Representation
- Learning target classifier
- Domain-shared coefficients : domain-shared
information Incorporate target-specific information from target data
- Propagate label information
- From source to target via
- Within target data
- For simplicity, employ linear classifier
41
Not discriminative enough without target-specific information
- : similarity between ,
- and ,
- : similarity between
and
- min
- 1
- 1
- ,
||||
Experiments
- Face Recognition across Blur and Illumination
- Dataset: CMU-PIE dataset [Sim et al,AFG02’]
- Setting1:
- Source: images under 11 illumination conditions
- Target: images under other 10 illumination conditions with different
blur kernels
- Setting 2:
- Reverse the source and target domains in the Setting 1
- Cross-dataset Object Recognition
- Dataset: Office+Caltech dataset [Saenko et al, ECCV10’]
- Feature: DeCaf6 [Donahue et al, ICML14’]
42
Experimental Results on Face Recognition
43
Face recognition accuracy (%) across blur and illumination variations
Setting 1 Setting 2 Method σ = 3 σ = 4 L = 9 L = 11 σ = 3 σ = 4 L = 9 L = 11 SVM_s 73.8 70.0 72.9 67.4 70.9 66.5 74.1 67.1 GFK 77.7 74.7 80.9 73.5 85.3 83.2 85.3 81.5 SA 76.5 75.3 78.2 75.3 75.0 71.8 76.5 77.4 SIDL 71.5 68.5 73.5 64.4 83.5 80.6 87.1 82.1 TKL 77.9 77.4 78.2 76.8 80.9 80.6 81.8 77.4 CORAL 76.2 72.9 78.5 67.1 83.2 81.2 84.1 81.8 JDA 74.1 74.7 78.5 62.7 87.4 83.5 86.2 76.8 TJM 59.7 56.8 60.3 42.1 50.6 51.5 56.5 40.9 CTC 68.8 65.3 70.0 62.7 74.7 73.8 77.1 72.1 DsGsDL_NT (Ours) 86.5 86.8 89.1 77.9 98.5 98.5 97.9 97.4 DsGsDL (Ours) 88.8 87.9 80.2 80.2 98.8 99.4 98.2 98.8
- Achieve good performance even without target‐specific information
- Performance is further improved with target‐specific information
Experimental Results on Object Recognition
44
Dataset SVM_s GFK SA SIDL TKL CORAL JDA TJM CTC DsGsDL_NT (Ours) DsGsDL (Ours) D W 99.3 97.3 98.2 99.7 98.3 99.0 99.3 98.0 97.3 99.3 100 D A 83.2 81.6 81.1 85.2 77.0 86.0 90.5 87.0 86.1 90.1 91.7 D C 76.7 75.4 76.1 80.8 74.1 77.5 84.1 76.8 74.1 84.3 85.4 W D 99.4 96.8 98.7 100 100 99.4 99.4 100 99.4 100 100 W A 81.2 87.9 77.2 82.3 76.6 82.4 90.5 85.9 78.4 88.1 91.5 W C 74.8 77.3 72.6 75.3 71.2 75.8 83.8 75.4 74.1 82.5 85.3 A D 82.8 84.1 81.3 72.0 79.0 80.9 77.7 86.0 82.8 89.8 93.6 A W 78.0 82.0 74.6 65.8 76.6 74.6 80.3 80.0 73.2 83.7 88.5 A C 83.0 83.9 81.9 81.2 80.1 83.4 80.9 78.0 79.0 85.6 87.4 C D 87.3 84.7 83.5 76.4 87.3 87.9 80.9 93.6 79.6 95.5 97.5 C W 80.3 86.1 76.2 71.5 82.4 80.0 82.4 81.7 72.2 94.9 96.3 C A 90.8 91.7 89.8 85.7 91.0 91.7 86.7 92.4 87.5 93.0 93.2 Average 84.7 85.7 82.6 81.3 82.8 84.9 85.5 86.2 82.0 90.6 92.5
Object recognition accuracy (%) across different datasets
(Color red, blue and green represent the best, second best and third best results for each pair of datasets, respectively)
Conclusion
- Elimination of equal conditional distribution
assumption in Unsupervised Domain Adaptation
- Estimate label information in target domain under
imbalanced unlabeled data setting
- Estimate weak label information by positive (minority-
class) mean or positive (minority-class) region in target domain
- Estimate labels more accurately by dynamic graph
matching
- Derive learning criteria for equal conditional
distributions
- Domain-shared Group-sparse Representation
45
Future Works
- Cross-modality Domain Adaptation
- Various representations, features with different
dimensions
- Learning with only source domain classifier
- Source domain data is not available
- Multiple Source Domain Adaptation
- Domain Similarity Evaluation
- More Applications
- Cross-dataset detection, Cross-dataset segmentation
46
Acknowledgement
- SDCS, SYSU
- COMP, HKBU
- Prof. Pong C Yuen, HKBU
- Dr. Jiawei Li, HKBU
- Dr. Baoyao Yang, HKBU
- Mang Ye, HKBU
47