breaking inter layer co adaptation

Breaking Inter-Layer Co-Adaptation by Classifier Anonymization Ikuro - PowerPoint PPT Presentation

ICML2019 Breaking Inter-Layer Co-Adaptation by Classifier Anonymization Ikuro Sato 1 Denso IT Laboratory. Inc., Japan 1 Kohta Ishikawa 1 National Institute of Advanced Industrial 2 Guoqing Liu 1 Science and Technology, Japan Masayuki Tanaka 2


  1. ICML2019 Breaking Inter-Layer Co-Adaptation by Classifier Anonymization Ikuro Sato 1 Denso IT Laboratory. Inc., Japan 1 Kohta Ishikawa 1 National Institute of Advanced Industrial 2 Guoqing Liu 1 Science and Technology, Japan Masayuki Tanaka 2 I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 1 /10

  2. Summary first About what? Breaking co-adaptation between feature extractor and classifier. How? By classifier anonymization technique. Theory? Proved: Features form simple point-like distribution . In reality? Point-like property largely confirmed on real datasets. I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 2 /10

  3. E2E optimization scheme flourishes. Is it always good? 1 E2E opt. 𝜚 ⋆ , πœ„ ⋆ = arg min ෍ 𝑀 𝐷 πœ„ 𝐺 𝜚 𝑦 , 𝑒 𝒠 0 𝜚,πœ„ 𝑦,𝑒 βˆˆπ’  Input DNN Feature Ext. Classifier Loss w/ target 𝑒 𝐺 𝜚 𝑦 𝑦 𝐷 πœ„ 𝐺 𝜚 𝑦 𝑀 𝐷 πœ„ 𝐺 𝜚 𝑦 , 𝑒 Feature extractor 𝐺 𝜚 ⋆ adapts to a particular classifier 𝐷 πœ„ . β€˜+1’ color: 𝐷 πœ„ value Feature dim-2 β€˜ - 1’ Toy ex.) Features may form 2-class regression excessively complex distribution. Disjointed β€’ Split β€’ Feature dim-1 I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 3 /10

  4. FOCA: Feature-extractor Optimization through Classifier Anonymization 1 𝜚 ⋆ = arg min FOCA ෍ 𝔽 πœ„~Θ 𝜚 𝑀 𝐷 πœ„ 𝐺 𝜚 𝑦 , 𝑒 𝒠 0 𝜚 𝑦,𝑒 βˆˆπ’  Want to know more about π›ͺ 𝜚 ? Random weak classifier: πœ„~Θ 𝜚 Please come to the poster! Feature extractor 𝐺 𝜚 ⋆ adapts to a set of weak classifiers 𝐷 πœ„ . Feature dim-2 Features form simple point-like distribution per class under some conditions. Feature dim-1 I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 4 /10

  5. Proposition about the point-like property In words, If feature extractor has an enough representation ability, all input data of the same class are projected to a single point in the feature space in a class-separable way under certain conditions. Please see the paper for the proof. I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 5 /10

  6. x-axis Feature dim. #1 Toy problem demonstration y-axis Feature dim. #2 data used to generate classifier decision boundary start Small-batch classifier works as a weak classifier to the entire dataset. Small perturbations lead to end point-like distribution. I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 6 /10

  7. Experiment #1: partial-dataset training Thing we wish to confirm: full-dataset classifier partial-dataset classifier Do they perform similarly for given 𝐺 𝜚 ⋆ ?? I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 7 /10

  8. Experiment #1: partial-dataset training CIFAR10 test error rates Performance gap large for other methods much smaller One indication of for FOCA point-like property classifier trained classifier trained with large dataset with small dataset (The same, fixed feature extractor is used within each method.) I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 8 /10

  9. More experiments … including: β€’ Approximate geodesic distance measurements between large- and small-dataset solutions β€’ Low-dimensional analyses to further study the point-like property. I. Sato, et al. , Breaking Inter-Layer Co-Adaptation by Classifier Anonymization , ICML 2019 9 /10

  10. Poster #28 tonight What? Breaking co-adaptation between feature extractor and classifier. How? By classifier anonymization . Proved: Features form simple Theory? point-like distribution . Reality? Point-like property largely confirmed on real datasets. 10 /10

Recommend


More recommend