co representation network for
play

Co-Representation Network for Generalized Zero-Shot Learning Fei - PowerPoint PPT Presentation

Co-Representation Network for Generalized Zero-Shot Learning Fei Zhang, Guangming Shi XIDIAN UNIVERSITY ICML 2019 In Intr troduct oduction ion Classic Deep CNN Data requirements decrease Predict Transfer Learning Few-Shot


  1. Co-Representation Network for Generalized Zero-Shot Learning Fei Zhang, Guangming Shi XIDIAN UNIVERSITY ICML 2019

  2. In Intr troduct oduction ion ➢ Classic Deep CNN Data requirements decrease Predict ➢ Transfer Learning • Few-Shot Learning Predict • One-Shot Learning • Zero-Shot Learning Source space (ZSL) (Seen Classes) Conventional ZSL (CZSL) Semantic space Predict · · · Legs Fur (Attributes, word2vecs) Target space (Unseen Classes) Generalized ZSL (GZSL)

  3. Bia ias s Pro roblem blem Existing Embedding Models for GZSL Average per-class top-1 accuracy in % on unseen classes of various models following CZSL settings and GZSL settings • Visual Space 80 to Semantic Space 68.3 65.6 • Visual & Semantic Space 60.1 59.9 60 55.1 to a Latent Space 54.2 54 53 • Semantic Space 45.6 44.1 to Visual Space 40 Bias Problem Unseen samples are easily 20 classified into similar seen classes. 16.8 13.4 11.3 8.9 0 0.4 7 7.3 1.8 1.8 0 DAP CONSE SSE LATEM ALE DEVISE SJE SYNC SAE GFZSL CZSL GZSL e.g. Zebra → Horse Yongqin , Xian , et al. “Zero -Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly.” IEEE TPAMI 2017

  4. Ou Our Mo r Model del O Feature Image 1 Expert module f 1 anchor feature O 2 Expert module f 2 C Image input CNN  Concatennate Horse Back progagation O Zebra K Relation module Expert module f K Panda Predict Tiger Relation module g Semantic input Cooperation module f Similarity output ➢ Co-Representation Network (CRnet) 1. A cooperation module for visual feature representation (our main contribution). 2. A pre-trained CNN (Resnet-101) for feature extraction. 3. A relation module for similarity output, i.e. the classification. (Sung, Flood , et al. "Learning to Compare: Relation Network for Few-Shot Learning." CVPR 2018.)

  5. Alg lgorithm rithm ➢ Initialization Algorithm O 1 Single layer perceptron Expert module f 1 Perform K-means Clustering on the semantic space. O 2 Semantic vectors: Single layer perceptron Expert module f 2 Clustering center: O Expert module k: K Single layer perceptron Expert module f K ➢ Cooperation Module Feature Sum the outputs of expert modules. Expert module f 1 anchor Expert module f 2 Visual Embedding Space Expert module f K

  6. Alg lgorithm rithm ➢ Relation Module ➢ Training Feature Image Concatenate feature anchor (output anchor feature Objective function: of cooperation module) and image feature v as the input. C Tow-layer perceptron with Sigmoid. Concatenate Ground-truth: End-to-end manner. Relation module Predict • When the model converges, cooperation module divides the semantic space into several parts. • Semantic vectors located in different parts are projected by several different expert modules. Semantic Space Semantic Space

  7. Be Benchmar nchmark k Resul sults ts

  8. Ana naly lysis sis ➢ Bias Problem ➢ Local Relative Distance (LRD) Unseen anchors distribute too close to We propose the LRD as a metric for bias problem. seen anchors in the embedding space used for classification. , Larger LRD means a more uniform embedding space, i.e. slighter bias problem. 1-d semantic space to 1-d visual embedding space: Visual Embedding Space Visual Embedding Space Serious bias problem Slight bias problem • High local linearity results in larger LRD. • Cooperation module actually learns a piecewise linear function of K+1 pieces with high local linearity f G : General fitting curve; f CR : Fitting curve of CRnet S: semantic space; V: visual embedding space.

  9. Contra ntrast st Exp xperi erimen ments ts ➢ Relation Network (RN) A two-layer perceptron instead of cooperation module is used. (Sung, Flood , et al. "Learning to Compare: Relation Network for Few-Shot Learning." CVPR 2018.) O Feature Image 1 Expert module f 1 anchor feature O 2 CRnet Expert module f 2 C Image input CNN Concatenate Horse O Zebra K Relation module Expert module f K Panda Predict Tiger Relation module g Semantic input Cooperation module f Similarity output vs vs Feature Image anchor feature Semantic Vectors Input RN C Image input CNN Concatenate Two-layer Horse Zebra Perceptron Relation module Panda Predict Tiger Relation module g Similarity output

  10. Contra ntrast st Exp xperi erimen ments ts ➢ Results • Slighter Bias Problem Compared with RN, CRnet achieves: Bias Rate of RN Error Rate of RN Bias Rate of CRnet Error Rate of CRnet • More Sparse and Discriminative Features 97.9 96.2 100 89.7 82.1 81.5 78.9 80 66.2 64.2 61.5 60.9 59.5 59.3 Rate (%) 57.5 60 48.6 46.7 46.3 44.9 44.6 39.3 40 32.3 22.3 21.1 17.6 18.2 • More Uniform Embedding Space (Larger LRD) 13.8 20 14.8 14.5 12.6 12.3 11.6 5.8 4.5 4.3 2.9 2.3 2.1 1.3 0.2 0.8 0.5 0 0.1 0 0 0 1 2 3 4 5 6 7 8 9 10 Avg Unseen Class Index Figure. Bar chart of per-class Bias Rate and per-class Error Rate of RN and CRnet on AwA2. Bias Rate: The rate in % of misclassification into the closest seen class; Error Rate: Per-class classification Error Rate in %.

  11. Su Summarize mmarize ➢ Co-representation network • Decomposition method for projecting semantic space to visual embedding space. • Cooperation module for representation and learnable relation module for classification. ✓ Training in an end-to-end manner. ✓ Slighter bias problem leads to a good performance on GZSL. Other advantages: ✓ Simple structure with high expandability. Email: ✓ No need for semantic information of unseen classes during training f.zhang@stu.xidian.edu.cn (compared with generative models)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend