Co-Representation Network for Generalized Zero-Shot Learning Fei - PowerPoint PPT Presentation

Co-Representation Network for Generalized Zero-Shot Learning Fei Zhang, Guangming Shi XIDIAN UNIVERSITY ICML 2019

In Intr troduct oduction ion ➢ Classic Deep CNN Data requirements decrease Predict ➢ Transfer Learning • Few-Shot Learning Predict • One-Shot Learning • Zero-Shot Learning Source space (ZSL) (Seen Classes) Conventional ZSL (CZSL) Semantic space Predict · · · Legs Fur (Attributes, word2vecs) Target space (Unseen Classes) Generalized ZSL (GZSL)

Bia ias s Pro roblem blem Existing Embedding Models for GZSL Average per-class top-1 accuracy in % on unseen classes of various models following CZSL settings and GZSL settings • Visual Space 80 to Semantic Space 68.3 65.6 • Visual & Semantic Space 60.1 59.9 60 55.1 to a Latent Space 54.2 54 53 • Semantic Space 45.6 44.1 to Visual Space 40 Bias Problem Unseen samples are easily 20 classified into similar seen classes. 16.8 13.4 11.3 8.9 0 0.4 7 7.3 1.8 1.8 0 DAP CONSE SSE LATEM ALE DEVISE SJE SYNC SAE GFZSL CZSL GZSL e.g. Zebra → Horse Yongqin , Xian , et al. “Zero -Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly.” IEEE TPAMI 2017

Ou Our Mo r Model del O Feature Image 1 Expert module f 1 anchor feature O 2 Expert module f 2 C Image input CNN  Concatennate Horse Back progagation O Zebra K Relation module Expert module f K Panda Predict Tiger Relation module g Semantic input Cooperation module f Similarity output ➢ Co-Representation Network (CRnet) 1. A cooperation module for visual feature representation (our main contribution). 2. A pre-trained CNN (Resnet-101) for feature extraction. 3. A relation module for similarity output, i.e. the classification. (Sung, Flood , et al. "Learning to Compare: Relation Network for Few-Shot Learning." CVPR 2018.)

Alg lgorithm rithm ➢ Initialization Algorithm O 1 Single layer perceptron Expert module f 1 Perform K-means Clustering on the semantic space. O 2 Semantic vectors: Single layer perceptron Expert module f 2 Clustering center: O Expert module k: K Single layer perceptron Expert module f K ➢ Cooperation Module Feature Sum the outputs of expert modules. Expert module f 1 anchor Expert module f 2 Visual Embedding Space Expert module f K

Alg lgorithm rithm ➢ Relation Module ➢ Training Feature Image Concatenate feature anchor (output anchor feature Objective function: of cooperation module) and image feature v as the input. C Tow-layer perceptron with Sigmoid. Concatenate Ground-truth: End-to-end manner. Relation module Predict • When the model converges, cooperation module divides the semantic space into several parts. • Semantic vectors located in different parts are projected by several different expert modules. Semantic Space Semantic Space

Be Benchmar nchmark k Resul sults ts

Ana naly lysis sis ➢ Bias Problem ➢ Local Relative Distance (LRD) Unseen anchors distribute too close to We propose the LRD as a metric for bias problem. seen anchors in the embedding space used for classification. , Larger LRD means a more uniform embedding space, i.e. slighter bias problem. 1-d semantic space to 1-d visual embedding space: Visual Embedding Space Visual Embedding Space Serious bias problem Slight bias problem • High local linearity results in larger LRD. • Cooperation module actually learns a piecewise linear function of K+1 pieces with high local linearity f G : General fitting curve; f CR : Fitting curve of CRnet S: semantic space; V: visual embedding space.

Contra ntrast st Exp xperi erimen ments ts ➢ Relation Network (RN) A two-layer perceptron instead of cooperation module is used. (Sung, Flood , et al. "Learning to Compare: Relation Network for Few-Shot Learning." CVPR 2018.) O Feature Image 1 Expert module f 1 anchor feature O 2 CRnet Expert module f 2 C Image input CNN Concatenate Horse O Zebra K Relation module Expert module f K Panda Predict Tiger Relation module g Semantic input Cooperation module f Similarity output vs vs Feature Image anchor feature Semantic Vectors Input RN C Image input CNN Concatenate Two-layer Horse Zebra Perceptron Relation module Panda Predict Tiger Relation module g Similarity output

Contra ntrast st Exp xperi erimen ments ts ➢ Results • Slighter Bias Problem Compared with RN, CRnet achieves: Bias Rate of RN Error Rate of RN Bias Rate of CRnet Error Rate of CRnet • More Sparse and Discriminative Features 97.9 96.2 100 89.7 82.1 81.5 78.9 80 66.2 64.2 61.5 60.9 59.5 59.3 Rate (%) 57.5 60 48.6 46.7 46.3 44.9 44.6 39.3 40 32.3 22.3 21.1 17.6 18.2 • More Uniform Embedding Space (Larger LRD) 13.8 20 14.8 14.5 12.6 12.3 11.6 5.8 4.5 4.3 2.9 2.3 2.1 1.3 0.2 0.8 0.5 0 0.1 0 0 0 1 2 3 4 5 6 7 8 9 10 Avg Unseen Class Index Figure. Bar chart of per-class Bias Rate and per-class Error Rate of RN and CRnet on AwA2. Bias Rate: The rate in % of misclassification into the closest seen class; Error Rate: Per-class classification Error Rate in %.

Su Summarize mmarize ➢ Co-representation network • Decomposition method for projecting semantic space to visual embedding space. • Cooperation module for representation and learnable relation module for classification. ✓ Training in an end-to-end manner. ✓ Slighter bias problem leads to a good performance on GZSL. Other advantages: ✓ Simple structure with high expandability. Email: ✓ No need for semantic information of unseen classes during training f.zhang@stu.xidian.edu.cn (compared with generative models)

Co-Representation Network for Generalized Zero-Shot Learning Fei - PowerPoint PPT Presentation

Co-Representation Network for Generalized Zero-Shot Learning Fei Zhang, Guangming Shi XIDIAN UNIVERSITY ICML 2019 In Intr troduct oduction ion Classic Deep CNN Data requirements decrease Predict Transfer Learning Few-Shot

K K Knowledge Knowledge l d l d Representation Representation Representation

Stable and Efficient Representation Learning with Nonnegativity Constraints Tsung-Han Lin and

Precise and Approximate Representation of Numbers The Cartesian-Lagrangian representation of

Image and Video Coding: Representation, Acquisition, Display ... 10011 ... encoder decoder

Number representation in Java Scientific notation Overview topics Binary representation of

parametric surface patches 1 implicit representation implicit surface representation f ( P ) = 0

DNA Interaction Follow Network Network User-Product Network Nonuniform network comm costs

What is meant by a flashforward? The mental representation of an The mental

Unit 11 Signed Representation Systems Binary Arithmetic 11.2 BINARY REPRESENTATION SYSTEMS

Unit 11 Signed Representation Systems BINARY REPRESENTATION SYSTEMS Binary Arithmetic REVIEW

Data Representation and Data Representation and Remote Procedure Calls Remote Procedure Calls

Lecture 5: Data Representation 1 / 43 Data Representation Discussion Deep learning job postings

Integer Representation Bits, binary numbers, and bytes Fixed-width representation of integers:

Nameless Representation of Terms CIS500: Software Foundations Nameless Representation of Terms

Boundary representation of objects Smooth surfaces Implicit representation f(x, y, z)

Unit 10 Signed Representation Systems Binary Arithmetic 10.2 BINARY REPRESENTATION SYSTEMS

XLIFF Extensibility and Metadata Applied to DITA-XLIFF and Drupal-XLIFF programs Bryan Schnabel

Word Embeddings in Feedforward Networks; Tagging and Dependency Parsing using Feedforward

Announcements Extra office hours today (instead of DIS sections); Zoom links on Canvas P6

Compile-time type transformation Meeting C++ 2019, Berlin dr Ivan uki KDAB

Hadoop Distributed File System (HDFS) 10/05/2018 1 HDFS Overview A distributed file system

CFLs and Regular Languages We can show that every RL is also a CFL CFLs and Regular Languages

CPSC 490: Problem Solving in Computer Science which is greater than or equal to x ? 1 Problem

Closure Properties Theorem: CFLs are closed under union If L 1 and L 2 are CFLs, then L 1 L 2