Learning from Limited Data The University of Tokyo / RIKEN AIP - PowerPoint PPT Presentation

GTC March 18, 2019 Learning from Limited Data The University of Tokyo / RIKEN AIP Tatsuya Harada

Deep Neural Networks for Visual Recognition Deep Neural Networks Applications A yellow train on the tracks near a train station. cellphone cup book laptop cup book • Tasks in the visual recognition field • Object class recognition laptop input output • Object detection • Image caption generation • Semantic and instance segmentation • Image generation • Style transfer • DNNs becomes an indispensable module. • A large amount of labeled data is needed to train DNNs. • Reducing annotation cost is highly required. 2

Can we learn Deep Neural Networks from limited Supervised Information?

Topics  Recent progresses in our team (MIL, the University of Tokyo) for learning from limited data  Between-class learning (BC learning)  Unsupervised domain adaptation  Close domain adaptation  Open set domain adaptation  Adaptive Object Detection

Learning from Limited Data Between-class Learning Yuji Tokozume, Yoshitaka Ushiku, Tatsuya Harada Learning from Between-class Examples for Deep Sound Recognition ICLR 2018 Between-class Learning for Image Classification CVPR 2018 Y. Tokozume T. Harada Y. Ushiku

6 Standard Supervised Learning 1. Select one example from training dataset 2. Train the model to output 1 for the corresponding class and 0 for the other classes Random Select & Augment Dog 1 Cat 0 Dog Bird 0 Input Bird Output Label Cat Model Training Dataset

Between-class (BC) Learning 7 Proposed method 1. Select two training examples from different classes On test phase, we input a single example into the network. 2. Mix those examples with a random ratio 3. Train the model to output the mixing ratio and mixing classes Random Select & Augment 0.7 Dog 0.7 KL Cat 0.3 Dog Bird 0 Bird Input Output 0.3 Label Cat Model Training Dataset Merits  Generate infinite training data from limited data  Learn more discriminative feature space than standard learning

BC learning for sounds 8  Two training examples A dog and a cat  Random ratio Dog: 1 Dog: 0 Dog: Cat: Cat: 0 Cat: 1 labels Bird: 0 Bird: 0 Bird: 0 sounds 𝐻 � , 𝐻 � : sound pressure level of 𝒚 � , 𝒚 � [dB]

Results of Sound Recognition 9 ② Various datasets ① Various models ③ Compatible with strong data augmentation ④ Surpass the human level We can improve recognition performance for any sound networks, if we apply the BC learning.

10 Results on CIFAR Our preliminary results were presented in ILSVRC2017 on July 26, 2017.

How BC Learning Works 11 Less discriminative More discriminative Class A Class A distribution distribution rA+(1-r)B rA+(1-r)B distribution distribution Class B Class B distribution distribution Small Fisher’s criterion Large Fisher’s criterion → Overlap among distributions → No overlap among distributions → Large BC learning loss → Small BC learning loss

How BC Learning Works In the classification, the distributions must be uncorrelated because the teaching signal is discrete. Small correlation Large correlation A C C A rA+(1-r)B rA+(1-r)B B B Decision Decision boundary boundary Large correlation among classes Small correlation among classes → Mixing class of A and B may → Mixing class of A and B is be classified into class C. not classified into class C. → Large BC learning loss → Small BC learning loss

13 Knowledge Transfer Learning Doggie! Doggie From picture books Domain Adaptation <a href="https://pixabay.com/ja/photos/%E5%AD%90%E7%8A%AC-%E3%82%B4%E3%83%BC%E3%83%AB%E3%83%87%E3%83%B3-%E3%83%BB-%E3%83%AA%E3%83%88%E3%83%AA%E3%83%BC%E3%83%90%E3%83%BC- 1207816/">Image</a> by <a href="https://pixabay.com/ja/users/Chiemsee2016-1892688/">Chiemsee2016</a> on Pixabay <a href="https://pixabay.com/ja/illustrations/%E7%8A%AC-%E5%8B%95%E7%89%A9-%E3%82%B3%E3%83%BC%E3%82%AE%E3%83%BC-%E3%83%93%E3%83%BC%E3%82%B0%E3%83%AB-1417208/">Image</a> by <a href="https://pixabay.com/ja/users/GraphicMama-team-2641041/">GraphicMama-team</a> on Pixabay

Domain Adaptation (DA)  Problems  Supervised learning model needs many labeled examples  Cost to collect them in various domains  Goal  Transfer knowledge from source (rich supervised data) to target (small supervised data) domain  Classifier that works well on target domain.  Unsupervised Domain Adaptation (UDA)  Labeled examples are given only in the source domain.  There are no labeled examples in the target domain. Target domain Source domain Real images, unlabeled Synthetic images, labeled

Distribution Matching for Unsupervised Domain Adaptation  Distribution matching based method • Match distributions of source and target features • Domain Classifier (GAN) [Ganin et al., 2015] • Maximum Mean Discrepancy [Long et al., 2015] Before adaptation Adapted Target T Source (unlabeled) Source Feature Extractor S Source (labeled) Decision boundary Target Decision boundary Target

Adversarial Domain Adaptation  Training the feature generator in a adversarial way works well! T Domain  Category classifier, domain Target (unlabeled) Classifier classifier, feature extractor Feature Extractor  Problems Category  Whole distribution matching Source (labeled) Classifier S  Ignorance of category information in source domain Tzeng, Eric, et al. Adversarial discriminative domain adaptation. CVPR, 2017. ?? ???? Source ?? Domain ? Source Domain classifier Source Domain classifier Domain classifier Source classifier Target Target Category Category Category Category Target classifier classifier classifier classifier Target

Unsupervised Domain Adaptation using Classifier Discrepancy Kuniaki Saito 1 , Kohei Watanabe 1 , Yoshitaka Ushiku 1 , Tatsuya Harada 1, 2 1: The University of Tokyo, 2: RIKEN CVPR 2018, oral presentation K. Saito K. Watanabe T. Harada Y. Ushiku

Proposed Approach  Considering class specific distributions  Using decision boundary to align distributions Proposed Before adaptation Adapted Source Source Source Decision Class A boundary Target Source Class B Previous work Decision Decision Target Target Target boundary boundary

Key Idea  Maximizing discrepancy by learning two classifiers  Minimizing discrepancy by learning feature space Maximize discrepancy Minimize discrepancy Maximize discrepancy Minimize discrepancy by learning classifiers by learning feature space by learning classifiers by learning feature space Source Source Source Source F 1 F 1 F 1 F 1 F 2 F 2 F 2 F 2 Target Target Target Target Discrepancy is the example which gets different Discrepancy Discrepancy predictions from two different classifiers.

Network Architecture and Training Loss Input L1 class 1 F 1 Classifiers 1 2 F 2 L2 class 2 Algorithm 1. Fix generator , and find classifiers � , � that maximize 𝟐 𝟑 2. for Fix classifiers � , � , and find feature generator that minimizes Maximize D by learning classifier Minimize D by learning feature generator Source Source F 1 F 1 F 2 F 2 Target Target

Improving by Dropout Adversarial Dropout Regularization Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada, Kate Saenko ICLR 2018 Input L1 class 1 F 1 Classifiers 1 2 F 2 L2 class 2 Selecting two classifiers by dropout! F Classifier Input Classifier Sampling by Dropout 1 1 2 2

Why Discrepancy Method Works Well? 22 Hypothesis Shared error of the ideal hypothesis Expected error Expected error in source domain in target domain ℎ 𝑦 = 𝐺 � ∘ 𝐻 𝑦 , Minimize D by This term is assumed to be low, Minimal upper-bound Maximize D by ℎ′ 𝑦 = 𝐺 � ∘ 𝐻 𝑦 learning feature if h and h’ can classify source learning classifier generator samples correctly.

Object Classification  Synthetic images to Real images (12 Classes)  Finetune pre-trained ResNet101 [He et al., CVPR 2016] (ImageNet)  Source:images, Target:images Source (Synthetic images) Target (Real images)

Semantic Segmentation  Simulated Image (GTA5) to Real Image (CityScape)  Finetuning of pre-trained VGG, Dilated Residual Network [Yu et al., 2017] (ImageNet)  Calculate discrepancy pixel-wise  Evaluation by mean IoU (TP/(TP+FP+FN)) GTA 5 (Source) CityScape(Target) 100 source only 90 80 ours 70 60 IoU 50 40 30 20 10 0 road sdwk bldng wall pole light sign vg n trrn rider truck train mcycl bcycl fence sky car bus perso

Qualitative Results RGB Ground truth Source only Adapted (ours)

Open Set Domain Adaptation (OSDA) Closed Domain Adaptation Open Set Domain Adaptation (P.P. Busto+ ICCV07) Source Target Unknown Source Target ・ Source and target completely share classes in domain adaptation. ・ Target examples are unlabeled. Open set situation is more realistic. ・ Open set ・・・ Target contains unknown category.

Learning from Limited Data The University of Tokyo / RIKEN AIP - PowerPoint PPT Presentation

GTC March 18, 2019 Learning from Limited Data The University of Tokyo / RIKEN AIP Tatsuya Harada Deep Neural Networks for Visual Recognition Deep Neural Networks Applications A yellow train on the tracks near a train station. cellphone cup

Agritech Agritech Agritech Limited Agritech Agritech

DataCamp Data Types for Data Science DataCamp Data Types for Data Science Data types Data type

Simbhaoli Sugars Limited Simbhaoli Sugars Limited Simbhaoli Sugars Limited Simbhaoli Sugars

Lycopodium Limited Annual General Meeting 2008 Lycopodium Limited 2008 Annual General Meeting

Presented by Dining Butler Limited [ For qualified investors only ] DINING BUTLER LIMITED

The Learning Tree Workshop: The Learning Tree Workshop: Experience-based Learning Series on

STRUCTURE INTO MACHINE LEARNING TRINITY OF AI ALGORITHMS COMPUTE DATA 2 DEEP LEARNING IS

Learning From Data Lecture 2 The Perceptron The Learning Setup A Simple Learning Algorithm: PLA

Machine Learning 11 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 11 1 11 Machine Learning

What is mobile learning, mobile learning policies and technologies Dr. Mohamed Ally Learning

SOBHA LIMITED CORPORATE PROFILE THE COMPANY Founded by Mr. P.N.C. Menon, SOBHA Limited was

Annual General Meeting Australian Vintage Limited Australian Vintage Limited Annual General

AUSTAL LIMITED AUSTAL LIMITED AUSTAL LIMITED AUSTAL SHIPS IMAGE MARINE AUSTAL SHIPS IMAGE

Presented by Dining Butler Limited DINING BUTLER LIMITED 2 2 THINK OF FOOD THINK OF US We are

AUSTAL LIMITED AUSTAL LIMITED AUSTAL LIMITED AUSTAL SHIPS IMAGE MARINE AUSTAL SHIPS IMAGE

Minda Corporation Limited Preferred Employer Profitable 1 Minda Corporation Limited Security

An Algorithm For Type III Solar Radio Bursts Recognition S. Vidojevi 1 , M. Dra i 2 , M.

Statistical downscaling: Status and open issues Rob Wilby, Loughborough University, UK Oued

Economic situation in Finland The rate of unemployment Country Women Men Total Finland 7,8%

CONSTI GROUP PLC CARNEGIE CONSTRUCTION SEMINAR 23 AUGUST 2017 CEO Marko Holopainen Building

Matt Fisher EUA Coordinator Overview of Parramatta today Overview of Parramatta today Overview

Dominique Robert Arnault Le Bris Jonathan Fisher Hannah Murphy Peter Galbraith Martin

Modeling Spatial Learning in Rats Based on Morris Water Maze Experiments Christel Faes, Marc

MBA Economic and Mortgage Finance Outlook MBA of Alabama Annual Conference June 7, 2017