Simulation for Real Boqing Gong Tecent AI Lab Department of - - PowerPoint PPT Presentation

simulation for real
SMART_READER_LITE
LIVE PREVIEW

Simulation for Real Boqing Gong Tecent AI Lab Department of - - PowerPoint PPT Presentation

Domain Adaptation & Transfer: All You Need to Use Simulation for Real Boqing Gong Tecent AI Lab Department of Computer Science An intelligent robot Semantic segmentation of urban scenes Assign each pixel a semantic label An


slide-1
SLIDE 1

Domain Adaptation & Transfer: All You Need to Use Simulation “for Real”

Boqing Gong

Tecent AI Lab

Department of Computer Science

slide-2
SLIDE 2

An intelligent robot

slide-3
SLIDE 3

Image credit: https://www.cityscapes-dataset.com/

Semantic segmentation of urban scenes

Assign each pixel a semantic label An appealing application: self-driving

slide-4
SLIDE 4

Triumphal approach: CNNs convolutional neural networks

Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

slide-5
SLIDE 5

Image credit: https://www.cityscapes-dataset.com/

To teach/train CNNs to segment images and videos

About 1.5 hrs to label one such image! Cityscapes: 30k images captured from 50 cities Only 5k are well labeled thus far

slide-6
SLIDE 6

Labeling-free training data by simulation

Image credit: http://synthia-dataset.net/

slide-7
SLIDE 7

Simulation to real world: catastrophic performance drop

15 30 45 60

Simulation→Simulation Simulation→Cityscapes 22 60

slide-8
SLIDE 8

Cause: standard assumption in machine learning

Same underlying distribution for training and testing

Consequence:

Poor cross-domain generalization Brittle systems in dynamic and changing environment

The perils of mismatched domains

8

slide-9
SLIDE 9

Synthetic imagery → Real photos

The perils of mismatched domains

[Zhang et al., ICCV’17]

slide-10
SLIDE 10

The perils of mismatched domains

[Jamal et al., CVPR’18]

Adapting face detector to a user’s album

slide-11
SLIDE 11

The perils of mismatched domains

Attribute detection

Middle-level concepts describing objects, faces, etc.

Shared by different categories

[Gan et al., CVPR’17]

slide-12
SLIDE 12

The perils of mismatched domains

Personalization of video summarizers

1 0.5 0.5 1 0.5 0.5 0.5 0.5

Car Children Drink Flowers Street Area Food Water

(a) Input: Video & Query (c) Output: Summary (b) Algorithm: Sequential & Hierarchical Determinantal Point Process (SH-DPP)

Important & diverse shots à Query-relevant, important, & diverse shots à

[Sharghi et al., ECCV’16, CVPR’17, ECCV’18]

slide-13
SLIDE 13

The perils of mismatched domains

Webly supervised learning

[Gan et al., ECCV’16, CVPR’18] [Ding et al., WACV’18]

slide-14
SLIDE 14

Setup

Source domain (with labeled data) Target domain (no labels for training)

Objective

Learn models to work well on target

Abstract form: unsupervised domain adaptation (DA)

Different distributions

?

14

slide-15
SLIDE 15

Existing methods

Correcting sampling bias

[Shimodaira, ’00] [Huang et al., Bickel et al., ’07] [Sugiyama et al., ’08] [Sethy et al., ’06] [Sethy et al., ’09]

Adjusting mismatched models

[Evgeniou and Pontil, ’05] [Duan et al., ’09] [Duan et al., Daumé III et al., Saenko et al., ’10] [Kulis et al., Chen et al., ’11]

+

  • ++

+

  • ++

Inferring domain- invariant features

[Pan et al., ’09] [Blitzer et al., ’06] [Gopalan et al., ’11] [Chen et al., ’12] [Daumé III, ’07] [Argyriou et al, ’08] [Gong et al., ’12] [Muandet et al., ’13]

+ + +

  • +
  • +
  • +
slide-16
SLIDE 16

Image Baseline Ours Groundtruth

slide-17
SLIDE 17

Let teacher model hint segmentation net (student)

0% 10% 20% 30% 40% Sky Road Pedestrian Traffic Sign Tree

Input: An urban scene image Algorithm: Logistic regression Output: Label distributions

slide-18
SLIDE 18

Input: An urban scene image Algorithm: Super-pixel + Logistic regression Output: Labels of some super-pixels

Road Sidewalk

Let 2nd teacher model hint segmentation net (student)

slide-19
SLIDE 19

min

Θ

L(Ys, b Ys) + d(pt, pt(b Yt))

b Y

s : Source, t : Target

Curriculum domain adaptation for training CNNs

0% 10% 20% 30% 40% Sky Road Pedestrian Traffic Sign Tree

[ICCV’17]

(b Y

slide-20
SLIDE 20

20

0% 10% 20% 30% 40% Sky Road Pedestrian Traffic Sign Tree

Road Sidewalk

A B C

Curriculum domain adaptation

slide-21
SLIDE 21

Cityscapes: Train/val/test: 2993/503/1531

slide-22
SLIDE 22

GTA: 24,996 images from the video game

slide-23
SLIDE 23

SYNTHIA: 9,400 images

slide-24
SLIDE 24

Simulation to real world: catastrophic performance drop

15 30 45 60

Simulation→Sim Sim→Cityscapes Adaptation

31

22 60

[Zhang et al., ICCV’17]

slide-25
SLIDE 25

Recent progress

15 30 45 60

Ours Ours, 2018 FCAN Semi-DA Real2Real

58 53 47

41 31

slide-26
SLIDE 26

Domain-invariant features Importance sampling of data Adapt background models etc. Curriculum domain adaptation Style transfer, etc. Simulation to reality for segmentation, detection, Dynamics planning & control, etc.

Domain adaptation: key to use simulation “for real”

slide-27
SLIDE 27

Domain-invariant features Importance sampling of data Adapt background models etc. Curriculum domain adaptation Style transfer, etc. Simulation to reality for segmentation, detection, Dynamics planning & control, etc.

Domain adaptation: key to use simulation “for real”

slide-28
SLIDE 28

Domain adaptation → domain generalization

2 (x,a) 1 (x,a) C (x,a) C+2 C+1

… … …

(x,a)m1

m1=1,2,…

(x,a)m2

m2=1,2,…

(x,a)mC

mC=1,2,…

(x,?)n

n=1,2,…

Training data sampled from C related domains Test data from both seen & unseen domains

!

slide-29
SLIDE 29

Simulation for domain generalization

Unseen Seen

M scenes N tasks Setting 3 es

Synthesize Policy for Transfer and Adaptation across Environments and Tasks

[NIPS’18, Spotlight]

slide-30
SLIDE 30

What to simulate?

Rare events

slide-31
SLIDE 31

What to simulate? Active Simulation

Simulator Reality More data, better model Actively tune simulator

[Proof-of-concept paper submitted]

slide-32
SLIDE 32

Thank you!