Environment-agnostic Multitask Learning for Natural Language - - PowerPoint PPT Presentation

environment agnostic multitask learning for natural
SMART_READER_LITE
LIVE PREVIEW

Environment-agnostic Multitask Learning for Natural Language - - PowerPoint PPT Presentation

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation Xin (Eric) Wang *, Vihan Jain*, Engene Ie, William Yang Wang, Zornitsa Kozareva, Sujith Ravi Research Conference 2018 Natural Language Grounded Navigation Command


slide-1
SLIDE 1

Research Conference 2018

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation

Xin (Eric) Wang*, Vihan Jain*, Engene Ie, William Yang Wang, Zornitsa Kozareva, Sujith Ravi

slide-2
SLIDE 2

Command embodied agents to navigate in the 3D world with natural language, such as coarse-/fine-grained instructions, questions, dialog

Natural Language Grounded Navigation

2

Person:

Can you grab the plant for me?

  • Sure. Where is it?

Gotcha. Get out of the room and go towards the kitchen. The plant is on the window near the kitchen.

slide-3
SLIDE 3

Vision-and-Language Navigation (VLN)

  • Given fine-grained instruction and a

starting location

  • Agent must reach the target location

by following the natural language instruction

  • Room-to-Room (R2R) Dataset

Anderson et al., CVPR 2018

3

slide-4
SLIDE 4

Cooperative Vision-and-Dialog Navigation (CVDN)

  • Both Navigator and Oracle are given a

hint (e.g., the goal room contains a mat)

  • Navigator: go towards the goal room

and can stop anytime to ask a question

  • Oracle: foresee the next best steps and

answer the questions

Thomason et al., CoRL 2019

4

slide-5
SLIDE 5

Sub-task: Navigation from Dialog History (NDH)

  • Given the dialogue history, predict

the navigation actions that bring the agent closer to the goal room

5

Thomason et al., CoRL 2019

slide-6
SLIDE 6

Challenge

6

slide-7
SLIDE 7

Poor Generalization Issue

  • Navigation models tend to overfit seen environments and perform

poorly on unseen environments

7

Training Evaluation Seen Unseen

=

!=

slide-8
SLIDE 8

Data Scarcity is A Big Problem

  • Real-world experiments are NOT scalable
  • Data collection is prohibitively expensive and time-consuming
  • Models break under distribution shift

8

slide-9
SLIDE 9

Environment-agnostic Multitask Navigation

9

slide-10
SLIDE 10
  • Multitask learning: transfer knowledge across tasks
  • Environment-agnostic learning: invariant representations that

can be better generalized on unseen environments

Towards Generalizable Navigation

10

slide-11
SLIDE 11

A Strong Baseline for VLN: RCM

Language Encoder Panoramic Features Action Predictor Word Embedding Trajectory Encoder CM-ATT VLN Instruction Paired Demo Path

Wang et al., CVPR 2019

11

Leave the living room. Go through the hallway with paintings on the wall and head to the kitchen. Stop next to the wooden dining table.

slide-12
SLIDE 12

Multitask RCM

Language Encoder Panoramic Features Action Predictor

Joint Word Embedding

Trajectory Encoder CM-ATT VLN Instruction

  • r

NDH Dialog Paired Demo Path

12

Interleaved Multitask Data Sampling

slide-13
SLIDE 13

Multitask Reinforcement Learning

  • Reward shaping:

○ VLN: Distance to Goal ○ NDH: Distance to Room

13

  • Navigation Loss: Reinforcement Learning + Supervised Learning
slide-14
SLIDE 14
  • NDH benefits from VLN
  • VLN benefits from NDH with more

fine-grained information about paths

Extending visual paths alone is NOT helpful

  • Multitask RL improves generalization

Seen-unseen gap is narrowed

Effect of Multitask RL

14

slide-15
SLIDE 15

Effect of Multitask RL

15

Multitask learning benefits from

  • More appearances of unrepresented

words

  • Shared semantic encoding of the

whole sentences

slide-16
SLIDE 16

Environment-agnostic Representation Learning

Language Encoder Action Predictor Word Embedding Trajectory Encoder CM-ATT NDH Dialog or VLN Instruction Environment Classifier Gradient Reversal Layer House label y

16

  • A classifier to predict the environment identity
slide-17
SLIDE 17

Environment-Aware versus Environment-Agnostic

17

6.49 8.38 6.07 2.64 1.81 3.15

2 4 6 8 10 RCM EnvAware EnvAgnostic

NDH (Progress)

Seen Unseen

52.39 57.59 52.79 42.93 38.83 44.4

30 35 40 45 50 55 60 RCM EnvAware EnvAgnostic

VLN (Success Rate)

Seen Unseen

  • Env-aware learning tends to overfit seen environments
  • Env-agnostic learning generalizes better on unseen environments
  • (Potential) Meta-learning with env-aware & env-agnostic may benefit from both worlds
slide-18
SLIDE 18

Environment-Aware versus Environment-Agnostic

18

  • Seen

Unseen

EnvAware EnvAgnostic EnvAware EnvAgnostic

slide-19
SLIDE 19

Environment-agnostic Multitask Learning Framework

19

slide-20
SLIDE 20

Effect of Environment-agnostic Multitask Learning

20

slide-21
SLIDE 21

Ranking 1st on CVDN Leaderboard

21

https://evalai.cloudcv.org/web/challenges/challenge-page/463/leaderboard/1292

slide-22
SLIDE 22

Future Work

22

slide-23
SLIDE 23

Generalized Navigation on Street View

TouchDown (Chen, et al. 2019) StreetLearn ( Mirowski, et al. 2018) TalkTheWalk (de Vries, et al. 2018)

23

slide-24
SLIDE 24

Thanks!

Paper: https://arxiv.org/abs/2003.00443 Code: https://github.com/google-research/valan

24