stay on the path instruction fidelity in vision and
play

Stay on the Path: Instruction Fidelity in Vision-and-Language - PowerPoint PPT Presentation

Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation Google Research Vihan Jain*, Gabriel Magalhaes*, Alexander Ku* Ashish Vaswani, Eugene Ie, Jason Baldridge * equal contribution ACL Florence, 29th July 2019


  1. Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation Google Research Vihan Jain*, Gabriel Magalhaes*, Alexander Ku* Ashish Vaswani, Eugene Ie, Jason Baldridge * equal contribution ACL Florence, 29th July 2019

  2. Vision-and-Language Navigation (VLN) ● Language ● Action Perception Planning ● ●

  3. Vision-and-Language Navigation (VLN) Example from Room-to-Room (R2R) 1 dataset [1] Anderson et al. Vision-and-language Navigation: Interpreting visually grounded navigation instructions in real environments , CVPR, 2018.

  4. Key Contributions ● Data Make a left down at the narrow hall... Go out the door and wait. Turn around and enter the bedroom... Walk into the doorway and stop

  5. Key Contributions ● Data ● Evaluation

  6. Key Contributions ● Data ● Evaluation r t = CLS ~ 0 Agent training ● reward r t Agent Environment action a t

  7. R2R → R4R Make a left down at the narrow hall... Go out the door and wait a 1 a n a 1 a n d(a n , b 1 ) < d th b 1 b 1 b m b m Turn around and enter the bedroom... Walk Make a left down at the narrow hall... Go out into the doorway and stop the door and wait. Turn around and enter the bedroom... Walk into the doorway and stop R2R-to-R4R code is at https://github.com/googleresearch/google-research/tree/master/r4r

  8. R2R v/s R4R

  9. VLN Evaluation: Success Rate (SR) success = d(p 5 , r 5 ) < d th r 1 =p 1 p 5 r 5 reference path agent path

  10. VLN Evaluation: Success Rate (SR) success = 1 r 1 =p 1 r 5 reference path agent path

  11. VLN Evaluation: SPL Success weighted by Path Length 1 r 1 =p 1 r 5 spl = 4/10 = 0.4 reference path agent path [1] Anderson et al. On Evaluation of Embodied Navigation Agents arXiv, 2018.

  12. VLN Evaluation: SPL r 1 =p 1 r 5 spl = 1 spl = 1 reference path agent path 1 agent path 2 [1] Anderson et al. On Evaluation of Embodied Navigation Agents arXiv, 2018.

  13. VLN Evaluation: SED Success weighted by Edit Distance 1 r 1 =p 1 r 5 sed = 1 - 0 = 1 sed = 1 - 3/4 = 0.25 reference path agent path 1 agent path 2 [1] Chen et al. Touchdown: Natural language navigation and spatial reasoning in visual street environments CVPR, 2019

  14. VLN Evaluation: SED SED=0 SED=0

  15. CLS: New VLN Evaluation Metric ● C overage weighted by L ength S core (CLS): product of Path Coverage ( PC ) and Length Score ( LS ) R : reference path P : agent’s predicted path

  16. CLS: New VLN Evaluation Metric ● Path Coverage (PC): average coverage of each node in reference path with respect to the predicted path d 3 d 2 d 1 reference path agent’s predicted path

  17. CLS: New VLN Evaluation Metric ● Expected optimal path length (EPL) is a function of path coverage Length Score (LS): compares path length of predicted path P to EPL ● reference path agent’s predicted path P

  18. CLS: Desirable Properuies Path Similarity Soft Unique Scale Measure Penalties Optimum Invariance Tractability CLS PC measures how Both PC and A predicted Both PC and Computation well the predicted LS are path achieves LS are Time: path covered the continuous the maximum invariant nodes of reference measures score if and due to graph PC - path only if it is equal invariant O(|P|.|R|) to reference constant d th path LS - O(|P|+|R|)

  19. Training VLN Agents Architecture similar to RCM 1 model ● a 1 a 2 a 3 Visual Visual Visual . . . Language Encoder Encoder Encoder Encoder . . . x 1 x 2 x n v 1 v 2 v 3 instructions visual scenes [1] Wang et al. Reinforced cross-modal matching and self-supervised imitation learning for vision-language navigation CoRR, 2018.

  20. Training VLN Agents Goal-oriented agents encouraged to pursue the goal node only ● The immediate reward after taking action a t at time step t in an episode of length T r T = 1 r T = 0

  21. Training VLN Agents Fidelity-oriented agents reach the goal node + conform to the reference path R ● CLS ~ 0 CLS ~ 1

  22. R2R Pergormance ● Fidelity-oriented agents perform slightly better on SPL, CLS SPL appears consistent with CLS ● Results on Validation Unseen dataset

  23. R2R Pergormance ● Ablation Studies Agent optimized to reach the goal may incidentally appear to be ○ conforming to the instructions Results on Validation Unseen dataset

  24. R4R Pergormance ● Fidelity-oriented agents outperform goal-oriented agents Results on Validation Unseen dataset

  25. R4R Pergormance ● Ablation Studies Fidelity-oriented agents attend more carefully to the instructions ○ Results on Validation Unseen dataset

  26. Recent Work ● Effective and General Evaluation for Instruction Conditioned Navigation using Dynamic Time Warping - https://arxiv.org/abs/1907.05446 Suite of DTW 1 based evaluation metrics for general instruction ● conditioned robotic tasks including VLN [1] Berndt et al. Using Dynamic Time Warping to Find Patterns in Time Series AAAIWS'94.

  27. Conclusion Data Agent training R4R Fidelity-oriented agents Evaluation CLS r T ~ 0

  28. Thank You! Questions?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend