a fast and accurate one stage approach to visual grounding
play

A Fast and Accurate One-Stage Approach to Visual Grounding - PowerPoint PPT Presentation

A Fast and Accurate One-Stage Approach to Visual Grounding Zhengyuan Yang Boqing Gong Liwei Wang Wenbing Huang Dong Yu Jiebo Luo Presenter: Tianlang Chen Visual grounding Grounding a language query onto a region of the image


  1. A Fast and Accurate One-Stage Approach to Visual Grounding Zhengyuan Yang Boqing Gong Liwei Wang Wenbing Huang Dong Yu Jiebo Luo Presenter: Tianlang Chen

  2. Visual grounding • Grounding a language query onto a region of the image • Grounding a language query onto a region of the image Phrase localization – Referring expression comprehension – Query: bottom right grass

  3. Existing framework • Two-stage framework ✔ Query: center building

  4. Existing framework • Performance is capped by the region candidates • Slow in speed

  5. One-stage visual grounding • One-stage approach • Generally applicable for sub-tasks in grounding

  6. Why one-stage visual grounding • No region candidates -> 7~20% higher in accuracy • One-stage -> 10x faster

  7. Architecture overview • Encoder • Fusion module • Grounding module

  8. Architecture • Encoder • Fusion module • Grounding module • Visual encoder: DarkNet53+FPN • Language encoder: Bert, LSTM, FV • Spatial encoder: location related queries

  9. Architecture • Encoder • Fusion module • Grounding module • Image-level fusion • Image-level fusion – Multiple resolutions – Three parts of input features

  10. Architecture • Encoder • Fusion module • Grounding module • Output format: box + confidence

  11. Datasets • Phrase localization: Flickr 30K Entities • Referring expression comprehension: ReferItGame the black backpack on the bottom right Flickr 30K Entities ReferItGame

  12. Comparison to other methods

  13. Qualitative results ● Reasons of improvement Two- gt stage Pred. Ours • Union of multiple objects • Stuff as opposed to things • Challenging regions

  14. A Fast and Accurate One-Stage Approach to Visual Grounding Code & models: https://github.com/zyang-ur/onestage_grounding Poster: #26 Contact: zyang39@cs.rochester.edu

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend