SLIDE 1
Goal
- Image captioning is subjective and ill-posed - many valid ways to describe any given image, making
evaluation difficult
- Referring expression - An unambiguous text description that applies to exactly one object or region in
Generation and Comprehension of Unambiguous Object Descriptions - - PowerPoint PPT Presentation
Generation and Comprehension of Unambiguous Object Descriptions Goal Image captioning is subjective and ill-posed - many valid ways to describe any given image, making evaluation difficult Referring expression - An unambiguous text
Tied weights Tied weights
Using GT or multibox proposals at test time Ground truth sentence (comprehension task) Generated sentence (generation task)
Using GT or multibox proposals at test time Ground truth sentence (comprehension task) Generated sentence (generation task)