A Distributed Representation Based Query Expansion Approach for - - PowerPoint PPT Presentation

a distributed representation based query expansion
SMART_READER_LITE
LIVE PREVIEW

A Distributed Representation Based Query Expansion Approach for - - PowerPoint PPT Presentation

A Distributed Representation Based Query Expansion Approach for Image Captioning Semih Yagcioglu, Erkut Erdem, Aykut Erdem, Ruket akc Hacettepe University Middle East Technical University Computer Vision Lab Department of Computer


slide-1
SLIDE 1

A Distributed Representation Based
 Query Expansion Approach for Image Captioning

Semih Yagcioglu, Erkut Erdem, Aykut Erdem, Ruket Çakıcı

Hacettepe University

Computer Vision Lab

Middle East Technical University

Department of Computer Engineering

slide-2
SLIDE 2
  • ur approach

a simple data-driven transfer based approach 
 using distributed representations

slide-3
SLIDE 3

image representation

  • features from 16-layer VGG network (fc7)
  • 4096 dimensions
slide-4
SLIDE 4

visual retrieval

and adaptive inlier selection

slide-5
SLIDE 5

Initial ranking

I1 I2 I5 c1: A man climbs up a snowy mountain. c2 c5 : A boy in orange jacket appears unhappy. : A person wearing a red jacket climbs a snowy hill. Query image Iq Visually similar images

slide-6
SLIDE 6

Initial ranking

I1 I2 I5 c1: A man climbs up a snowy mountain. c2 c5 : A boy in orange jacket appears unhappy. : A person wearing a red jacket climbs a snowy hill. Visually similar images Query expansion using distributed representations c1 c5 c2

  • ur query expansion approach

swap modalities from the visual domain to a textual one

slide-7
SLIDE 7

word representation

  • word2vec model (Mikolov et al., 2013)
  • GloVe model (Pennington et al., 2014)
  • word vectors, 500 dimensions
  • MS COCO captions as corpus (617K)
slide-8
SLIDE 8

words to captions

  • sum each word vector in a caption
  • sentence vector c to represent captions
slide-9
SLIDE 9

calculating

the new textual query

slide-10
SLIDE 10

Query expansion using distributed representations

Final ranking

transferred caption c1 c5 c2

c2: A boy in orange jacket appears unhappy. c1: A man climbs up a snowy mountain. c5: A person wearing a red jacket climbs a snowy hill.

re-ranking

via cosine similarity

slide-11
SLIDE 11

experimental setup

Dataset # Images # Captions Flickr8K 8K 5 Flickr30K 30K 5 MS COCO 123K 5

slide-12
SLIDE 12

the good, the bad and the ugly

results

slide-13
SLIDE 13

a man in a black shirt and his little girl wearing orange are sharing a treat

slide-14
SLIDE 14

a construction crew in orange vests working near train tracks

slide-15
SLIDE 15

a green bird perched on top of a tree filled with pink flowers

slide-16
SLIDE 16

a white cat is sitting in a bathroom sink

slide-17
SLIDE 17

a boy is holding a dog that is wearing a hat

slide-18
SLIDE 18

a man wearing a santa hat holding a dog posing for a picture a boy is holding a dog that is wearing a hat

slide-19
SLIDE 19

quantitative evaluation

  • VC (Ordonez et al. 2011)
  • MC-KL, MC-SB (Mason and Charniak 2014)
  • BLEU, METEOR, CIDEr
  • Flickr8K, Flickr30K and MS COCO
slide-20
SLIDE 20

quantitative evaluation

slide-21
SLIDE 21

human evaluation

  • rated for relevancy on a scale of 1 to 5
  • Crowdflower with at least 5 annotators
slide-22
SLIDE 22

concluding remarks

  • a simple yet effective data-driven image

captioning approach

  • future work could focus on
  • ther pooling approaches such as using Fisher vectors

(Klein et al. 2015)

  • incorporating syntactic relations (Socher et al. 2015)
  • source code will soon be available at
  • github.com/semihyagcioglu/image-captioning