Image Identification with Natural Language Specification Qi Feng, - PowerPoint PPT Presentation

Introduction Methods Results Saliency Map Image Identification with Natural Language Specification Qi Feng, Donghyun Kim Department of Computer Science, Boston University fung@bu.edu, donhk@bu.edu December 08, 2017 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map Outline Introduction 1 Methods 2 Results 3 4 Saliency Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map Photo Search Figure: Screen shot of a natural language search on Google Photos. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map The Problem Figure: Identification of the target image by natural language specification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map GloVe Embedding GloVe is an unsupervised learning algorithm for obtaining vector representations for words.[2] Figure: The projection of word embedding into 2D space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map The Baseline Model ▶ Cosine similarity ▶ average of word embeddings[2] ▶ the input query ▶ a generated caption for an image[4] ▶ The Inception v3 ▶ pretrained on the ILSVRC-2012-CLS[3]. ▶ The language model ▶ trained 20,000 iterations on MSCOCO[1]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map The Proposed Model Image Query CNN Visual Representation concat Language Model(LSTM) Similarity Figure: The proposed model. Red rounded rectangles are inputs to the model. The blue rectangle is the intermediate result from the convolutional neural network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map Training and Testing Figure: Positive Training Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map Results cont. ▶ The Baseline Model: 91.1% ▶ The Proposed Model: 93.4% . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map Excitation Back-propagation for Saliency Map ▶ Goal ▶ The goal is to find a salient region in input to interpret model’s predictions using a back-propagation. ▶ Assumptions ▶ The response of the activation neuron is non-negative. ▶ An activation neuron is tuned to detect certain visual features. Its response is positively correlated to its confidence of the detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map Spacial and Temporal Saliency Figure: Spatial and temporal saliency on MS-COCO. Original images on the left and saliency maps on the right. The queries are shown under each image. Red word represents the maximum temporal saliency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map Conclusion ▶ A model that identify an image by natural language specifications. ▶ An RNN to measure the similarity between images and queries. ▶ Excitation Back-propagation for finding spatial and temporal groundings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map Tsung-Yi Lin, Michael Maire, Serge J. Belongie, Lubomir D. Bourdev, Ross B. Girshick, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. Microsoft COCO: common objects in context. CoRR , abs/1405.0312, 2014. Jeffrey Pennington, Richard Socher, and Christopher Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) , pages 1532–1543, 2014. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) , 115(3):211–252, 2015. Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Introduction Methods Results Saliency Map Show and tell: Lessons learned from the 2015 MSCOCO image captioning challenge. CoRR , abs/1609.06647, 2016. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Identification with Natural Language Specification Qi Feng, Donghyun Kim

Image Identification with Natural Language Specification Qi Feng, - PowerPoint PPT Presentation

Introduction Methods Results Saliency Map Image Identification with Natural Language Specification Qi Feng, Donghyun Kim Department of Computer Science, Boston University fung@bu.edu, donhk@bu.edu December 08, 2017 . . . . . . . .

Identification and Specification of Identification and Specification of NGN Service and Control

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Formal Specification and Verification Formal specification Temporal logic 12.06.2012

Person Re-Identification Yiheng Liu Outli line Background Image-Based Person

Formal Specification and Verification Formal specification (2) 6.12.2016 Viorica

REQUIREMENT Requirement specification SPECIFICATION motivation and basics Today:

Formal Specification and Verification Formal specification (2) 29.11.2016 Viorica

Natural Language Understanding We want to communicate with computers using natural language

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Software specification in CASL - The Common Algebraic Specification Language Till Mossakowski,

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Stochastic Search using the Natural Gradient Ecient Natural Evolution Strategies (eNES) Yi Sun,

NaturalLI: Natural Logic Inference for Common Sense Reasoning Angeli & Manning (2014)

CSE 473: Artificial Intelligence Autumn 2011 Search Luke Zettlemoyer Slides from Dan Klein,

The Hill We Must Die On: Cryptographers and Congress Shaanan Cohney Gabriel Kaptchuk University

Task-Oriented Query Reformulation with Reinforcement Learning Authors: Rodrigo Nogueira and

WELC LCOME ME TO JS JS101 Job Search ch Training Skills, Knowledge, and Information for the

Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search Youhei

Stochastic Methods for Continuous Optimization Anne Auger and Dimo Brockhoff Paris-Saclay Master

Image Identification with Natural Language Specification Qi Feng, - PowerPoint PPT Presentation

Introduction Methods Results Saliency Map Image Identification with Natural Language Specification Qi Feng, Donghyun Kim Department of Computer Science, Boston University fung@bu.edu, donhk@bu.edu December 08, 2017 . . . . . . . .

Identification and Specification of Identification and Specification of NGN Service and Control

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Formal Specification and Verification Formal specification Temporal logic 12.06.2012

Person Re-Identification Yiheng Liu Outli line Background Image-Based Person

Formal Specification and Verification Formal specification (2) 6.12.2016 Viorica

REQUIREMENT Requirement specification SPECIFICATION motivation and basics Today:

Formal Specification and Verification Formal specification (2) 29.11.2016 Viorica

Natural Language Understanding We want to communicate with computers using natural language

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Software specification in CASL - The Common Algebraic Specification Language Till Mossakowski,

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Stochastic Search using the Natural Gradient Ecient Natural Evolution Strategies (eNES) Yi Sun,

NaturalLI: Natural Logic Inference for Common Sense Reasoning Angeli &amp; Manning (2014)

CSE 473: Artificial Intelligence Autumn 2011 Search Luke Zettlemoyer Slides from Dan Klein,

The Hill We Must Die On: Cryptographers and Congress Shaanan Cohney Gabriel Kaptchuk University

Task-Oriented Query Reformulation with Reinforcement Learning Authors: Rodrigo Nogueira and

WELC LCOME ME TO JS JS101 Job Search ch Training Skills, Knowledge, and Information for the

Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search Youhei

Stochastic Methods for Continuous Optimization Anne Auger and Dimo Brockhoff Paris-Saclay Master

NaturalLI: Natural Logic Inference for Common Sense Reasoning Angeli & Manning (2014)