Deep Networks for Computer Vision at Google Chuck Rosenberg - - PowerPoint PPT Presentation

deep networks for computer vision at google
SMART_READER_LITE
LIVE PREVIEW

Deep Networks for Computer Vision at Google Chuck Rosenberg - - PowerPoint PPT Presentation

Deep Networks for Computer Vision at Google Chuck Rosenberg ImageNet ILSVRC Workshop September 12, 2014 Quick Intro Private Photo Search and Public Image Search Teams Google Photos Our work: Pixels Knowledge Search by Image


slide-1
SLIDE 1

Deep Networks for Computer Vision at Google

Chuck Rosenberg

ImageNet ILSVRC Workshop September 12, 2014

slide-2
SLIDE 2

Quick Intro

Private Photo Search and Public Image Search Teams Our work: Pixels → Knowledge Google Photos

slide-3
SLIDE 3

Applications

Google Image Search Search by Image

slide-4
SLIDE 4

Applications - Photo Search

slide-5
SLIDE 5

Applications - Auto Curation

slide-6
SLIDE 6

Google Photos - Auto Awesome

slide-7
SLIDE 7

More Image Understanding at Google

YouTube Google Shopping StreetView / Maps Self-Driving Cars Robotics Much more... Advertising

slide-8
SLIDE 8

Understanding is about extracting Knowledge

slide-9
SLIDE 9

Image Understanding: Pixels → Entities

Single-word entities go way beyond simple objects! My photos of …

  • bjects: “dog”

fine-grained objects: “Husky” scenes: “beach”, “sunset” actions: “kitesurfing” ,“kiss” emotions: “happiness”, “laughter” events: “birthday”, “basketball game” abstract concepts: “love”, “zen”

slide-10
SLIDE 10

Deep Neural Network

The Deep and now Deeper Hammer

Pixels Target output

“ImageNet Classification with Deep Convolutional Neural Networks”, Krizhevsky, Sutskever, Hinton, NIPS 2012

Deep learning infrastructure by the Google Brain team

slide-11
SLIDE 11

Personal Photos - Example Annotations

Christmas tree Red Christmas decoration Christmas Crowd Cheering People Stadium Play Meal Cake Child Hummingbird Macro photography Reflection Red

slide-12
SLIDE 12

More Example Network Annotations

slide-13
SLIDE 13

More Example Network Annotations

slide-14
SLIDE 14

Google Network Stats

  • Training Data

○ ImageNet 1K ~1M images ○ X-Net 100’s of millions of images

  • Label Set

○ Image 1K ○ X-Net ~10’s of thousands of labels

  • Ground Truth Issues

○ Incomplete Training Data ○ Noisy Training Data

slide-15
SLIDE 15

Challenge: Incomplete label ground-truth

Problem increasingly serious as we add more types of entities and fine- grained categories: “Airedale Terrier” but not “Terrier” “Dog” “Animal” or “Pet” “Cute” or “Curb” “Grass” “Street” ...

slide-16
SLIDE 16

Challenge: Noisy data

“Tortoise” “Tortoise Shell Sunglasses” “Random noise”

slide-17
SLIDE 17

Image Understanding: Localization

dog human running

Mountain sky grass road

  • bject detection

scene parsing pose estimation

slide-18
SLIDE 18

Sample detections

ImageNet Pascal VOC

slide-19
SLIDE 19

Google Confidential and Proprietary

Training Embeddings Using Triplets

  • Training data consists of triplets: an anchor image, positive image, and

negative image.

  • Loss function:

... Deep Neural Net L2 Triplet Loss

E M B E D D I N G

Anchor Positive Negative

[1] “Learning Fine-grained Image Similarity with Deep Ranking”, Wang, Song, Leung, Rosenberg, Wang, Philbin, Chen, Wu, CVPR 2014

[1] Triplets

slide-20
SLIDE 20

Google Confidential and Proprietary

Embedding Results

slide-21
SLIDE 21

Google Confidential and Proprietary

Embedding Results

slide-22
SLIDE 22

Google Confidential and Proprietary

Embedding Results

slide-23
SLIDE 23

Google Confidential and Proprietary

slide-24
SLIDE 24

Some Take Aways

What Works

  • ImageNet - of course! =)
  • More data leads to better performance
  • Deeper and bigger networks lead to better performance
  • Networks handle many diverse problems very well

What Needs Work

  • More insight into the “Black Box” - diagnosis and understanding
  • Understand and improve and training data efficiency
  • Efficient means of collecting more training data
  • Better ways to deal with noisy training data
slide-25
SLIDE 25

Thanks to the teams...

  • Image Understanding Team
  • Google Photos Team
  • Google Brain Team
  • Google Research
  • Our great interns

And We’re Hiring! I’m: chuck@google.com

slide-26
SLIDE 26

The End