Computation, and Innovative Applications Tutorial at CVPR 2014 June - - PowerPoint PPT Presentation

computation and innovative applications
SMART_READER_LITE
LIVE PREVIEW

Computation, and Innovative Applications Tutorial at CVPR 2014 June - - PowerPoint PPT Presentation

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Tutorial at CVPR 2014 June 23rd, 1:00pm-5:00pm, Columbus, OH Introduction Instructors: Shih-Fu Chang John Smith Rogerio Feris Liangliang Cao Columbia


slide-1
SLIDE 1

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications

Tutorial at CVPR 2014 June 23rd, 1:00pm-5:00pm, Columbus, OH

slide-2
SLIDE 2

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Instructors:

Shih-Fu Chang John Smith Rogerio Feris Liangliang Cao Columbia University IBM T. J. Watson Research Center

slide-3
SLIDE 3

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

1970s

Early Days of Computer Vision

slide-4
SLIDE 4

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

First Digital Camera (1975)

 0.01 Megapixels  23 seconds to record a photo to cassette

slide-5
SLIDE 5

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

 Datasets with 5 or 10 images  Large-Scale Experiment: 800 photos (Takeo Kanade Thesis, 1973)

[D. Marr, 1976]

slide-6
SLIDE 6

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Today

Visual Data is Exploding!

slide-7
SLIDE 7

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Announcement of Pope Benedict in 2005

slide-8
SLIDE 8

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Announcement of Pope Francis in 2013

Rapid proliferation of mobile devices equipped with cameras

slide-9
SLIDE 9

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

 Billions of cell phones equipped with cameras  ~500 billion consumer photos are taken each year world-wide ~500 million photos taken per year in NYC alone  Hundreds of millions of Facebook photo uploads per day

Era of Big Visual Data

slide-10
SLIDE 10

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

slide-11
SLIDE 11

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Exciting Time for Computer Vision

 + DATA  + Computational Processing  + Advances in Computer Vision and Machine Learning Major opportunities for systems that automatically extract visual semantics from images and videos

slide-12
SLIDE 12

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Practical Application Areas

slide-13
SLIDE 13

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Smart Surveillance

“Show me all images of people matching the suspect description from time X to time Y from all cameras in area Z.”

Visual Semantics: Fine-grained person attributes

Slide credit: Rogerio Feris

slide-14
SLIDE 14

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Medical Imaging

MRI Brain Axial DX Torso DX Cervical Spine PET Color DX Appendage MRI Knee

Visual Semantics: Medical Image Modality and Anatomy

Slide credit: John Smith

slide-15
SLIDE 15

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Astronomy

[Cui et al, WACV 2015] http://www.galaxyzoo.org/

Visual Semantics: morphological galaxy attributes

Slide credit: Rogerio Feris

Huge dataset of galaxy images makes manual labeling infeasible (important to understand star formation, gas fraction, galaxy evolution, …)

slide-16
SLIDE 16

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Nature / Ecology

http://www.youtube.com/watch?v=AUL03ivS8bY http://www.snapshotserengeti.org/

Understanding how competing species coexist is a fundamental theme in ecology, with important implications for biodiversity, and the sustainability of life on Earth Snapshot Serengeti

Visual Semantics: species of animals from camera traps

Slide credit: Rogerio Feris

slide-17
SLIDE 17

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Nature / Ecology

Slide credit: Rogerio Feris

Plant Species

[Kumar et al, ECCV 2012]

Bird Species

http://www.vision.caltech.edu/visipedia/ Understanding of migration, conservation, … Used by botanists, educators, …

slide-18
SLIDE 18

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Social Media: Visual Sentiment Analysis

Colorful clouds Misty night Colorful butterfly Crying Baby [Borth et al, ACM MM 2013]

Slide credit: Rogerio Feris

slide-19
SLIDE 19

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Many more applications …

Google Goggles

Amazon

[Kovashka et al, CVPR 2012]

Slide credit: Rogerio Feris

slide-20
SLIDE 20

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

slide-21
SLIDE 21

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Objectives:

  • Cover state-of-the-art techniques for learning visual

semantics from images and videos

  • Focus on intuitive, semantic visual representations
  • Provide tools for scalable learning of semantic models
  • Cover innovative and practical applications
  • Provide pointers to related source code and datasets
slide-22
SLIDE 22

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part I: Feature Extraction, Coding, and Pooling (Liangliang)

 Brief Introduction to local feature descriptors, coding ,and pooling

Focus on modern representations such as Fisher Vector and Sparse Coding

slide-23
SLIDE 23

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part I: Feature Extraction, Coding, and Pooling (Liangliang)

 Connections to feature learning approaches (e.g., deep convolutional neural networks)

Picture credit: Kai Yu

slide-24
SLIDE 24

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part II: Large-Scale Semantic Modeling (John Smith)

 Semantic Concept Modeling: Historic Overview

Picture credit: John Smith

slide-25
SLIDE 25

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part II: Large-Scale Semantic Modeling (John Smith)

 How to deal with class imbalance? How to scale to millions of semantic unit models?

Picture credit: John Smith

slide-26
SLIDE 26

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part III: Shifting from naming to describing: semantic attribute models (Rogerio Feris)

 Scalable learning with Attribute Models / Zero-Shot Learning

[Lampert et al, CVPR 2009]

slide-27
SLIDE 27

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part III: Shifting from naming to describing: semantic attribute models (Rogerio Feris)

 Attribute-based Search

Slide credit: Rogerio Feris

slide-28
SLIDE 28

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part IV: High-level Semantic Modeling: Visual Sentiment Analysis (Shih-Fu Chang)

 Semantic models for encoding emotions in social media