Computation, and Innovative Applications Tutorial at CVPR 2014 June - - PowerPoint PPT Presentation

▶

Jan 01, 2023 213 likes •508 views

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Tutorial at CVPR 2014 June 23rd, 1:00pm-5:00pm, Columbus, OH Introduction Instructors: Shih-Fu Chang John Smith Rogerio Feris Liangliang Cao Columbia

SLIDE 1

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications

Tutorial at CVPR 2014 June 23rd, 1:00pm-5:00pm, Columbus, OH

SLIDE 2

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Instructors:

Shih-Fu Chang John Smith Rogerio Feris Liangliang Cao Columbia University IBM T. J. Watson Research Center

SLIDE 3

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

1970s

Early Days of Computer Vision

SLIDE 4

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

First Digital Camera (1975)

 0.01 Megapixels  23 seconds to record a photo to cassette

SLIDE 5

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

 Datasets with 5 or 10 images  Large-Scale Experiment: 800 photos (Takeo Kanade Thesis, 1973)

[D. Marr, 1976]

SLIDE 6

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Today

Visual Data is Exploding!

SLIDE 7

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Announcement of Pope Benedict in 2005

SLIDE 8

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Announcement of Pope Francis in 2013

Rapid proliferation of mobile devices equipped with cameras

SLIDE 9

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

 Billions of cell phones equipped with cameras  ~500 billion consumer photos are taken each year world-wide ~500 million photos taken per year in NYC alone  Hundreds of millions of Facebook photo uploads per day

Era of Big Visual Data

SLIDE 10

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

SLIDE 11

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Introduction

Exciting Time for Computer Vision

 + DATA  + Computational Processing  + Advances in Computer Vision and Machine Learning Major opportunities for systems that automatically extract visual semantics from images and videos

SLIDE 12

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Practical Application Areas

SLIDE 13

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Smart Surveillance

“Show me all images of people matching the suspect description from time X to time Y from all cameras in area Z.”

Visual Semantics: Fine-grained person attributes

Slide credit: Rogerio Feris

SLIDE 14

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Medical Imaging

MRI Brain Axial DX Torso DX Cervical Spine PET Color DX Appendage MRI Knee

Visual Semantics: Medical Image Modality and Anatomy

Slide credit: John Smith

SLIDE 15

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Astronomy

[Cui et al, WACV 2015] http://www.galaxyzoo.org/

Visual Semantics: morphological galaxy attributes

Slide credit: Rogerio Feris

Huge dataset of galaxy images makes manual labeling infeasible (important to understand star formation, gas fraction, galaxy evolution, …)

SLIDE 16

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Nature / Ecology

http://www.youtube.com/watch?v=AUL03ivS8bY http://www.snapshotserengeti.org/

Understanding how competing species coexist is a fundamental theme in ecology, with important implications for biodiversity, and the sustainability of life on Earth Snapshot Serengeti

Visual Semantics: species of animals from camera traps

Slide credit: Rogerio Feris

SLIDE 17

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Nature / Ecology

Slide credit: Rogerio Feris

Plant Species

[Kumar et al, ECCV 2012]

Bird Species

http://www.vision.caltech.edu/visipedia/ Understanding of migration, conservation, … Used by botanists, educators, …

SLIDE 18

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Examples of Application Areas

Social Media: Visual Sentiment Analysis

Colorful clouds Misty night Colorful butterfly Crying Baby [Borth et al, ACM MM 2013]

Slide credit: Rogerio Feris

SLIDE 19

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Many more applications …

Google Goggles

Amazon

[Kovashka et al, CVPR 2012]

Slide credit: Rogerio Feris

SLIDE 20

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

SLIDE 21

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Objectives:

Cover state-of-the-art techniques for learning visual

semantics from images and videos

Focus on intuitive, semantic visual representations
Provide tools for scalable learning of semantic models
Cover innovative and practical applications
Provide pointers to related source code and datasets

SLIDE 22

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part I: Feature Extraction, Coding, and Pooling (Liangliang)

 Brief Introduction to local feature descriptors, coding ,and pooling

Focus on modern representations such as Fisher Vector and Sparse Coding

SLIDE 23

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part I: Feature Extraction, Coding, and Pooling (Liangliang)

 Connections to feature learning approaches (e.g., deep convolutional neural networks)

Picture credit: Kai Yu

SLIDE 24

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part II: Large-Scale Semantic Modeling (John Smith)

 Semantic Concept Modeling: Historic Overview

Picture credit: John Smith

SLIDE 25

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part II: Large-Scale Semantic Modeling (John Smith)

 How to deal with class imbalance? How to scale to millions of semantic unit models?

Picture credit: John Smith

SLIDE 26

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part III: Shifting from naming to describing: semantic attribute models (Rogerio Feris)

 Scalable learning with Attribute Models / Zero-Shot Learning

[Lampert et al, CVPR 2009]

SLIDE 27

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part III: Shifting from naming to describing: semantic attribute models (Rogerio Feris)

 Attribute-based Search

Slide credit: Rogerio Feris

SLIDE 28

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014

Tutorial Overview

Part IV: High-level Semantic Modeling: Visual Sentiment Analysis (Shih-Fu Chang)

 Semantic models for encoding emotions in social media