about visual categories Kristen Grauman Department of Computer - PowerPoint PPT Presentation

Teaching computers about visual categories Kristen Grauman Department of Computer Science University of Texas at Austin

Visual category recognition Goal: recognize and detect categories of visually and semantically related… Objects Scenes Activities Kristen Grauman, UT Austin

The need for visual recognition Robotics Augmented reality Indexing by content Surveillance Scientific data analysis Kristen Grauman, UT Austin

Difficulty of category recognition Illumination Object pose Clutter Intra-class Occlusions Viewpoint appearance ~30,000 possible categories to distinguish! [Biederman 1987] Kristen Grauman, UT Austin

Progress charted by datasets Roberts 1963 COIL 1963 … 1996 Kristen Grauman, UT Austin

Progress charted by datasets MIT-CMU Faces INRIA Pedestrians UIUC Cars 1963 … 1996 2000 Kristen Grauman, UT Austin

Progress charted by datasets MSRC 21 Objects Caltech-101 Caltech-256 1963 … 2005 1996 2000 Kristen Grauman, UT Austin

Progress charted by datasets PASCAL VOC Detection challenge 1963 … 2005 2007 1996 2000 Kristen Grauman, UT Austin

Progress charted by datasets ImageNet 80M Tiny Images PASCAL VOC Birds-200 Faces in the Wild 1963 … 2005 2007 2013 1996 2000 2008 Kristen Grauman, UT Austin

Learning-based methods Last ~10 years: impressive strides by learning appearance models (usually discriminative). Novel image Annotator Car Training images Non-car [Papageorgiou & Poggio 1998, Schneiderman & Kanade 2000, Viola & Jones 2001, Dalal & Triggs 2005, Grauman & Darrell 2005, Lazebnik et al. 2006, Felzenszwalb et al. 2008,…] Kristen Grauman, UT Austin

Exuberance for image data (and their category labels) 14M images 1K+ labeled object categories [Deng et al. 2009-2012] ImageNet 80M images 53K noisily labeled object categories [Torralba et al. 2008] 80M Tiny Images 131K images 902 labeled scene categories 4K labeled object categories [Xiao et al. 2010] SUN Database Kristen Grauman, UT Austin

Problem Difficulty+scale Complexity of of data supervision Log scale 1998 2013 1998 2013 While complexity and scale of recognition task has escalated dramatically, our means of “teaching” visual categories remains shallow . Kristen Grauman, UT Austin

Envisioning a broader channel “This image has a cow in it.” Human annotator More labeled images ↔ more accurate models? Kristen Grauman, UT Austin

Envisioning a broader channel Human annotator Need richer means to teach system about visual world Kristen Grauman, UT Austin

Envisioning a broader channel human system human system Today Next 10 years Vision Knowledge Learning representation Vision Learning Multi-agent Human systems computation Robotics Language Kristen Grauman, UT Austin

Our goal Teaching computers about visual categories must be an ongoing, interactive process, with communication that goes beyond labels. This talk: 1. Active visual learning 2. Learning from visual comparisons Kristen Grauman, UT Austin

Active learning for visual recognition Labeled data Annotator ? Active request Unlabeled data Current classifiers [Mackay 1992, Cohn et al. 1996, Freund et al. 1997, Lindenbaum et al. 1999, Tong & Koller 2000, Schohn and Cohn 2000, Campbell et al. 2000, Roy & McCallum 2001, Kapoor et al. 2007,…] Kristen Grauman, UT Austin

Active learning for visual recognition active Labeled data Annotator Accuracy passive Unlabeled Num labels added data Current classifiers Intent: better models, faster/cheaper Kristen Grauman, UT Austin

Problem: Active selection and recognition Less expensive to • Multiple levels of obtain annotation are possible • Variable cost depending on level and example More expensive to obtain Kristen Grauman, UT Austin

Our idea: Cost-sensitive multi-question active learning • Compute decision-theoretic active selection criterion that weighs both: – which example to annotate, and – what kind of annotation to request for it as compared to – the predicted effort the request would require [Vijayanarasimhan & Grauman, NIPS 2008, CVPR 2009] Kristen Grauman, UT Austin

Decision-theoretic multi-question criterion Value of asking given Current Estimated risk if candidate Cost of getting question about given misclassification risk request were answered the answer data object Three “levels” of requests to choose from: ? ? 3. Segment the 2. Tag an object 1. Label a region image, name all in the image objects. Kristen Grauman, UT Austin

Predicting effort • What manual effort cost would we expect to pay for an unlabeled image? Which image would you rather annotate? Kristen Grauman, UT Austin

Predicting effort We estimate labeling difficulty from visual content. Kristen Grauman, UT Austin

Predicting effort We estimate labeling difficulty from visual content. Other forms of effort cost : expertise required, resolution of data, how far the robot must move, length of video clip,… Kristen Grauman, UT Austin

Multi-question active learning “Completely segment Labeled data image #32.” Annotator “Does image #7 contain a cow?” Unlabeled data Current classifiers [Vijayanarasimhan & Grauman, NIPS 2008, CVPR 2009] Kristen Grauman, UT Austin

Multi-question active learning curves Accuracy Annotation effort Kristen Grauman, UT Austin

Multi-question active learning with objects and attributes [Kovashka et al., ICCV 2011] Labeled data Annotator Unlabeled Does this object What is this Current data object ? have spots ? model Weigh relative impact of an object label or an attribute label, at each iteration. Kristen Grauman, UT Austin

Budgeted batch active learning [Vijayanarasimhan et al., CVPR 2010] Labeled data Annotator $ $ Unlabeled Current data $ $ model Unlabeled data Select batch of examples that together improves classifier objective and meets annotation budget . Kristen Grauman, UT Austin

Problem : “Sandbox” active learning Thus far, tested only in artificial settings: • Unlabeled data already fixed, small scale, biased ~10 3 prepared images passive Accuracy • Computational cost ignored active Actual time Kristen Grauman, UT Austin

Our idea : Live active learning Large-scale active learning of object detectors with crawled data and crowdsourced labels. How to scale active learning to massive unlabeled pools of data? Kristen Grauman, UT Austin

Pool-based active learning e.g., select point nearest to hyperplane decision boundary ? for labeling. w [Tong & Koller, 2000; Schohn & Cohn, 2000; Campbell et al. 2000] Kristen Grauman, UT Austin

Sub-linear time active selection We propose a novel hashing approach to identify the most uncertain examples in sub-linear time. 110 Current classifier 101 111 Actively selected examples Hash table Unlabeled data [Jain, Vijayanarasimhan, Grauman, NIPS 2010] Kristen Grauman, UT Austin

Hashing a hyperplane query   h ( ) { , } w x x 1 k ( t ) x 2 ( t ) x 1  ( t 1 ) x 1  ( t 1 )  x ( t 1 ) ( t ) (  x 3 t 1 ) w w 2 At each iteration of the learning loop, our hash functions map the current hyperplane directly to its nearest unlabeled points. Kristen Grauman, UT Austin

Hashing a hyperplane query   h ( ) { , } w x x 1 k ( t ) x 2 ( t ) x Guarantee high probability of collision for 1  ( t 1 ) points near decision boundary: x 1  ( t 1 )  x ( t 1 ) ( t ) (  x 3 t 1 ) w w 2 At each iteration of the learning loop, our hash functions map the current hyperplane directly to its nearest unlabeled points. Kristen Grauman, UT Austin

Sub-linear time active selection Accuracy improvements Accounting for all costs Improvement in AUROC as more data 15% H-Hash Active labeled Exhaustive Active Passive 10% 5% H-Hash Active 2 Exhaustive Active Passive Time spent 8 1.3 4 Selection + labeling time (hrs) searching for selection By minimizing both selection and labeling time, obtain the best H-Hash Exhaustive Active Active accuracy per unit time. H-Hash result on 1M Tiny Images Kristen Grauman, UT Austin

PASCAL Visual Object Categorization • Closely studied object detection benchmark • Original image data from Flickr http://pascallin.ecs.soton.ac.uk/challenges/VOC/ Kristen Grauman, UT Austin

Live active learning Consensus (Mean shift) Annotated data   “bicycle” h Current w 1100 hyperplane 1010 1111 Actively   selected h  Jumping ( O ) i examples Hash table of window image candidates windows Unlabeled Unlabeled images windows [Vijayanarasimhan & Grauman CVPR 2011]

about visual categories Kristen Grauman Department of Computer - PowerPoint PPT Presentation

Teaching computers about visual categories Kristen Grauman Department of Computer Science University of Texas at Austin Visual category recognition Goal: recognize and detect categories of visually and semantically related Objects Scenes

Combinatory Categorial Grammar (CCG) Categories Categories = types Primitive categories

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Tangent categories are locally Cartesian differential categories J.R.B. Cockett Department of

Ontological Categories Roberto Poli Ontologys three main components Fundamental categories

Weak functors for degenerate Trimble 3-categories Eugenia Cheng School of the Art Institute of

Tutorial: Differential Categories and Cartesian Differential Categories JS Pacaud Lemay FMCS

Notes on derived categories and motives Daniel Krashen Table of Contents Introduction The

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

Teaching Categories to Human Learners with Visual Explanations Oisin Mac Aodha Can we design

Recap by Milo Davies, SAS NZ POWERFUL ADAPTIVE OPEN UNIFIED SAS Visual Analytics SAS Visual

Visual Analytics Visual Analytics is the science of analytical reasoning supported by interactive

Visual Perception human perception display devices 1 CS 349 - Visual Perception Reference

New USNH Appointment Categories Appointment Categories TH E W H Y TH E W H Y ( R

Categories and their Algebra James A. Overton B.Math Honours Thesis Presentation September 12,

Introduction to Statistical Machine Learning Marcus Hutter Canberra, ACT, 0200, Australia

Attorney Expertise, Litigant Success, and Judicial Decisionmaking in the U.S. Courts of Appeals

"More Ballast!" The International Lisp Conference Keynote For those of you who haven't

Agro-processing, value chains and regional integration in Southern Africa SA-TIED Webinar

1 Peter Series Lesson #127 April 19, 2018 Dean Bible Ministries www.deanbibleministries.org Dr.

Translating Unknown Words by Analogical Learning Philippe Langlais and Alexandre Patry Dept.

The Dynamic economic effects of a US corporate income tax Rate Reduction John W. Diamond Kelly

Bargaining and Coalition Formation Dr James Tremewan (james.tremewan@univie.ac.at)

about visual categories Kristen Grauman Department of Computer - PowerPoint PPT Presentation

Teaching computers about visual categories Kristen Grauman Department of Computer Science University of Texas at Austin Visual category recognition Goal: recognize and detect categories of visually and semantically related Objects Scenes

Combinatory Categorial Grammar (CCG) Categories Categories = types Primitive categories

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Tangent categories are locally Cartesian differential categories J.R.B. Cockett Department of

Ontological Categories Roberto Poli Ontologys three main components Fundamental categories

Weak functors for degenerate Trimble 3-categories Eugenia Cheng School of the Art Institute of

Tutorial: Differential Categories and Cartesian Differential Categories JS Pacaud Lemay FMCS

Notes on derived categories and motives Daniel Krashen Table of Contents Introduction The

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

Teaching Categories to Human Learners with Visual Explanations Oisin Mac Aodha Can we design

Recap by Milo Davies, SAS NZ POWERFUL ADAPTIVE OPEN UNIFIED SAS Visual Analytics SAS Visual

Visual Analytics Visual Analytics is the science of analytical reasoning supported by interactive

Visual Perception human perception display devices 1 CS 349 - Visual Perception Reference

New USNH Appointment Categories Appointment Categories TH E W H Y TH E W H Y ( R

Categories and their Algebra James A. Overton B.Math Honours Thesis Presentation September 12,

Introduction to Statistical Machine Learning Marcus Hutter Canberra, ACT, 0200, Australia

Attorney Expertise, Litigant Success, and Judicial Decisionmaking in the U.S. Courts of Appeals

&quot;More Ballast!&quot; The International Lisp Conference Keynote For those of you who haven't

Agro-processing, value chains and regional integration in Southern Africa SA-TIED Webinar

1 Peter Series Lesson #127 April 19, 2018 Dean Bible Ministries www.deanbibleministries.org Dr.

Translating Unknown Words by Analogical Learning Philippe Langlais and Alexandre Patry Dept.

The Dynamic economic effects of a US corporate income tax Rate Reduction John W. Diamond Kelly

Bargaining and Coalition Formation Dr James Tremewan (james.tremewan@univie.ac.at)

"More Ballast!" The International Lisp Conference Keynote For those of you who haven't