Plan for today Topic overview: What does the visual recognition - PDF document

9/6/2012 Visual Recognition Kristen Grauman Dept of Computer Science Plan for today • Topic overview: – What does the visual recognition problem entail? – Why are these hard problems? – What works today? • Course overview: – Requirements – Syllabus tour 1

9/6/2012 Computer Vision • Automatic understanding of images and video – Computing properties of the 3D world from visual Computing properties of the 3D world from visual data (measurement) – Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation) – Algorithms to mine, search, and interact with visual g , , data ( search and organization ) What does recognition involve? Slide by Fei-Fei Li 2

9/6/2012 Detection: are there people? Slide by Fei-Fei Li Activity: What are they doing? Slide by Fei-Fei Li 3

9/6/2012 Object categorization mountain tree building banner street lamp vendor people Slide by Fei-Fei Li Instance recognition Potala Potala Palace A particular sign 4

9/6/2012 Scene and context categorization • outdoor • city • … Attribute recognition gray made of fabric crowded flat 5

9/6/2012 Object Categorization • Task Description  “Given a small number of training images of a category, ng recognize a-priori unknown instances of that category and assign g p g y g ory Augmented Computi the correct category label.” • Which categories are feasible visually? gnition Tutorial Visual Object Recog Perceptual and Sens “Fido” German dog animal living shepherd being K. Grauman, B. Leibe K. Grauman, B. Leibe Visual Object Categories • Basic Level Categories in human categorization [Rosch 76, Lakoff 87] ng ory Augmented Computi  The highest level at which category members have similar perceived shape  The highest level at which a single mental image reflects the gnition Tutorial entire category  The level at which human subjects are usually fastest at identifying category members  The first level named and understood by children y Visual Object Recog Perceptual and Sens  The highest level at which a person uses similar motor actions for interaction with category members K. Grauman, B. Leibe K. Grauman, B. Leibe 6

9/6/2012 Visual Object Categories • Basic-level categories in humans seem to be defined predominantly visually. ng • There is evidence that humans (usually) • There is evidence that humans (usually) ory Augmented Computi … start with basic-level categorization before doing identification. animal gnition Tutorial  Basic-level categorization is easier Abstract and faster for humans than object … … levels identification! quadruped  How does this transfer to automatic … Visual Object Recog Perceptual and Sens classification algorithms? Basic level dog cat cow German Doberman shepherd Individual … … “ Fido” level K. Grauman, B. Leibe K. Grauman, B. Leibe How many object categories are there? Biederman 1987 Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. 7

9/6/2012 Other Types of Categories • Functional Categories  e.g. chairs = “something you can sit on” ng ory Augmented Computi gnition Tutorial Visual Object Recog Perceptual and Sens K. Grauman, B. Leibe K. Grauman, B. Leibe 8

9/6/2012 Why recognition? – Recognition a fundamental part of perception • e.g., robots, autonomous agents – Organize and give access to visual content • Connect to information • Detect trends and themes • Why now? Autonomous agents able to detect objects http://www.darpa.mil/grandchallenge/gallery.asp 9

9/6/2012 Posing visual queries Yeh et al., MIT Belhumeur et al. Kooaba, Bay & Quack et al. Finding visually similar objects 10

9/6/2012 Exploring community photo collections Snavely et al. Simon & Seitz Discovering visual patterns Sivic & Zisserman Objects Lee & Grauman Lee & Grauman Categories Wang et al. Actions 11

9/6/2012 Auto-annotation Gammeter et al. T. Berg et al. Challenges 12

9/6/2012 Challenges: robustness Illumination Object pose Clutter Intra-class Occlusions Viewpoint appearance Challenges: context and human experience Context cues 13

9/6/2012 Challenges: context and human experience Function Dynamics Context cues Video credit: J. Davis Challenges: scale, efficiency • Half of the cerebral cortex in primates is devoted to processing visual information • ~20 hours of video added to YouTube per minute • ~5,000 new tagged photos added to Flickr per minute • Thousands to millions of pixels in an image • 30+ degrees of freedom in the pose of articulated • 30+ degrees of freedom in the pose of articulated objects (humans) • 3,000-30,000 human recognizable object categories 14

9/6/2012 Challenges: learning with minimal supervision More Less What kinds of things work best today? Reading license plates, zip codes, checks Frontal face detection Recognizing flat, textured objects (like books, CD Fingerprint recognition covers, posters) 15

9/6/2012 Inputs in 1963… L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963. … and inputs today Movies, news, sports Personal photo albums Medical and scientific images Surveillance and security Slide credit; L. Lazebnik 16

9/6/2012 … and inputs today 350 mil. photos, 916,271 titles 1 mil. added daily 1 6 bil images indexed 1.6 bil. images indexed 10 mil. videos, 65,000 added daily as of summer 2005 Images on the Web Movies, news, sports Satellite imagery City streets introductions 17

9/6/2012 This course • Focus on current research in – Object recognition and categorization – Image/video retrieval, annotation – Activity recognition • High-level vision and learning problems, g g p , innovative applications. 18

9/6/2012 Goals • Understand current approaches • Analyze • Identify interesting research questions Expectations • Discussions will center on recent papers in the field the field – Paper reviews each week • Student presentations – Papers and background reading – Experiment presentation • 2 implementation assignments • Project Workload is fairly high 19

9/6/2012 Prerequisites • Courses in: – Computer vision C t i i – Machine learning • Ability to analyze high-level conference papers Paper reviews • Each week, review two of the assigned papers. • Email me and TA by Thurs 9 PM E il d TA b Th 9 PM • Skip reviews the week(s) you are presenting. 20

9/6/2012 Paper review guidelines • Brief (2-3 sentences) summary • Main contribution • Main contribution • Strengths? Weaknesses? • How convincing are the experiments? Suggestions to improve them? • Extensions? • Additional comments, unclear points • Relationships observed between the papers we are reading Paper presentation guidelines • Read 3 selected papers in topic area • Well-organized talk about 30-45 minutes Well organized talk, about 30 45 minutes • What to cover? – Problem overview, motivation – Algorithm explanation, technical details – Any commonalities, important differences between y p techniques covered in the papers. • See handout and class webpage for more details. 21

9/6/2012 Experiment guidelines • Implement/download code for a main idea in the paper and show us toy examples: d h t l – Experiment with different types of (mini) training/testing data sets – Evaluate sensitivity to important parameter settings – Show (on a small scale) an example to analyze a strength/weakness of the approach • Present in class – about 30 minutes. • Present in class about 30 minutes • Share links to any tools or data. Timetable for presenters • For papers or experiments, by the Friday the week before your presentation is scheduled: – Email draft slides to me, and schedule a time to meet, do dry run, discuss. – This is a hard deadline: 5 points off automatically per day late • See course webpage for examples of good See course webpage for examples of good reviews, presentations. 22

9/6/2012 Projects Possibilities: – Extend a technique studied in class – Analysis and empirical evaluation of an existing technique – Comparison between two approaches – Design and evaluate a novel approach – Thorough survey / review paper – Thorough survey / review paper • Work in pairs, except for survey. Miscellaneous • Feedback welcome and useful • No laptops, phones, etc. in class please • Check class website • I’ll use Blackboard to email class 23

9/6/2012 Syllabus tour I. Object recognition fundamentals II. Beyond modeling individual objects III. Human-centered recognition 24

9/6/2012 Syllabus tour I. Object recognition fundamentals A. Local features and matching object instances B. Large-scale search and mining C. Classification and detection of categories D. Mid-level representations Local features and matching object instances Local invariant features Local invariant features, detection and description Matching models to images Indexing specific objects Indexing specific objects with bag-of-words descriptors 25

Plan for today Topic overview: What does the visual recognition - PDF document

9/6/2012 Visual Recognition Kristen Grauman Dept of Computer Science Plan for today Topic overview: What does the visual recognition problem entail? Why are these hard problems? What works today? Course overview:

PHASE IA PLAN ULTIMATE PLAN 13 PHASE IB PLAN ULTIMATE PLAN 14 ULTIMATE PLAN ULTIMATE PLAN

NEW COURTHOUSE 1 ST FLOOR PLAN ANNEX 3rd FLOOR PLAN 2nd FLOOR PLAN BASEMENT FLOOR PLAN ANNEX

What is the League Today 1 1/23/2017 What is the League Today What is the League Today 2

Today marks our kick off for the 2040 Long Range Transportation Plan. Today marks our kick off for

Medical Plan Comparison Central Care Plan Medical / Prescription Benefit Summary Advantage HDHP/HSA

Master Plan Open House #3 Preferred Alternative Master Plan Master Plan Process What is a

Site Plan May 2009 Site Plan February 2010 Site Plan May 5, 2010 Site Plan

Social/Network/Analysis mohamed.bouguessa@uqo.ca/ 1 Web/today 2

Lecture 15 Logistics HW4 is due today HW5 posted today HW5 posted today Exam

Retirement Plan Information Session Todays meeting Why plan for retirement? Your Plan

To Plan or Not To Plan Failure to plan is a plan to fail Questions to ask yourself Who is this

Retirement Plan Changes Update Overview Retirement Plan Options at W&M No changes to

Impr Improvement ement Plan Orienta Plan Orientation tion Building Your Plan for Academic

San Pedro Community Plan San Pedro Community Plan Presentation Overview Community Plan

Local Development Plan Local Development Plan Local Development Plan Local Development Plan

Project Area Vilas Park Vilas Park Master Plan Vilas Park Master Plan Plan Maestro De Vilas Park

Our Many Voices Platform Take the scenic route. Stop for storytellers. Search for the best piece

A distributed infrastructure supporting personalized services for the Mobile Web Claudia Canali

CDF Physics Ben Kilminster Fermilab DOE Annual Science & Review July 12-14, 2010 The CDF

SAT To Become an Auto Parts Manufacturing Leader in ASEAN with Excellent Quality May, 2011

New England Solar Cost- Reduction Partnership: Results and Lessons Learned Hosted by Warren

OpenWrt/LEDE: when two become one Florian Fainelli About Florian 2004: Bought a Linksys

Public attitudes to commercial access to health data An Ipsos MORI study commissioned by the

CSCI0170 An Integrated Introduction to Computer Science Prof. John Hughes Todays topics Who

Plan for today Topic overview: What does the visual recognition - PDF document

9/6/2012 Visual Recognition Kristen Grauman Dept of Computer Science Plan for today Topic overview: What does the visual recognition problem entail? Why are these hard problems? What works today? Course overview:

PHASE IA PLAN ULTIMATE PLAN 13 PHASE IB PLAN ULTIMATE PLAN 14 ULTIMATE PLAN ULTIMATE PLAN

NEW COURTHOUSE 1 ST FLOOR PLAN ANNEX 3rd FLOOR PLAN 2nd FLOOR PLAN BASEMENT FLOOR PLAN ANNEX

What is the League Today 1 1/23/2017 What is the League Today What is the League Today 2

Today marks our kick off for the 2040 Long Range Transportation Plan. Today marks our kick off for

Medical Plan Comparison Central Care Plan Medical / Prescription Benefit Summary Advantage HDHP/HSA

Master Plan Open House #3 Preferred Alternative Master Plan Master Plan Process What is a

Site Plan May 2009 Site Plan February 2010 Site Plan May 5, 2010 Site Plan

Social/Network/Analysis mohamed.bouguessa@uqo.ca/ 1 Web/today 2

Lecture 15 Logistics HW4 is due today HW5 posted today HW5 posted today Exam

Retirement Plan Information Session Todays meeting Why plan for retirement? Your Plan

To Plan or Not To Plan Failure to plan is a plan to fail Questions to ask yourself Who is this

Retirement Plan Changes Update Overview Retirement Plan Options at W&amp;M No changes to

Impr Improvement ement Plan Orienta Plan Orientation tion Building Your Plan for Academic

San Pedro Community Plan San Pedro Community Plan Presentation Overview Community Plan

Local Development Plan Local Development Plan Local Development Plan Local Development Plan

Project Area Vilas Park Vilas Park Master Plan Vilas Park Master Plan Plan Maestro De Vilas Park

Our Many Voices Platform Take the scenic route. Stop for storytellers. Search for the best piece

A distributed infrastructure supporting personalized services for the Mobile Web Claudia Canali

CDF Physics Ben Kilminster Fermilab DOE Annual Science &amp; Review July 12-14, 2010 The CDF

SAT To Become an Auto Parts Manufacturing Leader in ASEAN with Excellent Quality May, 2011

New England Solar Cost- Reduction Partnership: Results and Lessons Learned Hosted by Warren

OpenWrt/LEDE: when two become one Florian Fainelli About Florian 2004: Bought a Linksys

Public attitudes to commercial access to health data An Ipsos MORI study commissioned by the

CSCI0170 An Integrated Introduction to Computer Science Prof. John Hughes Todays topics Who

Retirement Plan Changes Update Overview Retirement Plan Options at W&M No changes to

CDF Physics Ben Kilminster Fermilab DOE Annual Science & Review July 12-14, 2010 The CDF