introductions
play

Introductions Instructor : Prof. Kristen Grauman TA : Wei-Lin - PDF document

CS381V - lecture 1 - course intro Introductions Instructor : Prof. Kristen Grauman TA : Wei-Lin Hsiao Visual Recognition Fall 2017 What is computer vision? Today Course overview Requirements, logistics Done? 1. Vision for


  1. CS381V - lecture 1 - course intro Introductions • Instructor : Prof. Kristen Grauman • TA : Wei-Lin Hsiao Visual Recognition Fall 2017 What is computer vision? Today • Course overview • Requirements, logistics Done? 1. Vision for measurement Computer Vision Real-time stereo Structure from motion Tracking • Automatic understanding of images and video 1. Computing properties of the 3D world from visual data (measurement) NASA Mars Rover Demirdjian et al. Snavely et al. Wang et al. 1

  2. CS381V - lecture 1 - course intro 2. Vision for perception, interpretation Computer Vision Objects amusement park sky Activities Scenes • Automatic understanding of images and video Locations The Wicked Cedar Point Text / writing Twister 1. Computing properties of the 3D world from visual Faces data (measurement) Gestures Ferris ride Motions 2. Algorithms and representations to allow a machine wheel ride Emotions… to recognize objects, people, scenes, and 12 E Lake Erie water activities. (perception and interpretation) ride tree tree people waiting in line people sitting on ride umbrellas tree maxair carousel deck bench tree pedestrians 3. Visual search, organization Computer Vision • Automatic understanding of images and video 1. Computing properties of the 3D world from visual data (measurement) 2. Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation) 3. Algorithms to mine, search, and interact with visual data ( search and organization ) Query Image or video Relevant archives content Related disciplines Computer Vision • Automatic understanding of images and video Artificial intelligence 1. Computing properties of the 3D world from visual Machine data (measurement) Graphics learning 2. Algorithms and representations to allow a machine Computer to recognize objects, people, scenes, and vision activities. (perception and interpretation) Image Cognitive processing science 3. Algorithms to mine, search, and interact with visual data ( search and organization ) Algorithms Course focus 2

  3. CS381V - lecture 1 - course intro Visual data in 1963 Vision and graphics Images Vision Model L. G. Roberts, Machine Perception Graphics of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963. Inverse problems: analysis and synthesis. Visual data in 2017 Why recognition? – Recognition a fundamental part of perception • e.g., robots, autonomous agents – Organize and give access to visual content Personal photo albums Movies, news, sports • Connect to information • Detect trends and themes • Why now? Medical and scientific images Surveillance and security Slide credit; L. Lazebnik Autonomous agents able to Faces detect objects Setting camera Camera waits for focus via face everyone to smile to detection take a photo [Canon] http://www.darpa.mil/grandchallenge/gallery.asp 3

  4. CS381V - lecture 1 - course intro Finding visually similar objects Posing visual queries Yeh et al., MIT Belhumeur et al. Kooaba, Bay & Quack et al. Discovering visual patterns Exploring community photo collections Snavely et al. Sivic & Zisserman Objects Lee & Grauman Categories Wang et al. Actions Simon & Seitz Auto-annotation Video-based interfaces Assistive technology systems Human joystick, NewsBreaker Live Camera Mouse, Boston College Gammeter et al. T. Berg et al. Microsoft Kinect 4

  5. CS381V - lecture 1 - course intro Obstacles? What else? What the computer gets Why is vision difficult? • Ill-posed problem: real world much more complex than what we can measure in images – 3D  2D • Impossible to literally “invert” image formation process Challenges: many nuisance parameters Challenges: intra-class variation Illumination Object pose Clutter Intra-class Occlusions Viewpoint appearance slide credit: Fei-Fei, Fergus & Torralba 5

  6. CS381V - lecture 1 - course intro Challenges: importance of context Challenges: importance of context Video credit: Rob Fergus and Antonio Torralba Video credit: Rob Fergus and Antonio Torralba Challenges: importance of context Challenges: complexity • Millions of pixels in an image • 30,000 human recognizable object categories • 30+ degrees of freedom in the pose of articulated objects (humans) • 300+ hours of new video on YouTube per minute • … • About half of the cerebral cortex in primates is devoted to processing visual information [Felleman and van Essen 1991] slide credit: Fei-Fei, Fergus & Torralba Progress charted by datasets Progress charted by datasets Roberts 1963 MIT-CMU Faces MIT-CMU Faces MIT-CMU Faces INRIA Pedestrians INRIA Pedestrians INRIA Pedestrians COIL UIUC Cars UIUC Cars UIUC Cars 1963 … 1963 … 1996 1996 2000 6

  7. CS381V - lecture 1 - course intro Progress charted by datasets Progress charted by datasets MSRC 21 Objects MSRC 21 Objects MSRC 21 Objects ImageNet ImageNet ImageNet 80M Tiny Images 80M Tiny Images 80M Tiny Images PASCAL VOC PASCAL VOC PASCAL VOC PASCAL VOC PASCAL VOC Caltech-101 Caltech-101 Caltech-101 Birds-200 Birds-200 Birds-200 Caltech-256 Caltech-256 Caltech-256 Faces in the Wild Faces in the Wild Faces in the Wild 1963 … 1963 … 1996 2000 2005 1996 2000 2005 2007 2008 2013 Expanding horizons: Expanding horizons: large-scale recognition captioning https://pdollar.wordpress.com/2015/01/21/image-captioning/ Expanding horizons: Expanding horizons: visual question answering vision for autonomous vehicles KITTI dataset – Andreas Geiger et al. 7

  8. CS381V - lecture 1 - course intro Expanding horizons: Expanding horizons: interactive visual search first-person vision WhittleSearch – Adriana Kovashka et al. Activities of Daily Living – Hamed Pirsiavash et al. This course • Focus on current research in – Object recognition and categorization – Image/video retrieval, annotation – Some activity recognition – Related applications • High-level vision and learning problems, innovative applications. Goals Prerequisites • Courses in: • Understand current approaches – Computer vision • Analyze – Machine learning • Identify interesting research questions • Some hands-on experience • Ability to analyze high-level conference papers 8

  9. CS381V - lecture 1 - course intro Basic format Overview of requirements • Discussions will center on recent papers in • Early weeks (1-4): the field – Lectures by instructor – Write 2 paper reviews each week, due Mon – CNN tutorial – Serve as proponent/opponent ~twice – Paper reading • Student presentations – Present an “external” from syllabus • Later weeks (5-11): – Experiment on an assigned paper – Paper discussion • 2 implementation assignments – Experiment • Project with a partner – External paper presentation Workload is fairly high Assigned vs. external papers Paper reviews Assigned • Each week, review two of the assigned papers. • Separately, summarize 2-3 “discussion points” • Post each separately to Piazza following instructions on course “requirements” page. External • Skip reviews the week(s) you are presenting an external paper or experiment. For inquiring minds http://vision.cs.utexas.edu/381V-fall2017 Paper review guidelines Discussion point guidelines • Brief (2-3 sentences) summary • ~2-3 sentences/bullets per reviewed paper • Main contribution • Recap of salient parts of your reviews • Strengths? Weaknesses? – Key observations, lingering questions, interesting • How convincing are the experiments? connections, etc. Suggestions to improve them? • Will be shared to our class via Piazza • Extensions? What’s inspiring? • Discussion points required for each class session (due 8 pm Monday) • Additional comments, unclear points • All encouraged to browse and post before • Relationships observed between the papers and after class we are reading • due 8 pm Monday on Piazza 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend