introductions
play

Introductions Computer Vision Automatic understanding of images - PDF document

CS 376 Computer Vision : Lecture 1 What is computer vision? Computer Vision Jan 18, 2018 Done? Kristen Grauman, University of Texas at Austin Introductions Computer Vision Automatic understanding of images and video Instructor : 1.


  1. CS 376 Computer Vision : Lecture 1 What is computer vision? Computer Vision Jan 18, 2018 Done? Kristen Grauman, University of Texas at Austin Introductions Computer Vision • Automatic understanding of images and video • Instructor : 1. Computing properties of the 3D world from visual – Prof. Kristen Grauman data (measurement) • TAs : – Thomas Crosley – Kapil Krishnakumar – Shubham Sharma 1. Vision for measurement Today Real-time stereo Structure from motion Tracking • Course overview • Requirements, logistics NASA Mars Rover Demirdjian et al. Snavely et al. Wang et al. 1

  2. CS 376 Computer Vision : Lecture 1 3. Visual search, organization Computer Vision • Automatic understanding of images and video 1. Computing properties of the 3D world from visual data (measurement) 2. Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation) Query Image or video Relevant archives content 2. Vision for perception, interpretation Computer Vision Objects amusement park sky Activities Scenes • Automatic understanding of images and video Locations The Wicked Cedar Point Text / writing Twister 1. Computing properties of the 3D world from visual Faces data (measurement) Gestures Ferris ride Motions wheel 2. Algorithms and representations to allow a machine ride Emotions… to recognize objects, people, scenes, and 12 E Lake Erie water activities. (perception and interpretation) ride tree 3. Algorithms to mine, search, and interact with visual tree data ( search and organization ) people waiting in line people sitting on ride umbrellas tree maxair carousel deck bench tree pedestrians Related disciplines Computer Vision • Automatic understanding of images and video Artificial intelligence 1. Computing properties of the 3D world from visual Machine data (measurement) Graphics learning 2. Algorithms and representations to allow a machine Computer to recognize objects, people, scenes, and vision activities. (perception and interpretation) Image Cognitive processing science 3. Algorithms to mine, search, and interact with visual data ( search and organization ) Algorithms 2

  3. CS 376 Computer Vision : Lecture 1 Vision and graphics Why vision? • As image sources multiply, so do applications Images Vision Model – Relieve humans of boring, easy tasks – Enhance human abilities Graphics – Advance human-computer interaction, visualization – Perception for robotics / autonomous agents Inverse problems: analysis and synthesis. – Organize and give access to visual content Visual data in 1963 Faces and digital cameras L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963. Setting camera Camera waits for focus via face everyone to smile to detection take a photo [Canon] Visual data in 2018 Linking to info with a mobile device Situated search Personal photo albums Movies, news, sports Yeh et al., MIT Google Goggles MSR Lincoln Medical and scientific images Surveillance and security kooaba Slide credit; L. Lazebnik 3

  4. CS 376 Computer Vision : Lecture 1 Special visual effects Video-based interfaces The Matrix Assistive technology systems Human joystick, NewsBreaker Live Camera Mouse, Boston College Mocap for Pirates of the Carribean , Industrial Light and Magic Source: S. Seitz What Dreams May Come Microsoft Kinect Safety & security What else? Navigation, driver safety Monitoring pool (Poseidon) Surveillance Pedestrian detection MERL, Viola et al. Obstacles? Vision for medical & neuroimages fMRI data Golland et al. Image guided surgery MIT AI Vision Group 4

  5. CS 376 Computer Vision : Lecture 1 Challenges: intra-class variation What the computer gets slide credit: Fei-Fei, Fergus & Torralba Why is vision difficult? Challenges: importance of context • Ill-posed problem: real world much more complex than what we can measure in images – 3D  2D • Impossible to literally “invert” image formation process Challenges: many nuisance parameters Challenges: importance of context Illumination Object pose Clutter Intra-class Occlusions Viewpoint appearance 5

  6. CS 376 Computer Vision : Lecture 1 Challenges: importance of context Progress charted by datasets MIT-CMU Faces MIT-CMU Faces MIT-CMU Faces INRIA Pedestrians INRIA Pedestrians INRIA Pedestrians UIUC Cars UIUC Cars UIUC Cars 1963 … 1996 2000 slide credit: Fei-Fei, Fergus & Torralba Challenges: complexity Progress charted by datasets • Millions of pixels in an image • 30,000 human recognizable object categories • 30+ degrees of freedom in the pose of articulated MSRC 21 Objects MSRC 21 Objects MSRC 21 Objects objects (humans) • Billions of images online • 144K hours of new video on YouTube daily Caltech-101 Caltech-101 Caltech-101 • … • About half of the cerebral cortex in primates is devoted to processing visual information [Felleman Caltech-256 Caltech-256 Caltech-256 and van Essen 1991] 1963 … 1996 2000 2005 Progress charted by datasets Progress charted by datasets ImageNet ImageNet ImageNet 80M Tiny Images 80M Tiny Images 80M Tiny Images Roberts 1963 PASCAL VOC PASCAL VOC PASCAL VOC PASCAL VOC PASCAL VOC Birds-200 Birds-200 Birds-200 COIL Faces in the Wild Faces in the Wild Faces in the Wild 1963 … 1963 … 1996 1996 2000 2005 2007 2008 2018 6

  7. CS 376 Computer Vision : Lecture 1 Expanding horizons: Expanding horizons: interactive visual search large-scale recognition WhittleSearch – Adriana Kovashka et al. Expanding horizons: Expanding horizons: captioning first-person vision https://pdollar.wordpress.com/2015/01/21/image-captioning/ Activities of Daily Living – Hamed Pirsiavash et al. Expanding horizons: Brainstorm vision for autonomous vehicles Pick an application or task among any of those we’ve described so far. 1. What functionality should the system have? 2. Intuitively, what are the technical sub-problems that must be solved? KITTI dataset – Andreas Geiger et al. 7

  8. CS 376 Computer Vision : Lecture 1 Grouping & fitting Goals of this course • Upper division undergrad course • Introduction to primary topics – Fundamentals of computer vision – image processing, grouping, multiple views – Recognition - emphasis on supervised learning (~last third of the course) [fig from Shi et al] Clustering, • Hands-on experience with algorithms segmentation, • Views of vision as a research area fitting; what parts belong together? Topics overview Multiple views Matching, invariant features, • Features & filters stereo vision, instance • Grouping & fitting recognition • Multiple views • Recognition Lowe Hartley and Zisserman Fei-Fei Li Features and filters Recognition and learning Recognizing categories (objects, scenes, Transforming and activities, attributes…), describing images; learning techniques textures, colors, edges 8

  9. CS 376 Computer Vision : Lecture 1 Textbooks Matlab • Recommended book: • Built-in toolboxes for low- – Computer Vision: Algorithms level image processing, and Applications visualization – By Rick Szeliski • Compact programs – http://szeliski.org/Book/ • Intuitive interactive debugging • Widely used in engineering Requirements / Grading Assignment 0 • Programming assignments (50%) • A0: Matlab warmup + basic image manipulation • Midterm exam (15%) • Out today, due Tues Jan 23 • Final exam (25%) • Class participation, including attendance (10%) • Verify CS account and Matlab access • Check grades on Canvas • Look at the tutorial online – A quote from a prior student evaluation: “To be honest, I think without going to class, the course would be very hard. “ Assignments Digital images Images as matrices • Majority - Programming problem – Implementation – Explanation, results • Code in Matlab – available on CS Unix machines (see course page) • Optional Latex templates • Most of these assignments take significant time to do. We recommend starting early. 9

  10. CS 376 Computer Vision : Lecture 1 Preview of assignments Digital images width 520 j=1 i=1 Intensity : [0,255] Grouping for segmentation 500 height im[176][201] has value 164 im[194][203] has value 37 Preview of assignments Color images, RGB color space Image mosaics / stitching R G B Image from Fei-Fei Li Preview of assignments Preview of assignments Seam carving Matching and recognition 10

  11. CS 376 Computer Vision : Lecture 1 Miscellaneous Preview of assignments • Slides, announcements via class website • No laptops, phones, tablets, etc. open in class please. • Please use the front rows Object detection Collaboration policy Coming up All responses and code must be written • Now: check out Matlab tutorial online individually unless otherwise specified. • A0 due Tues Jan 23 Students submitting answers or code found to be identical or substantially similar (due to • Textbook reading posted for next week inappropriate collaboration) risk failing the course. Assignment deadlines • Due about every two weeks – tentative deadlines posted online but could slightly shift depending on lecture pace • Assignments in by 11:59 PM on due date – Submit on Canvas, following submission instructions given in assignment. – Deadlines are firm. We’ll use timestamp. • Use Piazza, office hours for questions 11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend