today
play

Today Course overview Requirements, logistics Done? 1. Vision for - PDF document

Introductions Instructor : Prof. Kristen Grauman Honors Machine Primary TA : Kai-Yang Chiang Learning and Vision Extra office hours : Chao-Yeh Chen Kristen Grauman UT Austin What is computer vision? Today Course overview


  1. Introductions • Instructor : Prof. Kristen Grauman Honors Machine • Primary TA : Kai-Yang Chiang Learning and Vision • Extra office hours : Chao-Yeh Chen Kristen Grauman UT Austin What is computer vision? Today • Course overview • Requirements, logistics Done? 1. Vision for measurement Computer Vision Real-time stereo Structure from motion Tracking • Automatic understanding of images and video 1. Computing properties of the 3D world from visual data (measurement) NASA Mars Rover Demirdjian et al. Snavely et al. Wang et al. 1

  2. 2. Vision for perception, interpretation Computer Vision Objects amusement park sky Activities Scenes • Automatic understanding of images and video Locations T he Wicked Cedar Point T wister Text / w riting 1. Computing properties of the 3D world from visual Faces data (measurement) Ferris Gestures ride wheel Motions 2. Algorithms and representations to allow a machine ride Emotions… 12 E to recognize objects, people, scenes, and Lake Erie water activities. (perception and interpretation) ride tree tree people waiting in line people sitting on ride umbrellas tree maxair carousel deck bench tree pedestrians 3. Visual search, organization Computer Vision • Automatic understanding of images and video 1. Computing properties of the 3D world from visual data (measurement) 2. Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation) 3. Algorithms to mine, search, and interact with visual data ( search and organization ) Query Image or video Relevant archives content Related disciplines Computer Vision • Automatic understanding of images and video Artificial intelligence 1. Computing properties of the 3D world from visual Machine data (measurement) Graphics learning 2. Algorithms and representations to allow a machine Computer to recognize objects, people, scenes, and vision activities. (perception and interpretation) Image Cognitive processing science 3. Algorithms to mine, search, and interact with visual data ( search and organization ) Algorithms Course focus 2

  3. Visual data in 1963 Vision and graphics Images Vision Model L. G. Roberts, Machine Perception Graphics of Three Dim ensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963. Inverse problems: analysis and synthesis. Visual data in 2015 Why vision? • As image sources multiply, so do applications – Relieve humans of boring, easy tasks – Enhance human abilities Personal photo albums Movies, news, sports – Advance human-computer interaction, visualization – Perception for robotics / autonomous agents – Organize and give access to visual content Medical and scientific images Surveillance and security Slide credit; L. Lazebnik Faces and digital cameras Linking to info with a mobile device Situated search Yeh et al., MIT Google Goggles Setting camera Camera waits for focus via face everyone to smile to detection take a photo [Canon] MSR Lincoln kooaba 3

  4. Video-based interfaces What else? Assistive technology systems Human joystick, NewsBreaker Live Camera Mouse, Boston College Microsoft Kinect Vision for medical & neuroimages Special visual effects fMRI data Golland et al. The Matrix Image guided surgery MIT AI Vision Group Mocap for Pirates of the Carribean , Industrial Light and Magic Source: S. Seitz What Dreams May Come Obstacles? Safety & security Navigation, driver safety Monitoring pool (Poseidon) Surveillance Pedestrian detection MERL, Viola et al. 4

  5. What the computer gets Why is vision difficult? • Ill-posed problem: real w orld much more complex than w hat w e can measure in images – 3D  2D • Impossible to literally “invert” image formation process Challenges: many nuisance parameters Challenges: intra-class variation Illumination Object pose Clutter Occlusions Intra-class Viewpoint appearance slide credit: Fei-Fei, Fergus & T orralba Challenges: importance of context Challenges: importance of context Credit: Antonio Torralba and Rob Fergus Credit: Antonio Torralba and Rob Fergus 5

  6. Challenges: importance of context Challenges: complexity • Millions of pixels in an image • 30,000 human recognizable object categories • 30+ degrees of freedom in the pose of articulated objects (humans) • Billions of images online • 144K hours of new video on YouTube daily • … • About half of the cerebral cortex in primates is devoted to processing visual information [Felleman and van Essen 1991] slide credit: Fei-Fei, Fergus & T orralba Progress charted by datasets Progress charted by datasets Roberts 1963 MIT -CMU Faces INRIA Pedestrian s COIL UIUC Cars 1963 … 1963 … 1996 1996 2000 Progress charted by datasets Progress charted by datasets MSRC 21 Objects ImageNet 80M T iny Images PASCAL VOC Caltech-101 Birds-200 Caltech-256 Faces in the Wild 1963 … 1963 … 1996 2000 2005 1996 2000 2005 2007 2008 2013 6

  7. Expanding horizons: Expanding horizons: large-scale recognition captioning https://pdollar.wordpress.com/2015/01/21/image-captioning/ Expanding horizons: Expanding horizons: interactive visual search vision for autonomous vehicles KITTI dataset – Andreas Geiger et al. WhittleSearch – Adriana Kovashka et al. Expanding horizons: Brainstorm first-person vision Pick an application or task among any of those w e’ve described so far. 1. What functionality should the system have? 2. Intuitively, w hat are the technical sub-problems that must be solved? Activities of Daily Living – Hamed Pirsiavash et al. 7

  8. Topics overview Goals of this course • Features & filters • Upper division honors undergrad course • Grouping & fitting • Introduction to primary topics • Multiple view s – Special focus on machine learning methods • Recognition – Distinct from 378 3D Reconstruction, but some pieces of overlap • Hands-on experience w ith algorithms • View s of vision as a research area Grouping & fitting Features and filters [fig from Shi et al] Clustering, segmentation, Transforming and fitting; w hat parts describing images; belong together? textures, colors, edges Multiple views Recognition and learning Matching, invariant features, stereo vision, instance recognition Lowe Hartley and Zisserman Recognizing categories (objects, scenes, activities, attributes…), learning techniques Fei-Fei Li 8

  9. Textbooks Requirements / Grading • Problem sets (50%) • Recommended book: • Midterm exam (15%) – Computer Vision: Algorithms and Applications • Final exam (25%) – By Rick Szeliski • Class participation, including attendance (10%) – http://szeliski.org/Book/ • Check grades on Canvas – A quote from a prior student evaluation: “To be honest, I think w ithout going to class, the course w ould be very hard. “ Assignments Matlab • Some short answ er concept questions • Built-in toolboxes for low - • Programming problem level image processing, – Implementation visualization – Explanation, results • Compact programs • Code in Matlab – available on CS Unix machines (see course page) • Intuitive interactive debugging • Most of these assignments take significant time to do. We recommend starting early. • Widely used in engineering Assignment 0 Digital images • A0: Matlab w armup + basic image manipulation Images as matrices • Out today, due Fri Sept 4 • Verify CS account and Matlab access • Look at the tutorial online 9

  10. Digital images width 520 j=1 i=1 Color images, Intensity : [0,255] RGB color space 500 height R G B im[176][201] has value 164 im[194][203] has value 37 Preview of assignments Preview of assignments Seam carving Grouping for segmentation Preview of assignments Preview of assignments Image mosaics / stitching Matching and recognition Image from Fei-Fei Li 10

  11. Collaboration policy Preview of assignments All responses and code must be w ritten individually unless otherw ise specified. Students submitting answ ers or code found to be identical or substantially similar (due to inappropriate collaboration) risk failing the course. Object detection Image courtesy of James Hays Assignment deadlines Miscellaneous • Due about every tw o w eeks • Slides, announcements via class w ebsite – tentative deadlines posted online but could slightly shift depending on lecture pace • No laptops, phones, etc. open in class please. • Assignments in by 11:59 PM on due date • Use our office hours! – Submit on Canvas, following submission instructions given in assignment. – Deadlines are firm. We’ll use timestamp. Coming up • Now: check out Matlab tutorial online • A0 due Fri Sept 4 • Textbook reading posted for next week 11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend