Perceptual Tasks Scene Understanding Reconstruct the location and - PDF document

Perception (Vision) • Sensors – images (RGB, infrared, multispectral, hyperspectral) – touch sensors – sound (c) 2003 Thomas G. Dietterich 1 Perceptual Tasks • Scene Understanding – Reconstruct the location and orientation (“pose”) of all objects in the scene – If objects are moving, determine their velocity (rotational and translational) • Object Recognition – Identify object against arbitrary background – Face recognition – “Target” recognition • Task-specific Perception (Minimum perception needed to carry out task) – Obstacle avoidance – Landmark identification (c) 2003 Thomas G. Dietterich 2 1

Scene Understanding: Vision as Inverse Graphics 3-D World 2-D Image Computer Graphics Computer Vision Fundamental problem: � 3-D � 2-D transformation loses information (c) 2003 Thomas G. Dietterich 3 3-D � 2-D Information Loss (c) 2003 Thomas G. Dietterich 4 2

Probabilistic Formulation • I: image • W: world • Goal: – argmax W P(W|I) = argmax W P(I|W) · P(W) – Which worlds are more likely? (c) 2003 Thomas G. Dietterich 7 Image Formation • Object location (x,y,z) and pose (r, θ , ω ) • Object surface color • Object surface material (reflectance properties) • Light source position and color • Camera position and focal length (c) 2003 Thomas G. Dietterich 8 4

Image Formation (c) 2003 Thomas G. Dietterich 9 Inverse Graphics Fallacy • We don’t really need to know the location of every leaf on a tree to avoid hitting the tree while driving • Only extract the information necessary for intelligent behavior! – obstacle avoidance – face recognition – finding objects in your room • The probabilistic framework is still useful in each of these tasks (c) 2003 Thomas G. Dietterich 10 5

Computer Vision (c) 2003 Thomas G. Dietterich 15 Bottom-Up vs. Top-Down • Bottom-Up processing – starts with image and performs operations in parallel on each pixel – find edges, find regions – extract other important cues C • Top-Down processing – starts with P(W) expectations – computes P(C | W) for groups of cues C (c) 2003 Thomas G. Dietterich 16 8

Edge Detection (c) 2003 Thomas G. Dietterich 17 Edge Detection (2) 195 209 221 235 249 251 254 255 250 241 247 248 210 236 249 254 255 254 225 226 212 204 236 211 164 172 180 192 241 251 255 255 255 255 235 190 167 164 171 170 179 189 208 244 254 255 251 234 162 167 166 169 169 170 176 185 196 232 249 254 153 157 160 162 169 170 168 169 171 176 185 218 126 135 143 147 156 157 160 166 167 171 168 170 103 107 118 125 133 145 151 156 158 159 163 164 095 095 097 101 115 124 132 142 117 122 124 161 093 093 093 093 095 099 105 118 125 135 143 119 093 093 093 093 093 093 095 097 101 109 119 132 095 093 093 093 093 093 093 093 093 093 093 119 (c) 2003 Thomas G. Dietterich 18 9

Look for changes in brightness • Compute Spatial Derivative Ã ∂ I ( x, y ) ! , ∂ I ( x, y ) ∂ x ∂ y • Compute Magnitude Ã ! 2 Ã ! 2 ∂ I ( x, y ) ∂ I ( x, y ) + ∂ x ∂ y • Threshold (c) 2003 Thomas G. Dietterich 19 Problem: Images are Noisy 2 • intensity values: 1 0 − 1 0 10 20 30 40 50 60 70 80 90 100 1 • derivative: 0 − 1 0 10 20 30 40 50 60 70 80 90 100 threshold true edge false edge (c) 2003 Thomas G. Dietterich 20 10

Solution: Smooth Edges Prior to Edge Detection 2 1 0 − 1 0 10 20 30 40 50 60 70 80 90 100 1 0 − 1 0 10 20 30 40 50 60 70 80 90 100 1 Derivative of Smoothed 0 Intensities: − 1 0 10 20 30 40 50 60 70 80 90 100 (c) 2003 Thomas G. Dietterich 21 Efficient Implementation: Convolutions h = f ∗ g u =+ ∞ v =+ ∞ X X h ( x, y ) = f ( u, v ) · g ( x − u, y − v ) u = −∞ v = −∞ • Smoothing: Convolve image with gaussian • f(x,y) = I(x,y) the image intensities • g(u,v) = 1 2 πσ 2 e − ( u 2 + v 2 ) / 2 σ 2 √ (c) 2003 Thomas G. Dietterich 22 11

Convolutions can be performed using Fast Fourier Transform • FFT[f *g] = FFT[f] · FFT[g] – The FFT of a convolution is the product of the FFTs of the functions • f *g = FFT -1 (FFT[f] · FFT[g]) (c) 2003 Thomas G. Dietterich 23 Computing the Derivative • (f * g)’ = f * (g’) – The derivative of a convolution can be computed by first differentiating one of the functions • To take the derivative of the image after gaussian smoothing, first differentiate the gaussian and then smooth with that! • Can only be done in one dimension: do it separately for x and y. (c) 2003 Thomas G. Dietterich 24 12

Canny Edge Detector G 0 f V ( u, v ) = σ ( u ) G σ ( v ) G σ ( u ) G 0 f H ( u, v ) = σ ( v ) = R V I ∗ f V = R H I ∗ f H R V ( x, y ) 2 + R H ( x, y ) 2 R ( x, y ) = • Define an edge where R(x,y) > θ (a threshold) (c) 2003 Thomas G. Dietterich 25 Results (c) 2003 Thomas G. Dietterich 26 13

Interpreting Edges • Edges can be caused by many different phenomena in the world: – depth discontinuities – changes in surface orientation – changes in surface color – changes in illumination (c) 2003 Thomas G. Dietterich 27 Example Optical Illusion Steps Movie (c) 2003 Thomas G. Dietterich 28 14

Bayesian Model-Based Vision (Dan Huttonlocher & Pedro Felzenszwalb) • Goal: Locate and track people in images (c) 2003 Thomas G. Dietterich 29 White Lie Warning • The actual method is significantly different than the version I’m describing here • For the real story, see the following paper: – Efficient Matching of Pictorial Structures, Proceedings of the IEEE Computer Vision and Pattern Recognition Conference, pp. 66-73, 2000 – http://www.cs.cornell.edu/~dph/ (c) 2003 Thomas G. Dietterich 30 15

Probabilistic Model of a Person � 10 body parts � connected at points � probability distribution over the locations of the points � probability distribution over relative orientations of the parts � appearance distribution tells what each part looks like � P(L|I) ∝ P(I|L) · P(L) (c) 2003 Thomas G. Dietterich 31 Relationship between body part locations • Each body part is represented as a (x j ,y j ) rectangle + • s i = degree of θ ij foreshortening + s i (x i ,y i ) • (x j ,y j ) = relative offset • θ i,j = relative orientation (c) 2003 Thomas G. Dietterich 32 16

Bayesian Network Model σ x,i σ y,i x i s i y i left (x j ,y j ) upper + arm θ I,j σ x,j σ y,j x j s j y j θ ij + s i (x i ,y i ) θ j,k σ x,k σ y,k x k s k y k torso P(s i ) = Gauss(s i ;1, σ s,i ) P(x j |x i , σ xi ,s i ) = Gauss(x j ; x i + δ x,I,j · s i , σ x,i ) P(y j |y i , σ yi ,s i ) = Gauss(y j ; y i + δ y,I,j · s i , σ y,i ) P( θ i,j ) = vonMises( θ i,j , µ I,j ,k I,j ) (c) 2003 Thomas G. Dietterich 33 Generating a Person: Step 1: Position of Torso + (c) 2003 Thomas G. Dietterich 34 17

Choose foreshortening of forearms and lower legs + + + + + + + + + + (c) 2003 Thomas G. Dietterich 41 Appearance Model • Each pixel z is either a foreground pixel (a body part) or a background pixel. • P(f z = true | z ∈ Area1) = q 1 • P(f z = true | z ∈ Area2) = q 2 • P(f z = true | z ∈ Area3) = 0.5 Area 1 Area 2 Area 3 (whole image) (c) 2003 Thomas G. Dietterich 42 21

Appearance Model (2) • Each part has an average grey level (and a variance). Each pixel z generates its grey level from a Gaussian distribution: – P(g z | f z =true, z ∈ part i ) = Gauss(g z ; µ i , σ i ) • Background pixels have average grey level and variance – P(g z | f z =false, z ∈ background) = Gauss(g z ; µ b , σ b ) (c) 2003 Thomas G. Dietterich 43 • Does not handle overlapping body parts Generating the Image • Generate body location and pose • Generate pixel foreground/background for each pixel independently • Generate pixel grey levels (c) 2003 Thomas G. Dietterich 44 22

Training • All model parameters can be fit by supervised training – Manually identify location and orientation of body parts – Fit joint location and angle distributions, foreshortening distributions – Fit q 1 and q 2 foreground probabilities – Fit grey level distributions (c) 2003 Thomas G. Dietterich 45 Examples (c) 2003 Thomas G. Dietterich 46 23

Perceptual Tasks Scene Understanding Reconstruct the location and - PDF document

Perception (Vision) Sensors images (RGB, infrared, multispectral, hyperspectral) touch sensors sound (c) 2003 Thomas G. Dietterich 1 Perceptual Tasks Scene Understanding Reconstruct the location and orientation

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

2018-02-12 Perceptual organization PSY 525.001 Vision Science 2018 Spring Rick Gilmore

Shared Memory Programming with OpenMP Lecture 6: Tasks What are tasks? Tasks are

Scheduling Aperiodic Tasks Background Scheduling Treat aperiodic tasks as lowest-priority

RECENT PROGRESS ON WEB SERVICES FOR SFT Nefeli Kousi TASKS TASKS ROOT Primer to Notebooks

Time Management Beth Asbury Outline Time Bandits Scheduling tasks Prioritising tasks

Slide 1 Page: 1 Mathematical Tasks.ppt Effective Mathematics Instruction: The Role of

Taking Synchrony Seriously: Taking Synchrony Seriously: A Perceptual-Level Model of Infant A

Phonetic and phonological factors in coronal-to-dorsal perceptual assimilation Eleanor Chodroff

History and Philosophy of Robotics Laboratory for Perceptual Robotics Department of Computer

Frames for Psychoacoustics tics Peter Balazs Erblet transform and perceptual sparsity ARI

Readings Covered Further Readings Ware: Evaluation Appendix Ware, Appendix C: The Perceptual

Sensing Opportunities for Physical Interaction Florian Michahelles, Bernt Schiele Perceptual

Computer Vision by Learning: Motion in Action Jan van Gemert, UvA 2 Motion and perceptual

We need a better perceptual similarity metric Lubomir Bourdev WaveOne, Inc. CVPR Workshop

Measuring the Perceptual Effects of Speech Synthesis Modelling Assumptions Gustav Eje Henter,

distributional factors underlying learning and generalization of morphological inflections

Announcements What you should know for quiz This list is not inclusive. Quiz on Tuesday,

OF TOOLS AND INSTRUMENTS INTRODUCTION INVENTION OF THE TOOL Humans are the only species

N328 Visualizing Information Week 4: Marks & Channels Khairi Reda | redak@iu.edu School of

Me Research Professor of: Computer Science (by training) Physics & Astronomy,

R.I.T S. Ludi/R. Kuehl p. 1 R I T Software Engineering Usability Presentation Design Framework

SunyoungKim,PhD Last class 1. Prototype 2. Flowchart 3. Wizard of Oz Todays agenda

Optical Flow EECS 442 David Fouhey Fall 2019, University of Michigan

Perceptual Tasks Scene Understanding Reconstruct the location and - PDF document

Perception (Vision) Sensors images (RGB, infrared, multispectral, hyperspectral) touch sensors sound (c) 2003 Thomas G. Dietterich 1 Perceptual Tasks Scene Understanding Reconstruct the location and orientation

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

2018-02-12 Perceptual organization PSY 525.001 Vision Science 2018 Spring Rick Gilmore

Shared Memory Programming with OpenMP Lecture 6: Tasks What are tasks? Tasks are

Scheduling Aperiodic Tasks Background Scheduling Treat aperiodic tasks as lowest-priority

RECENT PROGRESS ON WEB SERVICES FOR SFT Nefeli Kousi TASKS TASKS ROOT Primer to Notebooks

Time Management Beth Asbury Outline Time Bandits Scheduling tasks Prioritising tasks

Slide 1 Page: 1 Mathematical Tasks.ppt Effective Mathematics Instruction: The Role of

Taking Synchrony Seriously: Taking Synchrony Seriously: A Perceptual-Level Model of Infant A

Phonetic and phonological factors in coronal-to-dorsal perceptual assimilation Eleanor Chodroff

History and Philosophy of Robotics Laboratory for Perceptual Robotics Department of Computer

Frames for Psychoacoustics tics Peter Balazs Erblet transform and perceptual sparsity ARI

Readings Covered Further Readings Ware: Evaluation Appendix Ware, Appendix C: The Perceptual

Sensing Opportunities for Physical Interaction Florian Michahelles, Bernt Schiele Perceptual

Computer Vision by Learning: Motion in Action Jan van Gemert, UvA 2 Motion and perceptual

We need a better perceptual similarity metric Lubomir Bourdev WaveOne, Inc. CVPR Workshop

Measuring the Perceptual Effects of Speech Synthesis Modelling Assumptions Gustav Eje Henter,

distributional factors underlying learning and generalization of morphological inflections

Announcements What you should know for quiz This list is not inclusive. Quiz on Tuesday,

OF TOOLS AND INSTRUMENTS INTRODUCTION INVENTION OF THE TOOL Humans are the only species

N328 Visualizing Information Week 4: Marks &amp; Channels Khairi Reda | redak@iu.edu School of

Me Research Professor of: Computer Science (by training) Physics &amp; Astronomy,

R.I.T S. Ludi/R. Kuehl p. 1 R I T Software Engineering Usability Presentation Design Framework

SunyoungKim,PhD Last class 1. Prototype 2. Flowchart 3. Wizard of Oz Todays agenda

Optical Flow EECS 442 David Fouhey Fall 2019, University of Michigan

N328 Visualizing Information Week 4: Marks & Channels Khairi Reda | redak@iu.edu School of

Me Research Professor of: Computer Science (by training) Physics & Astronomy,