Cue combinations, Bayesian models Thurs. March 1, 2018 1 Visual - - PowerPoint PPT Presentation

β–Ά
cue combinations
SMART_READER_LITE
LIVE PREVIEW

Cue combinations, Bayesian models Thurs. March 1, 2018 1 Visual - - PowerPoint PPT Presentation

COMP 546 Lecture 15 Cue combinations, Bayesian models Thurs. March 1, 2018 1 Visual Cues: image properties that can tell us about scene properties Image Scene texture depth gradient - size, shape, density - slant, tilt shading surface


slide-1
SLIDE 1

1

COMP 546

Lecture 15

Cue combinations, Bayesian models

  • Thurs. March 1, 2018
slide-2
SLIDE 2

Visual Cues:

image properties that can tell us about scene properties

Image

texture

  • size, shape, density

shading binocular disparities motion (from moving observer) defocus blur

Scene

depth gradient

  • slant, tilt

surface curvature depth

2

slide-3
SLIDE 3

π‘ž 𝐽 = 𝑗 𝑇 = 𝑑 )

  • Probability of measuring image 𝐽 = 𝑗, when the scene is 𝑇 = 𝑑.

(called β€œlikelihood” of scene 𝑇 = 𝑑, given the image 𝐽 = 𝑗).

  • Maximum likelihood method:

Choose 𝑇 = 𝑑 that maximizes π‘ž 𝐽 = 𝑗 𝑇 = 𝑑 )

Last lecture: Likelihood

3

slide-4
SLIDE 4

4

This lecture: How to combine cues ?

π‘ž 𝐽1, 𝐽2 𝑇 )

slide-5
SLIDE 5

5

Example:

stereo only texture and stereo

[Hillis 2004]

texture only (monocular)

slide-6
SLIDE 6

6

π‘ž 𝐽1, 𝐽2 𝑇 ) = π‘ž 𝐽1 𝑇 ) π‘ž 𝐽2 𝑇 )

Assume likelihood function is β€œconditionally independent”: e.g. 𝐽1 is texture. 𝐽2 is binocular disparity.

slide-7
SLIDE 7

7

𝑇 = s

π‘ž 𝐽2 𝑇 ) π‘ž 𝐽1 𝑇 )

Assume π‘ž 𝐽1 = 𝑗1 𝑇 = 𝑑 ) and π‘ž 𝐽2 = 𝑗2 𝑇 = 𝑑 ) are Gaussian shaped.

slide-8
SLIDE 8

8

𝑇 = s

π‘ž 𝐽2 𝑇 ) π‘ž 𝐽1 𝑇 )

Assume π‘ž 𝐽1 = 𝑗1 𝑇 = 𝑑 ) and π‘ž 𝐽2 = 𝑗2 𝑇 = 𝑑 ) are Gaussian shaped. Their maxima might occur at different values of 𝑑. Why ? 𝑑1 𝑑2

slide-9
SLIDE 9

We want to find the 𝑑 that maximizes:

π‘ž 𝐽1 | 𝑇 = 𝑑 π‘ž 𝐽2 | 𝑇 = 𝑑 = 𝑓

βˆ’ 𝑑 βˆ’ 𝑑1 2 2 𝜏12

𝑓

βˆ’ 𝑑 βˆ’ 𝑑2 2 2 𝜏22

slide-10
SLIDE 10

We want to find the 𝑑 that maximizes: So, we want to find the 𝑑 that minimizes:

π‘ž 𝐽1 | 𝑇 = 𝑑 π‘ž 𝐽2 | 𝑇 = 𝑑 = 𝑓

βˆ’ 𝑑 βˆ’ 𝑑1 2 2 𝜏12

𝑓

βˆ’ 𝑑 βˆ’ 𝑑2 2 2 𝜏22

slide-11
SLIDE 11

The lecture notes show that the solution 𝑇 = 𝑑 is where 𝑑 = π‘₯1𝑑1 + π‘₯2𝑑2 π‘₯1 + π‘₯2 = 1 0 < π‘₯𝑗 < 1

β€œLinear Cue Combination”

slide-12
SLIDE 12

The lecture notes show that the solution 𝑇 = 𝑑 is where Thus, less reliable cue (larger 𝜏) get less weight. π‘₯1 = 𝜏22 𝜏12 + 𝜏22 π‘₯2 = 𝜏12 𝜏12 + 𝜏22 𝑑 = π‘₯1𝑑1 + π‘₯2𝑑2 π‘₯1 + π‘₯2 = 1 0 < π‘₯𝑗 < 1

slide-13
SLIDE 13

13

Example:

stereo only

[Hillis 2004]

texture only (monocular)

Measure slant discrimination thresholds for cues in isolation. Estimate likelihood function parameters (𝑑1, 𝜏1 , 𝑑2, 𝜏2).

slide-14
SLIDE 14

14

texture and stereo

… then

  • present cues together
  • measure thresholds for 𝑇
  • convert thresholds to likelihood parameters (𝑑 , Οƒ)
slide-15
SLIDE 15

15

texture and stereo

… then

  • present cues together
  • measure thresholds for 𝑇
  • convert thresholds to likelihood parameters (𝑑 , Οƒ)
  • examine if these values are consistent with the model*

*Model also makes prediction about Οƒ in combined case.

𝑑 = π‘₯1𝑑1 + π‘₯2𝑑2

slide-16
SLIDE 16

16

𝑇 = s

π‘ž 𝐽2 𝑇 ) π‘ž 𝐽1 𝑇 )

𝑑1 𝑑2

Experimenter can manipulate 𝑑1 , 𝑑2 , 𝜏1 , 𝜏2 and predict effect on perception of slant.

texture and stereo

slide-17
SLIDE 17

17

COMP 546

Lecture 15

Cue combinations, Bayesian models

  • Thurs. March 1, 2018
slide-18
SLIDE 18

π‘ž 𝐽 = 𝑗 𝑇 = 𝑑) β‰  π‘ž 𝑇 = 𝑑 𝐽 = 𝑗)

18

Likelihood of scene 𝑑, given image 𝑗 Probability of scene 𝑑, given image 𝑗 What is the crucial difference ?

slide-19
SLIDE 19
slide-20
SLIDE 20

wire frame with independently chosen depths regular solid cube flat drawing

All scenes above have the same likelihood π‘ž( 𝐽 = 𝑗 | 𝑇 = 𝑑 ). Why do we prefer the regular solid cube?

[Kersten & Yuille 2003]

slide-21
SLIDE 21

Some scenes may have a larger probability π‘ž(𝑇 = 𝑑 ). The marginal probably π‘ž(𝑇 = 𝑑) is called the "prior".

slide-22
SLIDE 22

π‘ž 𝐽 𝑇 ) ≑

π‘ž(𝐽, 𝑇 ) π‘ž(𝑇)

π‘ž 𝑇 𝐽 ) ≑

π‘ž (𝐽, 𝑇 ) π‘ž(𝐽)

π‘ž 𝐽 𝑇 ) π‘ž 𝑇 = π‘ž 𝑇 𝐽 ) π‘ž 𝐽

Thus,

slide-23
SLIDE 23

π‘ž 𝐽 𝑇 ) ≑

π‘ž(𝐽, 𝑇 ) π‘ž(𝑇)

π‘ž 𝑇 𝐽 ) ≑

π‘ž (𝐽, 𝑇 ) π‘ž(𝐽)

π‘ž 𝑇 𝐽 ) = π‘ž 𝐽 𝑇 ) π‘ž 𝑇 π‘ž 𝐽

Thus,

Bayes Theorem

posterior likelihood scene prior image prior

slide-24
SLIDE 24

Maximum β€˜a Posteriori’ (MAP)

Given an image, 𝐽 = 𝑗, find the scene 𝑇 = 𝑑 that maximizes π‘ž( 𝑇 = 𝑑 | 𝐽 = 𝑗 ).

π‘ž 𝑇 𝐽 ) = π‘ž 𝐽 𝑇 ) π‘ž 𝑇 π‘ž 𝐽

posterior likelihood scene prior image prior

slide-25
SLIDE 25

Maximum β€˜a Posteriori’ (MAP)

Given an image, 𝐽 = 𝑗, find the scene 𝑇 = 𝑑 that maximizes π‘ž( 𝑇 = 𝑑 | 𝐽 = 𝑗 ).

π‘ž 𝑇 𝐽 ) = π‘ž 𝐽 𝑇 ) π‘ž 𝑇 π‘ž 𝐽

posterior likelihood scene prior image prior

We don't care about π‘ž( 𝐽 = 𝑗 ). Why not ?

slide-26
SLIDE 26

If the prior p(S) is uniform then maximum likelihood gives the same solution as maximum posterior (MAP). Interesting cases arise when the prior is non-uniform. π‘ž 𝑇 𝐽 ) = π‘ž 𝐽 𝑇 ) π‘ž 𝑇 π‘ž 𝐽

posterior likelihood scene prior image prior constant

slide-27
SLIDE 27

likelihood prior

slide-28
SLIDE 28

http://www.youtube.com/watch?v=Ttd0YjXF0no

Ames Room

https://www.youtube.com/watch?v=gJhyu6nlGt8

slide-29
SLIDE 29

Priors (β€œNatural Scenes Statistics”)

  • intensity
  • rientation of image lines, edges
  • disparity
  • motion
  • surface slant, tilt
slide-30
SLIDE 30
  • rientation πœ„ of lines, edges

[Girshick 2011]

π‘ž(𝑇 = πœ„)

People are indeed better at discriminating vertical and horizontal orientations than oblique orientations. Why? Because they use a prior ?

slide-31
SLIDE 31

surface slant 𝜏 and tilt 𝜐

Here we represent (slant, tilt) using a concave hemisphere. See next slide.

floor ceiling

slide-32
SLIDE 32

π‘ž(𝑇 = (𝜏, 𝜐))

[Adams & Elder 2016]

represent slants and tilts using a concave

Each disk shows π‘ž(𝜏, 𝜐) for surfaces visible over a range of viewing direction elevations, relative to line of sight.

slide-33
SLIDE 33

π‘ž(𝑇 = (𝜏, 𝜐))

slide-34
SLIDE 34

π‘ž(𝑇 = (𝜏, 𝜐))

slide-35
SLIDE 35

Maximum a Posteriori (MAP)

= βˆ— Choose the S = (slant,tilt) that maximizes the posterior.

π‘ž( 𝑇 ) π‘ž(𝐽 = 𝑗 | 𝑇) π‘ž 𝑇 𝐽 = 𝑗 )

posterior likelihood prior

slide-36
SLIDE 36

i.e. convex or concave ?

π‘ž(𝐽 = 𝑗 | 𝑇)

  • verall

(slant, tilt)

Likelihood functions can have more than one maximum.

slide-37
SLIDE 37

Depth Reversal Ambiguity and Shading

(see Exercise) A valley illuminated from the right produces the same shading as a hill illuminated from the left.

π‘ž(𝐽 = 𝑗 | 𝑇) Likelihood (slant, tilt)

slide-38
SLIDE 38

What β€œpriors” does the visual system use to resolve such twofold ambiguities ? Let’s look at a few related examples.

slide-39
SLIDE 39

You can perceive the center point as a hill or a valley. When you see it as a hill, you perceive the tilt as 180 deg (leftward). But when you see it as a valley, the slant is 0 (rightward).

slide-40
SLIDE 40

We tend to see the center as a hill. Why ?

slide-41
SLIDE 41

We tend to see the center as a valley. Why ?

slide-42
SLIDE 42

The visual system uses three priors to resolve the depth reversal ambiguity:

  • surface orientation: p(floor) > p(ceiling)
  • light source direction:

p( above) > p( below)

  • β€˜global’ surface curvature: p(convex) > p(concave)
slide-43
SLIDE 43

Example in which all three priors assumptions are met

light from above viewpoint from above (floor) shape is convex

slide-44
SLIDE 44

Example in which all three prior assumptions fail

shape is concave viewpoint from below (ceiling) light from below

slide-45
SLIDE 45

floor ceiling

Convex shape, illuminated from above the line of sight

slide-46
SLIDE 46

floor ceiling

Concave shape, illuminated from below the line of sight

slide-47
SLIDE 47

We showed how people combined the three different "priors": Percent correct in judging local "hill" or "valley": = 50 +/- 10 floor vs. ceiling +/- 10 light from above vs. below +/- 10 globally convex/concave

[Langer and Buelthoff, 2001]

slide-48
SLIDE 48

Best (80%) Worst (20%)

slide-49
SLIDE 49

These look weird, but in different ways. How ?

slide-50
SLIDE 50

Reminder

  • A2 is due tonight
  • Midterm (optional) is first class after Study Break