Image Segmentation Perceptual and Sensory Augmented Computing Marc - - PowerPoint PPT Presentation

image segmentation
SMART_READER_LITE
LIVE PREVIEW

Image Segmentation Perceptual and Sensory Augmented Computing Marc - - PowerPoint PPT Presentation

Image Segmentation Perceptual and Sensory Augmented Computing Marc Pollefeys Computer Vision WS 0/09 ETH Zurich Slide credits: V. Ferrari, K. Grauman, B. Leibe, S. Lazebnik, S. Seitz,Y Boykov, W. Freeman, P. Kohli Topics of This Lecture


slide-1
SLIDE 1

Perceptual and Sensory Augmented Computing Computer Vision WS 0/09

Image Segmentation

Marc Pollefeys ETH Zurich

Slide credits:

  • V. Ferrari, K. Grauman, B. Leibe, S. Lazebnik,
  • S. Seitz,Y Boykov, W. Freeman, P. Kohli
slide-2
SLIDE 2

Perceptual and Sensory Augmented Computing

Topics of This Lecture

  • Introduction

Ø Gestalt principles Ø Image segmentation

  • Segmentation as clustering

Ø k-Means Ø Feature spaces Ø Mixture of Gaussians, EM

  • Model-free clustering: Mean-Shift
  • Graph theoretic segmentation: Normalized Cuts
  • Interactive Segmentation with GraphCuts
slide-3
SLIDE 3

Perceptual and Sensory Augmented Computing

Examples of Grouping in Vision

Determining image regions Grouping video frames into shots Object-level grouping Figure-ground

Slide credit: Kristen Grauman

What things should be grouped? What cues indicate groups?

slide-4
SLIDE 4

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Similarity in appearance

http://chicagoist.com/attachments/chicagoist_alicia/GEESE.jpg, http://wwwdelivery.superstock.com/WI/223/1532/PreviewComp/SuperStock_1532R-0831.jpg

Slide adapted from Kristen Grauman

slide-5
SLIDE 5

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Symmetry

http://seedmagazine.com/news/2006/10/beauty_is_in_the_processingtim.php

Slide credit: Kristen Grauman

slide-6
SLIDE 6

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Common Fate

Image credit: Arthus-Bertrand (via F. Durand)

Slide credit: Kristen Grauman

slide-7
SLIDE 7

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Proximity

http://www.capital.edu/Resources/Images/outside6_035.jpg

Slide credit: Kristen Grauman

slide-8
SLIDE 8

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

The Gestalt School

  • Grouping is key to visual perception
  • Elements in a collection can have properties that result

from relationships

Ø “The whole is greater than the sum of its parts”

Illusory/subjective contours Occlusion Familiar configuration http://en.wikipedia.org/wiki/Gestalt_psychology

Slide credit: Svetlana Lazebnik

Image source: Steve Lehar

slide-9
SLIDE 9

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Gestalt Theory

  • Gestalt: whole or group

Ø Whole is greater than sum of its parts Ø Relationships among parts can yield new properties/features

  • Psychologists identified series of factors that predispose

set of elements to be grouped (by human visual system)

Untersuchungen zur Lehre von der Gestalt, Psychologische Forschung, Vol. 4, pp. 301-350, 1923 http://psy.ed.asu.edu/~classics/Wertheimer/Forms/forms.htm

“I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of colour. Do I have "327"? No. I have sky, house, and trees.”

Max Wertheimer

(1880-1943)

Slide credit: B. Leibe

slide-10
SLIDE 10

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Gestalt Factors

These factors make intuitive sense, but are very difficult to translate into algorithms.

Image source: Forsyth & Ponce

Slide credit: B. Leibe

slide-11
SLIDE 11

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Continuity through Occlusion Cues

Slide credit: B. Leibe

slide-12
SLIDE 12

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Continuity through Occlusion Cues

Continuity, explanation by occlusion

Slide credit: B. Leibe

slide-13
SLIDE 13

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Continuity through Occlusion Cues

Image source: Forsyth & Ponce

Slide credit: B. Leibe

slide-14
SLIDE 14

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Continuity through Occlusion Cues

Image source: Forsyth & Ponce

Slide credit: B. Leibe

slide-15
SLIDE 15

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Figure-Ground Discrimination

Slide credit: B. Leibe

slide-16
SLIDE 16

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

The Ultimate Gestalt test

Slide adapted from B. Leibe

slide-17
SLIDE 17

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Image Segmentation

  • Goal: identify groups of pixels that go together

Slide credit: Steve Seitz, Kristen Grauman

slide-18
SLIDE 18

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

The Goals of Segmentation

  • Separate image into objects

Image Human segmentation

Slide credit: Svetlana Lazebnik

slide-19
SLIDE 19

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

The Goals of Segmentation

  • Separate image into objects
  • Group together similar-looking pixels for efficiency of

further processing

  • X. Ren and J. Malik. Learning a classification model for segmentation. ICCV 2003.

“superpixels”

Slide credit: Svetlana Lazebnik

slide-20
SLIDE 20

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

slide-21
SLIDE 21

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

slide-22
SLIDE 22

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

slide-23
SLIDE 23

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

slide-24
SLIDE 24

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

slide-25
SLIDE 25

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

slide-26
SLIDE 26

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

slide-27
SLIDE 27

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

slide-28
SLIDE 28

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

slide-29
SLIDE 29

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

slide-30
SLIDE 30

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Topics of This Lecture

  • Introduction

Ø Gestalt principles Ø Image segmentation

  • Segmentation as clustering

Ø k-Means Ø Feature spaces Ø Mixture of Gaussians, EM

  • Model-free clustering: Mean-Shift
  • Graph theoretic segmentation: Normalized Cuts
  • Interactive Segmentation with GraphCuts
slide-31
SLIDE 31

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Image Segmentation: Toy Example

  • These intensities define the three groups.
  • We could label every pixel in the image according to

which of these it is.

Ø i.e. segment the image based on the intensity feature.

  • What if the image isn’t quite so simple?

intensity input image

black pixels gray pixels white pixels

1 2 3

Slide credit: Kristen Grauman

slide-32
SLIDE 32

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Pixel count Input image Input image Intensity Pixel count Intensity

Slide credit: Kristen Grauman

slide-33
SLIDE 33

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

  • Now how to determine the three main intensities that

define our groups?

  • We need to cluster.

Input image Intensity Pixel count

Slide credit: Kristen Grauman

slide-34
SLIDE 34

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

  • Goal: choose three “centers” as the representative

intensities, and label every pixel according to which of these centers it is nearest to.

  • Best cluster centers are those that minimize SSD

between all points and their nearest cluster center ci:

Slide credit: Kristen Grauman

190 255

1 2 3

Intensity

slide-35
SLIDE 35

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Clustering

  • With this objective, it is a “chicken and egg” problem:

Ø If we knew the cluster centers, we could allocate points to

groups by assigning each to its closest center.

Ø If we knew the group memberships, we could get the centers by

computing the mean per group.

Slide credit: Kristen Grauman

slide-36
SLIDE 36

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

K-Means Clustering

  • Basic idea: randomly initialize the k cluster centers, and

iterate between the two steps we just saw.

  • 1. Randomly initialize the cluster centers, c1, ..., cK
  • 2. Given cluster centers, determine points in each cluster

– For each point p, find the closest ci. Put p into cluster i

  • 3. Given points in each cluster, solve for ci

– Set ci to be the mean of points in cluster i

  • 4. If ci have changed, repeat Step 2
  • Properties

Ø

Will always converge to some solution

Ø

Can be a “local minimum”

– Does not always find the global minimum of objective function:

Slide credit: Steve Seitz

slide-37
SLIDE 37

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Segmentation as Clustering

K=2 K=3

img_as_col = double(im(:)); cluster_membs = kmeans(img_as_col, K); labelim = zeros(size(im)); for i=1:k inds = find(cluster_membs==i); meanval = mean(img_as_column(inds)); labelim(inds) = meanval; end Slide credit: Kristen Grauman

slide-38
SLIDE 38

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

K-Means Clustering

  • Java demo:

http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html

slide-39
SLIDE 39

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Feature Space

  • Depending on what we choose as the feature space, we

can group pixels in different ways.

  • Grouping pixels based on

intensity similarity

  • Feature space: intensity value (1D)

Slide credit: Kristen Grauman

slide-40
SLIDE 40

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Feature Space

  • Depending on what we choose as the feature space, we

can group pixels in different ways.

  • Grouping pixels based
  • n color similarity
  • Feature space: color value (3D)

R=255 G=200 B=250 R=245 G=220 B=248 R=15 G=189 B=2 R=3 G=12 B=2

R G B

Slide credit: Kristen Grauman

slide-41
SLIDE 41

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Segmentation as Clustering

  • Depending on what we choose as the feature space, we

can group pixels in different ways.

  • Grouping pixels based
  • n texture similarity
  • Feature space: filter bank responses (e.g. 24D)

Filter bank

  • f 24 filters

F24 F2 F1

Slide credit: Kristen Grauman

slide-42
SLIDE 42

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Spatial coherence

  • Assign a cluster label per pixel à

à possible discontinuities

  • How can we ensure they

are spatially smooth?

1 2 3

?

Original Labeled by cluster center’s intensity

Slide adapted from Kristen Grauman

slide-43
SLIDE 43

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Spatial coherence

  • Depending on what we choose as the feature space, we

can group pixels in different ways.

  • Grouping pixels based on

intensity+position similarity ⇒ Way to encode both similarity and proximity.

Slide adapted from Kristen Grauman

X Intensity Y

slide-44
SLIDE 44

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

K-Means without spatial information

  • K-means clustering based on intensity or color is

essentially vector quantization of the image attributes

Ø Clusters don’t have to be spatially coherent

Image Intensity-based clusters Color-based clusters

Slide adapted from Svetlana Lazebnik

Image source: Forsyth & Ponce

slide-45
SLIDE 45

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

K-Means with spatial information

  • K-means clustering based on intensity or color is

essentially vector quantization of the image attributes

Ø Clusters don’t have to be spatially coherent

  • Clustering based on (r,g,b,x,y) values enforces more

spatial coherence

Slide adapted from Svetlana Lazebnik

Image source: Forsyth & Ponce

slide-46
SLIDE 46

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Summary K-Means

  • Pros

Ø Simple, fast to compute Ø Converges to local minimum

  • f within-cluster squared error
  • Cons/issues

Ø Setting k? Ø Sensitive to initial centers Ø Sensitive to outliers Ø Detects spherical clusters only Ø Assuming means can be

computed

Slide credit: Kristen Grauman

slide-47
SLIDE 47

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Probabilistic Clustering

  • Basic questions

Ø What’s the probability that a point x is in cluster m? Ø What’s the shape of each cluster?

  • K-means doesn’t answer these questions.
  • Basic idea

Ø Instead of treating the data as a bunch of points, assume that

they are all generated by sampling a continuous function.

Ø This function is called a generative model. Ø Defined by a vector of parameters θ

Slide credit: Steve Seitz

slide-48
SLIDE 48

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Mixture of Gaussians

  • One generative model is a mixture of Gaussians (MoG)

Ø K Gaussian blobs with means µb covariance matrices Vb, dimension d

– Blob b defined by:

Ø Blob b is selected with probability Ø The likelihood of observing x is a weighted mixture of Gaussians

,

Slide adapted from Steve Seitz

slide-49
SLIDE 49

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Expectation Maximization (EM)

  • Goal

Ø

Find blob parameters θ that maximize the likelihood function

  • verall all N datapoints
  • Approach:

1.

E-step: given current guess of blobs, compute probabilistic ownership

  • f each point

2.

M-step: given ownership probabilities, update blobs to maximize likelihood function

3.

Repeat until convergence

Slide adapted from Steve Seitz

slide-50
SLIDE 50

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

EM Details

  • E-step

Ø Compute probability that point x is in blob b, given current

guess of θ

  • M-step

Ø Compute overall probability that blob b is selected Ø Mean of blob b Ø Covariance of blob b

(N data points)

Slide adapted from Steve Seitz

slide-51
SLIDE 51

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Applications of EM

  • Turns out this is useful for all sorts of problems

Ø Any clustering problem Ø Any model estimation problem Ø Missing data problems Ø Finding outliers Ø Segmentation problems

– Segmentation based on color – Segmentation based on motion – Foreground/background separation

Ø ...

  • EM demo

Ø http://lcn.epfl.ch/tutorial/english/gaussian/html/index.html

Slide credit: Steve Seitz

slide-52
SLIDE 52

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Segmentation with EM

Image source: Serge Belongie

k=2 k=3 k=4 k=5

EM segmentation results Original image

Slide credit: B. Leibe

slide-53
SLIDE 53

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Summary: Mixtures of Gaussians, EM

  • Pros

Ø Probabilistic interpretation Ø Soft assignments between data points and clusters Ø Generative model, can predict novel data points Ø Relatively compact storage ( )

  • Cons

Ø Initialization

– often a good idea to start from output of k-means

Ø Local minima Ø Need to know number of components K

– solutions: model selection (AIC, BIC), Dirichlet process mixture

Ø Need to choose generative model (math form of a cluster ?) Ø Numerical problems are often a nuisance

Slide adapted from B. Leibe

slide-54
SLIDE 54

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Topics of This Lecture

  • Introduction

Ø Gestalt principles Ø Image segmentation

  • Segmentation as clustering

Ø k-Means Ø Feature spaces Ø Mixture of Gaussians, EM

  • Model-free clustering: Mean-Shift
  • Graph theoretic segmentation: Normalized Cuts
  • Interactive Segmentation with GraphCuts
slide-55
SLIDE 55

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Finding Modes in a Histogram

  • How many modes are there?

Ø Mode = local maximum of a given distribution Ø Easy to see, hard to compute

Slide adapted from Steve Seitz

slide-56
SLIDE 56

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Mean-Shift Segmentation

  • An advanced and versatile technique for clustering-

based segmentation

http://coewww.rutgers.edu/riul/FORMER/comanici/MSPAMI/msPamiResults.html

  • D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis,

PAMI 2002.

Slide credit: Svetlana Lazebnik

slide-57
SLIDE 57

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Mean-Shift Algorithm

  • Iterative Mode Search

1.

Initialize random seed center and window W

2.

Calculate center of gravity (the “mean”) of W:

3.

Shift the search window to the mean

4.

Repeat steps 2+3 until convergence

Slide adapted from Steve Seitz

slide-58
SLIDE 58

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Region of interest Center of mass Mean Shift vector

Mean-Shift

Slide by Y . Ukrainitz & B. Sarel

slide-59
SLIDE 59

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Region of interest Center of mass Mean Shift vector

Mean-Shift

Slide by Y . Ukrainitz & B. Sarel

slide-60
SLIDE 60

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Region of interest Center of mass Mean Shift vector

Mean-Shift

Slide by Y . Ukrainitz & B. Sarel

slide-61
SLIDE 61

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Region of interest Center of mass Mean Shift vector

Mean-Shift

Slide by Y . Ukrainitz & B. Sarel

slide-62
SLIDE 62

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Region of interest Center of mass Mean Shift vector

Mean-Shift

Slide by Y . Ukrainitz & B. Sarel

slide-63
SLIDE 63

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Region of interest Center of mass Mean Shift vector

Mean-Shift

Slide by Y . Ukrainitz & B. Sarel

slide-64
SLIDE 64

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Region of interest Center of mass

Mean-Shift

Slide by Y . Ukrainitz & B. Sarel

slide-65
SLIDE 65

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Tessellate the space with windows Run the procedure in parallel

Slide by Y . Ukrainitz & B. Sarel

Real Modality Analysis

slide-66
SLIDE 66

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

The blue data points were traversed by the windows towards the mode.

Slide by Y . Ukrainitz & B. Sarel

Real Modality Analysis

slide-67
SLIDE 67

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Mean-Shift Clustering

  • Cluster: all data points in the attraction basin of a mode
  • Attraction basin: the region for which all trajectories

lead to the same mode

Slide by Y . Ukrainitz & B. Sarel

slide-68
SLIDE 68

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Mean-Shift Clustering/Segmentation

  • Choose features (color, gradients, texture, etc)
  • Initialize windows at individual pixel locations
  • Start mean-shift from each window until convergence
  • Merge windows that end up near the same “peak” or

mode

Slide adapted from Svetlana Lazebnik

slide-69
SLIDE 69

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Mean-Shift Segmentation Results

http://coewww.rutgers.edu/riul/FORMER/comanici/MSPAMI/msPamiResults.html

Slide credit: Svetlana Lazebnik

slide-70
SLIDE 70

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

More Results

Slide credit: Svetlana Lazebnik

slide-71
SLIDE 71

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Summary Mean-Shift

  • Pros

Ø General, application-independent tool Ø Model-free, does not assume any prior shape (spherical,

elliptical, etc.) on data clusters

Ø Just a single parameter (window size h)

– h has a physical meaning (unlike k-means) == scale of clustering

Ø Finds variable number of modes given the same h Ø Robust to outliers

  • Cons

Ø Output depends on window size h Ø Window size (bandwidth) selection is not trivial Ø Computationally rather expensive Ø Does not scale well with dimension of feature space

Slide adapted from Svetlana Lazebnik

slide-72
SLIDE 72

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Topics of This Lecture

  • Introduction

Ø Gestalt principles Ø Image segmentation

  • Segmentation as clustering

Ø k-Means Ø Feature spaces Ø Mixture of Gaussians, EM

  • Model-free clustering: Mean-Shift
  • Graph theoretic segmentation: Normalized Cuts
  • Interactive Segmentation with GraphCuts
slide-73
SLIDE 73

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Images as Graphs

  • Fully-connected graph

Ø Node (vertex) for every pixel Ø Edge between every pair of pixels (p,q) Ø Affinity weight wpq for each edge

– wpq measures similarity – Similarity is inversely proportional to difference (in color, texture, position, …)

q p wpq

w

Slide adapted from Steve Seitz

slide-74
SLIDE 74

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Segmentation by Graph Cuts

  • Break Graph into Segments

Ø Delete edges crossing between segments Ø Easiest to break edges with low similarity (low weight)

– Similar pixels should be in the same segments – Dissimilar pixels should be in different segments

w A B C

Slide adapted from Steve Seitz

slide-75
SLIDE 75

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Measuring Affinity

  • Distance
  • Intensity
  • Color
  • Texture

{ }

2

2 1 2

( , ) exp

d

aff x y x y

σ

= − −

{ }

2

2 1 2

( , ) exp ( ) ( )

d

aff x y I x I y

σ

= − −

(some suitable color space distance)

( )

{ }

2

2 1 2

( , ) exp ( ), ( )

d

aff x y dist c x c y

σ

= −

Source: Forsyth & Ponce

{ }

2

2 1 2

( , ) exp ( ) ( )

d

aff x y f x f y

σ

= − −

(vectors of filter outputs)

slide-76
SLIDE 76

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Scale Affects Affinity

  • Small σ: group only nearby points
  • Large σ: group far-away points

Slide adapted from Svetlana Lazebnik

Small σ Medium σ Large σ

Image Source: Forsyth & Ponce

small σ large σ data points

slide-77
SLIDE 77

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Graph Cut (GC)

  • GC = edges whose removal partitions a graph in two
  • Cost of a cut

Ø Sum of weights of cut edges:

  • A graph cut gives us a segmentation

Ø What is a “good” graph cut and how do we find one?

Slide adapted from Steve Seitz

A B

∈ ∈

=

B q A p q p

w B A cut

, ,

) , (

slide-78
SLIDE 78

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Graph Cut

Image Source: Forsyth & Ponce

Here, the cut is nicely defined by the block-diagonal structure of the affinity matrix. ⇒ How can this be generalized?

Slide credit: B. Leibe

slide-79
SLIDE 79

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Minimum Cut

  • We can do segmentation by finding the minimum cut in

a graph

Ø

Efficient algorithms exist for doing this

  • Drawback:

Ø

Weight of cut proportional to number of edges in the cut

Ø

Minimum cut tends to cut off very small, isolated components Ideal Cut Cuts with lesser weight than the ideal cut

Slide credit: Khurram Hassan-Shafique

slide-80
SLIDE 80

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Normalized Cut (NCut)

  • Min-cut has bias toward partitioning out small segments
  • This can be fixed by normalizing for size of segments
  • The normalized cut cost is:
  • The exact solution is NP-hard but an approximation can

be computed by solving a generalized eigenvalue problem.

assoc(A,V) = sum of weights from A to all nodes to graph

cut(A,B) assoc(A,V) + cut(A,B) assoc(B,V)

  • J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI 2000

Slide adapted from Svetlana Lazebnik

slide-81
SLIDE 81

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Interpretation as a Dynamical System

  • Treat the edges as springs and ‘shake’ the system

Ø Elasticity proportional to cost Ø Vibration “modes” correspond to segments

– Can compute these by solving a generalized eigenvector problem

Slide adapted from Steve Seitz

slide-82
SLIDE 82

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

NCuts as a Generalized Eigenvector Problem

  • Definitions
  • Rewriting Normalized Cut in matrix form:

,

: ( , ) ; : ( , ) ( , ); : {1, 1} , ( ) 1 . the affinity matrix, the diag. matrix, a vector in

i j j N

W W i j w D D i i W i j x x i i A = = − = ⇔ ∈

Slide credit: Jitendra Malik

(A,B) (A,B) (A,B) (A,V) (B,V) ( , ) (1 ) ( )(1 ) (1 ) ( )(1 ) ; 1 1 (1 )1 1 ( , ) ...

i

T T x T T i

cut cut NCut assoc assoc D i i x D W x x D W x k k D k D D i i

>

= + + − + − − − = + = − =

∑ ∑

slide-83
SLIDE 83

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Some More Math…

Slide credit: Jitendra Malik

slide-84
SLIDE 84

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

NCuts as a Generalized Eigenvalue Problem

  • After simplification, we get
  • This is a Rayleigh Quotient

Ø Solution given by the “generalized” eigenvalue problem

Ø Solved by converting to standard eigenvalue problem

  • Subtleties

Ø Smallest eigenvector is with eigenvalue 0 (and ) Ø Optimal solution is second smallest eigenvector Ø Gives continuous result—must convert into discrete values of y

( ) ( , ) , with {1, }, 1 0.

T T i T

y D W y NCut A B y b y D y Dy − = ∈ − =

Dy y W D λ = − ) (

Slide adapted from Alyosha Efros

This is hard, as y is discrete! Relaxation: y is continuous.

with

slide-85
SLIDE 85

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

NCuts Example

Smallest eigenvectors

Image source: Shi & Malik

NCuts segments

Slide credit: B. Leibe

slide-86
SLIDE 86

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

NCuts: Overall Procedure

  • 1. Construct a weighted graph G=(V,E) from an image.
  • 2. Connect each pair of pixels, and assign graph edge

weights,

  • 3. Solve for the smallest few
  • eigenvectors. This yields a continuous solution.
  • 4. Threshold eigenvectors to get a discrete cut

Ø This is where the approximation is made (we’re not solving NP).

  • 5. Recursively subdivide if NCut value is below a pre-

specified value.

( ) D W y Dy λ − = ( , )

  • Prob. that and belong to the same region.

W i j i j =

Slide credit: Jitendra Malik

NCuts Matlab code available at http://www.cis.upenn.edu/~jshi/software/

slide-87
SLIDE 87

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Color Image Segmentation with NCuts

Image Source: Shi & Malik

Slide credit: Steve Seitz

slide-88
SLIDE 88

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Results with Color & Texture

slide-89
SLIDE 89

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Summary: Normalized Cuts

  • Pros:

Ø Generic framework, flexible to choice of function that computes

weights (“affinities”) between nodes

Ø Does not require any model of the data distribution

  • Cons:

Ø Time and memory complexity can be high

– Dense, highly connected graphs ⇒ many affinity computations – Solving eigenvalue problem

Ø Preference for balanced partitions

– If a region is uniform, NCuts will find the modes of vibration of the image dimensions

Slide credit: Kristen Grauman

slide-90
SLIDE 90

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Segmentation: Caveats

  • We’ve looked at bottom-up ways to segment an image

into regions, yet finding meaningful segments is intertwined with the recognition problem.

  • Often want to avoid making hard decisions too soon
  • Difficult to evaluate; when is a segmentation successful?

Slide credit: Kristen Grauman

slide-91
SLIDE 91

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Topics of This Lecture

  • Introduction

Ø Gestalt principles Ø Image segmentation

  • Segmentation as clustering

Ø k-Means Ø Feature spaces Ø Mixture of Gaussians, EM

  • Model-free clustering: Mean-Shift
  • Graph theoretic segmentation: Normalized Cuts
  • Interactive Segmentation with GraphCuts
slide-92
SLIDE 92

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Markov Random Fields

  • Allow rich probabilistic models for images
  • But built in a local, modular way

Ø Learn local effects, get global effects out

Slide credit: William Freeman

Observed evidence Hidden “true states” Neighborhood relations

slide-93
SLIDE 93

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

MRF Nodes as Pixels (or Patches)

Image Image pixels states (e.g. foreground/background)

Slide adapted from William Freeman

( , )

i i

x y Φ

( , )

i j

x x Ψ

slide-94
SLIDE 94

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Network Joint Probability

,

( , ) ( , ) ( , )

i i i j i i j

P x y x y x x = Φ Ψ

∏ ∏

states Image

Slide adapted from William Freeman

Image-state compatibility function state-state compatibility function Neighboring nodes Local

  • bservations
slide-95
SLIDE 95

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Energy Formulation

  • Joint probability
  • Maximizing the joint probability is the same as

minimizing the log

  • This is similar to free-energy problems in statistical

mechanics (spin glass theory). We therefore draw the analogy and call E an energy function.

  • ϕ and ψ are called potentials.

,

( , ) ( , ) ( , )

i i i j i i j

P x y x y x x = Φ Ψ

∏ ∏

, ,

log ( , ) log ( , ) log ( , ) ( , ) ( , ) ( , )

i i i j i i j i i i j i i j

P x y x y x x E x y x y x x ϕ ψ = Φ + Ψ = +

∑ ∑ ∑ ∑

Slide credit: B. Leibe

slide-96
SLIDE 96

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Energy Formulation

  • Energy function
  • Unary potentials ϕ

Ø Encode local information about the given pixel/patch Ø How likely is a pixel/patch to be in a certain state ?

(e.g. foreground/background)?

  • Pairwise potentials ψ

Ø Encode neighborhood information Ø How different is a pixel/patch’s label from that of its neighbor?

(e.g. here independent of image data, but later based on intensity/color/texture difference) Pairwise potentials Unary potentials

( , )

i i

x y ϕ ( , )

i j

x x ψ

,

( , ) ( , ) ( , )

i i i j i i j

E x y x y x x ϕ ψ = +

∑ ∑

Slide adapted from B. Leibe

slide-97
SLIDE 97

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Energy Minimization

  • Goal:

Ø Infer the optimal labeling of the MRF.

  • Many inference algorithms are available, e.g.

Ø Gibbs sampling, simulated annealing Ø Iterated conditional modes (ICM) Ø Variational methods Ø Belief propagation Ø Graph cuts

  • Recently, Graph Cuts have become a popular tool

Ø Only suitable for a certain class of energy functions Ø But the solution can be obtained very fast for typical vision

problems (~1MPixel/sec).

( , )

i i

x y ϕ ( , )

i j

x x ψ

Slide credit: B. Leibe

slide-98
SLIDE 98

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Graph Cuts for Optimal Boundary Detection

  • Idea: convert MRF into source-sink graph

n-links s t a cut

hard constraint hard constraint

Minimum cost cut can be computed in polynomial time

(max-flow/min-cut algorithms)

[Boykov & Jolly, ICCV’01] Slide adapted from Yuri Boykov

slide-99
SLIDE 99

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Simple Example of Energy

∑ ∑

≠ ⋅ + =

N pq q p pq p p p

L L w L D L E ) ( ) ( ) ( δ

} , { t s Lp ∈

t-links n-links

Boundary term Regional term (binary segmentation)

Slide credit: Yuri Boykov

⎭ ⎬ ⎫ ⎩ ⎨ ⎧ Δ − =

2

2 exp σ

pq pq

I w

pq

I Δ

σ

s t a cut

) (s Dp ) (t Dp

slide-100
SLIDE 100

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Adding Regional Properties

pq

w

n-links s t a cut

) (t Dp

t-link

) (s Dp

t-link

NOTE: hard constrains are not required, in general.

Regional bias example

Suppose are given “expected” intensities

  • f object and background

t s

I I and

( )

2 2 2

/ || || exp ) ( σ

s p p

I I s D − − ∝

( )

2 2 2

/ || || exp ) ( σ

t p p

I I t D − − ∝

[Boykov & Jolly, ICCV’01] Slide credit: Yuri Boykov

slide-101
SLIDE 101

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Adding Regional Properties

pq

w

n-links s t a cut

) (t Dp

t-link

) (s Dp

t-link

( )

2 2 2

/ || || exp ) ( σ

s p p

I I s D − − ∝

( )

2 2 2

/ || || exp ) ( σ

t p p

I I t D − − ∝

EM-style optimization “expected” intensities of

  • bject and background

can be re-estimated

t s

I I and

[Boykov & Jolly, ICCV’01] Slide credit: Yuri Boykov

slide-102
SLIDE 102

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Adding Regional Properties

  • More generally, regional bias can be based on any

intensity models of object and background

a cut

( ) logPr( | )

p p p p

D L I L = −

given object and background intensity histograms

) (s Dp ) (t Dp

s t

I

) | Pr( s I p ) | Pr( t I p

p

I

[Boykov & Jolly, ICCV’01] Slide credit: Yuri Boykov

slide-103
SLIDE 103

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

How to Set the Potentials? Some Examples

  • Color potentials

Ø e.g. modeled with a Mixture of Gaussians

  • Edge potentials

Ø e.g. a “contrast sensitive Potts model”

where

  • Parameters θπ, θφ need to be learned, too!

[Shotton & Winn, ECCV’06]

( , , ( ); ) ( ) ( )

T i j ij ij i j

x x g y g y x x

φ φ

φ θ θ δ = − ≠

( )

2

2

i j

avg y y β = ⋅ −

2

( )

i j

y y ij

g y e

β − −

= ( , ; ) log ( , ) ( | ) ( ; , )

i i i i i k k k

x y x k P k x N y y

π π

π θ θ = Σ

Slide credit: B. Leibe

slide-104
SLIDE 104

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

How Does it Work? The s-t-Mincut Problem

Source Sink v1 v2

2 5 9 4 2 1 Graph (V, E, C)

Vertices V = {v1, v2 ... vn} Edges E = {(v1, v2) ....} Costs C = {c(1, 2) ....}

Slide credit: Pushmeet Kohli

slide-105
SLIDE 105

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

The s-t-Mincut Problem

Source Sink v1 v2

2 5 9 4 2 1

Slide credit: Pushmeet Kohli

What is an st-cut? What is the cost of a st-cut?

An st-cut (S,T) divides the nodes between source and sink. Sum of cost of all edges going from S to T

5 + 2 + 9 = 16

slide-106
SLIDE 106

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

The s-t-Mincut Problem

Source Sink v1 v2

2 5 9 4 2 1

Slide credit: Pushmeet Kohli

What is an st-cut? What is the cost of a st-cut?

An st-cut (S,T) divides the nodes between source and sink. Sum of cost of all edges going from S to T st-cut with the minimum cost

What is the st-mincut?

2 + 1 + 4 = 7

slide-107
SLIDE 107

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

History of Maxflow Algorithms

Augmenting Path and Push-Relabel

n: #nodes m: #edges U: maximum

edge weight Algorithms assume non- negative edge weights

Slide credit: Andrew Goldberg

slide-108
SLIDE 108

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

How to Compute the s-t-Mincut?

Source Sink v1 v2

2 5 9 4 2 1 Solve the dual maximum flow problem

In every network, the maximum flow equals the cost of the st-mincut

Min-cut/Max-flow Theorem Compute the maximum flow between Source and Sink

Constraints Edges: Flow < Capacity Nodes: Flow in = Flow out

Slide credit: Pushmeet Kohli

slide-109
SLIDE 109

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

2 5 9 4 2 1

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 0

slide-110
SLIDE 110

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

9 4 2 1

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 0 2 5

slide-111
SLIDE 111

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

9 4 2 1

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 0 + 2 5-2 2-2

slide-112
SLIDE 112

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

9 4 2 1

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 2 3

slide-113
SLIDE 113

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

3 9 4 2 1

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 2

slide-114
SLIDE 114

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

3 2 1

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 2 9 4

slide-115
SLIDE 115

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

3 2 1

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 2 + 4 5

slide-116
SLIDE 116

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

3 5 2 1

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 6

slide-117
SLIDE 117

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

2

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 6 3 5 1

slide-118
SLIDE 118

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

2

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 6 + 1 2 4 1-1

slide-119
SLIDE 119

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

3 5 2

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 7

slide-120
SLIDE 120

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow Algorithms

Source Sink v1 v2

3 5 2

Slide credit: Pushmeet Kohli

Augmenting Path Based Algorithms

  • 1. Find path from source to sink

with positive capacity

  • 2. Push maximum possible flow

through this path

  • 3. Repeat until no path can be

found Algorithms assume non-negative capacity Flow = 7

slide-121
SLIDE 121

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Maxflow in Computer Vision

  • Specialized algorithms for vision

problems

Ø Grid graphs Ø Low connectivity (m ~ O(n))

  • Dual search tree augmenting path

algorithm

[Boykov and Kolmogorov PAMI 2004]

Ø Finds approximate shortest augmenting

paths efficiently

Ø High worst-case time complexity Ø Empirically outperforms other

algorithms on vision problems

Ø Efficient code available on the web

http://www.adastral.ucl.ac.uk/~vladkolm/software.html

Slide credit: Pushmeet Kohli

slide-122
SLIDE 122

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

When Can s-t Graph Cuts Be Applied?

  • s-t graph cuts can only globally minimize binary energies

that are submodular.

  • Non-submodular cases can still be addressed with some
  • ptimality guarantees.

Ø Current research topic

∑ ∑

+ =

N pq q p p p p

L L E L E L E ) , ( ) ( ) (

} , { t s Lp ∈

t-links n-links

Boundary term Regional term E(L) can be minimized by s-t graph cuts

) , ( ) , ( ) , ( ) , ( s t E t s E t t E s s E + ≤ +

Submodularity (“convexity”)

[Boros & Hummer, 2002, Kolmogorov & Zabih, 2004] Slide credit: B. Leibe

slide-123
SLIDE 123

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Dealing with Non-Binary Cases

  • For image segmentation, the limitation to binary

energies is a nuisance.

⇒ Binary segmentation only

  • We would like to solve also multi-label problems.

Ø NP-hard problem with 3 or more labels

  • There exist some approximation algorithms which

extend graph cuts to the multi-label case

Ø α-Expansion Ø αβ

αβ-Swap

  • They are no longer guaranteed to return the globally
  • ptimal result.

Ø But α-Expansion has a guaranteed approximation quality (2-

approx) and converges in a few iterations.

Slide credit: B. Leibe

slide-124
SLIDE 124

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

α-Expansion Move

  • Basic idea:

Ø Break multi-way cut computation into a sequence of

binary s-t cuts.

  • ther labels

α

Slide credit: Yuri Boykov

slide-125
SLIDE 125

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

α-Expansion Algorithm

  • 1. Start with any initial solution
  • 2. For each label “α” in any order

1.

Compute optimal α-expansion move (s-t graph cuts): set the label of each node to alpha or leave to current label (so s = alpha and t = current)

2.

Decline the move if there is no energy decrease

  • 3. iterate to 2.
  • Stop when no expansion move would decrease energy
  • why good ? à

à each move is optimal within a very large set of possible segmentations (2N)

Slide adapted from Yuri Boykov

slide-126
SLIDE 126

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

α-Expansion Moves

  • In each a-expansion a given label “α” grabs space from
  • ther labels

initial solution

  • expansion
  • expansion
  • expansion
  • expansion
  • expansion
  • expansion
  • expansion

For each move we choose the expansion that gives the largest decrease in the energy: binary optimization problem

Slide credit: Yuri Boykov

slide-127
SLIDE 127

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

GraphCut Applications: “GrabCut”

User segmentation cues Additional segmentation cues

  • Interactive Image Segmentation [Boykov & Jolly, ICCV’01]

Ø Rough region cues sufficient Ø Segmentation boundary can be extracted from edges

  • Procedure

Ø User marks foreground and background regions with a brush à

à get initial segmentation à à correct by additional brush strokes

Slide adapted from Matthieu Bray

slide-128
SLIDE 128

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

  • Obtained from interactive user input

Ø User marks foreground and background regions with a brush Ø Alternatively, user can specify a bounding box

GrabCut: Data Model

Global optimum of the unary energy Background color Foreground color

Slide adapted from Carsten Rother

slide-129
SLIDE 129

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

GrabCut: Coherence Model

  • An object is a coherent set of pixels:

How to choose γ ?

Slide credit: Carsten Rother Error (%) over training set: 25

[ ]

2

( , )

( , ) e

m n

y y n m m n C

x y x x

β

ψ γ δ

− − ∈

= ≠

slide-130
SLIDE 130

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Iterated Graph Cuts

Energy after each iteration Result

Foreground & Background Background G

R

Foreground Background G

R

1 2 3 4

Color model (Mixture of Gaussians)

Slide credit: Carsten Rother

slide-131
SLIDE 131

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

GrabCut: Example Results

slide-132
SLIDE 132

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Improving Efficiency of Segmentation

  • Problem: Images contain many pixels

Ø Even with efficient graph cuts, an MRF

formulation has too many nodes for interactive results.

  • Efficiency trick: Superpixels

Ø Group together similar-looking

pixels for efficiency of further processing.

Ø Cheap, local oversegmentation Ø Important to ensure that superpixels

do not cross boundaries

  • Several different approaches possible

Ø Superpixel code available here

Ø http://www.cs.sfu.ca/~mori/research/superpixels/

Image source: Greg Mori

Slide credit: B. Leibe

slide-133
SLIDE 133

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Superpixels for Pre-Segmentation

Speedup Graph structure

Slide credit: B. Leibe

slide-134
SLIDE 134

Perceptual and Sensory Augmented Computing Computer Vision WS 08/09

Summary: Graph Cuts Segmentation

  • Pros

Ø Powerful technique, based on probabilistic model (MRF). Ø Applicable for a wide range of problems. Ø Very efficient algorithms available for vision problems. Ø Becoming a de-facto standard for many segmentation tasks.

  • Cons/Issues

Ø Graph cuts can only solve a limited class of models

– Submodular energy functions – Can capture only part of the expressiveness of MRFs

Ø Only approximate algorithms available for multi-label case

Slide credit: B. Leibe