Lecture: k-means & mean-shift clustering Juan Carlos Niebles and - - PowerPoint PPT Presentation

lecture k means mean shift clustering
SMART_READER_LITE
LIVE PREVIEW

Lecture: k-means & mean-shift clustering Juan Carlos Niebles and - - PowerPoint PPT Presentation

Clustering Lecture: k-means & mean-shift clustering Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 22-Oct-2019 1 St Stanfor ord University CS 131 Roadmap Clustering Pixels Segments Images Videos Web


slide-1
SLIDE 1

Clustering

St Stanfor

  • rd University

22-Oct-2019 1

Lecture: k-means & mean-shift clustering

Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab

slide-2
SLIDE 2

Clustering

St Stanfor

  • rd University

22-Oct-2019 2

CS 131 Roadmap

Pixels Images

Convolutions Edges Descriptors

Segments

Resizing Segmentation Clustering Recognition Detection Machine learning

Videos

Motion Tracking

Web

Neural networks Convolutional neural networks

slide-3
SLIDE 3

Clustering

St Stanfor

  • rd University

22-Oct-2019 3

Recap: Image Segmentation

  • Goal: identify groups of pixels that go together
slide-4
SLIDE 4

Clustering

St Stanfor

  • rd University

22-Oct-2019 4

Recap: Gestalt Theory

  • Gestalt: whole or group

– Whole is greater than sum of its parts – Relationships among parts can yield new properties/features

  • Psychologists identified series of factors that predispose set of

elements to be grouped (by human visual system)

Untersuchungen zur Lehre von der Gestalt, Psychologische Forschung, Vol. 4, pp. 301-350, 1923 http://psy.ed.asu.edu/~classics/Wertheimer/Forms/forms.htm

“I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of colour. Do I have "327"? No. I have sky, house, and trees.”

Max Wertheimer

(1880-1943)

slide-5
SLIDE 5

Clustering

St Stanfor

  • rd University

22-Oct-2019 5

Recap: Gestalt Factors

  • These factors make intuitive sense, but are very difficult to translate into algorithms.
slide-6
SLIDE 6

Clustering

St Stanfor

  • rd University

22-Oct-2019 6

What will we learn today?

  • K-means clustering
  • Mean-shift clustering

Reading: [FP] Chapters: 14.2, 14.4

  • D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature

Space Analysis, PAMI 2002.

slide-7
SLIDE 7

Clustering

St Stanfor

  • rd University

22-Oct-2019 7

What will we learn today?

  • K-means clustering
  • Mean-shift clustering

Reading: [FP] Chapters: 14.2, 14.4

  • D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis, PAMI 2002.
slide-8
SLIDE 8

Clustering

St Stanfor

  • rd University

22-Oct-2019 8

Image Segmentation: Toy Example

  • These intensities define the three groups.
  • We could label every pixel in the image according to which of these primary

intensities it is.

– i.e., segment the image based on the intensity feature.

  • What if the image isn’t quite so simple?

intensity input image

black pixels gray pixels white pixels

1 2 3

Slide credit: Kristen Grauman

slide-9
SLIDE 9

Clustering

St Stanfor

  • rd University

22-Oct-2019 9

Pixel count Input image Input image Intensity Pixel count Intensity

Slide credit: Kristen Grauman

slide-10
SLIDE 10

Clustering

St Stanfor

  • rd University

22-Oct-2019 10

  • Now how to determine the three main intensities that define
  • ur groups?
  • We need to cluster.

Input image Intensity Pixel count

Slide credit: Kristen Grauman

slide-11
SLIDE 11

Clustering

St Stanfor

  • rd University

22-Oct-2019 11

  • Goal: choose three “centers” as the representative intensities,

and label every pixel according to which of these centers it is nearest to.

  • Best cluster centers are those that minimize Sum of Square

Distance (SSD) between all points and their nearest cluster center ci:

Slide credit: Kristen Grauman

190 255

1 2 3

Intensity

SSD = x −ci

( )

2 x∈clusteri

clusteri

slide-12
SLIDE 12

Clustering

St Stanfor

  • rd University

22-Oct-2019 12

Clustering for Summarization

Goal: cluster to minimize variance in data given clusters

– Preserve information

Whether is assigned to Cluster center Data Slide: Derek Hoiem

c*, δ* = argmin

c, δ

1 N δij

i K

ci − x j

( )

2 j N

x j ci

slide-13
SLIDE 13

Clustering

St Stanfor

  • rd University

22-Oct-2019 13

Clustering

  • With this objective, it is a “chicken and egg” problem:

–If we knew the cluster centers, we could allocate points to groups by assigning each to its closest center. –If we knew the group memberships, we could get the centers by computing the mean per group.

Slide credit: Kristen Grauman

slide-14
SLIDE 14

Clustering

St Stanfor

  • rd University

22-Oct-2019 14

K-means clustering

1. Initialize ( ): cluster centers 2. Compute : assign each point to the closest center

– denotes the set of assignment for each to cluster at iteration t

1. Computer : update cluster centers as the mean of the points 1. Update , Repeat Step 2-3 till stopped

Slide: Derek Hoiem

c1,...,cK

δ t = argmin

δ

1 N δ t−1

ij i K

ct−1

i − x j

( )

2 j N

ct = argmin

c

1 N δ t

ij i K

ct−1

i − x j

( )

2 j N

t = t +1

x j ci

δt

t = 0

δt ct

slide-15
SLIDE 15

Clustering

St Stanfor

  • rd University

22-Oct-2019 15

K-means clustering

1. Initialize ( ): cluster centers 2. Compute : assign each point to the closest center 1. Computer : update cluster centers as the mean of the points 2. Update , Repeat Step 2-3 till stopped

Slide: Derek Hoiem

c1,...,cK

t = 0

δt ct

  • Commonly used: random initialization
  • Or greedily choose K to minimize residual
  • Typical distance measure:
  • Euclidean
  • Cosine
  • Others
  • doesn’t change anymore.

ct

sim(x, ! x ) = xT ! x sim(x, ! x ) = xT ! x x ⋅ ! x

( )

t = t +1

ct = argmin

c

1 N δ t

ij i K

ct−1

i − x j

( )

2 j N

slide-16
SLIDE 16

Clustering

St Stanfor

  • rd University

22-Oct-2019 16

K-means clustering

  • Java demo:

http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html

Illustration Source: wikipedia

  • 1. Initialize

Cluster Centers

  • 2. Assign Points to

Clusters

  • 3. Re-compute

Means Repeat (2) and (3)

slide-17
SLIDE 17

Clustering

St Stanfor

  • rd University

22-Oct-2019 17

K-means clustering

  • Converges to a local minimum solution

–Initialize multiple runs

  • Better fit for spherical data
  • Need to pick K (# of clusters)
slide-18
SLIDE 18

Clustering

St Stanfor

  • rd University

22-Oct-2019 18

Segmentation as Clustering

2 clusters Original image 3 clusters

slide-19
SLIDE 19

Clustering

St Stanfor

  • rd University

22-Oct-2019 19

K-Means++

  • Can we prevent arbitrarily bad local minima?

1.Randomly choose first center. 2.Pick new center with prob. proportional to

–(Contribution of x to total error)

3.Repeat until K centers.

  • Expected error * optimal

Arthur & Vassilvitskii 2007

Slide credit: Steve Seitz

x −ci

( )

2

= O log K

( )

slide-20
SLIDE 20

Clustering

St Stanfor

  • rd University

22-Oct-2019 20

Feature Space

  • Depending on what we choose as the feature space, we can

group pixels in different ways.

  • Grouping pixels based on

intensity similarity

  • Feature space: intensity value (1D)

Slide credit: Kristen Grauman

slide-21
SLIDE 21

Clustering

St Stanfor

  • rd University

22-Oct-2019 21

Feature Space

  • Depending on what we choose as the feature space, we can group

pixels in different ways.

  • Grouping pixels based
  • n color similarity
  • Feature space: color value (3D)

R=255 G=200 B=250 R=245 G=220 B=248 R=15 G=189 B=2 R=3 G=12 B=2

R G B

Slide credit: Kristen Grauman

slide-22
SLIDE 22

Clustering

St Stanfor

  • rd University

22-Oct-2019 22

Feature Space

  • Depending on what we choose as the feature space, we can group

pixels in different ways.

  • Grouping pixels based
  • n texture similarity
  • Feature space: filter bank responses (e.g., 24D)

Filter bank of 24 filters

F24 F2 F1

Slide credit: Kristen Grauman

slide-23
SLIDE 23

Clustering

St Stanfor

  • rd University

22-Oct-2019 23

Smoothing Out Cluster Assignments

  • Assigning a cluster label per pixel may yield outliers:
  • How can we ensure they

are spatially smooth?

1 2 3

?

Original Labeled by cluster center’s intensity

Slide credit: Kristen Grauman

slide-24
SLIDE 24

Clustering

St Stanfor

  • rd University

22-Oct-2019 24

Segmentation as Clustering

  • Depending on what we choose as the feature space, we can group pixels

in different ways.

  • Grouping pixels based on

intensity+position similarity Þ Way to encode both similarity and proximity.

Slide credit: Kristen Grauman

X Intensity Y

slide-25
SLIDE 25

Clustering

St Stanfor

  • rd University

22-Oct-2019 25

K-Means Clustering Results

  • K-means clustering based on intensity or color is essentially

vector quantization of the image attributes

–Clusters don’t have to be spatially coherent

Image Intensity-based clusters Color-based clusters

Image source: Forsyth & Ponce

slide-26
SLIDE 26

Clustering

St Stanfor

  • rd University

22-Oct-2019 26

K-Means Clustering Results

  • K-means clustering based on intensity or color is essentially

vector quantization of the image attributes

–Clusters don’t have to be spatially coherent

  • Clustering based on (r,g,b,x,y) values enforces more spatial

coherence

Image source: Forsyth & Ponce

slide-27
SLIDE 27

Clustering

St Stanfor

  • rd University

22-Oct-2019 27

How to evaluate clusters?

  • Generative

– How well are points reconstructed from the clusters?

  • Discriminative

– How well do the clusters correspond to labels?

  • Can we correctly classify which pixels belong to the panda?

– Note: unsupervised clustering does not aim to be discriminative as we don’t have the labels.

Slide: Derek Hoiem

slide-28
SLIDE 28

Clustering

St Stanfor

  • rd University

22-Oct-2019 28

How to choose the number of clusters?

Try different numbers of clusters in a validation set and look at performance.

Slide: Derek Hoiem

slide-29
SLIDE 29

Clustering

St Stanfor

  • rd University

22-Oct-2019 29

K-Means pros and cons

  • Pros
  • Finds cluster centers that minimize conditional

variance (good representation of data)

  • Simple and fast, Easy to implement
  • Cons
  • Need to choose K
  • Sensitive to outliers
  • Prone to local minima
  • All clusters have the same parameters (e.g., distance

measure is non-adaptive)

  • *Can be slow: each iteration is O(KNd) for N d-

dimensional points

  • Usage
  • Unsupervised clustering
  • Rarely used for pixel segmentation
slide-30
SLIDE 30

Clustering

St Stanfor

  • rd University

22-Oct-2019 30

What will we learn today?

  • K-means clustering
  • Mean-shift clustering

Reading: [FP] Chapters: 14.2, 14.4

  • D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis, PAMI 2002.
slide-31
SLIDE 31

Clustering

St Stanfor

  • rd University

22-Oct-2019 31

Mean-Shift Segmentation

  • An advanced and versatile technique for clustering-

based segmentation

http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html

  • D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis, PAMI 2002.

Slide credit: Svetlana Lazebnik

slide-32
SLIDE 32

Clustering

St Stanfor

  • rd University

22-Oct-2019 32

Mean-Shift Algorithm

  • Iterative Mode Search

1.

Initialize random seed, and window W

2.

Calculate center of gravity (the “mean”) of W:

3.

Shift the search window to the mean

4.

Repeat Step 2 until convergence

26-Oct-17 32

Slide credit: Steve Seitz

slide-33
SLIDE 33

Clustering

St Stanfor

  • rd University

22-Oct-2019 33

Region of interest Center of mass Mean Shift vector

Mean-Shift

26-Oct-17 33

Slide by Y. Ukrainitz & B. Sarel

slide-34
SLIDE 34

Clustering

St Stanfor

  • rd University

22-Oct-2019 34

Region of interest Center of mass Mean Shift vector

Mean-Shift

26-Oct-17 34

Slide by Y. Ukrainitz & B. Sarel

slide-35
SLIDE 35

Clustering

St Stanfor

  • rd University

22-Oct-2019 35

Region of interest Center of mass Mean Shift vector

Mean-Shift

26-Oct-17 35

Slide by Y. Ukrainitz & B. Sarel

slide-36
SLIDE 36

Clustering

St Stanfor

  • rd University

22-Oct-2019 36

Region of interest Center of mass Mean Shift vector

Mean-Shift

26-Oct-17 36

Slide by Y. Ukrainitz & B. Sarel

slide-37
SLIDE 37

Clustering

St Stanfor

  • rd University

22-Oct-2019 37

Region of interest Center of mass Mean Shift vector

Mean-Shift

26-Oct-17 37

Slide by Y. Ukrainitz & B. Sarel

slide-38
SLIDE 38

Clustering

St Stanfor

  • rd University

22-Oct-2019 38

Region of interest Center of mass Mean Shift vector

Mean-Shift

26-Oct-17 38

Slide by Y. Ukrainitz & B. Sarel

slide-39
SLIDE 39

Clustering

St Stanfor

  • rd University

22-Oct-2019 39

Region of interest Center of mass

Mean-Shift

26-Oct-17 39

Slide by Y. Ukrainitz & B. Sarel

slide-40
SLIDE 40

Clustering

St Stanfor

  • rd University

22-Oct-2019 40

Tessellate the space with windows Run the procedure in parallel

Slide by Y. Ukrainitz & B. Sarel

Real Modality Analysis

26-Oct-17 40

slide-41
SLIDE 41

Clustering

St Stanfor

  • rd University

22-Oct-2019 41

The blue data points were traversed by the windows towards the mode.

Slide by Y. Ukrainitz & B. Sarel

Real Modality Analysis

26-Oct-17 41

slide-42
SLIDE 42

Clustering

St Stanfor

  • rd University

22-Oct-2019 42

Mean-Shift Clustering

  • Cluster: all data points in the attraction basin of a

mode

  • Attraction basin: the region for which all trajectories

lead to the same mode

Slide by Y. Ukrainitz & B. Sarel

slide-43
SLIDE 43

Clustering

St Stanfor

  • rd University

22-Oct-2019 43

Mean-Shift Clustering/Segmentation

  • Find features (color, gradients, texture, etc)
  • Initialize windows at individual pixel locations
  • Perform mean shift for each window until convergence
  • Merge windows that end up near the same “peak” or mode

Slide credit: Svetlana Lazebnik

slide-44
SLIDE 44

Clustering

St Stanfor

  • rd University

22-Oct-2019 44

Mean-Shift Segmentation Results

26-Oct-17 44 http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html

Slide credit: Svetlana Lazebnik

slide-45
SLIDE 45

Clustering

St Stanfor

  • rd University

22-Oct-2019 45

More Results

26-Oct-17 45

Slide credit: Svetlana Lazebnik

slide-46
SLIDE 46

Clustering

St Stanfor

  • rd University

22-Oct-2019 46

More Results

26-Oct-17 46

slide-47
SLIDE 47

Clustering

St Stanfor

  • rd University

22-Oct-2019 47

Problem: Computational Complexity

  • Need to shift many windows…
  • Many computations will be redundant.

26-Oct-17 47

Slide credit: Bastian Leibe

slide-48
SLIDE 48

Clustering

St Stanfor

  • rd University

22-Oct-2019 48

Speedups: Basin of Attraction

  • 1. Assign all points within radius r of end point to the mode.

26-Oct-17 48 r

Slide credit: Bastian Leibe

slide-49
SLIDE 49

Clustering

St Stanfor

  • rd University

22-Oct-2019 49

Speedups

  • 2. Assign all points within radius r/c of the search path to the mode -> reduce the

number of data points to search. 26-Oct-17 49

r=c

Slide credit: Bastian Leibe

slide-50
SLIDE 50

Clustering

St Stanfor

  • rd University

22-Oct-2019 50

Technical Details

Comaniciu & Meer, 2002

slide-51
SLIDE 51

Clustering

St Stanfor

  • rd University

22-Oct-2019 51

Other Kernels

source

slide-52
SLIDE 52

Clustering

St Stanfor

  • rd University

22-Oct-2019 52

Technical Details

Comaniciu & Meer, 2002

  • Term1: this is proportional to the density estimate at x (similar to equation 1

from two slides ago).

  • Term2: this is the mean-shift vector that points towards the direction of

maximum density. Taking the derivative of:

slide-53
SLIDE 53

Clustering

St Stanfor

  • rd University

22-Oct-2019 53

Technical Details

Comaniciu & Meer, 2002

Finally, the mean shift procedure from a given point xt is: 1. Computer the mean shift vector m:

  • 2. Translate the density window:
  • 3. Iterate steps 1 and 2 until convergence.
slide-54
SLIDE 54

Clustering

St Stanfor

  • rd University

22-Oct-2019 54

Summary Mean-Shift

  • Pros

– General, application-independent tool – Model-free, does not assume any prior shape (spherical, elliptical, etc.) on data clusters – Just a single parameter (window size h)

  • h has a physical meaning (unlike k-means)

– Finds variable number of modes – Robust to outliers

  • Cons

– Output depends on window size – Window size (bandwidth) selection is not trivial – Computationally (relatively) expensive (~2s/image) – Does not scale well with dimension of feature space

Slide credit: Svetlana Lazebnik

slide-55
SLIDE 55

Clustering

St Stanfor

  • rd University

22-Oct-2019 55

What will we have learned today

  • K-means clustering
  • Mean-shift clustering

Reading: [FP] Chapters: 14.2, 14.4

  • D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis, PAMI 2002.