Unsupervised Image Segmentation Using Comparative Reasoning and - - PowerPoint PPT Presentation

unsupervised image segmentation using comparative
SMART_READER_LITE
LIVE PREVIEW

Unsupervised Image Segmentation Using Comparative Reasoning and - - PowerPoint PPT Presentation

Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks Anuva Kulkarni Carnegie Mellon University Filipe Condessa Carnegie Mellon, IST-University of Lisbon Carnegie Mellon University Jelena Kovacevic 1 Outline


slide-1
SLIDE 1

Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks

Anuva Kulkarni Carnegie Mellon University Filipe Condessa Carnegie Mellon, IST-University of Lisbon Jelena Kovacevic Carnegie Mellon University

1

slide-2
SLIDE 2

Outline

  • Motivation

– Training-free methods – Comparative Reasoning – Related work

  • Approach

– Winner Take All (WTA) Hash – Clustering based on Random Walks

  • Some experimental results

2

slide-3
SLIDE 3

Acknowledgements

  • Example and test images taken from

– Berkeley Segmentation Dataset (BSDS) – The Prague Texture Segmentation Data Generator and Benchmark

3

slide-4
SLIDE 4

Motivation

  • Goals:

– Segment images where no. of classes unknown – Eliminate training data (may not be available) – Fast pre-processing step for classification

  • Segmentation is similarity search
  • Comparative Reasoning is rank correlation

using machine learning concept of “hashing”

4

slide-5
SLIDE 5

Hashing

  • Used to speed up the searching process
  • A ‘hash function’ relates the data values to

keys or ‘hash codes’

  • Hash table is shortened representation of data

0111 Value Key/ Hash code Hash function

Hash value Data 001 010 011 100 Bird_type1 Bird_type2 Dog_type1 Fox_type1 Hash table

5

slide-6
SLIDE 6

Hashing

  • Similar data points have the same (or close

by) hash keys or “hash codes”

  • Properties of hash functions

– Always returns a number for an object – Two equal objects will always have the same number – Two unequal objects may not always have different numbers

Hash code Input data

6

Images from https://upload.wikimedia.org/wikipedia/commons/3/32/House_sparrow04.jpg Wikipedia www.weknowyourdreams.com

slide-7
SLIDE 7

Hashing for Segmentation

  • Each pixel is described by some feature

vectors (eg. Color)

  • Hashing is used to cluster them into groups

1110 0110 0111 0001 Color features

  • f each pixel

computed Image Similar features hashed into same groups

7

slide-8
SLIDE 8

Segmentation and Randomized Hashing

  • Random hashing i.e using a hash code to

indicate the region in which a feature vector lies after splitting the space using a set of randomly chosen splitting planes

  • C. J. Taylor and A. Cowley, “Fast segmentation via randomized hashing.,” in BMVC, pp. 1–11, 2009.

8

2 3 1 1001 1011 1111 0111 0001 0000 1000 0110 0011 0100

slide-9
SLIDE 9

Winner Take All (WTA) Hash

  • A way to convert feature vectors into compact

binary hash codes

  • Absolute value of feature does not matter,
  • nly the ordering of values matters
  • Rank correlation preserved

– Stability

  • Distance between hashes approximates rank

correlation

  • J. Yagnik, D. Strelow, D. A. Ross, and R.s. Lin, “The power of comparative reasoning,” in ICCV 2011,
  • pp. 2431–2438, IEEE, 2011.

9

slide-10
SLIDE 10

Calculating WTA Hash

  • Consider 3 feature vectors

Step 1: Create random permutations

10

13 4 2 11 5 3 3 1 5 2 6 4 2 13 5 4 3 11 44 1 15 90 6 5 12 5 3 10 4 2

feature 1 feature 2 feature 3

1 90 44 5 15 6 3 12 4 5 2 10

Permute with θ Permutation vector θ Step 1

slide-11
SLIDE 11

Calculating WTA Hash

  • Step 2: Choose first K entries. Let K=3

11

13 4 2 11 5 3 3 1 5 2 6 4 2 13 5 4 3 11 44 1 15 90 6 5 12 5 3 10 4 2

feature 1 feature 2 feature 3

1 90 44 5 15 6 3 12 4 5 2 10

Permute with θ

2 13 5 4 3 11 44 1 15 90 6 5 3 12 4 5 2 10

Choose first K entries Permutation vector θ Step 1 Step 2

slide-12
SLIDE 12

Calculating WTA Hash

  • Step 3: Pick the index of the max. entry. This is

the hash code ‘h’ of that feature vector

12

13 4 2 11 5 3 3 1 5 2 6 4 2 13 5 4 3 11 44 1 15 90 6 5 12 5 3 10 4 2

feature 1 feature 2 feature 3

1 90 44 5 15 6 3 12 4 5 2 10

Permute with θ

2 13 5 4 3 11 44 1 15 90 6 5 3 12 4 5 2 10

Choose first K entries

2 13 5 4 3 11 44 1 15 90 6 5 3 12 4 5 2 10

Hash code is index

  • f top entry out of the K

h=2 h=2 h=1 Permutation vector θ Step 1 Step 2 Step 3

slide-13
SLIDE 13

Calculating WTA Hash

Notice that Feature 2 is just Feature 1 perturbed by one, but Feature 3 is very different

13

13 4 2 11 5 3 3 1 5 2 6 4 2 13 5 4 3 11 44 1 15 90 6 5 12 5 3 10 4 2

feature 1 feature 2 feature 3

1 90 44 5 15 6 3 12 4 5 2 10

Permute with θ

2 13 5 4 3 11 44 1 15 90 6 5 3 12 4 5 2 10

Choose first K entries

2 13 5 4 3 11 44 1 15 90 6 5 3 12 4 5 2 10

Hash code is index

  • f top entry out of the K

h=2 h=2 h=1 Permutation vector θ Feature 1 and Feature 2 are similar Step 1 Step 2 Step 3

slide-14
SLIDE 14

Random Walks

  • Understanding proximity in graphs
  • Useful in propagation in graphs – creates

probability maps

  • Similar to electrical network with voltages and

resistances

  • It is supervised.

User must specify seeds

14

2 2 2 1 1 1 1 +1V

  • 1V
  • 0.16V

0.05V 0.16V

slide-15
SLIDE 15

Our Approach

15

Input image Random projections WTA hash

Transform to

graph with (Nodes, Edges)

Similarity Search Block I Block II

Segmented

  • utput
  • Auto. seed

selection

Stop?

Probabilities from RW algo. Yes No

RW Algorithm Block III

slide-16
SLIDE 16

Block I: Similarity Search

16

Input image Random projections WTA hash

Transform to

graph with (Nodes, Edges)

Similarity Search Block I Block II

Segmented

  • utput
  • Auto. seed

selection

Stop?

Probabilities from RW algo. Yes No

RW Algorithm Block III

slide-17
SLIDE 17

WTA hash

  • Image Dimensions: P x Q x d
  • Project onto R randomly chosen hyperplanes

– Each point in image has R feature vectors

17

Image = R PQ P Q d PQ d

vectorize Random projections

  • nto R pairs of points
slide-18
SLIDE 18

WTA hash

  • Run WTA hash N times.

18

Image = R PQ Each point has R features PQ P Q d PQ d

vectorize Random projections

  • nto R pairs of points

01 11

Run WTA hash. W for each point in the image K=3 Hence possible values

  • f hash codes

are 00, 01, 11 Repeat this N times to get PQ x N matrix of hash codes

slide-19
SLIDE 19

Block II: Create Graph

19

Input image Random projections WTA hash

Transform to

graph with (Nodes, Edges)

Similarity Search Block I Block II

Segmented

  • utput
  • Auto. seed

selection

Stop?

Probabilities from RW algo. Yes No

RW Algorithm Block III

slide-20
SLIDE 20

Create Graph

  • Run WTA hash N times ! each point has N

hash codes

  • Image transformed into lattice
  • Calculate edge weight between nodes i and j

20

ωi,j = exp(−βνi,j)

where: νi,j = dH(i, j) γ dH(i, j) = Avg. Hamm. distance over all N hash codes of i and j γ = Scaling factor β = Weight parameter for the RW algorithm

slide-21
SLIDE 21

Block III: RW Algorithm

21

Input image Random projections WTA hash

Transform to

graph with (Nodes, Edges)

Similarity Search Block I Block II

Segmented

  • utput
  • Auto. seed

selection

Stop?

Probabilities from RW algo. Yes No

RW Algorithm Block III

slide-22
SLIDE 22

Seed Selection

  • Needs initial seeds to be defined
  • Unsupervised draws using Dirichlet processes
  • DP(G0,α)

– Go is base distribution – α is discovery parameter

  • Larger α leads to discovery of more classes

22

=1

!|!!!!, ! = Total!numbe where !!"! = Tot !! = Class !!! = {! ! = nu

=10

!|!!!!, ! = Total!numbe

=100

!|!!!!, ! = Total!numbe

slide-23
SLIDE 23

Seed Selection

  • Probability that a new seed belongs to a new

class is proportional to α

  • Probability for the ith sample with class label yi

– Result by Blackwell and MacQueen, 1973

23

p(yi = c|y−i, α) = n−i

c

+

α Ctot

n − 1 + α

where: Ctot = Total number of classes yi = Class label c, c 2 {1, 2...Ctot} y−i = {yj|j 6= i} n−i

c

= number of samples in cth class excluding ith sample

slide-24
SLIDE 24

Seed Selection

  • Unsupervised, hence Ctot is infinite. Hence,
  • “Clustering effect” or “rich gets richer”
  • Probability that a new class is discovered:

24

Class is non-empty Class is empty or new

lim

Ctot→∞ p(yi = c|y−i, α) =

n−i

c

n − 1 + α ∀c, n−i

c

> 0

lim

Ctot→∞ p(yi 6= yj for all j < i|y−i, α) =

α n 1 + α 8c, n−i

c

= 0

slide-25
SLIDE 25

Random Walks

  • Use the RW algorithm to generate probability

maps in each iteration

  • Entropy calculated with probability maps
  • Entropy-based stopping criteria

– Cluster purity ", Avg. image entropy #

25

slide-26
SLIDE 26

Experimental Results

26

Automatically Picked seeds

Berkeley segmentation subset

  • Avg. GCE of dataset = 0.186

Histology images

slide-27
SLIDE 27

Experimental Results

27

TexGeo Avg GCE of dataset = 0.134 TexBTF

  • Avg. GCE of dataset = 0.061
slide-28
SLIDE 28

Experimental Results

  • Comparison measure: Global Consistency

Error (GCE)*

– Lower GCE indicates lower error

28

*C. Fowlkes, D. Martin, and J. Malik, “Learning affinity functions for image segmentation: Combining patch-based and gradient-based approaches,” vol. 2, pp. II–54, IEEE, 2003.

  • No. of

features GCE Score BSDSubset TexBTF TexColor TexGeo 10 0.179 0.063 0.159 0.102 20 0.180 0.065 0.159 0.129 40 0.186 0.061 0.156 0.134

slide-29
SLIDE 29

Experimental Results

  • Comparison measure: Global Consistency

Error (GCE)

– Lower GCE indicates lower error

  • Comparison with other methods**:

– Performed on BSDS Subset

29

  • No. of

features GCE Score BSDSubset TexBTF TexColor TexGeo 10 0.179 0.063 0.159 0.102 20 0.180 0.065 0.159 0.129 40 0.186 0.061 0.156 0.134

Method Human RAD Seed

Learned Affinity Mean Shift Normalized cuts

GCE 0.080 0.205 0.209 0.214 0.260 0.336 **E. Vazquez, J. Van De Weijer, and R. Baldrich, “Image segmentation in the presence of shadows and highlights,”

  • pp. 1–14, Springer, 2008.
slide-30
SLIDE 30

Conclusions

  • Comparative reasoning and Winner Take All

hash enables fast similarity search

  • Our method performs unsupervised

segmentation using context (Random Walks-based clustering)

  • There is no need to predefine the number of

classes

  • This can be used as a pre-processing step

for classification of hyperspectral images, biomedical images etc.

30

slide-31
SLIDE 31

Thank you

31