K -Medoids for K -Means Seeding James Newling & Franc ois - - PowerPoint PPT Presentation

k medoids for k means seeding
SMART_READER_LITE
LIVE PREVIEW

K -Medoids for K -Means Seeding James Newling & Franc ois - - PowerPoint PPT Presentation

K -Medoids for K -Means Seeding James Newling & Franc ois Fleuret Machine Learning Group, Idiap Research Institute & EPFL December 5th, 2017 COLE POLYTECHNIQUE FDRALE DE LAUSANNE The standard K -means pipeline First: Seeding.


slide-1
SLIDE 1

K-Medoids for K-Means Seeding

James Newling & Franc ¸ois Fleuret

Machine Learning Group,

Idiap Research Institute & EPFL

December 5th, 2017

ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE

slide-2
SLIDE 2

The standard K-means pipeline First: Seeding. Second: Lloyd’s (a.k.a. K-means) algorithm.

uniform K-means++ E = 0.105 E = 0.072 simulated data K = 122, N = 25K LLOYD LLOYD

1 / 3

slide-3
SLIDE 3

The standard K-means pipeline (+CLARANS)

uniform K-means++ E = 0.105 E = 0.072 simulated data K = 122, N = 25K LLOYD LLOYD E = 0.032 E = 0.032 CLARANS LLOYD CLARANS LLOYD

1 / 3

slide-4
SLIDE 4

CLARANS of Ng and Han (1994)

1: while not converged do 2:

randomly choose 1 center and 1 non-center

3:

if swapping them decreases E then

4:

implement the swap

5:

end if

6: end while

2 / 3

slide-5
SLIDE 5

CLARANS of Ng and Han (1994)

1: while not converged do 2:

randomly choose 1 center and 1 non-center

3:

if swapping them decreases E then

4:

implement the swap

5:

end if

6: end while

Avoids local minima of LLOYD by,

  • long-range swaps
  • updating centers and samples simultanously.

2 / 3

slide-6
SLIDE 6

CLARANS of Ng and Han (1994)

1: while not converged do 2:

randomly choose 1 center and 1 non-center

3:

if swapping them decreases E then

4:

implement the swap

5:

end if

6: end while

Avoids local minima of LLOYD by,

  • long-range swaps
  • updating centers and samples simultanously.

We present algorithmic improvements, where

  • computing new E is O(N/K)
  • implementing swap is O(N).

2 / 3

slide-7
SLIDE 7

Results

  • RNA dataset, d = 8, N = 16 × 104, K = 400
  • 50 runs without CLARANS (red), 24 runs with (blue).

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 time [s] 1.0 1.2 1.4 1.6 1.8 E K-means++ LLOYD K-means++ CLARANS LLOYD

  • On 16 datasets, geometric mean improvement is 3%.

CLARANS with Levenshtein metric for sequence data, l0, l1, . . . , l∞ for sparse/dense vectors, many others, on github.

3 / 3

slide-8
SLIDE 8

The end james.newling@idiap.ch