k variates
play

k -variates++: Poster #29, Mon. 3-7pm more pluses in the k -means++ - PowerPoint PPT Presentation

(formerly NICTA) k -variates++: Poster #29, Mon. 3-7pm more pluses in the k -means++ Richard Nock , Raphal Canyasse, Roksana Boreli, Frank Nielsen DATA61 | ANU | TECHNION | ECOLE POLYTECHNIQUE | UNSW | SONY CS LABS, INC. www.data61.csiro.au In


  1. (formerly NICTA) k -variates++: Poster #29, Mon. 3-7pm more pluses in the k -means++ Richard Nock , Raphaël Canyasse, Roksana Boreli, Frank Nielsen DATA61 | ANU | TECHNION | ECOLE POLYTECHNIQUE | UNSW | SONY CS LABS, INC. www.data61.csiro.au

  2. In this talk k -variates ❖ A generalization of the popular k -means++ seeding � ❖ Two theorems on k -variates++ � ❖ guarantees on approximation of the global optimum � ❖ likelihood ratio bound between neighbouring instances � ❖ Applications: “ reductions” between clustering algorithms + approximation bounds of new clustering algorithms, privacy 2 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  3. In this talk k -variates ❖ A generalization of the popular k -means++ seeding � ❖ Two theorems on k -variates++ � ! e r o m d n A ❖ guarantees on approximation of the global optimum � ) r e t s o p e e s ( ❖ likelihood ratio bound between neighbouring instances � ❖ Applications: “ reductions” between clustering algorithms + approximation bounds of new clustering algorithms, privacy 3 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  4. In this talk k -variates ❖ A generalization of the popular k -means++ seeding � ❖ Two theorems on k -variates++ � ! e r o m d n A ❖ guarantees on approximation of the global optimum � � ) r e t s o p e e s ( ❖ likelihood ratio bound between neighbouring instances � ) ! r e p a p e e s ( ❖ Applications: “ reductions” between clustering algorithms + approximation bounds of new clustering algorithms, privacy 4 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  5. 
 Motivation k -means++ seeding = a gold standard in ❖ clustering: � utterly simple to implement (iteratively ❖ pick centers squ. distance to previous ∼ centers) � assumption-free (expected) approximation ❖ guarantee wrt the k -means global optimum : 
 k -means++ E C [potential] ≤ (2 + log k ) · 8 φ opt distributed on-line (Arthur & Vassilvitskii, SODA 2007) � streamed ❖ Inspired many variants (tensor clustering, distributed, data stream, on-line, parallel no closed form centroid clustering, clustering without centroids in tensors closed form, etc.) more potentials 5 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  6. Motivation Approaches are spawns of k -means++: � ❖ modify the algorithm (e.g. ) � ❖ ∼ k -variates use it as building block � ❖ Our objective: � ❖ all in the same “bag”: a generalisation of ❖ k -means++ k -means++ from which such approaches distributed would be just “instanciations” 
 more applications reductions � ⇒ on-line Because general new applications ❖ ⇒ streamed no closed form centroid more potentials 6 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  7. k -means++ Arthur & Vassilvitskii, SODA’07 Input : data A ⇢ R d with | A | = m , k 2 N ⇤ ; Step 1: Initialise centers C ; ; Step 2: for t = 1 , 2 , ..., k . 2.1: randomly sample a ⇠ q t A , with q 1 = u m and, for t > 1, ! � 1 X . . x 2 C k a � x k 2 D t ( a 0 ) q t ( a ) = D t ( a ) , where D t ( a ) = min 2 ; a 0 2 A 2.2: x a ; 2.3: C C [ { x } ; Output : C ; 7 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  8. k -variates Input : data A ⇢ R d with | A | = m , k 2 N ⇤ , random variables { X a , a 2 A } , probe functions ℘ t : A ! R d ( t � 1); Step 1: Initialise centers C ; ; Step 2: for t = 1 , 2 , ..., k . 2.1: randomly sample a ⇠ q t A , with q 1 = u m and, for t > 1, ! � 1 X . . x 2 C k ℘ t ( a ) � x k 2 D t ( a 0 ) q t ( a ) = D t ( a ) , where D t ( a ) = min 2 ; a 0 2 A 2.2: randomly sample x ⇠ X a ; 2.3: C C [ { x } ; Output : C ; 8 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  9. Two theorems & applications 9 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  10. Theorem 1 l a b o l g f o n o i t a m i x o r p p a m u m t i p o ❖ k -means potential for : , with � c ∈ C k a � c k 2 a ∈ A k a � c ( a ) k 2 C φ ( A ; C ) = P c ( a ) = arg min . . 2 2 ( ≥ 0) ❖ Suppose is -stretching: for any optimal cluster with size > 1 A ℘ t η and any , 
 a 0 ∈ A φ ( A ; C ) φ ( ℘ t ( A ); C ) φ ( A ; { a 0 } ) ≤ (1 + η ) · φ ( ℘ t ( A ); { ℘ t ( a 0 ) } ) , ∀ t ❖ Then , with E C ∼ k − variates++ [ φ ( A ; C )] ≤ (2 + log k ) · Φ = (6 + 4 η ) φ opt + 2 φ bias + 2 φ var Φ . X k a � c opt ( a ) k 2 = φ opt . 2 a ∈ A X k E [ X a ] � c opt ( a ) k 2 = φ bias . 2 a ∈ A X . = tr (cov[ X a ]) φ var a ∈ A 10 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  11. Theorem 1 l a b o l g f o n o i t a m i x o r p p a m u m t i p o ❖ k -means potential for : , with � c ∈ C k a � c k 2 a ∈ A k a � c ( a ) k 2 C φ ( A ; C ) = P c ( a ) = arg min . . 2 2 ( ≥ 0) ❖ Suppose is -stretching: for any optimal cluster with size > 1 A ℘ t η and any , 
 a 0 ∈ A φ ( A ; C ) φ ( ℘ t ( A ); C ) φ ( A ; { a 0 } ) ≤ (1 + η ) · φ ( ℘ t ( A ); { ℘ t ( a 0 ) } ) , ∀ t ❖ Then , with E C ∼ k − variates++ [ φ ( A ; C )] ≤ (2 + log k ) · Φ = (6 + 4 η ) φ opt + 2 φ bias + 2 φ var Φ . k- means++: � X k a � c opt ( a ) k 2 = φ opt . 2 • probe = Id � a ∈ A • = Diracs X k E [ X a ] � c opt ( a ) k 2 = φ bias . X . 2 a ∈ A X . = tr (cov[ X a ]) φ var a ∈ A 11 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  12. Theorem 1 l a b o l g f o n o i t a m i x o r p p a m u m t i p o ❖ k -means potential for : , with � c ∈ C k a � c k 2 a ∈ A k a � c ( a ) k 2 C φ ( A ; C ) = P c ( a ) = arg min . . 2 2 ( ≥ 0) ❖ Suppose is -stretching: for any optimal cluster with size > 1 A ℘ t η and any , 
 a 0 ∈ A φ ( A ; C ) φ ( ℘ t ( A ); C ) φ ( A ; { a 0 } ) ≤ (1 + η ) · φ ( ℘ t ( A ); { ℘ t ( a 0 ) } ) , ∀ t ❖ Then , with E C ∼ k − variates++ [ φ ( A ; C )] ≤ (2 + log k ) · Φ = (6 + 4 η ) φ opt + 2 φ bias + 2 φ var Φ . k- means++: X k a � c opt ( a ) k 2 = φ opt . 2 φ bias = φ opt a ∈ A φ var = 0 X k E [ X a ] � c opt ( a ) k 2 = φ bias . 2 0 = a ∈ A η X . φ opt = tr (cov[ X a ]) φ var 8 = Φ ⇒ a ∈ A 12 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  13. 
 
 
 Remarks ❖ Guarantee approaches statistical lowerbound 
 (Fréchet-Cramér-Rao-Darmois) � ❖ Can be better than Arthur-Vassilvitskii bound, in particular if 
 φ bias < φ opt φ bias = knob from which background / domain knowledge may improve the general bound 
 13 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  14. Applications ❖ Reductions from k -variates++ approximability ratios � ⇒ ❖ pick clustering algorithm , � L ❖ show that expected output of = that of k -variates++ L for particular choices of and 
 X . ℘ t (note: no computational constraint, just need existence) � ❖ Get approximability ratio for ! 
 L 14 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  15. Summary (poster, paper) X . Setting Algorithm L Probe functions Densities ℘ t Batch k -means++ Identity Diracs Distributed d - k -means++ Identity Uniform, support = subsets p + d - k -means++ Distributed Identity Non uniform, compact support Streaming s - k -means++ synopses Diracs On-line ol - k -means++ point (batch not hit) Diracs / closest center (batch hit) 15 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

  16. Summary (poster, paper) X . Setting Algorithm L Probe functions Densities ℘ t Batch k -means++ Identity Diracs Distributed d - k -means++ Identity Uniform, support = subsets p + d - k -means++ Distributed Identity Non uniform, compact support Streaming s - k -means++ synopses Diracs On-line ol - k -means++ point (batch not hit) Diracs / closest center (batch hit) 16 k -variates++: more pluses in the k -means++ | Richard Nock , Raphael Canyasse, Roksana Boreli & Frank Nielsen ICML 2016

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend