separation and convexity properties of hierarchical and
play

Separation and convexity properties of hierarchical and non - PowerPoint PPT Presentation

Plan Separation and convexity properties of hierarchical and non hierarchical clustering Patrice Bertrand 1 1 CEREMADE, Universit e Paris-Dauphine, Paris, France Joint work with Jean Diatta 2 2 LIM, Universit e de La R eunion,


  1. Plan Separation and convexity properties of hierarchical and non hierarchical clustering Patrice Bertrand 1 1 CEREMADE, Universit´ e Paris-Dauphine, Paris, France Joint work with Jean Diatta 2 2 LIM, Universit´ e de La R´ eunion, Saint-Denis, France P. Bertrand

  2. Plan Plan 1 Background Ternary separation and convexity 2 Characterizations of clustering structures 3 Application to Cluster Analysis 4 P. Bertrand

  3. Plan Plan 1 Background Ternary separation and convexity 2 Characterizations of clustering structures 3 Application to Cluster Analysis 4 P. Bertrand

  4. Plan Plan 1 Background Ternary separation and convexity 2 Characterizations of clustering structures 3 Application to Cluster Analysis 4 P. Bertrand

  5. Plan Plan 1 Background Ternary separation and convexity 2 Characterizations of clustering structures 3 Application to Cluster Analysis 4 P. Bertrand

  6. Background Separation and convexity Characterizations Application to Cluster Analysis ◮ Multi-level clustering structures • Hierarchies Johnson (1967), Benz´ ecri (1973) • Weak Hierarchies Bandelt & Dress (1989, 1994), Diatta & Fichet (1994, 1998), Bertrand & Janowitz (2002) • Pyramids (or pseudo-hierarchies) Diday (1984, 1986), Fichet (1984, 1986) • Paired hierarchies Bertrand (2002, 2008), Bertrand & Brucker (2007) P. Bertrand

  7. Background Separation and convexity Characterizations Application to Cluster Analysis Definitions A pair { A , B } ⊆ E (ground set) is said to be ◮ hierarchical : A ∩ B ∈ { A , B , ∅} If { A , B } is not hierarchical, then A and B cross each other B A We use the following terminology for F ⊆ 2 E : ◮ set-system : {∅} / ∈ F and E ∈ F ◮ total : for all x ∈ E , { x } ⊆ F ◮ closed : F is closed under non empty intersections: � ∀G ⊆ F , G ∈ F ∪ {∅} ◮ (strongly) hierarchical : each pair { X , Y } ⊆ F is hierarchical P. Bertrand

  8. Background Separation and convexity Characterizations Application to Cluster Analysis Weak hierarchies A collection F ⊆ 2 E is said to be weakly hierarchical if ∀ X , Y , Z ∈ F , X ∩ Y ∩ Z ∈ { X ∩ Y , Y ∩ Z , X ∩ Z } nsc There are no A 1 , A 2 , A 3 ∈ F and a 1 , a 2 , a 3 ∈ E s.t. a i ∈ A j ⇐ ⇒ i � = j Forbidden configuration: A A A 3 2 1 a a a 3 1 2 P. Bertrand

  9. Background Separation and convexity Characterizations Application to Cluster Analysis Paired hierarchies A collection F ⊆ 2 E is called paired hierarchical if each F -member crosses at most one F -member nsc ◮ ∀ X , Y , Z ∈ F , at least 2 of { X , Y } , { Y , Z } , { X , Z } are hierarchical ◮ ” X crosses Y ” defines an equivalence relation whose class sizes are at most 2 ◮ Forbidden configurations: (a) (b) (c) The term paired-hierarchy is used since { � G : G is a class } is a hierarchy P. Bertrand

  10. ✂✄ ✝✞ ☎✆ ✞✟ �✁ ✂✄ ✝✞ ☎✆ ✂✄ �✁ ☎✆ ✝✞ ✆✝ ☎✆ ✝✞ ☎✆ ✂✄ �✁ ✂✄ �✁ ✝✞ �✁ �✁ Background Separation and convexity Characterizations Application to Cluster Analysis Examples and counter-examples Paired-hierarchies a c a b d c b d ✂✄✂ ☎✄☎ c d a b a b c d Weak-hierarchies a c c b d a b d P. Bertrand

  11. Background Separation and convexity Characterizations Application to Cluster Analysis Correspondences between dissimilarities and multi-level clustering structures → ( F , f ) ( F ⊆ 2 E and f : F �→ R + being increasing) d (dissimilarity on E ) ← φ : ( F , f ) �→ φ ( F , f ) with φ ( F , f )( x , y ) = min { f ( A ) : a , b ∈ A , A ∈ F} Conversely, each dissimilarity d is associated with: ◮ D d ( x , y ) : closed ball of center x ∈ E and radius r = d ( x , y ) D d ( x , y ) = { z ∈ E : d ( z , x ) ≤ d ( x , y ) } ◮ B d ( x , y ) : 2-ball generated by x , y ∈ E , in the sense of d B d ( x , y ) D d ( x , y ) ∩ D d ( y , x ) = { z ∈ E : max { d ( z , x ) , d ( z , y ) } ≤ d ( x , y ) } = x y P. Bertrand

  12. Background Separation and convexity Characterizations Application to Cluster Analysis Separation relation A ternary relation designates any subset of E 3 A (ternary) separation relation is a ternary relation of the form: ◮ Given F ⊆ 2 E , the ternary separation relation s ( F ) is defined by ( x , y , z ) ∈ s ( F ) if it exists a F -member which contains x and y but not z . In what follows, we will write simply xyz ∈ s ( F ) in place of ( x , y , z ) ∈ s ( F ) P. Bertrand

  13. Background Separation and convexity Characterizations Application to Cluster Analysis Convexity A bstract convexity (van de Vel, cf. early 1950s). ◮ A collection C ⊆ 2 E is called a convexity on E if ∅ , E ∈ C and C is closed both under intersections and nested unions. ( E , C ) is called a convex structure or a convexity space . Convex set : any member of C ◮ ∀ A ⊆ E , conv C ( A ) = � A � C = � { C : A ⊆ C ∈ C } , is called the (convex) hull of A . ◮ Notations: � a , b � := � a , b � C ◮ Segment joining a and b : the 2-polytope conv ( { a , b } ) ◮ � , � C : ( a , b ) ∈ E 2 �→ � a , b � C ∈ 2 E is called the segment operator of the convexity C . P. Bertrand

  14. Background Separation and convexity Characterizations Application to Cluster Analysis A rity The arity of C is ≤ n if for all C ∈ C and F ⊆ C with # F ≤ n , we have: � F � = conv ( F ) ⊆ C R ank A ⊆ E is called convexly independent if a / ∈ � A \ { a }� for all a ∈ A The rank of a convex structure ( E , C ) is defined as the maximum size of a convexly independent set. I nterval operator I : E × E �→ 2 E is called an interval operator on E if ∀ a , b ∈ E , a , b ∈ I ( a , b ) = I ( b , a ) . I ( a , b ) : interval between a and b ; ( E , I ) : interval space . Example: � , � C : ( a , b ) ∈ E 2 �→ � a , b � C of any convexity C P. Bertrand

  15. Background Separation and convexity Characterizations Application to Cluster Analysis N otations G I := { C ⊆ E | ∀ x , y ∈ C , I ( x , y ) ⊆ C } is the convexity induced by I G � , � C := interval convexity induced by the segment operator � , � C � a , b � I segment between a and b in the sense of the convexity G I . P roperties (Calder (1971)) ◮ ∀ a , b ∈ E , I ( a , b ) ⊆ � a , b � I ◮ A convexity is induced by an interval operator iff its arity is ≤ 2. ◮ The hull of a set A in an interval space is given by ∞ � � A � = A k , k = 0 where A 0 = A and for all k ∈ N , A k + 1 = � { I ( a , a ′ ) | a , a ′ ∈ A k } . P. Bertrand

  16. Background Separation and convexity Characterizations Application to Cluster Analysis Convexity induced by � , � C Lemma 1 Let C be a convexity on E . (i) � , � C and � , � � , � C coincide. (ii) We have: {� a , b � C | a , b ∈ E } ⊆ C ⊆ G � , � C , where the two inclusions may be strict. Remark It is easily checked that: xyz ∈ s ( C ) ⇐ ⇒ z �∈ � x , y � C . P. Bertrand

  17. Background Separation and convexity Characterizations Application to Cluster Analysis Interval operators and Cluster Analysis ◮ B d and D d are two interval operators defined on E Lemma 2 For all dissimilarity d on E and all x , y ∈ E , there exist u , v ∈ E such that: � x , y � B d = � u , v � B d = B d ( u , v ) . P. Bertrand

  18. Background Separation and convexity Characterizations Application to Cluster Analysis Separation, Interval operators and Weak Hierarchies ◮ Bandelt and Dress (1994): A set-system C is weakly hierarchical iff for all x 1 , x 2 , x 3 distinct in E , s ( C ) does not contains both x 1 x 2 x 3 , x 2 x 3 x 1 and x 3 x 1 x 2 ◮ Let I be an interval operator on E , and let ( W ) No x , y , z ∈ E exist s.t. x / ∈ I ( y , z ) , y / ∈ I ( x , z ) and z / ∈ I ( x , y ) . P. Bertrand

  19. Background Separation and convexity Characterizations Application to Cluster Analysis Proposition 3 Let I be an interval operator and let ( i ) I satisfies ( W ) ( ii ) � , � I satisfies ( W ) ( iii ) G I is weakly hierarchical ( iv ) G I is of rank at most 2, i.e. if ∅ � A ⊆ E , then � A � I is of the form � a , b � I for some a , b ∈ A . Then ( i ) ⇒ ( ii ) ⇔ ( iii ) ⇔ ( iv ) P. Bertrand

  20. Background Separation and convexity Characterizations Application to Cluster Analysis Corollary 4 If the interval operator I satisfies ( W ), then G I = {� a , b � I | a , b ∈ E } Definition 5 ( k -ball) Let A ⊆ E with # A = k > 2, and denote B d A = { x ∈ E | ∀ a ∈ A , d ( a , x ) ≤ diam d A } . Proposition 6 If ( C , f ) is an indexed closed weak-hierarchical set system s.t. f − 1 ( 0 ) = { X ∈ C | f ( X ) = 0 } is a partition of E , then B φ (( C , f )) = � A � C ∪{∅} , A for all nonempty subset A of E . P. Bertrand

  21. Background Separation and convexity Characterizations Application to Cluster Analysis Notation 7 B ( C , f ) := B φ (( C , f )) Corollary 8 Let C be a set-system on E . The following are equivalent: (i) C is closed and weakly hierarchical (ii) C ∪ {∅} = G I for some interval operator I satisfying ( W ) (iii) C ∪ {∅} = G B ( C , f ) for some index f on C satisfying � f − 1 ( 0 ) = E (iv) C ∪ {∅} = G B ( C , f ) for all index f on C satisfying � f − 1 ( 0 ) = E Criterion to recognize whether a set system is weakly hierarchical: define f by f ( A ) := | A | − 1 P. Bertrand

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend