Fuzzy Clustering Each point x i takes a probability w ij to belong - PowerPoint PPT Presentation

Fuzzy Clustering • Each point x i takes a probability w ij to belong to a cluster C j • Requirements k w 1 ∑ – For each point x i , = ij j 1 = m – For each cluster C j 0 w m < ∑ ij < i = 1 Jian Pei: CMPT 459/741 Clustering (4) 1

Fuzzy C-Means (FCM) Select an initial fuzzy pseudo-partition, i.e., assign values to all the w ij Repeat Compute the centroid of each cluster using the fuzzy pseudo-partition Recompute the fuzzy pseudo-partition, i.e., the w ij Until the centroids do not change (or the change is below some threshold) Jian Pei: CMPT 459/741 Clustering (4) 2

Critical Details • Optimization on sum of the squared error (SSE): k m p 2 SSE ( C , … , C ) w dist ( x , c ) ∑∑ = 1 k ij i j j 1 i 1 = = m m p p c w x / w • Computing centroids: ∑ ∑ = j ij i ij i 1 i 1 = = • Updating the fuzzy pseudo-partition 1 1 k 2 p 1 2 p 1 w ( 1 / dist ( x , c ) ) ( 1 / dist ( x , c ) ) − − ∑ = ij i j i q q 1 = k – When p=2 2 2 w 1 / dist ( x , c ) 1 / dist ( x , c ) ∑ = ij i j i q q 1 = Jian Pei: CMPT 459/741 Clustering (4) 3

Choice of P • When p à 1, FCM behaves like traditional k-means • When p is larger, the cluster centroids approach the global centroid of all data points • The partition becomes fuzzier as p increases Jian Pei: CMPT 459/741 Clustering (4) 4

Effectiveness Jian Pei: CMPT 459/741 Clustering (4) 5

Mixture Models • A cluster can be modeled as a probability distribution – Practically, assume a distribution can be approximated well using multivariate normal distribution • Multiple clusters is a mixture of different probability distributions • A data set is a set of observations from a mixture of models Jian Pei: CMPT 459/741 Clustering (4) 6

Object Probability • Suppose there are k clusters and a set X of m objects – Let the j-th cluster have parameter θ j = ( µ j , σ j ) – The probability that a point is in the j-th cluster is w j , w 1 + … + w k = 1 • The probability of an object x is k prob ( x | ) w p ( x | ) ∑ Θ = θ j j j j 1 = m m k prob ( X | ) prob ( x | ) w p ( x | ) ∏ ∏∑ Θ = Θ = θ i j j i j i 1 i 1 j 1 = = = Jian Pei: CMPT 459/741 Clustering (4) 7

Example 2 ( x ) − µ 1 − 2 prob ( x | ) e 2 σ Θ = i 2 π σ ( 4 , 2 ) ( 4 , 2 ) θ = − θ = 1 2 2 2 ( x 4 ) ( x 4 ) + − 1 1 − − prob ( x | ) e e 8 8 Θ = + 2 2 2 2 π π Jian Pei: CMPT 459/741 Clustering (4) 8

Maximal Likelihood Estimation • Maximum likelihood principle: if we know a set of objects are from one distribution, but do not know the parameter, we can choose the parameter maximizing the probability 2 ( x ) • Maximize − µ m 1 − 2 prob ( x | ) e 2 ∏ Θ = σ i 2 π σ j 1 = – Equivalently, maximize 2 m ( x ) − µ log prob ( X | ) i 0 . 5 m log 2 m log ∑ Θ = − − π − σ 2 2 σ i 1 = Jian Pei: CMPT 459/741 Clustering (4) 9

EM Algorithm • Expectation Maximization algorithm Select an initial set of model parameters Repeat Expectation Step: for each object, calculate the probability that it belongs to each distribution θ i , i.e., prob(x i | θ i ) Maximization Step: given the probabilities from the expectation step, find the new estimates of the parameters that maximize the expected likelihood Until the parameters are stable Jian Pei: CMPT 459/741 Clustering (4) 10

Advantages and Disadvantages • Mixture models are more general than k- means and fuzzy c-means • Clusters can be characterized by a small number of parameters • The results may satisfy the statistical assumptions of the generative models • Computationally expensive • Need large data sets • Hard to estimate the number of clusters Jian Pei: CMPT 459/741 Clustering (4) 11

Grid-based Clustering Methods • Ideas – Using multi-resolution grid data structures – Using dense grid cells to form clusters • Several interesting methods – CLIQUE – STING – WaveCluster Jian Pei: CMPT 459/741 Clustering (4) 12

CLIQUE • Clustering In QUEst • Automatically identify subspaces of a high dimensional data space • Both density-based and grid-based Jian Pei: CMPT 459/741 Clustering (4) 13

CLIQUE: the Ideas • Partition each dimension into the same number of equal length intervals – Partition an m-dimensional data space into non- overlapping rectangular units • A unit is dense if the number of data points in the unit exceeds a threshold • A cluster is a maximal set of connected dense units within a subspace Jian Pei: CMPT 459/741 Clustering (4) 14

CLIQUE: the Method • Partition the data space and find the number of points in each cell of the partition – Apriori: a k-d cell cannot be dense if one of its (k-1)-d projection is not dense • Identify clusters: – Determine dense units in all subspaces of interests and connected dense units in all subspaces of interests • Generate minimal description for the clusters – Determine the minimal cover for each cluster Jian Pei: CMPT 459/741 Clustering (4) 15

CLIQUE: An Example 6 7 (10,000) Salary 5 Vac atio n 4 3 30 50 1 2 age age 0 Vacation (week) 20 30 40 50 60 6 7 5 4 3 1 2 age 0 20 30 40 50 60 Jian Pei: CMPT 459/741 Clustering (4) 16

CLIQUE: Pros and Cons • Automatically find subspaces of the highest dimensionality with high density clusters • Insensitive to the order of input – Not presume any canonical data distribution • Scale linearly with the size of input • Scale well with the number of dimensions • The clustering result may be degraded at the expense of simplicity of the method Jian Pei: CMPT 459/741 Clustering (4) 17

Bad Cases for CLIQUE Parts of a cluster may be missed A cluster from CLIQUE may contain noise Jian Pei: CMPT 459/741 Clustering (4) 18

Dimensionality Reduction • Clustering a high dimensional data set is challenging – Distance between two points could be dominated by noise • Dimensionality reduction: choosing the informative dimensions for clustering analysis – Feature selection: choosing a subset of existing dimensions – Feature construction: construct a new (small) set of informative attributes Jian Pei: CMPT 459/741 Clustering (4) 19

Variance and Covariance • Given a set of 1-d points, how different are those points? n 2 ( X X ) = ∑ − i – Standard deviation: s i 1 = n n 1 − 2 ( X X ) = ∑ − – Variance: i 2 s i 1 = n 1 − • Given a set of 2-d points, are the two dimensions correlated? n – Covariance: ( X X )( Y Y ) = ∑ − − i i i 1 cov( X , Y ) = n 1 − Jian Pei: CMPT 459/741 Clustering (4) 20

Principal Components Art work and example from http://csnet.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf Jian Pei: CMPT 459/741 Clustering (4) 21

Step 1: Mean Subtraction • Subtract the mean from each dimension for each data point • Intuition: centralizing the data set Jian Pei: CMPT 459/741 Clustering (4) 22

Step 2: Covariance Matrix � cov( D , D ) cov( D , D ) cov( D , D ) ⎛ ⎞ 1 1 1 2 1 n ⎜ ⎟ cov( D , D ) cov( D , D ) � cov( D , D ) ⎜ ⎟ 2 1 2 2 2 n C = ⎜ ⎟ � � � � ⎜ ⎟ ⎜ ⎟ cov( D , D ) cov( D , D ) � cov( D , D ) ⎝ ⎠ n 1 n 2 n n Jian Pei: CMPT 459/741 Clustering (4) 23

Step 3: Eigenvectors and Eigenvalues • Compute the eigenvectors and the eigenvalues of the covariance matrix – Intuition: find those direction invariant vectors as candidates of new attributes – Eigenvalues indicate how much the direction invariant vectors are scaled – the larger the better for manifest the data variance Jian Pei: CMPT 459/741 Clustering (4) 24

Step 4: Forming New Features • Choose the principal components and forme new features – Typically, choose the top-k components Jian Pei: CMPT 459/741 Clustering (4) 25

New Features NewData = RowFeatureVector x RowDataAdjust The first principal component is used Jian Pei: CMPT 459/741 Clustering (4) 26

Clustering in Derived Space Y - 0.707x + 0.707y O X Jian Pei: CMPT 459/741 Clustering (4) 27

Spectral Clustering Data Affinity matrix Computing the leading Clustering in the Projecting back to k eigenvectors of A new space cluster the original data [ ] W ij Av = \lamda v A = f(W) Jian Pei: CMPT 459/741 Clustering (4) 28

Affinity Matrix • Using a distance measure dist ( oi,oj ) W ij = e − σ w where σ is a scaling parameter controling how fast the affinity W ij decreases as the distance increases • In the Ng-Jordan-Weiss algorithm, W ii is set to 0 Jian Pei: CMPT 459/741 Clustering (4) 29

Clustering • In the Ng-Jordan-Weiss algorithm, we define a diagonal matrix such that n X D ii = W ij j =1 • Then, A = D − 1 2 WD − 1 2 • Use the k leading eigenvectors to form a new space • Map the original data to the new space and conduct clustering Jian Pei: CMPT 459/741 Clustering (4) 30

Fuzzy Clustering Each point x i takes a probability w ij to belong - PowerPoint PPT Presentation

Fuzzy Clustering Each point x i takes a probability w ij to belong to a cluster C j Requirements k w 1 For each point x i , = ij j 1 = m For each cluster C j 0 w m < ij < i = 1 Jian Pei: CMPT 459/741

A fuzzy clustering method using Genetic Algorithm and Fuzzy Subtractive Clustering Thanh Le, Tom

On Fuzzy Soft Rings Banu Pazar Varol and Halis Ayg un Department of Mathematics, Kocaeli

Applications Three sample applications Fuzzy inferno Nostalgic cow Twilight Eden Fuzzy inferno

7 Transformations of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric Computing

Semi-Heuristic Target-Based Fuzzy Target . . . Fuzzy Target . . . Fuzzy Decision Procedures:

M odels for Inexact Reasoning Fuzzy Logic Lesson 8 Fuzzy Controllers M aster in

11 Fuzzy Rule-Based Models Fuzzy Systems Engineering Toward Human-Centric Computing Contents

Geodesic Distance Distance based based Geodesic Fuzzy Clustering Clustering Fuzzy Abonyi and

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Fuzzy Reasoning Outline Introduction Bivalent & Multivalent Logics Fundamental

M odels for Inexact Reasoning Fuzzy Logic Lesson 1 Crisp and Fuzzy Sets M aster in

5 Operations and Aggregations of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric

On using Different Distance Measures for Fuzzy Numbers in Fuzzy Linear Regression Models Duygu

10 Fuzzy Modeling: Principles and Methodology Fuzzy Systems Engineering Toward Human-Centric

2 Notions and Concepts of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric Computing

SNAP: Stateful Network-Wide Abstractions for Packet Processing Mina Tahmasbi Arashloo 1 , Yaron

SOSP History Day Workshop Jeanna Neefe Matthews October 4 2015 1 Birth of SOSP and SIGOPS SOSP

TCP and UDP Performance over a Wireless LAN George Xylomenos and George C. Polyzos {

Abstractions for timed automata B. Srivathsan Chennai Mathematical Institute Joint work with F.

Introduction to Network Security Chapter 7 Transport Layer Protocols Dr. Doug Jacobson -

Constraint Satisfaction Problems Chapter 5 Section 1 3 Constraint Satisfaction 1 Outline

Contents Foundations of Artificial Intelligence What are CSPs? 1 5. Constraint Satisfaction

Problems C H A P T E R 5 H A S S A N K H O S R A V I S P R I N G 2 0 1 1 Outline CSP

Fuzzy Clustering Each point x i takes a probability w ij to belong - PowerPoint PPT Presentation

Fuzzy Clustering Each point x i takes a probability w ij to belong to a cluster C j Requirements k w 1 For each point x i , = ij j 1 = m For each cluster C j 0 w m < ij < i = 1 Jian Pei: CMPT 459/741

A fuzzy clustering method using Genetic Algorithm and Fuzzy Subtractive Clustering Thanh Le, Tom

On Fuzzy Soft Rings Banu Pazar Varol and Halis Ayg un Department of Mathematics, Kocaeli

Applications Three sample applications Fuzzy inferno Nostalgic cow Twilight Eden Fuzzy inferno

7 Transformations of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric Computing

Semi-Heuristic Target-Based Fuzzy Target . . . Fuzzy Target . . . Fuzzy Decision Procedures:

M odels for Inexact Reasoning Fuzzy Logic Lesson 8 Fuzzy Controllers M aster in

11 Fuzzy Rule-Based Models Fuzzy Systems Engineering Toward Human-Centric Computing Contents

Geodesic Distance Distance based based Geodesic Fuzzy Clustering Clustering Fuzzy Abonyi and

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Fuzzy Reasoning Outline Introduction Bivalent &amp; Multivalent Logics Fundamental

M odels for Inexact Reasoning Fuzzy Logic Lesson 1 Crisp and Fuzzy Sets M aster in

5 Operations and Aggregations of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric

On using Different Distance Measures for Fuzzy Numbers in Fuzzy Linear Regression Models Duygu

10 Fuzzy Modeling: Principles and Methodology Fuzzy Systems Engineering Toward Human-Centric

2 Notions and Concepts of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric Computing

SNAP: Stateful Network-Wide Abstractions for Packet Processing Mina Tahmasbi Arashloo 1 , Yaron

SOSP History Day Workshop Jeanna Neefe Matthews October 4 2015 1 Birth of SOSP and SIGOPS SOSP

TCP and UDP Performance over a Wireless LAN George Xylomenos and George C. Polyzos {

Abstractions for timed automata B. Srivathsan Chennai Mathematical Institute Joint work with F.

Introduction to Network Security Chapter 7 Transport Layer Protocols Dr. Doug Jacobson -

Constraint Satisfaction Problems Chapter 5 Section 1 3 Constraint Satisfaction 1 Outline

Contents Foundations of Artificial Intelligence What are CSPs? 1 5. Constraint Satisfaction

Problems C H A P T E R 5 H A S S A N K H O S R A V I S P R I N G 2 0 1 1 Outline CSP

Fuzzy Reasoning Outline Introduction Bivalent & Multivalent Logics Fundamental