Fuzzy Systems Fuzzy Clustering Rudolf Kruse Christian Moewes - PowerPoint PPT Presentation

Fuzzy Systems Fuzzy Clustering Rudolf Kruse Christian Moewes {kruse,cmoewes}@iws.cs.uni-magdeburg.de Otto-von-Guericke University of Magdeburg Faculty of Computer Science Department of Knowledge Processing and Language Engineering R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 1 / 76

Outline 1. Fuzzy Data Analysis Representation of a Datum Data Analysis 2. Clustering 3. Basic Clustering Algorithms 4. Distance Function Variants 5. Objective Function Variants 6. Cluster Validity R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 2 / 76

Fuzzy Data Analysis datum: • something given • gets its sense in a certain context • describes the condition of a certain “thing” • carries only information if there are at least two different possibilities of the condition • is seen as the realization of a certain variable of a universe R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 3 / 76

Representation of a Datum • characteristic yes/no: universe consists of two elements • characteristic gradiations: universe (finite), grade (figures) • observations/measurements: universe (Euclidean space) • continuous observations in space or time: universe (Hilbert space), e.g. , spectrogram • gray-shaded images: universe (depends), e.g. , x-ray images • expert opinion: universe (logic), e.g. , statements, facts, rules R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 4 / 76

Data Analysis 1st level • valuation and examination with regard to simple, essential characteristics • analysis of frequency, reliability test, runaway, credibility 2nd level • pattern matching • grouping observations (according to background knowledge, . . . ) • maybe transformation with the aim of finding structures withing data explorative data analysis • examination of data without previously chosen mathematic model R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 5 / 76

Data Analysis 3rd level • analysis of data regarding one or more mathematical models • qualitative • formation relating to additional characteristics expressed by quality • e.g. , introduction of the term of similarity for cluster analysis • quantitative • recognition of functional relations • e.g. , approximation of regression analysis R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 6 / 76

Data Analysis 4th level • conclusion and evaluation of the conclusion • prediction of future or missing data ( e.g. , time line analysis) • data assign to standards ( e.g. , spectrogram analysis) • combination of data ( e.g. , data fusion) • valuation of conclusions • possibly learning from data, model revision problem • what to do in case of vague, imprecise or inconsistent data ⇒ fuzzy data analysis • common data is analyzed with fuzzy methods R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 7 / 76

Outline 1. Fuzzy Data Analysis 2. Clustering 3. Basic Clustering Algorithms 4. Distance Function Variants 5. Objective Function Variants 6. Cluster Validity R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 8 / 76

Clustering • clustering is an unsupervised learning task • goal: divide dataset s.t. both constraints hold • objects belonging to same cluster are as similar as possible • objects belonging to different clusters are as dissimilar as possible • similarity is usually measured in terms of distance function • the smaller the distance, the more similar two data tuples Definition R p × I R p → [ 0 , ∞ ) is a distance function if ∀ x , y , z ∈ I R p : d : I (i) d ( x , y ) = 0 ⇔ x = y (identity), (ii) d ( x , y ) = d ( y , x ) (symmetry), (iii) d ( x , z ) ≤ d ( x , y ) + d ( y , z ) (triangle inequality). R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 9 / 76

Distance Functions Illustration of distance functions • Minkowski family � p � 1 k � ( x d − y d ) k d k ( x , y ) = d = 1 • well-known special cases from this family are k = 1 : Manhattan or city block distance, k = 2 : Euclidean distance, maximum distance, i.e. d ∞ ( x , y ) = max p k → ∞ : d = 1 | x d − y d | k = 1 k = 2 k → ∞ R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 10 / 76

Partitioning Algorithms • here, we only focus on partitioning algorithms • i.e. , given c ∈ I N , find best partition of data into c groups • different from hierarchical techniques, i.e. , organize data in nested sequence of groups • usually number of (true) clusters is unknown • using partitioning methods, however, we must specify c R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 11 / 76

Prototype-based Clustering • we focus on prototype-based clustering algorithms • i.e. , clusters are represented by cluster prototypes C i , i = 1 , . . . , c • prototypes capture structure (distribution) of data in each cluster • set of prototypes C = { C 1 , . . . , C c } • prototype C i is n -tuple which consists of • cluster center c i , and • some additional parameters about size and shape of cluster • prototypes are constructed by clustering algorithms R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 12 / 76

Outline 1. Fuzzy Data Analysis 2. Clustering 3. Basic Clustering Algorithms Hard c-means Fuzzy c-means Possibilistic c-means Comparison of FCM and PCM 4. Distance Function Variants 5. Objective Function Variants R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 13 / 76 6. Cluster Validity

Basic Clustering Algorithms Center Vectors and Objective Functions • consider simplest cluster prototypes, i.e. , center vectors C i = ( c i ) • distance measure d based on inner product, e.g. , Euclidean distance • all algorithms are based on objective functions J • quantify goodness of cluster models • must be minimized to obtain optimal clusters • algorithms determine best decomposition by minimizing J R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 14 / 76

Hard c -means R p is • each data point x j in dataset X = { x 1 , . . . , x n } , X ⊆ I assigned to exactly one cluster ⇒ each cluster Γ i ⊂ X • set of clusters Γ = { Γ 1 , . . . , Γ c } must be exhaustive partition of X into c non-empty and pairwise disjoint subsets Γ i , 1 < c < n • data partition is optimal when sum of squared distances between cluster centers and data points assigned to them is minimal • clusters should be as homogeneous as possible R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 15 / 76

Hard c -means • objective function of the hard c -means: c n � � u ij d 2 J h ( X , U h , C ) = ij i = 1 j = 1 • U = ( u ij ∈ { 0 , 1 } ) c × n is called partition matrix with � 1 , if x j ∈ Γ i u ij = 0 , otherwise • each data point is assigned exactly to one cluster c � u ij = 1 , ∀ j ∈ { 1 , . . . , n } i = 1 • every cluster must contain at least one data point n � u ij > 0 , ∀ i ∈ { 1 , . . . , c } j = 1 R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 16 / 76

Alternating Optimization Scheme • J h depends on c and assignment U of data points to clusters • finding parameters that minimize J h is NP-hard • hard c -means minimizes J h by alternating optimization (AO) 1. parameters to optimize are split into two groups 2. one group is optimized holding the other group fixed (and vice versa) 3. iterative update scheme is repeated until convergence • it cannot be guaranteed that global optimum will be reached • algorithm may get stuck in local minimum R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 17 / 76

AO Scheme for Hard c -means 1. chose initial c i , e.g. , randomly picking c data points ∈ X 2. hold C fixed and determine U that minimize J h • each data point is assigned to its closest cluster center � if i = arg min c 1 , k = 1 d kj u ij = 0 , otherwise • any other assignment would not minimize J h for fixed clusters 3. hold U fixed, update c i as mean of all x j assigned to them • mean minimizes sum of square distances in J h , formally � n j = 1 u ij x j c i = � n j = 1 u ij 4. both steps are repeated until no change in C or U can be observed R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 18 / 76

Example • symmetric dataset with two clusters • hard c -means assigns crisp label to data point in middle • is this very intuitive? R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 19 / 76

Discussion • hard c -means tends to get stuck in local minimum • it is necessary to conduct several runs with different initializations [Duda and Hart, 1973] • sophisticated initialization methods can be used as well, e.g. , Latin hypercube sampling [McKay et al., 1979] • best result of many clusterings can be chosen based on J h • crisp memberships { 0 , 1 } prohibit ambiguous assignments • when clusters are badly delineated or overlapping, relaxing requirement u ij ∈ { 0 , 1 } needed R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Clustering 2009/12/13 20 / 76

Fuzzy Systems Fuzzy Clustering Rudolf Kruse Christian Moewes - PowerPoint PPT Presentation

Fuzzy Systems Fuzzy Clustering Rudolf Kruse Christian Moewes {kruse,cmoewes}@iws.cs.uni-magdeburg.de Otto-von-Guericke University of Magdeburg Faculty of Computer Science Department of Knowledge Processing and Language Engineering R. Kruse,

On Fuzzy Soft Rings Banu Pazar Varol and Halis Ayg un Department of Mathematics, Kocaeli

Applications Three sample applications Fuzzy inferno Nostalgic cow Twilight Eden Fuzzy inferno

7 Transformations of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric Computing

11 Fuzzy Rule-Based Models Fuzzy Systems Engineering Toward Human-Centric Computing Contents

M odels for Inexact Reasoning Fuzzy Logic Lesson 8 Fuzzy Controllers M aster in

Semi-Heuristic Target-Based Fuzzy Target . . . Fuzzy Target . . . Fuzzy Decision Procedures:

5 Operations and Aggregations of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric

10 Fuzzy Modeling: Principles and Methodology Fuzzy Systems Engineering Toward Human-Centric

2 Notions and Concepts of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric Computing

Fuzzy Reasoning Outline Introduction Bivalent & Multivalent Logics Fundamental

A fuzzy clustering method using Genetic Algorithm and Fuzzy Subtractive Clustering Thanh Le, Tom

M odels for Inexact Reasoning Fuzzy Logic Lesson 1 Crisp and Fuzzy Sets M aster in

On using Different Distance Measures for Fuzzy Numbers in Fuzzy Linear Regression Models Duygu

Fuzzy Systems Christian Jacob jacob@cpsc.ucalgary.ca

Fuzzy Logic Andrew Kusiak Fuzzy logic is a tool for embedding Intelligent Systems Laboratory

Least Sensitive For t-Norms and t- . . . Definition of a Fuzzy . . . (Most Robust) Main Result

Rigless, a Misnomer? Applications to late well lifecycles Steven Allan Canny 4 th October 2017

Energy Submetering at WRRFs October 2017 October 2017 Nancy Andrews rews Survey Data Indicates

Localized Realized Volatility Modeling Ying Chen Wolfgang Karl Hrdle Uta Pigorsch National

Advances in Intelligent Compaction for HMA NCAUPG HMA Conference Overland Park, Ks. Victor

Math 233 - October 8, 2009 What is the general proceedure for finding extrema? 1. (a) What

Sparse resolutions to inconsistent datasets using L1-minimization Arun Hegde Wenyu Li Jim

COMPUTER PRESENTATION OF THE CLOSED CIRCUITS IN MINERAL PROCESSING BY SOFTWARE COMUPUTER PACKETS

Distributed Sensing and Perception via Sparse Representation Allen Y. Yang Department of EECS,

Fuzzy Systems Fuzzy Clustering Rudolf Kruse Christian Moewes - PowerPoint PPT Presentation

Fuzzy Systems Fuzzy Clustering Rudolf Kruse Christian Moewes {kruse,cmoewes}@iws.cs.uni-magdeburg.de Otto-von-Guericke University of Magdeburg Faculty of Computer Science Department of Knowledge Processing and Language Engineering R. Kruse,

On Fuzzy Soft Rings Banu Pazar Varol and Halis Ayg un Department of Mathematics, Kocaeli

Applications Three sample applications Fuzzy inferno Nostalgic cow Twilight Eden Fuzzy inferno

7 Transformations of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric Computing

11 Fuzzy Rule-Based Models Fuzzy Systems Engineering Toward Human-Centric Computing Contents

M odels for Inexact Reasoning Fuzzy Logic Lesson 8 Fuzzy Controllers M aster in

Semi-Heuristic Target-Based Fuzzy Target . . . Fuzzy Target . . . Fuzzy Decision Procedures:

5 Operations and Aggregations of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric

10 Fuzzy Modeling: Principles and Methodology Fuzzy Systems Engineering Toward Human-Centric

2 Notions and Concepts of Fuzzy Sets Fuzzy Systems Engineering Toward Human-Centric Computing

Fuzzy Reasoning Outline Introduction Bivalent &amp; Multivalent Logics Fundamental

A fuzzy clustering method using Genetic Algorithm and Fuzzy Subtractive Clustering Thanh Le, Tom

M odels for Inexact Reasoning Fuzzy Logic Lesson 1 Crisp and Fuzzy Sets M aster in

On using Different Distance Measures for Fuzzy Numbers in Fuzzy Linear Regression Models Duygu

Fuzzy Systems Christian Jacob jacob@cpsc.ucalgary.ca

Fuzzy Logic Andrew Kusiak Fuzzy logic is a tool for embedding Intelligent Systems Laboratory

Least Sensitive For t-Norms and t- . . . Definition of a Fuzzy . . . (Most Robust) Main Result

Rigless, a Misnomer? Applications to late well lifecycles Steven Allan Canny 4 th October 2017

Energy Submetering at WRRFs October 2017 October 2017 Nancy Andrews rews Survey Data Indicates

Localized Realized Volatility Modeling Ying Chen Wolfgang Karl Hrdle Uta Pigorsch National

Advances in Intelligent Compaction for HMA NCAUPG HMA Conference Overland Park, Ks. Victor

Math 233 - October 8, 2009 What is the general proceedure for finding extrema? 1. (a) What

Sparse resolutions to inconsistent datasets using L1-minimization Arun Hegde Wenyu Li Jim

COMPUTER PRESENTATION OF THE CLOSED CIRCUITS IN MINERAL PROCESSING BY SOFTWARE COMUPUTER PACKETS

Distributed Sensing and Perception via Sparse Representation Allen Y. Yang Department of EECS,

Fuzzy Reasoning Outline Introduction Bivalent & Multivalent Logics Fundamental