Clustering
DWML, 2007 1/27
Clustering DWML, 2007 1/27 Densitiy Based Clustering DBSCAN Idea: - - PowerPoint PPT Presentation
Clustering DWML, 2007 1/27 Densitiy Based Clustering DBSCAN Idea: identify contiguous regions of high density. DWML, 2007 2/27 Densitiy Based Clustering Step 1: classification of points 1.: Choose parameters , k DWML, 2007 3/27 Densitiy
DWML, 2007 1/27
DBSCAN
DWML, 2007 2/27
Step 1: classification of points
DWML, 2007 3/27
Step 1: classification of points
DWML, 2007 3/27
Step 1: classification of points
DWML, 2007 3/27
Step 1: classification of points
DWML, 2007 3/27
Step 1: classification of points
DWML, 2007 3/27
Step 1: classification of points
DWML, 2007 3/27
Step 1: classification of points
DWML, 2007 3/27
Step 2: Define Connectivity
DWML, 2007 4/27
Step 2: Define Connectivity
DWML, 2007 4/27
Step 2: Define Connectivity
DWML, 2007 4/27
Setting k and ǫ
Pros and Cons + Can detect clusters of highly irregular shape + Robust with respect to outliers
DWML, 2007 5/27
Probabilistic Model for Clustering
k
i=1
k
i=1
DWML, 2007 6/27
Clustering principle
N
j=1
i=1 Pi(a)λi.
DWML, 2007 7/27
Mixture of Gaussians
DWML, 2007 8/27
Mixture Model → Data
DWML, 2007 9/27
Mixture Model → Data
DWML, 2007 9/27
Mixture Model → Data
DWML, 2007 9/27
Data → Clustering
DWML, 2007 10/27
Data → Clustering
DWML, 2007 10/27
Data → Clustering
DWML, 2007 10/27
Data → Clustering
DWML, 2007 10/27
Data → Clustering
DWML, 2007 10/27
Gaussian Mixture Models
DWML, 2007 11/27
Naive Bayes Mixture
DWML, 2007 12/27
Clustering as fitting Incomplete Data
DWML, 2007 13/27
DWML, 2007 14/27
Scoring a Clustering
k
i=1
s,s′∈Si
DWML, 2007 15/27
Axioms for Clustering
DWML, 2007 16/27
DWML, 2007 17/27
Market basket data
DWML, 2007 18/27
Rule structure
Antecedent
Consequent
DWML, 2007 19/27
Rule structure
Antecedent
Consequent
Issues to consider when learning association rules
DWML, 2007 19/27
Interesting association rules
DWML, 2007 20/27
Interesting association rules
DWML, 2007 20/27
Interesting association rules
DWML, 2007 20/27
Interesting association rules
DWML, 2007 20/27
Support, Confidence
DWML, 2007 21/27
Support, Confidence
DWML, 2007 21/27
Support, Confidence
DWML, 2007 21/27
Support, Accuracy Example: The rule
Example: In fraud detection we would be interested in rules with high confidence although
Association rule mining
DWML, 2007 22/27
Frequent sets
DWML, 2007 23/27
The APriori Algorithm
DWML, 2007 24/27
The APriori Algorithm
DWML, 2007 24/27
The APriori Algorithm
DWML, 2007 24/27
The APriori Algorithm
ID Asparagus Beans Broccoli Corn Pepper Squash Tomatos 1 1 1 1 2 1 1 1 3 1 1 1 1 4 1 1 1 1 5 1 1 1 6 1 1 1 1 7 1 1 Frequent 1-item sets (F1): freq(Asparagus) = 4 freq(Beans) = 3 freq(Broccoli) = 2 freq(Corn) = 5 freq(Pepper) = 2 freq(Squash) = 3 freq(Tomatos) = 4 Candidate 2-item sets (C2): All possible combinations. Frequent 2-item sets (F2): A Be Br C P S T A 2 1 2 1 2 2 Be 1 1 2 2 Br 1 1 C 2 2 3 P 1 S 2
DWML, 2007 25/27
Frequent sets → Association rules
DWML, 2007 26/27
Not all rules are interesting
ID Asparagus Beans Broccoli Corn Pepper Squash Tomatos 1 1 1 1 2 1 1 1 3 1 1 1 1 4 1 1 1 1 5 1 1 1 6 1 1 1 1 7 1 1
DWML, 2007 27/27
Not all rules are interesting
ID Asparagus Beans Broccoli Corn Pepper Squash Tomatos 1 1 1 1 2 1 1 1 3 1 1 1 1 4 1 1 1 1 5 1 1 1 6 1 1 1 1 7 1 1
DWML, 2007 27/27