Data Mining and Machine Learning: Fundamental Concepts and - PowerPoint PPT Presentation

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA 2 Department of Computer Science Universidade Federal de Minas Gerais, Belo Horizonte, Brazil Chap. 12: Pattern and Rule Assessment Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Rule Assessment Measures: Support and Confidence Support: The support of the rule is defined as the number of transactions that contain both X and Y , that is, sup ( X − → Y ) = sup ( XY ) = | t ( XY ) | The relative support is the fraction of transactions that contain both X and Y , that is, the empirical joint probability of the items comprising the rule → Y ) = P ( XY ) = rsup ( XY ) = sup ( XY ) rsup ( X − | D | Confidence: The conf idence of a rule is the conditional probability that a transaction contains the consequent Y given that it contains the antecedent X : → Y ) = P ( Y | X ) = P ( XY ) P ( X ) = rsup ( XY ) rsup ( X ) = sup ( XY ) conf ( X − sup ( X ) Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Example Dataset: Support and Confidence Tid Items 1 ABDE 2 BCE 3 ABDE 4 ABCE 5 ABCDE 6 BCD Rule confidence Rule conf Frequent itemsets: minsup = 3 A − → E 1.00 sup rsup Itemsets E − → A 0.80 3 0.5 ABD , ABDE , AD , ADE BCE , BDE , CE , DE B − → E 0.83 4 0 . 67 A , C , D , AB , ABE , AE , BC , BD E − → B 1.00 5 0 . 83 E , BE E − → BC 0.60 6 1 . 0 B BC − → E 0.75 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Rule Assessment Measures: Lift, Leverage and Jaccard Lift: Lift is defined as the ratio of the observed joint probability of X and Y to the expected joint probability if they were statistically independent, that is, P ( XY ) rsup ( X ) · rsup ( Y ) = conf ( X − rsup ( XY ) → Y ) lift ( X − → Y ) = P ( X ) · P ( Y ) = rsup ( Y ) Leverage: Leverage measures the difference between the observed and expected joint probability of XY assuming that X and Y are independent leverage ( X − → Y ) = P ( XY ) − P ( X ) · P ( Y ) = rsup ( XY ) − rsup ( X ) · rsup ( Y ) Jaccard: The Jaccard coefficient measures the similarity between two sets. When applied as a rule assessment measure it computes the similarity between the tidsets of X and Y : → Y ) = | t ( X ) ∩ t ( Y ) | jaccard ( X − | t ( X ) ∪ t ( Y ) | P ( XY ) = P ( X ) + P ( Y ) − P ( XY ) Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Lift, Leverage, Jaccard, Support and Confidence Rule lift AE − → BC 0.75 Rule rsup lift leverage ACD − → E 0.17 1.20 0.03 CE − → AB 1.00 AC − → E 0.33 1.20 0.06 BE − → AC 1.20 AB − → D 0.50 1.12 0.06 A − → E 0.67 1.20 0.11 Rule rsup conf lift E − → AC 0.33 0.40 1.20 Rule rsup lift jaccard − → 0.67 0.80 1.20 − → 0.33 0.75 0.33 E AB A C B − → E 0.83 0.83 1.00 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Contingency Table for X and Y Y ¬ Y sup ( XY ) sup ( X ¬ Y ) sup ( X ) X ¬ X sup ( ¬ XY ) sup ( ¬ X ¬ Y ) sup ( ¬ X ) sup ( Y ) sup ( ¬ Y ) | D | Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Rule Assessment Measures: Conviction Define ¬ X to be the event that X is not contained in a transaction, that is, X �⊆ t ∈ T , and likewise for ¬ Y . There are, in general, four possible events depending on the occurrence or non-occurrence of the itemsets X and Y as depicted in the contingency table. Conviction measures the expected error of the rule, that is, how often X occurs in a transaction where Y does not. It is thus a measure of the strength of a rule with respect to the complement of the consequent, defined as → Y ) = P ( X ) · P ( ¬ Y ) 1 conv ( X − = P ( X ¬ Y ) lift ( X − → ¬ Y ) If the joint probability of X ¬ Y is less than that expected under independence of X and ¬ Y , then conviction is high, and vice versa. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Rule Conviction Rule rsup conf lift conv A − → DE 0.50 0.75 1.50 2.00 − → 0.50 1.00 1.50 ∞ DE A E − → C 0.50 0.60 0.90 0.83 C − → E 0.50 0.75 0.90 0.68 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Rule Assessment Measures: Odds Ratio The odds ratio utilizes all four entries from the contingency table. Let us divide the dataset into two groups of transactions – those that contain X and those that do not contain X . Define the odds of Y in these two groups as follows: odds ( Y | X ) = P ( XY ) / P ( X ) P ( X ¬ Y ) / P ( X ) = P ( XY ) P ( X ¬ Y ) odds ( Y |¬ X ) = P ( ¬ XY ) / P ( ¬ X ) P ( ¬ X ¬ Y ) / P ( ¬ X ) = P ( ¬ XY ) P ( ¬ X ¬ Y ) The odds ratio is then defined as the ratio of these two odds: → Y ) = odds ( Y | X ) odds ( Y |¬ X ) = P ( XY ) · P ( ¬ X ¬ Y ) oddsratio ( X − P ( X ¬ Y ) · P ( ¬ XY ) = sup ( XY ) · sup ( ¬ X ¬ Y ) sup ( X ¬ Y ) · sup ( ¬ XY ) If X and Y are independent, then odds ratio has value 1. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Odds Ratio Let us compare the odds ratio for two rules, C − → A and D − → A . The contingency tables for A and C , and for A and D , are given below: C ¬ C D ¬ D 2 2 3 1 A A ¬ A 2 0 ¬ A 1 1 The odds ratio values for the two rules are given as → A ) = sup ( AC ) · sup ( ¬ A ¬ C ) sup ( A ¬ C ) · sup ( ¬ AC ) = 2 × 0 oddsratio ( C − 2 × 2 = 0 → A ) = sup ( AD ) · sup ( ¬ A ¬ D ) sup ( A ¬ D ) · sup ( ¬ AD ) = 3 × 1 oddsratio ( D − 1 × 1 = 3 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Iris Data: Discretization Attribute Range or value Label 4.30–5.55 sl 1 Sepal length 5.55–6.15 sl 2 6.15–7.90 sl 3 2.00–2.95 sw 1 Sepal width 2.95–3.35 sw 2 3.35–4.40 sw 3 1.00–2.45 pl 1 Petal length 2.45–4.75 pl 2 4.75–6.90 pl 3 0.10–0.80 pw 1 0.80–1.75 pw 2 Petal width 1.75–2.50 pw 3 Iris-setosa c 1 Class Iris-versicolor c 2 Iris-virginica c 3 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

uT bC rS rS rSrS rSrS rSrSrSrS rSrS rSrS bC bC bC bCbCbCbCbCbC rSrSrS bCbCbCbC bCbCbC bCbCbC bCbCbC bC rS uT uT uT uT uT rSrSrS rS uT uTuT uT uT uT uT uT uTuT uTuT uT uT uTuT uTuT rS uTuT uTuT uTuT rS rS rS rS rS rS rS rSrS uT bC uT rSrS rS rS uT rS rS rSrS rS rS rSrSrS rSrSrS rS rS rSrS rS rSrSrSrS rSrS rSrS bC bC bC bC bCbCbCbCbCbC bCbCbCbC bCbCbC bCbCbC bCbCbC uT rS rS uTuT uTuT uT uT uT uT uT uTuT uTuT uTuT uTuT Iris: Support vs. Confidence, and Conviction vs. Lift conf conv bC Iris-setosa ( c 1 ) 1 . 00 30 . 0 rS uT rS Iris-versicolor ( c 2 ) uT Iris-virginica ( c 3 ) 25 . 0 0 . 75 20 . 0 0 . 50 15 . 0 10 . 0 bC Iris-setosa ( c 1 ) 0 . 25 rS Iris-versicolor ( c 2 ) 5 . 0 uT Iris-virginica ( c 3 ) 0 0 0 0 . 1 0 . 2 0 . 3 0 . 4 0 0 . 5 1 . 0 1 . 5 2 . 0 2 . 5 3 . 0 rsup lift (a) Support vs. confidence (b) Lift vs. conviction Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Iris Data: Best Class-specific Rules Best Rules by Support and Confidence Rule rsup conf lift conv { pl 1 , pw 1 } − → c 1 0.333 1.00 3.00 ∞ pw 2 − → c 2 0.327 0.91 2.72 6.00 pl 3 − → c 3 0.327 0.89 2.67 5.24 Best Rules by Lift and Conviction Rule rsup conf lift conv { pl 1 , pw 1 } − → c 1 0.33 1.00 3.00 ∞ { pl 2 , pw 2 } − → c 2 0.29 0.98 2.93 15.00 { sl 3 , pl 3 , pw 3 } − → c 3 0.25 1.00 3.00 ∞ Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Pattern Assessment Measures: Support and Lift Support: The most basic measures are support and relative support, giving the number and fraction of transactions in D that contain the itemset X : rsup ( X ) = sup ( X ) sup ( X ) = | t ( X ) | | D | Lift: The lift of a k -itemset X = { x 1 , x 2 ,..., x k } is defined as P ( X ) rsup ( X ) lift ( X , D ) = = � k � k i = 1 P ( x i ) i = 1 rsup ( x i ) Generalized Lift: Assume that { X 1 , X 2 ,..., X q } is a q -partition of X , i.e., a partitioning of X into q nonempty and disjoint itemsets X i . Define the generalized lift of X over partitions of size q as follows: � � P ( X ) lift q ( X ) = min � q i = 1 P ( X i ) X 1 ,..., X q This is, the least value of lift over all q -partitions X . Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chap. 12: Pattern and Rule Assessment

Data Mining and Machine Learning: Fundamental Concepts and - PowerPoint PPT Presentation

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA 2 Department of Computer Science

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Introduction What is data mining? to Data Mining: On what kind of data? Data Mining

Data mining Machine Intelligence Thomas D. Nielsen September 2008 Data mining September 2008

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Data Mining: Concepts and Techniques Chapter 1 Introduction 1 August 19, 2013

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 1 of Data Mining by

Introduction What is data mining? to Data mining functionalities Data Mining Major

DATA MINING LECTURE 2 What is data? The data mining pipeline What is Data Mining? Data

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Data Mining: Concepts and Techniques Web Mining Li Xiong Slides credits: Jiawei Han and

Data Mining 2020 Frequent Pattern Mining (2) Ad Feelders Universiteit Utrecht October 2, 2020

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Sta$s$cal Significance Tes$ng In Theory and In Prac$ce Ben

Users Really Do Plug in USB Drives They Find Matthew Tischer, Zakir Durumeric, Sam Foster, Sunny

When its better to ask forgiveness than get permission Chris Thompson, Maritza Johnson, Serge

The intersection axiom of conditional independence : some new results Richard D. Gill

Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312, Spring 2019 Heckman

Interesting Patterns Jilles Vreeken 15 May 2015 Questions of the Day What is interestingness?

Normalization and differential expression II Katharina H oel Statistical Analysis of RNA-Seq

Table of contents 1. Introduction: You are already an experimentalist 2. Conditions 3. Items

Sambuz

Useful Links

Newsletter

Mail Us