1 of 22
Efficiently Mining Long Patterns from Databases
Roberto Bayardo IBM Almaden Research Center
Efficiently Mining Long Patterns from Databases Roberto Bayardo - - PowerPoint PPT Presentation
Efficiently Mining Long Patterns from Databases Roberto Bayardo IBM Almaden Research Center 1 of 22 The Problem Current flock of algorithms for mining frequent itemsets in databases: Use (almost exclusively) subset-infrequency pruning -
1 of 22
Roberto Bayardo IBM Almaden Research Center
2 of 22
eggs&bread, eggs&butter, and bread&butter are known to be frequent
3 of 22
4 of 22
5 of 22
6 of 22
7 of 22
{} 1 2 1,2 1,2,3 1,2,3,4 1,3 1,3,4 1,4 2,3 2,3,4 2,4 3 3,4 4 1,2,4
8 of 22
9 of 22
10 of 22
11 of 22
12 of 22
13 of 22
14 of 22
15 of 22
16 of 22
17 of 22
18 of 22
10 100 1000 10000 100000 5 10 15 20 25 30 35 CPU Time (sec) Support (%) Max-Miner Apriori-LB Apriori
19 of 22
(external slide)
20 of 22
5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 DB passes Length of longest pattern census* chess connect-4 splice mushroom retail
21 of 22
22 of 22