Parameter-free Mining of Non-redundant Discriminative Itemsets - - PowerPoint PPT Presentation

parameter free mining of non redundant
SMART_READER_LITE
LIVE PREVIEW

Parameter-free Mining of Non-redundant Discriminative Itemsets - - PowerPoint PPT Presentation

An Exhaustive Covering Approach to Parameter-free Mining of Non-redundant Discriminative Itemsets Yoshitaka Kameya Meijo University DaWaK-16 1 Outline Background Our propsal Experiments DaWaK-16 2 Outline Background Our


slide-1
SLIDE 1

An Exhaustive Covering Approach to Parameter-free Mining of Non-redundant Discriminative Itemsets

Yoshitaka Kameya Meijo University

1 DaWaK-16

slide-2
SLIDE 2

Outline

  • Background
  • Our propsal
  • Experiments

DaWaK-16 2

slide-3
SLIDE 3

Outline

  • Background
  • Our propsal
  • Experiments

DaWaK-16 3

slide-4
SLIDE 4

Background: Discriminative Patterns (1)

  • Discriminative patterns:

– Show differences between two groups (classes) – Used for:

  • Characterizing the positive class
  • Building more precise classifiers

DaWaK-16 4

milk=True  aquatic=False  + Discriminative pattern x Positive class –:Negative class +:Positive class Class labels

slide-5
SLIDE 5

Background: Discriminative Patterns (2)

  • Discriminative patterns tend to be more meaningful

than frequent patterns (thanks to class labels)

  • Are class labels always available?

– Comparing groups is a standard starting point in data analysis – Clustering can find groups (classes)

DaWaK-16 5

Clusters

  • 1. Clustering

Clusters labeled with discriminative patterns

.... .... ....

  • 2. Discriminative

pattern mining

Original data

 Cluster labeling

slide-6
SLIDE 6

Background: Discriminative Patterns (3)

  • Quality score: Measures the overlap between pattern x

and positive class c

  • Most of popular quality scores are not anti-monotonic:

– Confidence, Lift – Support difference, Weighted relative accuracy, Leverage – F-score, Dice, Jaccard – ...

DaWaK-16 6

 Branch & bound pruning is often used

[Morishita+ 00][Zimmarmann+ 09][Nijssen+ 09]

x

Quality is high

c

Quality is low

x

c

slide-7
SLIDE 7
  • Example: Item A is relevant to the positive class

Rank Pattern F-score TIDs Covered 1 {A, C} 0.75 2, 3, 4 2 {B} 0.73 1, 2, 4, 5 3 {A} 0.67 1, 2, 3, 4 3 {A, B} 0.67 1, 2, 4 5 {A, D, E} 0.60 1, 2, 3 5 {A, E} 0.60 1, 2, 3 5 {C} 0.60 2, 3, 4 8 {A, B, C} 0.57 2, 4 8 {A, C, D} 0.57 2, 3 8 {A, C, D, E} 0.57 2, 3 8 {A, C, E} 0.57 2, 3 12 {A, D} 0.55 1, 2, 3 13 {A, B, D} 0.50 1, 2 13 {A, B, D, E} 0.50 1, 2 13 {A, B, E} 0.50 1, 2 13 {B, C} 0.50 2, 4 TID Class Transaction 1 + {A, B, D, E} 2 + {A, B, C, D, E} 3 + {A, C, D, E} 4 + {A, B, C} 5 + {B} 6 – {A, B, D, E} 7 – {B, C, D, E} 8 – {C, D, E} 9 – {A, D, E} 10 – {A, D} TID Class Transaction 1 + {A, B, D, E} 2 + {A, B, C, D, E} 3 + {A, C, D, E} 4 + {A, B, C} 5 + {B} 6 – {A, B, D, E} 7 – {B, C, D, E} 8 – {C, D, E} 9 – {A, D, E} 10 – {A, D}

Background: Coping with redundancy (1)

DaWaK-16 7

Dataset Positive Transactions Negative Transactions Top-15 patterns (+1 due to tie score)

 Patterns containing A tend to be top-ranked in the candidate list (most of them are redundant)

slide-8
SLIDE 8

Background: Coping with redundancy (2)

  • Set-inclusion-based constraints

– Closedness [Pasquier+ 99] – Productivity [Bayardo 00][Webb 07]

DaWaK-16 8

slide-9
SLIDE 9

Background: Coping with redundancy (2)

  • Set-inclusion-based constraints

– Closedness [Pasquier+ 99] – Productivity [Bayardo 00][Webb 07]

DaWaK-16 9

Rank Pattern F-score TIDs Covered 1 {A, C} 0.75 2, 3, 4 2 {B} 0.73 1, 2, 4, 5 3 {A} 0.67 1, 2, 3, 4 3 {A, B} 0.67 1, 2, 4 5 {A, D, E} 0.60 1, 2, 3 5 {A, E} 0.60 1, 2, 3 5 {C} 0.60 2, 3, 4 8 {A, B, C} 0.57 2, 4 8 {A, C, D} 0.57 2, 3 8 {A, C, D, E} 0.57 2, 3 8 {A, C, E} 0.57 2, 3 12 {A, D} 0.55 1, 2, 3 13 {A, B, D} 0.50 1, 2 13 {A, B, D, E} 0.50 1, 2 13 {A, B, E} 0.50 1, 2 13 {B, C} 0.50 2, 4

Closedness: For patterns covering the same (positive) transactions, pick the largest one

slide-10
SLIDE 10

Background: Coping with redundancy (2)

  • Set-inclusion-based constraints

– Closedness [Pasquier+ 99] – Productivity [Bayardo 00][Webb 07]

DaWaK-16 10

Rank Pattern F-score TIDs Covered 1 {A, C} 0.75 2, 3, 4 2 {B} 0.73 1, 2, 4, 5 3 {A} 0.67 1, 2, 3, 4 3 {A, B} 0.67 1, 2, 4 5 {A, D, E} 0.60 1, 2, 3 5 {A, E} 0.60 1, 2, 3 5 {C} 0.60 2, 3, 4 8 {A, B, C} 0.57 2, 4 8 {A, C, D} 0.57 2, 3 8 {A, C, D, E} 0.57 2, 3 8 {A, C, E} 0.57 2, 3 12 {A, D} 0.55 1, 2, 3 13 {A, B, D} 0.50 1, 2 13 {A, B, D, E} 0.50 1, 2 13 {A, B, E} 0.50 1, 2 13 {B, C} 0.50 2, 4

Closedness: For patterns covering the same (positive) transactions, pick the largest one

slide-11
SLIDE 11

Background: Coping with redundancy (2)

  • Set-inclusion-based constraints

– Closedness [Pasquier+ 99] – Productivity [Bayardo 00][Webb 07]

DaWaK-16 11

Rank Pattern F-score TIDs Covered 1 {A, C} 0.75 2, 3, 4 2 {B} 0.73 1, 2, 4, 5 3 {A} 0.67 1, 2, 3, 4 3 {A, B} 0.67 1, 2, 4 5 {A, D, E} 0.60 1, 2, 3 5 {A, E} 0.60 1, 2, 3 5 {C} 0.60 2, 3, 4 8 {A, B, C} 0.57 2, 4 8 {A, C, D} 0.57 2, 3 8 {A, C, D, E} 0.57 2, 3 8 {A, C, E} 0.57 2, 3 12 {A, D} 0.55 1, 2, 3 13 {A, B, D} 0.50 1, 2 13 {A, B, D, E} 0.50 1, 2 13 {A, B, E} 0.50 1, 2 13 {B, C} 0.50 2, 4

16 patterns  8 patterns

slide-12
SLIDE 12

Background: Coping with redundancy (2)

  • Set-inclusion-based constraints

– Closedness [Pasquier+ 99] – Productivity [Bayardo 00][Webb 07]

DaWaK-16 12

Rank Pattern F-score TIDs Covered 1 {A, C} 0.75 2, 3, 4 2 {B} 0.73 1, 2, 4, 5 3 {A} 0.67 1, 2, 3, 4 3 {A, B} 0.67 1, 2, 4 5 {A, D, E} 0.60 1, 2, 3 5 {A, E} 0.60 1, 2, 3 5 {C} 0.60 2, 3, 4 8 {A, B, C} 0.57 2, 4 8 {A, C, D} 0.57 2, 3 8 {A, C, D, E} 0.57 2, 3 8 {A, C, E} 0.57 2, 3 12 {A, D} 0.55 1, 2, 3 13 {A, B, D} 0.50 1, 2 13 {A, B, D, E} 0.50 1, 2 13 {A, B, E} 0.50 1, 2 13 {B, C} 0.50 2, 4

Productivity: If a super-pattern has no higher quality, remove it

slide-13
SLIDE 13

Background: Coping with redundancy (2)

  • Set-inclusion-based constraints

– Closedness [Pasquier+ 99] – Productivity [Bayardo 00][Webb 07]

DaWaK-16 13

Rank Pattern F-score TIDs Covered 1 {A, C} 0.75 2, 3, 4 2 {B} 0.73 1, 2, 4, 5 3 {A} 0.67 1, 2, 3, 4 3 {A, B} 0.67 1, 2, 4 5 {A, D, E} 0.60 1, 2, 3 5 {A, E} 0.60 1, 2, 3 5 {C} 0.60 2, 3, 4 8 {A, B, C} 0.57 2, 4 8 {A, C, D} 0.57 2, 3 8 {A, C, D, E} 0.57 2, 3 8 {A, C, E} 0.57 2, 3 12 {A, D} 0.55 1, 2, 3 13 {A, B, D} 0.50 1, 2 13 {A, B, D, E} 0.50 1, 2 13 {A, B, E} 0.50 1, 2 13 {B, C} 0.50 2, 4

Productivity: If a super-pattern has no higher quality, remove it

slide-14
SLIDE 14

Background: Coping with redundancy (2)

  • Set-inclusion-based constraints

– Closedness [Pasquier+ 99] – Productivity [Bayardo 00][Webb 07]

DaWaK-16 14

Rank Pattern F-score TIDs Covered 1 {A, C} 0.75 2, 3, 4 2 {B} 0.73 1, 2, 4, 5 3 {A} 0.67 1, 2, 3, 4 3 {A, B} 0.67 1, 2, 4 5 {A, D, E} 0.60 1, 2, 3 5 {A, E} 0.60 1, 2, 3 5 {C} 0.60 2, 3, 4 8 {A, B, C} 0.57 2, 4 8 {A, C, D} 0.57 2, 3 8 {A, C, D, E} 0.57 2, 3 8 {A, C, E} 0.57 2, 3 12 {A, D} 0.55 1, 2, 3 13 {A, B, D} 0.50 1, 2 13 {A, B, D, E} 0.50 1, 2 13 {A, B, E} 0.50 1, 2 13 {B, C} 0.50 2, 4

16 patterns  4 patterns

slide-15
SLIDE 15

Background: Coping with redundancy (2)

  • Set-inclusion-based constraints

– Productivity + Closedness [Kameya+ 13]

DaWaK-16 15

Rank Pattern F-score TIDs Covered 1 {A, C} 0.75 2, 3, 4 2 {B} 0.73 1, 2, 4, 5 3 {A} 0.67 1, 2, 3, 4 3 {A, B} 0.67 1, 2, 4 5 {A, D, E} 0.60 1, 2, 3 5 {A, E} 0.60 1, 2, 3 5 {C} 0.60 2, 3, 4 8 {A, B, C} 0.57 2, 4 8 {A, C, D} 0.57 2, 3 8 {A, C, D, E} 0.57 2, 3 8 {A, C, E} 0.57 2, 3 12 {A, D} 0.55 1, 2, 3 13 {A, B, D} 0.50 1, 2 13 {A, B, D, E} 0.50 1, 2 13 {A, B, E} 0.50 1, 2 13 {B, C} 0.50 2, 4

16 patterns  3 patterns

slide-16
SLIDE 16

Background: Coping with redundancy (3)

  • The best-covering constraint

– In the same spirit of the HCC (highest confidence covering) constraint in HARMONY [Wang+ 05]

DaWaK-16 16

Rank Pattern F-score TIDs Covered 1 {A, C} 0.75 2, 3, 4 2 {B} 0.73 1, 2, 4, 5 3 {A} 0.67 1, 2, 3, 4 3 {A, B} 0.67 1, 2, 4 5 {A, D, E} 0.60 1, 2, 3 5 {A, E} 0.60 1, 2, 3 5 {C} 0.60 2, 3, 4 8 {A, B, C} 0.57 2, 4 8 {A, C, D} 0.57 2, 3 8 {A, C, D, E} 0.57 2, 3 8 {A, C, E} 0.57 2, 3 12 {A, D} 0.55 1, 2, 3 13 {A, B, D} 0.50 1, 2 13 {A, B, D, E} 0.50 1, 2 13 {A, B, E} 0.50 1, 2 13 {B, C} 0.50 2, 4

Best-covering: Every pattern must be the best to at least one positive transaction

slide-17
SLIDE 17

Background: Coping with redundancy (3)

  • The best-covering constraint

– In the same spirit of the HCC (highest confidence covering) constraint in HARMONY [Wang+ 05]

DaWaK-16 17

Rank Pattern F-score TIDs Covered 1 {A, C} 0.75 2, 3, 4 2 {B} 0.73 1, 2, 4, 5 3 {A} 0.67 1, 2, 3, 4 3 {A, B} 0.67 1, 2, 4 5 {A, D, E} 0.60 1, 2, 3 5 {A, E} 0.60 1, 2, 3 5 {C} 0.60 2, 3, 4 8 {A, B, C} 0.57 2, 4 8 {A, C, D} 0.57 2, 3 8 {A, C, D, E} 0.57 2, 3 8 {A, C, E} 0.57 2, 3 12 {A, D} 0.55 1, 2, 3 13 {A, B, D} 0.50 1, 2 13 {A, B, D, E} 0.50 1, 2 13 {A, B, E} 0.50 1, 2 13 {B, C} 0.50 2, 4

Best-covering: Every pattern must be the best to at least one positive transaction

slide-18
SLIDE 18

Background: Coping with redundancy (3)

  • The best-covering constraint

– In the same spirit of the HCC (highest confidence covering) constraint in HARMONY [Wang+ 05]

DaWaK-16 18

Rank Pattern F-score TIDs Covered 1 {A, C} 0.75 2, 3, 4 2 {B} 0.73 1, 2, 4, 5 3 {A} 0.67 1, 2, 3, 4 3 {A, B} 0.67 1, 2, 4 5 {A, D, E} 0.60 1, 2, 3 5 {A, E} 0.60 1, 2, 3 5 {C} 0.60 2, 3, 4 8 {A, B, C} 0.57 2, 4 8 {A, C, D} 0.57 2, 3 8 {A, C, D, E} 0.57 2, 3 8 {A, C, E} 0.57 2, 3 12 {A, D} 0.55 1, 2, 3 13 {A, B, D} 0.50 1, 2 13 {A, B, D, E} 0.50 1, 2 13 {A, B, E} 0.50 1, 2 13 {B, C} 0.50 2, 4

16 patterns  2 patterns

Original dataset TID Class Transaction 1 + {A, B, D, E} 2 + {A, B, C, D, E} 3 + {A, C, D, E} 4 + {A, B, C} 5 + {B} 6 – {A, B, D, E} 7 – {B, C, D, E} 8 – {C, D, E} 9 – {A, D, E} 10 – {A, D}

slide-19
SLIDE 19

Background: Control parameters

  • Minimum support (minsup) min is a sensitive control

parameter

  • Top-k mining [Han+ 02]:

– k = "# of output patterns" – k is fairly easy to specify because we usually know

how many patterns we can handle

(k is more human-centric than min)

– However, we do not exactly know in advance how many useful patterns we can mine – Is it possible to remove even k ?

DaWaK-16 19

  • r
slide-20
SLIDE 20

Background: Sequential covering (1)

  • Sequential covering:

– One traditional way for building a rule-based classifier

  • Procedure:

– Iterate until there are no uncovered positive examples

  • Induce a new rule r
  • Remove all positive examples covered by r

DaWaK-16 20

Negative examples Positive examples

slide-21
SLIDE 21

Background: Sequential covering (1)

  • Sequential covering:

– One traditional way for building a rule-based classifier

  • Procedure:

– Iterate until there are no uncovered positive examples

  • Induce a new rule r
  • Remove all positive examples covered by r

DaWaK-16 21

Negative examples Positive examples Covered by rule r1

slide-22
SLIDE 22

Background: Sequential covering (1)

  • Sequential covering:

– One traditional way for building a rule-based classifier

  • Procedure:

– Iterate until there are no uncovered positive examples

  • Induce a new rule r
  • Remove all positive examples covered by r

DaWaK-16 22

Negative examples Positive examples Covered positive examples are removed

slide-23
SLIDE 23

Background: Sequential covering (1)

  • Sequential covering:

– One traditional way for building a rule-based classifier

  • Procedure:

– Iterate until there are no uncovered positive examples

  • Induce a new rule r
  • Remove all positive examples covered by r

DaWaK-16 23

Negative examples Positive examples Covered by rule r2

slide-24
SLIDE 24

Background: Sequential covering (1)

  • Sequential covering:

– One traditional way for building a rule-based classifier

  • Procedure:

– Iterate until there are no uncovered positive examples

  • Induce a new rule r
  • Remove all positive examples covered by r

DaWaK-16 24

Negative examples Positive examples Covered positive examples are removed

slide-25
SLIDE 25

Background: Sequential covering (2)

  • Problems in removing positive examples:

– Lately-generated rules may not be meaningful – The number of positive examples decreases [Domingos 94]  Lately-generated rules may not be statistically reliable

DaWaK-16 25

Negative examples Positive examples Next rules must be learned from positive examples under a biased distribution

slide-26
SLIDE 26

Our proposal

  • ExCover: an efficient and exact method for finding

non-redundant discriminative itemsets

  • Features:

– Exhaustive search unlike sequential covering – Best-covering constraint tighter than productivity  fewer redundant patterns – No control parameters limiting the search space

DaWaK-16 26

slide-27
SLIDE 27

Outline

 Background

  • Our proposal

– Best-covering constraint – ExCover

  • Experiments

DaWaK-16 27

slide-28
SLIDE 28

Outline

 Background

  • Our proposal

– Best-covering constraint – ExCover

  • Experiments

DaWaK-16 28

slide-29
SLIDE 29

Best-covering constraint (1)

  • Best-covering constraint:

“Every pattern must have the highest quality for at least one positive transaction it covers”

DaWaK-16 29

x t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11

Possible Patterns Positive transactions "Covers" relation

slide-30
SLIDE 30

Best-covering constraint (2)

  • Best-covering constraint:

“Every pattern must have the highest quality for at least one positive transaction it covers”

DaWaK-16 30

x

Positive transactions "Covers" relation

t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11

Instance of x Possible Patterns

slide-31
SLIDE 31

Best-covering constraint (3)

  • Best-covering constraint:

“Every pattern must have the highest quality for at least one positive transaction it covers”

DaWaK-16 31

x

Positive transactions "Covers" relation Instance of x Competitors for t3

t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11

Possible Patterns

slide-32
SLIDE 32

Best-covering constraint (3)

  • Best-covering constraint:

“Every pattern must have the highest quality for at least one positive transaction it covers”

DaWaK-16 32

x

Positive transactions "Covers" relation Instance of x Competitors for t7

t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11

Possible Patterns

slide-33
SLIDE 33

Best-covering constraint (3)

  • Best-covering constraint:

“Every pattern must have the highest quality for at least one positive transaction it covers”

DaWaK-16 33

x

Positive transactions "Covers" relation Instance of x Competitors for t9

t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 We can also say : x must have higher quality than any other competitors for some instance

Possible Patterns

slide-34
SLIDE 34

Best-covering constraint (4)

  • Tightness:

Best-covering is tighter than productivity

Sketch of proof – Sub-pattern of x is always a competitor of x – If x is best-covering, its sub-pattern must have lower quality – Productivity: x must have higher quality than its sub-patterns

  • Branch & bound pruning:

We can safely prune x and its descendants when the upper bound of x's quality is lower than the quality of any competitor of x

DaWaK-16 34

Best-covering: x must have higher quality than any other competitors for some instance

slide-35
SLIDE 35

Outline

 Background

  • Our proposal

Best-covering constraint – ExCover

  • Experiments

DaWaK-16 35

slide-36
SLIDE 36
  • Basic strategy:

– Depth-first search by a variant [Kameya+ 13] of LCM [Uno+ 04]:

  • Only visits patterns closed on positive transactions

 The closedness constraint is built-in

  • Visits earlier shorter patterns including high quality items

 There is more chance of pruning

ExCover: Search space

DaWaK-16 36

Enumeration tree

  • f closed patterns:

All combinations

  • f B, A and C

All combinations

  • f B and A

Quality of item: B > A > C > E > D

All combinations

  • f B

{B} {A} {B, A} {A, C} {B, A, C} {A, E, D} {B, A, E, D} {A, C, E, D} {B, A, C, E, D}

slide-37
SLIDE 37

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate patterns are maintained in the candidate table

following the best-covering constraint

DaWaK-16 37

slide-38
SLIDE 38

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate patterns are maintained in the candidate table

following the best-covering constraint

DaWaK-16 38

x

Patterns Positive transactions "Covers" relation

t1 t2 t3 t4 t5 t6 t7 t8 t9 t

Instance of x

slide-39
SLIDE 39

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate table is a map:

Positive transaction t  Best competitor(s) for t

DaWaK-16 39

x

Current pattern Candidate table

t1 t2 t3 t4 t5 t6 t7 t8 t9 t

slide-40
SLIDE 40

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate table is a map:

Positive transaction t  Best competitor(s) for t

DaWaK-16 40

x

Current pattern

t1 t2 t3

(empty)

t4 t5 t6 t7 z1 t8 t9 z2 t

Candidate table

slide-41
SLIDE 41

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate table is a map:

Positive transaction t  Best competitor(s) for t

DaWaK-16 41

x

Current pattern

t1 t2 t3

(empty)

t4 t5 t6 t7 z1 t8 t9 z2 t

Candidate table

slide-42
SLIDE 42

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate table is a map:

Positive transaction t  Best competitor(s) for t

DaWaK-16 42

x

Current pattern

t1 t2 t3 x t4 t5 t6 t7 z1 t8 t9 z2 t

Candidate table

slide-43
SLIDE 43

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate table is a map:

Positive transaction t  Best competitor(s) for t

DaWaK-16 43

x

Current pattern

t1 t2 t3 x t4 t5 t6 t7 z1 t8 t9 z2 t

Candidate table

slide-44
SLIDE 44

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate table is a map:

Positive transaction t  Best competitor(s) for t

DaWaK-16 44

x

Current pattern

t1 t2 t3 x t4 t5 t6 t7 x t8 t9 z2 t Quality(x) > Quality(z1)

Candidate table

slide-45
SLIDE 45

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate table is a map:

Positive transaction t  Best competitor(s) for t

DaWaK-16 45

x

Current pattern

t1 t2 t3 x t4 t5 t6 t7 x t8 t9 z2 t

Candidate table

slide-46
SLIDE 46

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate table is a map:

Positive transaction t  Best competitor(s) for t

DaWaK-16 46

x

Current pattern

t1 t2 t3 x t4 t5 t6 t7 x t8 t9 z2 t Quality(x) < Quality(z2)

Candidate table

slide-47
SLIDE 47

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate table is a map:

Positive transaction t  Best competitor(s) for t

DaWaK-16 47

x

Current pattern

t1 t2 t3 x t4 t5 t6 t7 x t8 t9 z2 t

Candidate table

slide-48
SLIDE 48

ExCover: Candidate table

  • Basic strategy (cont’d):

– Top-1 (Top-k with k = 1) mining concurrently for each positive transaction

  • Candidate table is a map:

Positive transaction t  Best competitor(s) for t

DaWaK-16 48

x

Current pattern

t1 t2 t3 z1 t4 t5 t6 t7 z2 t8 t9 z3 t Quality(x) < Quality(z1) Quality(x) < Quality(z2) Quality(x) < Quality(z3)

Pruned!

upper bound of x's quality

Candidate table

slide-49
SLIDE 49

Fixed inside the algorithm

  • ExCover is...

– Exhaustive

  • Only performs safe branch & bound pruning

– Parameter-free

  • Conducts concurrent top-1 mining

ExCover: Property

DaWaK-16 49

slide-50
SLIDE 50

ExCover: Related work

  • HARMONY [Wang+ 05]

– Uses the same strategy as that of ExCover – However its original paper does not mention on redundancy – Uses confidence p(c | x) as the quality score

  • Confidence prefers highly specific patterns

 Not easy to have its upper bound

  • User-specified minsup min is required for pruning

DaWaK-16 50

slide-51
SLIDE 51

Outline

 Background  Our proposal Best-covering constraint ExCover

  • Experiments

DaWaK-16 51

slide-52
SLIDE 52

Experiments: Outline

  • We use datasets from UCI ML Repository
  • Experiment 1:

– Detailed analysis on redundancy among patterns using the Mushroom dataset

  • Experiment 2:

– Analysis on search performance using 16 datasets preprocessed by the CP4IM project:

DaWaK-16 52

Dataset #Trans. #Items anneal 812 93 audiology 216 148 australian-credit 653 125 german-credit 1,000 112 heart-cleveland 296 95 hepatitis 137 68 hypothyroid 3,247 88 kr-vs-kp 3,196 73 Dataset #Trans. Items lymph 148 68 mushroom 8,124 110 primary-tumor 336 31 soybean 630 50 splice-1 3,190 287 tic-tac-toe 958 28 vote 435 48 zoo-1 101 36

slide-53
SLIDE 53

Experiment 1: Mushroom

DaWaK-16 53

Rank Pattern F-score 1 {odor=n, veil-type=p} 0.881 2 {gill-size=b, stalk-surface-above-ring=s, veil-type=p} 0.866 3 {gill-size=b, stalk-surface-below-ring=s, veil-type=p} 0.837 4 {gill-size=b, veil-type=p} 0.798 5 {stalk-surface-above-ring=s, veil-type=p} 0.776 6 {ring-type=p, veil-type=p} 0.771 7 {stalk-surface-below-ring=s, veil-type=p} 0.744 8 {veil-type=p} 0.682

Productivity + Closedness + Top-k [Kameya+ 13] (k = 30)

Covers 4,112 out of 4,208 positive transactions Covers remaining 96 positive transactions

slide-54
SLIDE 54

Experiment 1: Mushroom

DaWaK-16 54

Rank Pattern F-score 1 {odor=n, veil-type=p} 0.881 2 {gill-size=b, stalk-surface-above-ring=s, veil-type=p} 0.866 3 {gill-size=b, stalk-surface-below-ring=s, veil-type=p} 0.837 4 {gill-size=b, veil-type=p} 0.798 5 {stalk-surface-above-ring=s, veil-type=p} 0.776 6 {ring-type=p, veil-type=p} 0.771 7 {stalk-surface-below-ring=s, veil-type=p} 0.744 8 {veil-type=p} 0.682 Rank Pattern F-score 1 {odor=n, veil-type=p} 0.881 2 {gill-size=b, stalk-surface-above-ring=s, veil-type=p} 0.866 3 {stalk-surface-above-ring=s, veil-type=p} 0.776

ExCover Productivity + Closedness + Top-k [Kameya+ 13] (k = 30)

Specifying k < 5 loses information from 96 positive transactions! We only need 3 best-covering patterns to summarize the entire dataset Covers remaining 96 positive transactions

slide-55
SLIDE 55

Experiment 2: Settings

  • 16 datasets preprocessed by the CP4IM project
  • Previous method in comparison [Kameya+ 13]:

– Productivity + Closedness + Top-k

– k was chosen from 10, 100 and 1,000

DaWaK-16 55

slide-56
SLIDE 56

Experiment 2: #Patterns

  • ExCover outputs

a more compact set of patterns

  • # of output patterns

was moderate and did not vary

DaWaK-16 56

Productivity + Closedness + Top-10

slide-57
SLIDE 57

Experiment 2: Search space

  • Search space =

# of visited patterns in depth-first search

DaWaK-16 57

slide-58
SLIDE 58

Experiment 2: Running time

  • Our implementation: In Java
  • Running time averaged
  • ver 30 runs
  • For most datasets,

ExCover finishes within one second

DaWaK-16 58

(second)

slide-59
SLIDE 59

Summary

  • ExCover: an efficient and exact method for finding

non-redundant discriminative itemsets

– Works under the best-covering constraint – Requires no control parameters limiting the search space – Finds a more compact set of patterns in a shorter time

DaWaK-16 59

Future work

  • Transactions including numeric values
  • Building classifiers from best-covering patterns
  • Sequence pattern mining