Outline Introduction Motivation & related work Existing - - PDF document

outline
SMART_READER_LITE
LIVE PREVIEW

Outline Introduction Motivation & related work Existing - - PDF document

CloseViz: Visualizing Useful Patterns Chris Carmichael Carson K. Leung Department of Computer Science Th The University of Manitoba, Canada U i it f M it b C d UP @ KDD 2010 Outline Introduction Motivation & related work


slide-1
SLIDE 1

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 1

CloseViz: Visualizing Useful Patterns

Chris Carmichael Carson K. Leung Department of Computer Science Th U i it f M it b C d The University of Manitoba, Canada UP @ KDD 2010

Outline

  • Introduction
  • Motivation & related work

Existing visualizers

  • Proposed visualizer

CloseViz: Visualizing closed frequent patterns

Carmichael & Leung (U Manitoba, Canada)

patterns

  • Conclusions
slide-2
SLIDE 2

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 2

Introduction & Motivation

  • Focus on the KDD of frequent pattern mining
  • Motivation: Since the introduction of frequent

pattern mining, lots of algorithms have been developed

They mostly return the mined results in textual forms

“A i t i th th d d ”

Carmichael & Leung (U Manitoba, Canada)

  • “A picture is worth a thousand words”

Visual representation helps users in gaining insight into massive amounts of data or information

Motivation: Existing Visualizers

  • Many were designed to visualize

i ti l association rules

(e.g., {apples, bananas} {cherries, dates})

  • Recently, there are visualizers that can be

used for visualizing frequent patterns

Carmichael & Leung (U Manitoba, Canada)

slide-3
SLIDE 3

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 3

A Sample Visualizer #1

  • Designed to visualize

association rules C b d f i li i

e

  • Can be used for visualizing

frequent patterns

  • Uses a 2D space consisting
  • f many vertical axes
  • Evenly distributes domain

items along these vertical axes

c b d a

Carmichael & Leung (U Manitoba, Canada)

  • Represents an itemset X as

a curve

  • Uses thickness of the curve

to indicate frequency of an itemset X

A Sample Visualizer #1

  • {a,c,d}, {b,c,d,e}
  • frequency(e) ≥

e

q y( ) frequency(c) ≥ frequency(b) ≥ frequency(d) ≥ frequency(a)

c b d a

Carmichael & Leung (U Manitoba, Canada)

slide-4
SLIDE 4

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 4

A Sample Visualizer #1

  • Do these curves

represent itemsets…

e

{a,c,d} & {b,c,d,e}

  • r

{a,c,d,e} & {b,c,d}?

c b d a

Carmichael & Leung (U Manitoba, Canada)

A Sample Visualizer #1

Problems:

  • 1. Does not clearly show

e

  • 1. Does not clearly show

the (absolute) frequency

  • f an domain item
  • 2. Not easy to tell the

(absolute) frequency of an itemset by judging the thickness of curves 3 C h

c b d a

Carmichael & Leung (U Manitoba, Canada)

  • 3. Curves cross over each
  • ther
slide-5
SLIDE 5

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 5

A Sample Visualizer #2: FIsViz [PAKDD’08]

  • Designed to visualize

frequent patterns q p

  • Uses a 2D space with

domain items on the x-axis & frequency on the y-axis

  • Represents an itemset X

l li

frequency

80% 70% 60% 50%

Carmichael & Leung (U Manitoba, Canada)

as a polyline

Domain items

a b c

A Sample Visualizer #2: FIsViz

Advantages:

  • 1. Clearly shows the
  • 1. Clearly shows the

frequency of an domain item

  • E.g., frequency({b}) = 70%
  • 2. Easy to tell the frequency
  • f an itemset
  • E.g., freq({a,b,c}) = 50%

frequency

80% 70% 60% 50%

Carmichael & Leung (U Manitoba, Canada)

Domain items

a b c

slide-6
SLIDE 6

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 6

A Sample Visualizer #2: FIsViz

Potential problem:

  • Polylines bend & cross

frequency

Polylines bend & cross

  • ver each other
  • E.g., do these polylines

represent itemsets… {a,c,d} & {b,c,e}

  • r

{a,c,e} & {b,c,d}?

80% 70% 60% 50%

Carmichael & Leung (U Manitoba, Canada)

Domain items

a b c d e

A Sample Visualizer #3: WiFIsViz [ICDM’08]

  • Also designed to visualize

frequent patterns q p

  • Uses a 2D space with

domain items on the x-axis & frequency on the y-axis

  • Represents an itemset X

h i t l li

frequency

80% 70% 60% 50%

Carmichael & Leung (U Manitoba, Canada)

as a horizontal line

Domain items

a b c

slide-7
SLIDE 7

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 7

A Sample Visualizer #3: WiFIsViz

Advantages:

  • 1. Clearly shows the
  • 1. Clearly shows the

frequency of an domain item

  • E.g., frequency({b}) = 70%
  • 2. Easy to tell the frequency
  • f an itemset
  • E.g., freq({a,b,c}) = 50%

frequency

80% 70% 60% 50%

Carmichael & Leung (U Manitoba, Canada)

Domain items

a b c

A Sample Visualizer #3: WiFIsViz

Potential problems:

  • 1. Shows all frequent

q patterns

Lots of horizontal lines

  • 2. Multiple frequent

patterns may have the same frequency

Broad band for each frequency value

  • r

M h i t l li

frequency

60% 55% 50%

Carmichael & Leung (U Manitoba, Canada)

Many horizontal lines project onto one info loss ({a,b,c,d} is at 60% or 50%?)

  • 3. Uses different icons

(unfilled vs. filled circles)

Domain items

a b c d

slide-8
SLIDE 8

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 8

Problem Statement

  • We provide users with a visualizer that is

d i d f h i l f l designed for showing only useful patterns & that avoids aforementioned potential problems

  • Contribution: We propose CloseViz (which

Carmichael & Leung (U Manitoba, Canada)

  • Contribution: We propose CloseViz (which

shows closed frequent patterns)

Our Visualizer: CloseViz

slide-9
SLIDE 9

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 9

CloseViz

  • Like WiFIsViz, CloseViz ...

uses a 2D space with domain items on the uses a 2D space with domain items on the x-axis & frequency on the y-axis represents an itemset X as a horizontal line

  • Unlike WiFIsViz, CloseViz ...

shows closed frequent patterns (instead of all frequent patterns) frequent patterns) uses only one type of icons (i.e., unfilled circle) distinguishes real patterns vs. the results of projection

Carmichael & Leung (U Manitoba, Canada)

CloseViz

  • 1. Shows closed frequent patterns

WiFIsViz

frequency

60% 55% 60% 55%

frequency

Carmichael & Leung (U Manitoba, Canada)

Domain items

a b c d 50%

Domain items

a b c d 50%

slide-10
SLIDE 10

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 10

CloseViz

  • 2. Uses only unfilled circles

frequency

60% 55%

frequency

60% 55%

CloseViz

Carmichael & Leung (U Manitoba, Canada)

Domain items

a b c d 50%

Domain items

a b c d 50%

CloseViz

  • 3. Represents real closed patterns by solid

lines, results of projection by dashed lines

WiFIsViz

frequency

60% 55%

frequency

60% 55%

CloseViz

Carmichael & Leung (U Manitoba, Canada)

Domain items

a b c d 50%

Domain items

a b c d 50%

slide-11
SLIDE 11

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 11

Sample Screenshots

Carmichael & Leung (U Manitoba, Canada)

Screenshot of FIsViz

Carmichael & Leung (U Manitoba, Canada)

slide-12
SLIDE 12

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 12

Screenshot of WiFIsViz

Carmichael & Leung (U Manitoba, Canada)

Screenshot of CloseViz

Carmichael & Leung (U Manitoba, Canada)

slide-13
SLIDE 13

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 13

Conclusions Conclusions

  • We proposed CloseViz, which provides users

with a visualizer that with a visualizer that ...

is designed for showing useful patterns (namely, closed frequent patterns) & avoids aforementioned potential problems of existing visualizers

  • CloseViz …

reduces #patterns to be shown

Carmichael & Leung (U Manitoba, Canada)

reduces #patterns to be shown allows visual exploration retains all important info ( Closed patterns can be served as surrogates for all frequent patterns)

slide-14
SLIDE 14

Carmichael & Leung (U Manitoba, Canada) UP@KDD 2010 / 14

Thank you / Merci

kleung [AT] cs.umanitoba.ca it b / kl www.cs.umanitoba.ca/~kleung dblab.cs.umanitoba.ca