707.009 Foundations of Knowledge Management g g Categorization - - PowerPoint PPT Presentation

707 009 foundations of knowledge management g g
SMART_READER_LITE
LIVE PREVIEW

707.009 Foundations of Knowledge Management g g Categorization - - PowerPoint PPT Presentation

Knowledge Management Institute 707.009 Foundations of Knowledge Management g g Categorization & Formal Concept Analysis Markus Strohmaier Univ. Ass. / Assistant Professor Knowledge Management Institute Graz University of


slide-1
SLIDE 1

Knowledge Management Institute

707.009 Foundations of Knowledge Management g g „Categorization & Formal Concept Analysis“

Markus Strohmaier

  • Univ. Ass. / Assistant Professor

Knowledge Management Institute Graz University of Technology, Austria e-mail: markus.strohmaier@tugraz.at web: http://www.kmi.tugraz.at/staff/markus

1

Markus Strohmaier 2011

slide-2
SLIDE 2

Knowledge Management Institute

Slides in part based on

  • Gerd Stumme

– Course at Otto-von-Guericke Universität Magdeburg / Summer Term 2003 – ECML PKDD Tutorial ECML PKDD Tutorial

  • Rudolf Wille

– „Formal Concept Analysis as Mathematical Theory of Concepts and C t Hi hi “ I F l C t A l i Ed B G t t l Concept Hierarchies“, In Formal Concept Analysis, Eds B. Ganter et al., LNAI 3626, pp1-33, (2005)

Further Literature:

2

Markus Strohmaier 2011

http://www.aifb.uni-karlsruhe.de/WBS/gst/FBA03/chapter1_2.pdf (Ganter / Stumme)

slide-3
SLIDE 3

Knowledge Management Institute

Overview

T d ‘ A d Today‘s Agenda: Categorization & Formal Concept Analysis

  • Formal Context
  • Formal Concepts
  • Formal Concept Lattices
  • FCA Implications
  • Constructing Concept Lattices

3

Markus Strohmaier 2011

slide-4
SLIDE 4

Knowledge Management Institute

Categorization Categorization [Mervis Rosch 1981]

Intension (Meaning)

  • The specification of those qualities that a thing must have to be a

member of the class Extension (the objects in the class) Extension (the objects in the class)

  • Things that have those qualities

4

Markus Strohmaier 2011

slide-5
SLIDE 5

Knowledge Management Institute

Categorization Categorization [Mervis Rosch 1981]

Six salient problems: Six salient problems:

  • Arbitrariness of categories. Are there any a priori reasons for dividing
  • bjects into categories, or is this division initially arbitrary?
  • Equivalence of category members. Are all category members equally

representative of the category as has often been assumed?

  • Determinacy of category membership and representation. Are categories

specified by necessary and sufficient conditions for membership? Are boundaries of categories well defined?

  • The nature of abstraction. How much abstraction is required--that is, do we

need only memory for individual exemplars to account for categorization? Or, at the other extreme, are higher-order abstractions of general knowledge, beyond the individual categories, necessary?

  • Decomposability of categories into elements. Does a reasonable

explanation of objects consist in their decomposition into elementary qualities?

  • The nature of attributes. What are the characteristics of these "attributes“

5

Markus Strohmaier 2011

into which categories are to be decomposed?

slide-6
SLIDE 6

Knowledge Management Institute

Formal Concept Analysis

Running Example:

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/.., Texture: Smooth/Bumpy, I h ö ht d f hi i d ß i Z it b i d di Äh li hk it d Ich möchte nur darauf hinweisen, daß es eine Zeit gab, in der man die Ähnlichkeit der Empfindungen zur Basis der Kategorisierung von Pflanze und Tier gemacht hat. Man denke [...] an die frühen Taxonomien des Ulisse Aldrovandi aus dem 16. Jahrhundert, der die scheußlichen Tiere (die Spinnen, Molche und Schlagen) und die Schönheiten (die scheußlichen Tiere (die Spinnen, Molche und Schlagen) und die Schönheiten (die Leoparden, die Adler usw.) zu eigenen Gruppen [von Lebewesen] zusammenfasste. Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23

6

Markus Strohmaier 2011

[Mervis Rosch 1981]

slide-7
SLIDE 7

Knowledge Management Institute

Terminology

ISO 704 T i l W k P i i l d th d ISO 704: Terminology Work: Principles and methods DIN 2330: Begriffe und Ihre Benennungen C t Name Definition Representation level A l Concept attribute a attribute b attribute c Concept level Apple: Taste Color Shape attribute c Object 1 Object 2 Object 3 Shape Object 1 property A property B property C Object 2 property A property B property C Object 3 property A property B property C Object level

7

Markus Strohmaier 2011

property C property C property C

slide-8
SLIDE 8

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis [Wille 2005]

M d l t it f th ht A t i Models concepts as units of thought. A concept is constituted by its: Extension: consists of all objects belonging to a

  • Extension: consists of all objects belonging to a

concept

  • Intension: consists of all attributes common to all those
  • Intension: consists of all attributes common to all those
  • bjects

Concepts „live“ in relationships with many other concepts where the sub-concept-superconcept-relation concepts where the sub concept superconcept relation plays a prominent role.

8

Markus Strohmaier 2011

slide-9
SLIDE 9

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis [Wille 2005]

Formal context: Formal context: A Formal Context is a tripel (G, M, I) for which G and M are sets while I is a binary relation between G and M a binary relation between G and M. Formal Concept:

A formal concept of a formal context K := (G,M, I) is defined as a pair (A,B) with and A = B´, and B = A´; A and B are called the extent and the intent of the formal concept (A,B), respectively.

9

Markus Strohmaier 2011

slide-10
SLIDE 10

Knowledge Management Institute

Formal Concept Analysis

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

Def.: A formal context is a tripel (G,M,I), where

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

  • G is a set of objects,
  • M is a set of attributes
  • and I is a relation

between G and M. (g,m) I is read as „object g has attribute m“.

10

Markus Strohmaier 2011

slide-11
SLIDE 11

Knowledge Management Institute

Formal Concept Analysis

Derivation Operators

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

  • A G, B M (A…Extent, B…Intent)
  • all attributes shared by all objects of A
  • all objects having all attributes of B

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

all objects having all attributes of B

A1´ A1

X X X

A formal concept is

X X X X X X

defined as a pair (A,B) A = B´, and B = A´

11

Markus Strohmaier 2011

slide-12
SLIDE 12

Knowledge Management Institute

Formal Concept Analysis

Def.: A formal concept is a pair (A,B), with

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

  • A G, B M

all attributes shared by all objects of A all objects having all attributes of B

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

j g

  • A´=B and B´= A

Set A is called the extent

Intent t

Set A is called the extent (a set of objects) Set B is called the intent (a set of attributes)

Extent

Of the formal concept (A,B)

12

Markus Strohmaier 2011

slide-13
SLIDE 13

Knowledge Management Institute

Formal Concept Analysis

Sub/Superconcept Relation

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

  • A G, B M
  • all attributes shared by all objects of A
  • all objects having all attributes of B

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

j g

B2 ↔A2´ A2

X X X

B1↔A1´ A

X X X X X X

A1 The orange concept is a subconcept of the blue concept, since its extent is contained in the blue one (equivalent to the blue intent is contained in the orange one)

13

Markus Strohmaier 2011

blue one. (equivalent to the blue intent is contained in the orange one)

slide-14
SLIDE 14

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Concept Lattices

Concept Lattices (cf. Galois Lattices)

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

1

X X X

A1´ A2´

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

Where is C1 & C2 located? A1

X X X X X X X X X

A2

C1

X X X

Extent Intent

Formal Concept C1 (A1, A1´) The set of objects that are „yellow“, „sweet“ and „smooth“ C2

15

Markus Strohmaier 2011

„sweet and „smooth

slide-15
SLIDE 15

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Concept Lattices

Concept Lattices (cf. Galois Lattices)

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

1

X X X

A1´

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

)

A1

X X X X X X X X X

(Attributes)

A1

X X X

Intent cts) xtent (Objec

16

Markus Strohmaier 2011

Ex

slide-16
SLIDE 16

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Concept Lattices

D f Th t l tti f f l t t (G M I) i Def.: The concept lattice of a formal context (G,M,I) is the set of all formal concepts of (G,M,I), together with the partial order The concept lattice is denoted by B(G,M,I) . Theorem: The concept lattice is a lattice, i.e. for two concepts (A1,B1) and (A B ) there is always and (A2,B2), there is always

  • a greatest common subconcept
  • and a least common superconcept

p p

17

Markus Strohmaier 2011

slide-17
SLIDE 17

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Greatest Common Subconcept

Which objects share the attributes „smooth“ and „red“ and „sour“?

greatest common subconcept

A: Grapes, Apples

(infimum)

  • a greatest common subconcept
  • and a least common superconcept

18

Markus Strohmaier 2011

p p

slide-18
SLIDE 18

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Least Common Superconcept

Which attributes share the objects „strawberries“ and „lemon“?

least common superconcept

A: Bumpy, round

(supremum)

  • a greatest common subconcept
  • and a least common superconcept

19

Markus Strohmaier 2011

p p

slide-19
SLIDE 19

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Implications

Def.: An implication B1 > B2 holds in a context (G M I) if every object intent respects B1 >B2 i e

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/.., Texture: Smooth/Bumpy,

B1 -> B2 holds in a context (G, M, I) if every object intent respects B1 ->B2, i.e. if each object that has all the attributes in B1 also has all the attributes in B2. We also say B1->B2 is an implication of (G,M,I).

B1 B2 B2 B1

20

Markus Strohmaier 2011

slide-20
SLIDE 20

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Implications

Object count f

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy Color: Red/Yellow/.., Texture: Smooth/Bumpy,

21

Markus Strohmaier 2011

slide-21
SLIDE 21

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Applications / AOL Search Query Logs

Implications:

22

Markus Strohmaier 2011

slide-22
SLIDE 22

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Applications / Goal Tagging

Implications:

23

Markus Strohmaier 2011

slide-23
SLIDE 23

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Applications / Bugs - Bugzilla

Implications:

24

Markus Strohmaier 2011

slide-24
SLIDE 24

Knowledge Management Institute

FCA / S li FCA / Scaling

Transforming many-valued into single valued contexts Many-valued contexts

Color Shape Taste Texture Apple Red/ green/ yellow Round Sweet/ sour Smooth Lemon Yellow Round sour Bumpy

What hasn‘t been mentioned yet

Lemon Yellow Round sour Bumpy Banana Yellow Long Sweet Smooth Strawberries Red Round Sweet Bumpy Grapes Red/ green Round Sweet/ Smooth

y

Grapes Red/ green Round Sweet/ sour Smooth Pear Yellow Long Sweet Smooth

Color Shape Taste Texture

S1 red green yellow red X

Via Scales

red X green X yellow X

25

Markus Strohmaier 2011

slide-25
SLIDE 25

Knowledge Management Institute

K l d A i iti Knowledge Acquisition

CIMIANO, HOTHO, & STAAB 2005

26

Markus Strohmaier 2011

slide-26
SLIDE 26

Knowledge Management Institute

K l d A i iti Knowledge Acquisition

CIMIANO, HOTHO, & STAAB 2005

27

Markus Strohmaier 2011

slide-27
SLIDE 27

Knowledge Management Institute

K l d A i iti Knowledge Acquisition

CIMIANO, HOTHO, & STAAB 2005

28

Markus Strohmaier 2011

slide-28
SLIDE 28

Knowledge Management Institute

tion cquisit

STAAB 2005

dge Ac

HOTHO, & S

  • wled

CIMIANO,

Kno

29

Markus Strohmaier 2011

slide-29
SLIDE 29

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Lattice Construction Algorithms

A N i A h A Naive Approach:

Ganter / Stumme 2003

31

Markus Strohmaier 2011

slide-30
SLIDE 30

Knowledge Management Institute

Formal Concept Analysis Lattice Construction Algorithms

E l Example:

(Formal Concepts) Ganter / Stumme 2003

32

Markus Strohmaier 2011 Ganter / Stumme 2003

slide-31
SLIDE 31

Knowledge Management Institute

Formal Concept Analysis Formal Concept Analysis Lattice Construction Algorithms

A N i A h A Naive Approach:

Ganter / Stumme 2003

33

Markus Strohmaier 2011

slide-32
SLIDE 32

Knowledge Management Institute

Formal Concept Analysis y Lattice Construction Algorithms

E l Example:

„5“ is a subconcept of „10“

(FC(5): {T2,T7},{e}) <= FC(10): ({T1,T2,T3,T4,T5,T6,T7},{0})

Why is there no line between FC4 & 7?

umme 2003 ({0},{a,b,c,d,e}) <= ({T4},{a,b,c}) ({0},{a,b,c,d,e}) <= ({T4},{a,b,c}) Ganter / Stu Ganter / Stumme 2003 a circle for a concept is always positioned higher than all circles for its proper subconcepts.

34

Markus Strohmaier 2011

slide-33
SLIDE 33

Knowledge Management Institute

Formal Concept Analysis Lattice Construction Algorithms

E l td´ Example ctd´:

10

Reduced Labeling

5 10 2 4 3

Reduced Labeling (Top to bottom)

5 2 4 3 9 8 7 9 8 7 1 Ganter / Stumme 2003 6

35

Markus Strohmaier 2011

slide-34
SLIDE 34

Knowledge Management Institute

Formal Concept Analysis

T l Tools: E.g.

  • ConExp http://conexp.sourceforge.net/index.html
  • Networks.tb http://networks-tb.sourceforge.net/
  • JaLaBa http://maarten.janssenweb.net/jalaba/JaLaBA.pl

Further Information

  • FCA Homepage http://www.upriss.org.uk/fca/fca.html

36

Markus Strohmaier 2011

slide-35
SLIDE 35

Knowledge Management Institute

Formal Concept Analysis

C E htt // f t/i d ht l

  • ConExp http://conexp.sourceforge.net/index.html

37

Markus Strohmaier 2011

slide-36
SLIDE 36

Knowledge Management Institute

Bonus Task

Sk t h t i ti l

  • Sketch a categorization example
  • Define a Formal Context, for which |G| >=10, |M| >=10 and

|I|~|G|+|M| | | | | | |

  • Use Conexp to

– Name all elements of G and M (choose plausible G and M) Represent your formal context (choose plausible I) – Represent your formal context (choose plausible I) – Draw the Concept Lattice – Calculate Implications (there should be at least one implication with f>3)

S b it

  • Submit

– A one-page .pdf file that contains your 1) context, 2) the layouted(!) lattice 3) the top10 implications and 4) a brief interpretation of your example

N th df Fil i th f ll i S t

  • Name the pdf File using the following Syntax:

„GWM11-BT2-YOURMATR-YOURLASTNAME.pdf“

– To me via e-mail using subject „[GWM11-BT2-YOURMATR]“ – before the beginning of next week‘s class

38

Markus Strohmaier 2011

– before the beginning of next week s class

slide-37
SLIDE 37

Knowledge Management Institute

Any questions? y q See you next week! y

39

Markus Strohmaier 2011