formal concept analysis
play

Formal Concept Analysis II Closure Systems and Implications Robert - PowerPoint PPT Presentation

Formal Concept Analysis II Closure Systems and Implications Robert J aschke Asmelash Teka Hadgu FG Wissensbasierte Systeme/L3S Research Center Leibniz Universit at Hannover slides based on a lecture by Prof. Gerd Stumme Robert J


  1. Formal Concept Analysis II Closure Systems and Implications Robert J¨ aschke Asmelash Teka Hadgu FG Wissensbasierte Systeme/L3S Research Center Leibniz Universit¨ at Hannover slides based on a lecture by Prof. Gerd Stumme Robert J¨ aschke (FG KBS) Formal Concept Analysis 1 / 25

  2. Agenda Implications 4 Implications Attribute Logic Concept Intents and Implications Implications and Closure Systems Pseudo-Intents and the Stem Base Computing the Stem Base With Next Closure Bases of Association Rules Robert J¨ aschke (FG KBS) Formal Concept Analysis 2 / 25

  3. Implications Def.: An implication X Ñ Y Bicycle Trail holds in a context, if every NPS Guided Tours Hiking Fishing Muir Woods John object that has all attributes Pinnacles Muir Horseback Riding from X also has all attributes Lava Beds Swimming Fort Point from Y . Joshuas Tree Cabrillo Channel Islands Cross Country Death Valley Ski Trail Devils Postpile Kings Canyon Redwood Sequoia Boating Golden Gate Point Rayes Lassen Volcanic Examples: Santa Monica Mountains Yosemite Whiskeytown-Shasta-Trinity { Swimming } Ñ { Hiking } { Boating } Ñ { Swimming, Hiking, NPS Guided Tours, Fishing, Horseback Riding } { Bicycle Trail, NPS Guided Tours } Ñ { Swimming, Hiking, Horseback Riding } Robert J¨ aschke (FG KBS) Formal Concept Analysis 3 / 25

  4. Attribute Logic overlap disjoint parallel common vertex common segment common edge We are dealing with implications over an possibly infinite set of objects! Robert J¨ aschke (FG KBS) Formal Concept Analysis 4 / 25

  5. Concept Intents and Implications Def.: A subset T Ď M respects an implication A Ñ B , if A Ę T or B Ď T holds. (We then also say that T is a model of A Ñ B .) T respects a set L of implications, if T respects every implication in L . Lemma: An implication A Ñ B holds in a context, iff B Ď A 2 ( ô A 1 Ď B 1 ). It is then respected by all concept intents. Robert J¨ aschke (FG KBS) Formal Concept Analysis 5 / 25

  6. Implications and Closure Systems Lemma: If L is a set of implications in M , then Mod p L q : “ t X Ď M | X respects L u is a closure system on M . The respective closure operator X ÞÑ L p X q is constructed in the following way: For a set X Ď M , let X L : “ X Y ď t B | A Ď X u . A Ñ B P L We form the sets X L , X LL , X LLL , . . . until a set L p X q : “ X L ... L is obtained with L p X q L “ L p X q (i.e., a fixpoint). 1 L p X q is then the closure of X for the closure system Mod p L q . 1 If M is infinite, this may require infinitely many iterations. Robert J¨ aschke (FG KBS) Formal Concept Analysis 6 / 25

  7. Implications and Closure Systems Def.: An implication A Ñ B follows (semantically) from a set L of implications in M if each subset of M respecting L also respects A Ñ B . A family of implications is called closed if every implication following from L is already contained in L . Lemma: A set L of implications in M is closed, iff the following conditions ( Armstrong Rules ) are satisfied for all W, X, Y, Z Ď M : 1 X Ñ X P L , 2 If X Ñ Y P L , then X Y Z Ñ Y P L , 3 If X Ñ Y P L and Y Y Z Ñ W P L , then X Y Z Ñ W P L . Remark: You should know these rules from the database lecture! Robert J¨ aschke (FG KBS) Formal Concept Analysis 7 / 25

  8. Pseudo-Intents and the Stem Base Def.: A set L of implications of a context p G, M, I q is called complete , if every implication that holds in p G, M, I q follows from L . A set L of implications is called non-redundant if no implication in L follows from other implications in L . Def.: P Ď M is called pseudo intent of p G, M, I q , if P ­“ P 2 , and if Q Ĺ P is a pseudo intent, then Q 2 Ď P . Theorem: The set of implications L : “ t P Ñ P 2 | P is pseudo intent u is non-redundant and complete. We call L the stem base . Robert J¨ aschke (FG KBS) Formal Concept Analysis 8 / 25

  9. Pseudo-Intents and the Stem Base Example: membership of developing countries in supranational groups (Source: Lexikon Dritte Welt. Rowohlt-Verlag, Reinbek 1993) Robert J¨ aschke (FG KBS) Formal Concept Analysis 9 / 25

  10. Robert J¨ aschke (FG KBS) Formal Concept Analysis 10 / 25

  11. Robert J¨ aschke (FG KBS) Formal Concept Analysis 11 / 25

  12. Pseudo-Intents and the Stem Base stem base of the developing countries context: t OPEC u Ñ t Group of 77, Non-Alligned u t MSAC u Ñ t Group of 77 u t Non-Alligned u Ñ t Group of 77 u t Group of 77, Non-Alligned, MSAC, OPEC u Ñ t LLDC, AKP u t Group of 77, Non-Alligned, LLDC, OPEC u Ñ t MSAC, AKP u Robert J¨ aschke (FG KBS) Formal Concept Analysis 12 / 25

  13. Computing the Stem Base With Next Closure The computation is based on the following theorem: Theorem: The set of all concept intents and pseudo-intents is a closure system. The corresponding closure operator is given by: Starting with a set X we compute X L ‚ : “ X Y ď t B | A Ă X, A ‰ X u A Ñ B P L X L ‚ L ‚ : “ X L ‚ Y ď t B | A Ă X L ‚ , A ‰ X L ‚ u A Ñ B P L etc., until we reach a set L ‚ p X q with L ‚ p X q “ L ‚ p x q L ‚ . This is then the wanted intent or pseudo-intent. Robert J¨ aschke (FG KBS) Formal Concept Analysis 13 / 25

  14. Computing the Stem Base With Next Closure The algorithm Next Closure to compute all concept intents and the stem base: 1 The set L of all implications is initialized to L “ H . 2 The lectically first concept intent or pseudo-intent is H . 3 If A is an intent or a pseudo-intent, the lectically next intent/pseudo-intent is computed by checking all i P M z A in descending order, until A ă i L ‚ p A ` i q holds. Then L ‚ p A ` i q is the next intent or pseudo-intent. 4 If L ‚ p A ` i q “ p L ‚ p A ` i qq 2 holds, then L ‚ p A ` i q is a concept intent, otherwise it is a pseudo-intent and the implication L ‚ p A ` i q Ñ p L ‚ p A ` i qq 2 is added to L . 5 If L ‚ p A ` i q “ M , finish. Else, set A Ð L ‚ p A ` i q and continue with Step 3. Robert J¨ aschke (FG KBS) Formal Concept Analysis 14 / 25

  15. Computing the Stem Base With Next Closure a b c e 1 ˆ ˆ Example: 2 ˆ ˆ 3 ˆ ˆ ˆ A i A ` i L ‚ p A ` i q A ă i L ‚ p A ` i q ? p L ‚ p A ` i qq 2 L new intent Robert J¨ aschke (FG KBS) Formal Concept Analysis 15 / 25

  16. Agenda Implications 4 Implications Attribute Logic Concept Intents and Implications Implications and Closure Systems Pseudo-Intents and the Stem Base Computing the Stem Base With Next Closure Bases of Association Rules Robert J¨ aschke (FG KBS) Formal Concept Analysis 16 / 25

  17. Bases of Association Rules { veil color: white, gill spacing: close } Ñ { gill attachment: free } support: 78.52% confidence: 99.60% The input data to compute association rules can be represented as a formal context p G, M, I q : M is a set of items (things, products of a market basket), G contains the transaction ids , and the relation I the list of transactions . Robert J¨ aschke (FG KBS) Formal Concept Analysis 17 / 25

  18. Bases of Association Rules { veil color: white, gill spacing: close } Ñ { gill attachment: free } support: 78.52% confidence: 99.60% The support of an implication is the fraction of all objects that have all attributes from the premise and the conclusion. (repetition: the support of an attribute set X Ď M is supp p X q : “ | X 1 | | G | .) Def.: The support of a rule X Ñ Y is given by supp p X Ñ Y q : “ supp p X Y Y q The confidence is the fraction of all objects that fulfill both the premise and the conclusion among those objects that fulfill the premise. Def.: The confidence of a rule X Ñ Y is given by conf p X Ñ Y q : “ supp p X Y Y q supp p X q Robert J¨ aschke (FG KBS) Formal Concept Analysis 17 / 25

  19. Bases of Association Rules { veil color: white, gill spacing: close } Ñ { gill attachment: free } support: 78.52% confidence: 99.60% Classical data mining task: Find for given minsupp, minconf P r 0 , 1 s all rules with a support and confidence above these bounds. Our task: finding a base of rules, i.e., a minimal set of rules from which all other rules follow. Robert J¨ aschke (FG KBS) Formal Concept Analysis 17 / 25

  20. Bases of Association Rules From B 1 “ B 3 follows supp p B q “ | B 1 | | G | “ | B 3 | | G | “ supp p B 2 q Theorem: X Ñ Y and X 2 Ñ Y 2 have the same support and the same confidence. To compute all association rules it is thus sufficient to compute the support of all frequent sets with B “ B 2 (i.e., the intents of the iceberg concept lattice). Robert J¨ aschke (FG KBS) Formal Concept Analysis 18 / 25

  21. Bases of Association Rules The Benefit of Iceberg Concept Lattices (Compared to Frequent Itemsets) veil type: partial gill attachment: free ring number: one 100 % veil color: white gill spacing: close 92.30 % 97.43 % 97.62 % 81.08 % 90.02 % 97.34 % 76.81 % 78.80 % 89.92 % minsupp = 70% 78.52 % 74.52 % 32 frequent itemsets are ➞ more efficient computation (e.g., Titanic ) represented by 12 ➞ fewer rules (without loss of information!) frequent concept intents Robert J¨ aschke (FG KBS) Formal Concept Analysis 19 / 25

  22. Bases of Association Rules The Benefit of Iceberg Concept Lattices (Compared to Frequent Itemsets) gill attachment: free veil type: partial 97.6% 97.4% veil color: white gill spacing: close ring number: one 97.2% 97.5% 99.9% 99.7% 99.6% 99.9% 97.0% Association rules can be visualized in the (iceberg) concept lattice: exact association rules (implications): conf “ 100% (approximate) association rules: conf ă 100% Robert J¨ aschke (FG KBS) Formal Concept Analysis 20 / 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend