Preprocessing input data for machine learning by FCA Jan OUTRATA - PowerPoint PPT Presentation

Preprocessing input data for machine learning by FCA Jan OUTRATA Dept. Computer Science Palack´ y University, Olomouc, Czech Republic CLA 2010, Oct 19–21, Sevilla Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 1 / 24

Outline introduction and related work preliminaries on Boolean Factor Analysis (BFA) and decision trees preprocessing input data using BFA example experimental evaluation conclusions and future research Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 2 / 24

Introduction to the problem – FCA often used for data preprocessing for (other) DM or ML methods to improve their results – results of DM and ML methods depend on structure of data = attributes in case of object-attribute data – data preprocessing . . . transformation of attributes Our approach: – formal concepts are used to create new attributes – which ones? → factor concepts obtained by Boolean Factor Analysis (BFA, described by FCA by Belohlavek, Vychodil, 2006) → new attributes = factors added to original attributes 1 replacing original attributes . . . reduction of dimensionality of data 2 (fewer factors) Main question: can factors better describe input data for DM/ML methods? Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 3 / 24

Related work (focused on decision tree induction) – constructive induction/feature construction . . . new attributes as conjs./disj., arithm. ops., etc. of original attributes – oblique decision trees . . . multiple attributes used in splitting condition (e.g. linear combinations) – work utilizing FCA? → construction of the whole learning model (lattice-based/concept-based learning, Mephu Nguifo et al., Kuznetsov and others) Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 4 / 24

Boolean Factor Analysis (BFA) = decomposition of (binary) object-attribute data matrix I to boolean product of object-factor matrix A and factor-attribute matrix B I ij = ( A ◦ B ) ij = � k l =1 A il · B lj A il = 1 . . . factor l applies to object i B lj = 1 . . . attribute j is one of the manifestations of factor l ( A ◦ B ) ij . . . “object i has attribute j if and only if there is a factor l such that l applies to i and j is one of the manifestations of l ” factors ≈ new attributes Problem: find the number k of factors as small as possible       1 1 0 0 0 1 0 0 1 1 1 0 0 0 1 1 0 0 1 1 0 1 0 0 0 1 1 0        =  ◦       1 1 1 1 0 1 1 0 0 1 0 0 0 1     1 0 0 0 1 0 0 1 0 0 1 0 0 0 Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 5 / 24

Boolean Factor Analysis – solution using FCA Belohlavek R., Vychodil V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Comput. System Sci 76 (1)(2010), 3-20. Matrices A and B can be constructed from a set F of formal concepts of input data � X , Y , I � , so-called factor concepts : F = {� A 1 , B 1 � , . . . , � A k , B k �} ⊆ B ( X , Y , I ) l -th column of A F = characteristic vector of A l l -th row of B F = characteristic vector of B l Decomposition using formal concepts to determine factors is optimal: Theorem Let I = A ◦ B for n × k and k × m binary matrices A and B. Then there exists a set F ⊆ B ( X , Y , I ) of formal concepts of I with |F| ≤ k such that for the n × | F | and | F | × m binary matrices A F and B F we have I = A F ◦ B F . Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 6 / 24

Transformations between attribute and factor spaces object . . . vector in Boolean space { 0 , 1 } m of orig. attributes., row of I . . . vector in Boolean space { 0 , 1 } k of factors, row of A = mappings g : { 0 , 1 } m → { 0 , 1 } k and h : { 0 , 1 } k → { 0 , 1 } m : ( g ( P )) l = � m ( h ( Q )) j = � k j =1 ( B lj → P j ) l =1 ( Q l · B lj ) ( g ( P )) l = 1 iff l -th row of B is included in P ( h ( Q )) j = 1 iff attribute j is a manifestation of at least one factor from Q Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 7 / 24

The ML method: Decision tree induction Decision tree . . . approximate representation of a (finite-valued) function over (finite-valued) attributes . . . the function is described by assignment of class labels to vectors of attribute values – used for classification of vectors (objects) into classes A A B C f ( A , B , C ) B G good yes false yes B B C good no false no no yes N Y F T bad no false no yes C N Y B Y false true N Y good no true yes yes no N Y bad yes true yes non-leaf tree node . . . test on a splitting attribute . . . covered collection of objects is split under the possible outcomes of the test (= values of the splitting attribute) leaf tree node . . . covers (majority of) objects with the same class label Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 8 / 24

Decision tree induction problem & algorithms Decision tree induction problem . . . to construct a decision tree that 1 approximates well the function described by (few) objects ( training data ) 2 classifies well “unseen” objects ( testing data ) Algorithms: – common strategy: recursively splitting tree nodes (collections of objects) based on splitting attributes – the problem of selection of a splitting attribute ⇒ local optimization problem – selection criteria . . . based on measures defined in terms of class distribution of objects in nodes before and after splitting → entropy and information gain measures, Gini index, classification error etc. Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 9 / 24

Transformation of input data in ML: logical, categorical (nominal), ordinal, numerical, . . . attributes in FCA: logical – binary (yes/no) or graded attributes → transformation . . . conceptual scaling (Ganter, Wille) – note: we need not transform the class attribute Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 10 / 24

Example: transformation of input data Name body temp. gives birth fourlegged hibernates mammal cat warm yes yes no yes bat warm yes no yes yes salamander cold no yes yes no eagle warm no no no no guppy cold yes no no no ↓ Name bt cold bt warm gb no gb yes fl no fl yes hb no hb yes mammal cat 0 1 0 1 0 1 1 0 yes bat 0 1 0 1 1 0 0 1 yes salamander 1 0 1 0 0 1 0 1 no eagle 0 1 1 0 1 0 1 0 no guppy 1 0 0 1 1 0 1 0 no mammal . . . class label Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 11 / 24

Extending the collection of attributes Recall: new attributes (= factors) are added to original attributes 1 decompose input data matrix I into matrix A describing objects X by factors F and matrix B explaining factors F by attributes Y 2 new attributes Y ′ = Y ∪ F 3 extended data table I ′ ⊆ X × Y ′ : I ′ ∩ ( X × Y ) = I and I ′ ∩ ( X × F ) = A Original decomposition (using FCA): decomposition aim: the number of factors as small as possible existing approx. algorithm (Belohlavek, Vychodil): greedy search for factor concepts which cover the largest area of still uncovered 1s in input data table function of optimality of factor concept = “cover ability” Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 12 / 24

Preprocessing input data for machine learning by FCA Jan OUTRATA - PowerPoint PPT Presentation

Preprocessing input data for machine learning by FCA Jan OUTRATA Dept. Computer Science Palack y University, Olomouc, Czech Republic CLA 2010, Oct 1921, Sevilla Jan Outrata (Palack y University) Preprocessing input data . . . CLA

CS6220: DATA MINING TECHNIQUES Chapter 3: Data Preprocessing Instructor: Yizhou Sun

Data Preprocessing Why Data Preprocessing? Chris Williams, School of Informatics University of

Data Preprocessing Data Mining and Exploration: Preprocessing Data preparation is a big issue for

Preprocessing Data for Machine Learning P R E P R OC E SSIN G FOR MAC H IN E L E AR N IN G

File Input and Output File Input and Output 1 / 9 File input/output input function reads values

CS378 Introduction to Data Mining Data Exploration and Data Preprocessing Li Xiong Data

Input Input devices Text entry Positional input Input Devices 1 iPod Wheel Input Devices 2

Data Preparation Data cleaning Discretization (Data preprocessing) Data

Data Warehousing and Machine Learning Preprocessing Thomas D. Nielsen Aalborg University

Preprocessing and Dimensionality Reduction J er emy Fix CentraleSup elec

Tra ffi c Management as a Service | Ghent, Belgium INPUT PROCESS OUTPUT INPUT PROCESS OUTPUT

Input Input devices Text entry Positional input Input Devices 1 MacBook Wheel (The Onion) -

Learning From Data Lecture 27 Learning Aides Input Preprocessing Dimensionality Reduction and

TRACER TUTORIAL: TEXT REUSE DETECTION PREPROCESSING M arco B uchler, Emily Franzini and Greta

Data Preprocessing Week 2 Topics Topics Data Types Data Repositories Data

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

How Travis AFB Transformed its Cleanup Program into an Award Winning Green Sustainable

CSE 416, Section 1 Semester Project Discussion Session Objectives Understand issues and

Visualizing Outlier Analysis to Detect Gerrymandering with an Agent-Based Model Anne Yust

Advanced 3D-Data Structures 3D scanner: produces a set of spatial points which are not connected

YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of

Communication Issues in Collective Decision Making Nicolas Maudet nicolas.maudet@lip6.fr

Lecture Notes for Chapter 5 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Look for

Districting and Gerrymandering Andrea Scozzari University Niccol` o Cusano Caen, July 8-12 2014