Relational Learning Expressive Background Knowledge can be - PDF document

Aggregation Based Feature Invention and Relational Concept Classes (Claudia Perlich & Foster Provost) Relational Learning • Expressive • Background Knowledge can be incorporated easily • Aggregation 1

Predictive Relational Learning • M: (t, RDB) y = + y φ (t, ψ (RDB)) ε • Complexity of relational concept 1. Complexity of relationships 2. Complexity of Aggregation Function 3. Complexity of the function 2

Relational Concept Classes • Propositional – Features can be concatenated – No aggregation – Example – One customer table and other demographic table • Independent Attributes – 1 to n relationship requires simple aggregation – Mapping from a bag of zero or more attributes to a categorical or numeric value – Ex Sum, Average for numeric values – Ex Mode for categorical attributes Relational Concept Classes - Contd • Dependent Attributes within one table – Multi-dimensional Aggregation – Number of products bought on Dec 22 nd (conditioned on Date) • Dependent Attributes across tables – More than one bag of objects of different type – Amount spent on items returned at a later date – Needs info from more than 1 table • Global graph features – Transitive closure over a set possible joins – Customer Reputation 3

Methods for Relational Aggregation • First Order Logic - ILP • Simple Numeric Aggregation – Simple Aggregation operators – Mean, Min, Max, Mode – Cannot express above level 2 Set Distances • – Relational Distance metric & KNN – Calculates the minimum distance of all possible pairs of objects – Distance – Sum of squared distance (numeric values) or edit distance (categorical values) – Assumes attribute independence Transformation Based Learning join Aggregation Relational Potential Features Set of objects Data Feature Selection y Model Feature Vector 4

Value Distributions • Value Order: List of (Value: Index) pairs – Ex (watch:1, book:2,CD:3,DVD:4) • Case Vector – Ex {book,CD,CD,book,DVD,book} for case t – CV t Products.ProductType = (0,3,2,1) • Reference Vector – based on a condition c – Has at position i the sum of values CV[i] for all cases t for which c was true – Ex Number of CDs • Variance Vector – (CV[i]) 2 / (N c - 1) where N c – number of cases where c is true Target Dependent Individual Values RV Class +ve RV Class -ve Book .01 Book .21 CD .31 CD .36 DVD .35 DVD .28 VCR .33 VCR .15 • Most common (MC) - CD • Most common positive (MOP): DVD • Most common Negative (MON): CD • Most Discriminative (MOD): Book 5

Feature Complexity Low 1. No Relational Features 2. Unconditional Features MC, Count 3. Class Conditional Features – MOP,MON 4. Discriminative Class Conditional Features – MOD,MOM High Vector Distances 6

Domain: Initial Public Offerings • IPO(Date,Size,Price,Ticker,Exchange,SIC,Runup) • HEAD(Ticker,Bank) • UNDER(Ticker,Bank) • IND(SIC,Ind2) • IND2(Ind2,Ind) • Goal: To predict whether the offer was made on the NASDAQ exchange Implementation details • Four approaches were tested – ILP – Logic Based feature construction – Selection of specific individual values – Target dependent vector aggregation • Two features were constructed – One for (n:1) joins – Other for autocorrelation 7

Details (Contd) • Exploration – To find related objects – Uses BFS – Stopping criterion – maximum number of chains • Feature Selection – Weighted Sampling to select a subset of 10 features • Model Estimation – Uses C4.5 to learn a tree – No change in results if logistic regression was used • Logic Based Feature construction – Uses ILP to learn FOL clauses and append the binary features • ILP – Only class labels Aggregation approaches NO No Feature Construction MOC Unconditional features – Counts in IPO table VD MVD MOP Class Conditional Features – Most positive and Negative MON categoricals and vector distances VDPN MOD Discriminative Features – Most common categoricals and vector MOM distances MVDD 8

low high Complexity Level Unconditional Conditional Discriminative Features Features Features AUC values for aggregation methods grouped by complexity Accuracy AUC As complexity increases, performance increases As training size increases, performance increases 9

Conclusions • Expressive power of models combined with aggregation • Distance metric • Complex aggregations can reduce explorations • Focusses only upto level 2 of the hierarchy 10

Relational Learning Expressive Background Knowledge can be - PDF document

Aggregation Based Feature Invention and Relational Concept Classes (Claudia Perlich & Foster Provost) Relational Learning Expressive Background Knowledge can be incorporated easily Aggregation 1 Predictive Relational

Chapter 2: Relational Model Chapter 2: Relational Model Structure of Relational Databases

Chapter 3: Relational Model Structure of Relational Databases Relational Algebra Tuple

Relational Algebra Relational Query Languages Recall: Query = Retrieval Program Language

Relational Algebra 1 / 39 Relational Algebra Relational model specifies stuctures and

Relational Query Languages (2) SQL and QBE Walid G. Aref Query Languages For The Relational

Chapter 8 Evaluation of Relational Operators Implementing the Relational Algebra Relational

Relational Calculus More declarative than relational algebra Foundation for query

RELATIONAL ALGEBRA CHAPTER 6 1 CHAPTER 6 OUTLINE Unary Relational Operations: SELECT and

Relational Data Model Hacettepe University Computer Engineering Department Outline 1. Relational

This Lecture The Relational Model Relational data structures Relations and Relational

Goals Why relational learning? Review of logic programming Examples for

Relational Non-Relational Rational Agile Predictable Flexible Traditional

CSE 154 LECTURE 13:RELATIONAL DATABASES AND SQL Relational databases relational database : A

CSC 337 LECTURE 20: RELATIONAL DATABASES AND SQL Relational databases relational database : A

Relational Calculus Another Theoretical QL-Relational Calculus Comes in two flavors: Tuple

Extended RA Database Systems: The Complete Book Ch 5.1-5.2, 15.4 1 Relational Algebra A Set of

Early Action in GHG Mitigation and Role of Information Disclosure Mechanisms Donna Ramirez

Updating IEEE 1471 David Emery & Rich Hilliard* WICSA 2008 Working Session 4

Volunteer Clouds for the LHC experiments Laurence Field Hassen Riahi CERN IT-SDC EGI User

(Pseudo Wire Emulation) Ghassem Koleyni Nortel Networks ghassem@nortelnetworks.com Phone : +1

Spatio-temporal correlations across the melting of 2 D Wigner molecules Amit Ghosal IISER

Statistical Analysis for M edical and Public Health Data Qazvin University of M edical Sciences

SSML 1.1 Daniel C. Burnett Nuance Communications J anuary 13, 2007 Overview SSML 1.1

Skeletons Animated characters are usually built on top of an underlying skeleton The

Relational Learning Expressive Background Knowledge can be - PDF document

Aggregation Based Feature Invention and Relational Concept Classes (Claudia Perlich & Foster Provost) Relational Learning Expressive Background Knowledge can be incorporated easily Aggregation 1 Predictive Relational

Chapter 2: Relational Model Chapter 2: Relational Model Structure of Relational Databases

Chapter 3: Relational Model Structure of Relational Databases Relational Algebra Tuple

Relational Algebra Relational Query Languages Recall: Query = Retrieval Program Language

Relational Algebra 1 / 39 Relational Algebra Relational model specifies stuctures and

Relational Query Languages (2) SQL and QBE Walid G. Aref Query Languages For The Relational

Chapter 8 Evaluation of Relational Operators Implementing the Relational Algebra Relational

Relational Calculus More declarative than relational algebra Foundation for query

RELATIONAL ALGEBRA CHAPTER 6 1 CHAPTER 6 OUTLINE Unary Relational Operations: SELECT and

Relational Data Model Hacettepe University Computer Engineering Department Outline 1. Relational

This Lecture The Relational Model Relational data structures Relations and Relational

Goals Why relational learning? Review of logic programming Examples for

Relational Non-Relational Rational Agile Predictable Flexible Traditional

CSE 154 LECTURE 13:RELATIONAL DATABASES AND SQL Relational databases relational database : A

CSC 337 LECTURE 20: RELATIONAL DATABASES AND SQL Relational databases relational database : A

Relational Calculus Another Theoretical QL-Relational Calculus Comes in two flavors: Tuple

Extended RA Database Systems: The Complete Book Ch 5.1-5.2, 15.4 1 Relational Algebra A Set of

Early Action in GHG Mitigation and Role of Information Disclosure Mechanisms Donna Ramirez

Updating IEEE 1471 David Emery &amp; Rich Hilliard* WICSA 2008 Working Session 4

Volunteer Clouds for the LHC experiments Laurence Field Hassen Riahi CERN IT-SDC EGI User

(Pseudo Wire Emulation) Ghassem Koleyni Nortel Networks ghassem@nortelnetworks.com Phone : +1

Spatio-temporal correlations across the melting of 2 D Wigner molecules Amit Ghosal IISER

Statistical Analysis for M edical and Public Health Data Qazvin University of M edical Sciences

SSML 1.1 Daniel C. Burnett Nuance Communications J anuary 13, 2007 Overview SSML 1.1

Skeletons Animated characters are usually built on top of an underlying skeleton The

Updating IEEE 1471 David Emery & Rich Hilliard* WICSA 2008 Working Session 4