Merging Classifiers of Different Classification Approaches - PowerPoint PPT Presentation

Merging Classifiers of Different Classification Approaches Incremental Classification, Concept Drift and Novelty Detection Workshop Antonina Danylenko 1 and Welf L¨ owe 1 antonina.danylenko@lnu.se 14 December, 2014 1 Linnaeus University, Sweden Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 1(28)

Agenda ◮ Introduction; ◮ Problem, Motivation, Approach; ◮ Decision Algebra; ◮ Merge as an Operation of Decision Algebra; ◮ Merging Classifiers; ◮ Experiments; ◮ Conclusions. Agenda Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 2(28)

Introduction ◮ Classification is a common problem that arises in different fields of Computer Science (data mining, information storage and retrieval, knowledge management); ◮ Classification approaches are often tightly coupled to: ◮ learning strategies: different algorithms are used; ◮ data structures: represent information in different ways; ◮ how common problems are addressed: workarounds; ◮ It is not that easy to select an appropriate classification model for classification problem (be aware of accuracy, robustness, scalability); Introduction Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 3(28)

Problem and Motivation ◮ Simple combining of classifiers learned over different data sets of the same problem is not straightforward; ◮ Current work is done in aggregation and meta-learning: ◮ combine different classifiers learned over same data set; ◮ construct single classifier learned on the different variations of the same classification problem; ◮ as a result - do not take into account that the context can differ. ◮ Combining classifiers with partly- or completely- disjoint contexts use one single classification approach for base-level classifiers; ◮ Generality gets lost: incomparable, difficult benchmarking, hard to propagate advances between domains; Introduction Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 4(28)

Proposed Approach ◮ Use Decision Algebra that defines classifiers as re-usable black-boxes in terms of so-called decision functions; ◮ Define a general merge operation over these decisions functions which allows for symbolic computations with classification information captured; ◮ Show an example of merging classifiers of different classification approaches; ◮ Show that the merger of classifiers tendentiously becomes more accurate; Introduction Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 5(28)

Classification Information ◮ Classification information is a set of decision tuples: CI = { ( � a 1 , c 1 ) , . . . ( � a n , c n ) } a ∈ � ◮ It is complete if: ∀ � A : ( � a , c ) ∈ CI ; ◮ It is non-contradictive if: ∀ ( � a i , c i ) , ( � a j , c j ) ∈ CI : � a i = � a j ⇒ c i = c j ; ◮ Problem domain ( A , C ) of CI is a superset of � A × C , that defines the actual classification problem, where � A ∈ A ; Decision Algebra Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 6(28)

Decision Function ◮ Decision Function is a representation of complete and possibly contradictive decision information: df : � A → D ( C ) a ∈ � maps actual context � A to a (probability) distribution D ( C ); ◮ It is a higher order (or curried) function: df n : A n → ( A n − 1 → ( . . . ( A 1 → ( → D ( C ))))); ◮ Can be easily represented as a decision tree or decision graph: df n = x 1 ( df n − 1 , . . . , df n − 1 | Λ 1 | ) 1 where Λ i is a domain of attribute A 1 Decision Algebra Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 7(28)

Graph Representation of Decision Function ◮ Decision function df 2 = x 1 ( na , x 2 ( na , na , a , a ) , x 2 ( na , na , a , a ) , a ) na na a a na na a a na a high med vhigh high med low vhigh low na 2 2 a 2 high med low vhigh 1 1 Figur: A tree (left) and graph (right) representation of df 2 . Each node labeled with n represents a decision term with a selection operator x n ; each square leaf node labled with c corresponds to a probability distribution over classes C with c the most probable class. Decision Algebra Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 8(28)

Decision Algebra ◮ ( DA ) is a theoretical framework that is defined as a parameterized specification, with � A and D ( C ) as parameters. It provides a general representation of classification information as an abstract classifier; Decision Algebra Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 9(28)

Operations Over Decision Functions ◮ Constructor x n : x n : Λ 1 × DF [ � A ′ , D ] × · · · × Λ 1 × DF [ � A ′ , D ] → DF [ � A , D ] � �� | Λ 1 | times ◮ Bind binds attribute A i to an attribute value a ∈ Λ i : DF [ � A , D ] × Λ i → DF [ � A ′ , D ] : bind A i ( x n ( a 1 , df 1 , · · · , a | Λ 1 | , df | Λ 1 | ) , a ) ≡ df i , if a = a i bind A 1 ( df 2 , high) = x 2 ( na , na , a , a ) bind A 1 ◮ Evert changes the order of attributes in the decision function: DF [ � A , D ] → DF [ � A ′ , D ] : evert A i evert A i ( df ) := x ( a 1 , bind A i ( df , a 1 ) , . . . , a | Λ i | , bind A i ( df , a | Λ i | )) evert A 2 ( df 2 ) x 2 ( x 1 ( na , na , na , a ) , x 1 ( na , na , na , a ) , = x 1 ( na , a , a , a ) , x 1 ( na , a , a , a )) Merge as an Operation of Decision Algebra Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 10(28)

Merge Operation over Decision Functions ◮ Merge operator ⊔ D over class distribution D ( C ); ⊔ D : D ( C ) × D ( C ) → D ( C ) d ( C ) ⊔ D d ′ ( C ) = { ( c , p + p ′ ) | ( c , p ) ∈ d ( C ) , ( c , p ′ ) ∈ d ′ ( C ) } ◮ General merge operation over decision functions : ⊔ : DF 1 [ � A , D ] × DF 2 [ � A , D ] → DF ′ [ � A , D ] 0 ∈ DF ∅ [ { � 0 , df 2 ◮ Merge over constant decision functions df 1 0 } , D ]: ⊔ ( df 0 1 , df 0 x 0 ( ⊔ D ( df 0 1 , df 0 2 ) := 2 )) Merge as an Operation of Decision Algebra Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 11(28)

Scenario One: Same Formal Context ◮ Prerequisite : The decision functions df 1 ∈ DF 1 [ � A , D ] and df 2 ∈ DF 2 [ � A ′ , D ] are constructed over different samples of the A ′ = Λ 1 × . . . × Λ n ; same problem domain and � A = � ⊔ ( df 1 , df 2 ) := x n ( a 1 , ⊔ ( bind A 1 ( df 1 , a 1 ) , bind A 1 ( df 2 , a 1 )) , . . . , a k , ⊔ ( bind A 1 ( df 1 , a k ) , bind A 1 ( df 2 , a k ))) Merging Classifiers Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 12(28)

Scenario One: Cont’d na a low high vhigh med 2 a na med high high vhigh med vhigh 1 1 vg 1: if df 1 ∈ DF ∅ [ { � low low 0 } , D ] ∧ df 2 ∈ (a) (b) DF ∅ [ { � 0 } , D ] then 2: return x ( ⊔ D ( df 1 , df 2 )) na na, a 3: end if vhigh 4: for all a ∈ Λ 1 do high med 5: = df a na 2 low ⊔ ( bind 1 ( df 1 , a ) , bind 1 ( df 2 , a )) (c.1) (c.2) 6: end for 7: return na, a a x ( a 1 , df a 1 , . . . , a | Λ 1 | , df a | Λ1 | ) vhigh high med high med vhigh na 2 a low 2 med a, vg low low high 1 (c.3) (c.4) vhigh (c) (d) Merging Classifiers Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 13(28)

Scenario Two: Disjoint Formal Contexts ◮ Prerequisite : The decision functions df 1 ∈ DF 1 [ � A , D ] and df 2 ∈ DF 2 [ � A ′ , D ] are constructed over samples with disjoint formal contexts of the same problem domain: � A = Λ 1 × . . . × Λ n and A ′ = Λ ′ � 1 × . . . × Λ ′ m and attributes { A 1 , . . . , A n } ∩ { A ′ 1 , . . . , A ′ m } = ∅ ; ⊔ ( df 1 , df 2 ) := x n ( a 1 , ⊔ ( bind A 1 ( df 1 , a 1 ) , bind A 1 ( df 2 , a 1 )) , . . . , a k , ⊔ ( bind A 1 ( df 1 , a k ) , bind A 1 ( df 2 , a k ))) ⊔ ( df 0 df 2 , df 0 1 , df 2 ) := ⊔ ( 1 ) Merging Classifiers Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 14(28)

Scenario Two: Cont’d acc 2 5more 3 4 na g 3 acc 1: if df 1 ∈ DF ∅ [ { � low 0 } , D ] ∧ df 2 ∈ vhigh 2 high more 4 DF ∅ [ { � 0 } , D ] then 1 vg 4 med low 2: return x ( ⊔ D ( df 1 , df 2 )) 5 6 3: end if (a) (b) 4: if df 1 ∈ DF ∅ [ { � 0 } , D ] then acc 5: return ⊔ ( df 2 , df 1 )) 2 6: end if more 4 vg, acc 4 7: for all a ∈ Λ 1 do acc acc, g acc, na low low 8: df a = more more 6 6 more 2 2 ⊔ ( bind 1 ( df 1 , a ) , bind 1 ( df 2 , a )) 2 4 4 vg, g 4 vg, na 4 g vg 4 4 2 2 9: end for low low low low 5more 5more 3 3 10: return 6 3 6 6 3 6 4 4 low low x ( a 1 , df a 1 , . . . , a | Λ 1 | , df a | Λ1 | ) vhigh vhigh 1 1 high high med med 5 5 (c) (d) Merging Classifiers Department of Computer Science, Linnaues University Merging Classifiers of Different Classification Approaches 15(28)

Merging Classifiers of Different Classification Approaches - PowerPoint PPT Presentation

Merging Classifiers of Different Classification Approaches Incremental Classification, Concept Drift and Novelty Detection Workshop Antonina Danylenko 1 and Welf L owe 1 antonina.danylenko@lnu.se 14 December, 2014 1 Linnaeus University,

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Optimal Merging in Quantum k -xor and k -sum Algorithms Mara Naya-Plasencia, Andr

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Bayes Classifiers Nave Bayes Classification Patrick Mair Bayes Classifiers Weather data

Track Filtering/Quality/Merging A proposal for data format of track quality and track merging in

Merging DataFrames Merging DataFrames with pandas Population DataFrame In [1]: import pandas as

Parton Showers and Matching/Merging Lecture 2 of 2: Matching/Merging & Non-Perturbative

Comparison Based Merging Upper and Lower bounds EMADS Fall 2003: Comparison Based Merging Page 1

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

Latent Classification Models Classification in continuous domains Helge Langseth and Thomas D.

Question Classification Ling573 NLP Systems and Applications April 22, 2014 Roadmap

Buying, Selling, Merging Buying, Selling, Merging and Valuation and Valuation Sponsored by: US

Yet Another Approach To Model Merging merge and diff relations and very short version rules

NO REFORM LEFT BEHIND? BALANCING PUBLIC SERVICE IMPROVEMENT WITH GOVERNMENT PRIORITIES:

CDT meets Trace Compass EclipseCon, March 2015 Marc Khouzam Marc-Andr Laperle ABOUT US

Structuralism - Its prominent past, sad present, and bright future WORKSHOP Marcin Jan Schroeder

The Jungle Universe About scales and physics in the cosmos Simon Portegies Zwart Sterrewacht

Neural evidence for a single lexicogrammatical processing system Jennifer Hughes

Evaluating a German Sketch Grammar: A Case Study on Noun Phrase Case Kremena Ivanova , Ulrich

Between complex predicates and regular phrases: German collocational clusters Philippa Cook

Scalable Learning Technologies Scalable Learning Technologies for Big Data Mining for Big Data

Merging Classifiers of Different Classification Approaches - PowerPoint PPT Presentation

Merging Classifiers of Different Classification Approaches Incremental Classification, Concept Drift and Novelty Detection Workshop Antonina Danylenko 1 and Welf L owe 1 antonina.danylenko@lnu.se 14 December, 2014 1 Linnaeus University,

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Optimal Merging in Quantum k -xor and k -sum Algorithms Mara Naya-Plasencia, Andr

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Bayes Classifiers Nave Bayes Classification Patrick Mair Bayes Classifiers Weather data

Track Filtering/Quality/Merging A proposal for data format of track quality and track merging in

Merging DataFrames Merging DataFrames with pandas Population DataFrame In [1]: import pandas as

Parton Showers and Matching/Merging Lecture 2 of 2: Matching/Merging &amp; Non-Perturbative

Comparison Based Merging Upper and Lower bounds EMADS Fall 2003: Comparison Based Merging Page 1

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

Latent Classification Models Classification in continuous domains Helge Langseth and Thomas D.

Question Classification Ling573 NLP Systems and Applications April 22, 2014 Roadmap

Buying, Selling, Merging Buying, Selling, Merging and Valuation and Valuation Sponsored by: US

Yet Another Approach To Model Merging merge and diff relations and very short version rules

NO REFORM LEFT BEHIND? BALANCING PUBLIC SERVICE IMPROVEMENT WITH GOVERNMENT PRIORITIES:

CDT meets Trace Compass EclipseCon, March 2015 Marc Khouzam Marc-Andr Laperle ABOUT US

Structuralism - Its prominent past, sad present, and bright future WORKSHOP Marcin Jan Schroeder

The Jungle Universe About scales and physics in the cosmos Simon Portegies Zwart Sterrewacht

Neural evidence for a single lexicogrammatical processing system Jennifer Hughes

Evaluating a German Sketch Grammar: A Case Study on Noun Phrase Case Kremena Ivanova , Ulrich

Between complex predicates and regular phrases: German collocational clusters Philippa Cook

Scalable Learning Technologies Scalable Learning Technologies for Big Data Mining for Big Data

Parton Showers and Matching/Merging Lecture 2 of 2: Matching/Merging & Non-Perturbative