making set valued predictions in evidential
play

Making Set-valued Predictions in Evidential Classification: A - PowerPoint PPT Presentation

Making Set-valued Predictions in Evidential Classification: A Comparison of Different Approaches Liyao Ma & Thierry Denux ISIPTA 2019 - 5th July 1 Introduction Classification : label predictions = { 1 , , n }


  1. Making Set-valued Predictions in Evidential Classification: A Comparison of Different Approaches Liyao Ma & Thierry Denœux ISIPTA 2019 - 5th July 1

  2. Introduction ● Classification : label predictions Ω = { ω 1 , · · · , ω n } ● Uncertainty → set-valued predictions ● Dempster-Shafer theory ISIPTA 2019 - 5th July 2

  3. Decision making view of classification Precise assignments F = { f ω 1 , · · · , f ω n } ● Precise assignments + complete preorder : Maximum Expected Utility principle ISIPTA 2019 - 5th July 3

  4. Decision making view of classification Precise assignments F = { f ω 1 , · · · , f ω n } ● Precise assignments + complete preorder : Maximum Expected Utility principle ● The uncertain case ❍ Precise assignments + partial preorder ❍ Partial assignments + complete preorder ISIPTA 2019 - 5th July 3

  5. Decision making view of classification Precise assignments F = { f ω 1 , · · · , f ω n } ● Precise assignments + complete preorder : Maximum Expected Utility principle ● The uncertain case ❍ Precise assignments + partial preorder ❍ Partial assignments + complete preorder Partial assignments F = { f A , A ∈ 2 Ω \ {∅}} ISIPTA 2019 - 5th July 3

  6. Two families of decision strategies ● Precise assignments + partial preorder ❍ F = { f ω 1 , · · · , f ω n } ❍ Interval dominance, maximality, weak dominance... ❍ Lack of information → [ E m ( f i ) , E m ( f i )] ❍ Set of non-dominated acts F ∗ = { f ω 1 , f ω 2 } ● Partial assignments + complete preorder ❍ F = { f A , A ∈ 2 Ω \ {∅}} ❍ Generalized maximin, maximax, Hurwicz, minimax regret... ❍ The optimal act F ∗ = { f { ω 1 ,ω 2 } } ISIPTA 2019 - 5th July 4

  7. Defining the utility of set-valued predictions states of nature acts ω 1 ω 2 ω 3 f { ω 1 } 1.0000 0.2000 0.1000 f { ω 2 } 0.2000 1.0000 0.2000 f { ω 3 } 0.1000 0.2000 1.0000 ISIPTA 2019 - 5th July 5

  8. Defining the utility of set-valued predictions states of nature acts ω 1 ω 2 ω 3 f { ω 1 } 1.0000 0.2000 0.1000 f { ω 2 } 0.2000 1.0000 0.2000 f { ω 3 } 0.1000 0.2000 1.0000 f { ω 1 ,ω 2 } ? ? ? f { ω 1 ,ω 3 } ? ? ? f { ω 2 ,ω 3 } ? ? ? f { ω 1 ,ω 2 ,ω 3 } ? ? ? ISIPTA 2019 - 5th July 5

  9. Defining the utility of set-valued predictions ● Ordered Weighted Average (OWA) operator u A , j = F ( { u ij | ω i ∈ A } ) = � | A | ˆ k = 1 w k u A ( k ) j ❍ Tolerance degree of imprecision TOL ( w ) = � | A | | A |− k | A |− 1 w k k = 1 ❍ weights calculation ENT ( w ) := − � | A | max k = 1 w k log w k w s.t. TOL ( w ) = γ � | A | k = 1 w k = 1 ISIPTA 2019 - 5th July 6

  10. Defining the utility of set-valued predictions states of nature acts ω 1 ω 2 ω 3 f { ω 1 } 1.0000 0.2000 0.1000 f { ω 2 } 0.2000 1.0000 0.2000 f { ω 3 } 0.1000 0.2000 1.0000 f { ω 1 ,ω 2 } 0.8400 0.8400 0.1800 f { ω 1 ,ω 3 } 0.8200 0.2000 0.8200 f { ω 2 ,ω 3 } 0.1800 0.8400 0.8400 f { ω 1 ,ω 2 ,ω 3 } 0.7373 0.7455 0.7373 ISIPTA 2019 - 5th July 7

  11. Experimental Comparisons ● UCI and artificial Gaussian data sets ● Classification performances with varying γ ● Performances with noised test sets ● Performances with increasing training set size ISIPTA 2019 - 5th July 8

  12. Conclusions ● Two approaches are contrasted ❍ partial preorder among precise assignments ❍ complete preorder among partial assignments ● the utility of set-valued prediction : OWA ● experimental comparisons ❍ set-valued predictions perform better ❍ cautious rules preferred ISIPTA 2019 - 5th July 9

  13. Thank you! Making Set-valued Predictions in Evidential Classification: A Comparison of Different Approaches Liyao Ma, Thierry Denœux Two families of set-valued decision strategies Partial preorders among precise assignments Patterns are assigned to one and only one of the n classes: F = { f 1 , · · · , f n } decision criterion preference relation E m ( f i ) = � m ( B ) min ω j ∈ B u ij interval dominance f i � ID f j ⇐ ⇒ E m ( f i ) ≥ E m ( f j ) B ⊆ Ω maximality f i � max f j ⇐ ⇒ E m ( f i − f j ) ≥ 0 E m ( f i ) = � m ( B ) max ω j ∈ B u ij B ⊆ Ω weak dominance f i � WD f j ⇐ ⇒ � E m ( f i ) ≥ E m ( f j ) � ∧ � E m ( f i ) ≥ E m ( f j ) � Complete preorders among partial assignments Patterns are assigned partially to a non-empty subset of Ω : F = { f A , A ∈ 2 Ω \ {∅}} - generalized maximin f A i � ∗ f A j ⇐ ⇒ E m ( f A i ) ≥ E m ( f A j ) ⇒ E owa m , β ( f A i ) ≥ E owa - generalized OWA f A i � β f A j ⇐ m , β ( f A j ) - generalized maximax f A i � ∗ f A j ⇐ ⇒ E m ( f A i ) ≥ E m ( f A j ) - generalized minimax regret f A i � r f A j ⇐ ⇒ R ( f A i ) ≤ R ( f A j ) - generalized Hurwicz f A i � α f A j ⇐ ⇒ E m , α ( f A i ) ≥ E m , α ( f A j ) - maximum expected utility f A i � m f A j ⇐ ⇒ EU ( f A i ) ≥ EU ( f A j ) - pignistic criterion f A i � p f A j ⇐ ⇒ E p ( f A i ) ≥ E p ( f A j ) Extending utility matrix via an OWA operator Evaluation of set-valued predictions The extended utility matrix ˆ The classification performance is evaluated by the U ( 2 n − 1 ) × n is crucial to | A | both decision-making and performance evaluation. ENT ( w ) = − � averaged utility in the test set T : w k log w k , The utility of assigning one instance to set A should k = 1 | T | intuitively be a function of those utilities of each pre- 1 subject to TOL ( w ) = γ and � | A | Acc ( T ) = � u F ∗ ˆ i , i ∗ . cise assignments within A : k = 1 w k = 1 . | T | Example: the utility matrix extended by an i = 1 | A | OWA operator with γ = 0.8 u A , j = F ( { u ij | ω i ∈ A } ) = ˆ � w k u A ( k ) j . states of nature Experimental data acts k = 1 ω 1 ω 2 ω 3 Given the DM’s tolerance degree of imprecision f { ω 1 } 1.0000 0.2000 0.1000 2.5 class 1 f { ω 2 } 0.2000 1.0000 0.2000 UCI Balance 2 class 3 class 2 1.5 | A | 0.1000 0.2000 1.0000 | A | − k f { ω 3 } scale dataset 1 � attribute y TOL ( w ) = | A | − 1 w k = γ , f { ω 1 , ω 2 } 0.8400 0.8400 0.1800 and simu- 0.5 0 k = 1 f { ω 1 , ω 3 } 0.8200 0.2000 0.8200 lated Gaussian -0.5 -1 f { ω 2 , ω 3 } 0.1800 0.8400 0.8400 datasets the weights corresponding to the OWA operator are -1.5 f { ω 1 , ω 2 , ω 3 } 0.7373 0.7455 0.7373 obtained by maximizing the entropy -2 -2 -1 0 1 2 3 4 attribute x Experiments Belief functions concerning the states of nature were generated through the DS theory-based neural network classifier. DC1 DC2 DC3 DC4 DC5 DC6 DC7 DC8 DC9 averaged utility γ =0.5 0.9186 0.9188 0.9186 0.9186 0.9186 0.9186 0.9187 0.9187 0.9187 γ =0.6 0.9179 0.9184 0.9176 0.9179 0.9184 0.9176 0.9187 0.9188 0.9188 γ =0.7 0.9059 0.9064 0.9052 0.9059 0.9056 0.9054 0.9190 0.9190 0.9187 Classification γ =0.8 0.9043 0.9032 0.9028 0.9043 0.9030 0.9024 0.9191 0.9191 0.9188 performances γ =0.9 0.9319 0.9325 0.9331 0.9319 0.9192 0.9192 0.9188 0.9339 0.9339 with varying γ γ =1.0 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9194 0.9194 0.9188 (UCI Balance γ =0.5 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 97.44% 97.44% 99.97% % of precision scale dataset) γ =0.6 88.96% 89.47% 88.96% 88.96% 89.18% 89.06% 97.44% 97.44% 99.97% γ =0.7 80.10% 80.77% 80.06% 80.10% 80.22% 80.26% 97.44% 97.44% 99.97% γ =0.8 69.70% 70.14% 69.63% 69.70% 69.82% 69.63% 97.44% 97.44% 99.97% γ =0.9 57.02% 57.76% 57.12% 57.02% 57.38% 57.12% 97.44% 97.44% 99.97% γ =1.0 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 97.44% 97.44% 99.97% Performances with noised test sets (Gaussian dataset) Performances with increasing training set size (Gaussian) 0.95 1 0.95 1 0.9 F1: Maximin, Minimax regret 0.9 F1: Pignistic F1: Maximax 0.85 F1: Hurwicz 0.94 F1: OWA 0.8 Maximin, Minimax regret 0.95 F1: Maximin, Minimax regret 0.8 F2: Interval dominance % of precise predictions Maximax % of precise predictions F1: Maximax F1: Pignistic averaged utility F2: Maximality 0.7 Pignistic 0.93 F1: Hurwicz 0.75 F2: Weak dominance Hurwicz averaged utility F1: OWA 0.6 OWA 0.9 F2: Interval dominance 0.7 Interval dominance 0.92 F2: Maximality 0.5 Maximality F2: Weak dominance 0.65 Weak dominance F1: Maximin, Minimax regret 0.85 0.4 F1: Maximax 0.6 0.91 F1: Pignistic 0.3 F1: Hurwicz 0.55 F1: OWA 0.8 0.9 F2: Interval dominance 0.5 0.2 F2: Maximality F2: Weak dominance 0.45 0.1 0.89 0.75 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 parameter parameter number of training instances number of training instances Conclusions The set-valued predictions induced by a partial preorder turn into precise ones when information becomes more precise. In contrast, the criteria based on a complete preorder can provide set-valued predictions even when uncertainty is quantified by probabilities. Set-valued predictions perform better than precise ones in the case of complex data sets: therefore, the most cautious rules should be preferred in highly uncertain environments. ISIPTA 2019 - 5th July 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend