automatic sample by sample model selection between two
play

Automatic Sample-by- sample Model Selection Between Two - PDF document

Automatic Sample-by- sample Model Selection Between Two Off-the-shelf Classifiers Steve P. Chadwick University of Texas at Dallas Model Selection by Predicting the Better Classifier Idea: Two classifiers, "primary" and


  1. Automatic Sample-by- sample Model Selection Between Two Off-the-shelf Classifiers Steve P. Chadwick University of Texas at Dallas

  2. Model Selection by Predicting the Better Classifier Idea: � Two classifiers, "primary" and "secondary" ✁ Use confidence to predict which one is expected to perform best Pima Indian Diabetes ----------------- Primary classifier: Fisher LD 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Percentage of error within equal-sized bins 6 5 4 7 3 2 1 8 1 0 9 Fisher LD classifies over 70% of the data before half the total error is accumulated. Secondary classifier: 1-Nearest Neighbor 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Percentage of error within equal-sized bins 3 4 2 5 1 6 10 7 9 8 1-Neigh classifies about 50% of the data when half the total error is accumulated.

  3. Ljubljana Breast Cancer ----------------- Primary classifier: Fisher LD 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Percentage of error within equal-sized bins 5 4 3 6 2 1 7 1 0 8 9 Fisher LD classifies over 60% of the data before half the total error is accumulated. Secondary classifier: Nearest Unlike Nei. 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Percentage of error within equal-sized bins 5 4 3 6 2 1 10 7 9 8 NUN classifies about 60% of the data before half the total error is accumulated.

  4. Confidence measure profiles ----------------- (Ljubljana Breast Cancer) 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Fisher LD using 1/(w t x+s) 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 1-Nearest Neighbor using 1-Neighbor distance 40 30 20 10 1 2 3 4 5 6 7 8 9 10 MSE using Q 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 1-Nearest Neighbor using distance from centers 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 1-Neighbor using nearest unlike neighbor ratio

  5. Differential error ----------------- (Ljubljana Breast Cancer) 30% class A, 70% class B Fisher LD: 70 60 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Nearest Unlike Neighbor: 80 60 40 20 1 2 3 4 5 6 7 8 9 10 Differential error ----------------- (Pima Indian Diabetes) 35% class A, 65% class B Fisher LD: 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 1-Nearest Neighbor: 70 60 50 40 30 20 10 1 2 3 4 5 6 7 8 9 10 Differential error ----------------- (Synthetic Data) 50% class A, 50% class B Fisher LD: 30 25 20 15 10 5 1 2 3 4 5 6 7 8 9 10 1-Nearest Neighbor: 80 60 40 20 1 2 3 4 5 6 7 8 9 10

  6. ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ Obstacles � About 25% of the training data contributes to calculating the selection LD when combining linear discriminant and nearest neighbor classifiers. 3 2.5 2 1.5 1 0.5 100 200 300 400 500 600 Selection LD and data in q-space ✄ The different confidence measures have different ranges, which makes them difficult to compare with each other.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend