klaR: A Package Including Various Classification Tools Christian R - - PowerPoint PPT Presentation

klar a package including various classification tools
SMART_READER_LITE
LIVE PREVIEW

klaR: A Package Including Various Classification Tools Christian R - - PowerPoint PPT Presentation

klaR: A Package Including Various Classification Tools Christian R over, Nils Raabe, Karsten Luebke and Uwe Ligges Universit at Dortmund 44221 Dortmund Germany May 21, 2004 Overview: Example data 1. Classification tools 2. 3.


slide-1
SLIDE 1

klaR: A Package Including Various Classification Tools

Christian R¨

  • ver, Nils Raabe, Karsten Luebke and Uwe Ligges

Universit¨ at Dortmund 44221 Dortmund Germany May 21, 2004

slide-2
SLIDE 2

Overview:

1. Example data 2. Classification tools 3. Comparing classification results 4. Variable selection 5. Illustrating discrimination 6. Visualization of data structure

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

1

slide-3
SLIDE 3

B3 data: “West German business cycles”

  • data on 14 economic variables observed quarterly over 39 years

(157 observations)

  • each quarter was assigned to one out of 4 phases:
  • 1. upswing
  • 2. upper turning point
  • 3. downswing
  • 4. lower turning point
  • wanted: classification rule for phases
  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

2

slide-4
SLIDE 4

RDA: Regularized Discriminant Analysis1

  • generalization of LDA and QDA
  • assumptions similar to QDA

(differences in means and covariances)

  • covariance matrices are manipulated using two parameters (γ and λ)
  • more robust against multicollinearity
  • parameters are determined by minimizing (estimated) misclassification

rate

1Friedman, J.H. (1989):

Regularized Discriminant Analysis. Journal of the American Statistical Association 84, 165-175.

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

3

slide-5
SLIDE 5

RDA: special cases

  • (γ=0, λ=0): QDA — individual covariances for each group.
  • (γ=0, λ=1): LDA — a common covariance matrix.
  • (γ=1, λ=0):

Conditional independence, identical variances within class (similar to Naive Bayes).

  • (γ=1, λ=1):

Objects are assigned to class with nearest mean (euclidean).

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

4

slide-6
SLIDE 6

RDA: examples

  • set parameters manually...

> x <- rda(PHASEN~., data=B3[train,], gamma=0.05, lambda=0.1)

  • ...or optimize misclassification rate.

> x <- rda(PHASEN~., data=B3[train,])

  • prediction etc. as usual

> predict(x, B3[test,]) $class [1] 3 3 3 4 4 4 4 1 3 1 1 1 1 1 1 1 4 4 4 1 1 4 4 4 1 1

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

5

slide-7
SLIDE 7

SVMlight2

  • interface to T. Joachims’ Support Vector Machine implementation
  • supports loss parameters and 1-against-all classification
  • returns comparable membership scores (‘posterior probabilities’)
  • example:

> x <- svmlight(PHASEN ~ ., data=B3[train,]) > predict(x, B3[test,])

2Joachims, T. (2004): SVMlight. http://svmlight.joachims.org/

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

6

slide-8
SLIDE 8

Comparing classifications

  • looking at misclassifications:

> errormatrix(true.phase, rda.prediction) predicted true dn ltp up utp -SUM- dn 2 7 7 ltp 2 4 2 up 1 12 14 13 utp 5 1 5

  • SUM-

3 24 27

  • 27 out of 48 are misclassified, worst rates for (true) “utp”, most

misclassifications go into class “ltp”,. . .

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

7

slide-9
SLIDE 9

Comparing classifications

  • looking at posterior assignments:

$posterior up utp dn ltp [1,] 0.000 0.000 0.978 0.022 [2,] 0.001 0.000 0.995 0.005 [3,] 0.077 0.000 0.151 0.772 [4,] 0.249 0.000 0.000 0.750 [5,] 0.256 0.000 0.005 0.739 each observation is assigned to every class with a certain posterior probability or membership

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

8

slide-10
SLIDE 10

Comparing classifications

  • probability distribution over 4 classes may be illustrated by a point in a

3-dimensional simplex (tetraeder, ‘barycentric plot’): – each corner corresponds to one class, – probability for certain class proportional to distance to opposite side

  • example:

> quadplot(rdapred$posterior, [...] )

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

9

slide-11
SLIDE 11

RDA posterior assignments

1 2 3 4

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

10

slide-12
SLIDE 12

SVMlight posterior assignments

1 2 3 4

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

11

slide-13
SLIDE 13

Comparing classifications

  • RDA: greater posterior probabilities

(points on edges and corners)

  • SVMlight: more uncertainty

(points inside simplex) ➜ measure these features for comparison

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

12

slide-14
SLIDE 14

Comparing classifications

  • derive3

– Correctness rate: 1 - error rate – Accuracy: distance to ‘true’ corner – Ability to separate: distance to classified corner – Confidence: mean membership of assigned class (either by class or average)

3Garczarek, U. and Weihs, C. (2003):

Standardizing the Comparison of Partitions. Computational Statistics 18, 143-162.

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

13

slide-15
SLIDE 15

> ucpm(m=rdapred$posterior, tc=B3$PHASEN[test]) $CR [1] 0.5833333 $AC [1] 0.3250307 $AS [1] 0.981954 $CF [1] 0.9889456 $CFvec 1 2 3 4 0.9912088 1.0000000 0.9999684 0.9511723

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

14

slide-16
SLIDE 16

Comparing classifications

LDA RDA SVM Correctness rate (1 - error rate) 0.44 0.58 0.54 Accuracy (distance to true corner) 0.03 0.33 0.17 Ability to separate (distance to classified corner) 0.75 0.98 0.29 Confidence (mean membership of assigned class) 0.83 0.99 0.47

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

15

slide-17
SLIDE 17

Variable selection

  • stepclass: stepwise selection using (estimated) misclassification rate

– forward selection: add variables to model – backward selection: throw variables out – or both directions

  • works for most classification methods
  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

16

slide-18
SLIDE 18

Variable selection

  • example:

> x <- stepclass(PHASEN~., data=B3[train,], + method="qda", prior=rep(1/4,4)) > x method : qda final model : EWAJW, LSTKJW, ZINSLR error rate : 0.3265

  • error rate for test set is 29% (71% correct)
  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

17

slide-19
SLIDE 19

Visualization of partitionings

  • how are classes located / separated?
  • look at partitioning for every pair of variables...

> partimat(B3[,x$model$name], B3[,"PHASEN"], + method="qda", plot.matrix=TRUE)

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

18

slide-20
SLIDE 20

2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 2 2 3 3 3 3 3 3 4 1 11 1 1 1 2 2 2 3 3 3 3 3 3 4 4 4 4 1 1 1 1 1 1 2 2 33 3 3 3 4 4 4 4 1 12 2 3 3 3 3 4 4 4 4 4 4 4 1 1 1 1 1 1 11 1 1 11 1 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 41 1

Error: 0.287

5 10 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 2 2 3 3 3 3 3 34 1 1 1 1 1 1 2 22 3 3 3 3 3 3 4 4 4 4 1 1 1 1 1 1 2 2 3 3 3 3 3 4 4 4 41 1 2 2 3 3 3 3 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 11 1 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 4 1 1

Error: 0.287

5 10 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 2 2 3 3 3 3 3 3 4 1 1 1 1 1 1 2 2 2 3 3 3 3 3 3 4 4 4 4 1 1 1 1 1 1 2 2 3 3 3 3 3 4 4 4 4 1 12 2 3 3 3 3 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 34 4 4 4 4 4 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 2 2 2 2 2 2 3 3 3 3 3 3 3 33 4 1 1

Error: 0.459

2 4 6 −4 −2 2 4 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 2 2 3 3 3 3 3 3 4 1 1 1 1 1 1 2 2 2 3 3 3 3 3 3 4 4 4 4 1 1 1 1 1 1 2 2 3 3 3 3 3 4 4 4 4 1 1 2 2 3 3 3 3 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3 33 3 3 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 11 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 4 1 1

Error: 0.459

2 4 6 −4 −2 2 4 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 2 2 3 3 3 3 3 3 4 1 1 1 1 1 1 2 2 2 3 3 3 3 3 3 4 4 4 4 1 1 1 1 1 1 2 2 3 3 3 3 3 4 4 4 4 1 1 2 2 3 3 3 3 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 4 1 1

Error: 0.42

5 10 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 1 1 1 2 2 2 2 2 3 3 3 3 3 3 4 1 1 1 1 1 1 2 2 2 3 3 3 33 3 4 4 4 4 1 1 1 1 1 1 2 2 3 3 3 3 3 4 4 4 4 1 1 2 2 3 3 3 3 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3 33 3 3 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 4 1 1

Error: 0.42

5 10 −4 −2 2 4 6 −4 −2 2 4 6 EWAJW LSTKJW 2 4 6 2 4 6 ZINSLR

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

19

slide-21
SLIDE 21

Visualization of data structure

  • EDAM: Eight Directions Arranged Map4
  • similar to Self Organizing Maps
  • observations (and gaps between) are arranged on a 2D-grid in order to

reflect distances

  • example:

> lcEDAM <- EDAM(B3[test,-1], classes=B3$PHASEN[test], + standardize = TRUE, iter.max = 20)

4Raabe, N. (2003).

Vergleich von Kohonen Self-Organizing-Maps mit einem nichtsimultanen Klassi- fikations- und Visualisierungsverfahren. Diploma Thesis, University of Dortmund.

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

20

slide-22
SLIDE 22

1 2 3 4 5 6 7 8 1 2 3 4 5 6 Dimension 1 Dimension 2 upswing upper turning point downswing lower turning point

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

21

slide-23
SLIDE 23

1 2 3 4 5 6 7 8 1 2 3 4 5 6 Dimension 1 Dimension 2 1993,1 1993,1 1993,2 1993,2 1993,4 1993,4 1982,4 1982,4 1982,2 1982,2 1982,3 1982,3 1992,1 1992,1 1992,2 1992,2 1993,3 1993,3 1983,1 1983,1 1984,2 1984,2 1985,1 1985,1 1992,3 1992,3 1992,4 1992,4 1994,1 1994,1 1984,1 1984,1 1983,2 1983,2 1983,3 1983,3 1991,4 1991,4 1989,3 1989,3 1984,4 1984,4 1983,4 1983,4 1985,3 1985,3 1984,3 1984,3 1991,3 1991,3 1989,4 1989,4 1988,4 1988,4 1988,3 1988,3 1985,4 1985,4 1985,2 1985,2 1991,2 1991,2 1990,2 1990,2 1989,2 1989,2 1988,2 1988,2 1987,3 1987,3 1987,2 1987,2 1990,3 1990,3 1990,1 1990,1 1989,1 1989,1 1987,4 1987,4 1986,1 1986,1 1987,1 1987,1 1991,1 1991,1 1990,4 1990,4 1988,1 1988,1 1986,4 1986,4 1986,3 1986,3 1986,2 1986,2 upswing upper turning point downswing lower turning point

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

22

slide-24
SLIDE 24

1 2 3 4 5 6 7 8 1 2 3 4 5 6 Dimension 1 Dimension 2 1993,1 1993,1 1993,2 1993,2 1993,4 1993,4 1982,4 1982,4 1982,2 1982,2 1982,3 1982,3 1992,1 1992,1 1992,2 1992,2 1993,3 1993,3 1983,1 1983,1 1984,2 1984,2 1985,1 1985,1 1992,3 1992,3 1992,4 1992,4 1994,1 1994,1 1984,1 1984,1 1983,2 1983,2 1983,3 1983,3 1991,4 1991,4 1989,3 1989,3 1984,4 1984,4 1983,4 1983,4 1985,3 1985,3 1984,3 1984,3 1991,3 1991,3 1989,4 1989,4 1988,4 1988,4 1988,3 1988,3 1985,4 1985,4 1985,2 1985,2 1991,2 1991,2 1990,2 1990,2 1989,2 1989,2 1988,2 1988,2 1987,3 1987,3 1987,2 1987,2 1990,3 1990,3 1990,1 1990,1 1989,1 1989,1 1987,4 1987,4 1986,1 1986,1 1987,1 1987,1 1991,1 1991,1 1990,4 1990,4 1988,1 1988,1 1986,4 1986,4 1986,3 1986,3 1986,2 1986,2 upswing upper turning point downswing lower turning point

1 2 3 4

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

23

slide-25
SLIDE 25

> require(klaR)

  • seen:

– Classification tools: rda, svmlight – Comparing classifications: errormatrix, ucpm – 3D barycentric plots: quadplot – Variable selection: stepclass – Illustrating classifications: partimat – Data visualization: EDAM

  • further features:

– 2D barycentric plots: triplot – Hidden Markov Modelling: hmm.sop – Simple k-Nearest Neighbour: sknn

  • C. R¨
  • ver, N. Raabe, K. Luebke and U. Ligges: klaR: A Package Including Various Classification Tools

24