Feat ure Select ion using/ f or Feat ure Select ion using/ f or - - PowerPoint PPT Presentation

feat ure select ion using f or feat ure select ion using
SMART_READER_LITE
LIVE PREVIEW

Feat ure Select ion using/ f or Feat ure Select ion using/ f or - - PowerPoint PPT Presentation

NI PS03 Workshop of Feature Extraction and Feature Selection Challenge Feat ure Select ion using/ f or Feat ure Select ion using/ f or Transduct ive ransduct ive S Support upport V Vect or ect or M Machine achine T Mr. Zhili Wu Dr.


slide-1
SLIDE 1

NI PS03 Workshop of Feature Extraction and Feature Selection Challenge

Feat ure Select ion using/ f or Feat ure Select ion using/ f or T Transduct ive ransduct ive S Support upport V Vect or ect or M Machine achine

  • Mr. Zhili Wu
  • Dr. Chun- hung Li

Department of Computer Science Hong Kong Baptist University

slide-2
SLIDE 2

Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) - Contents

Contents

  • I ntroduction to Feature Selection
  • Why TSVM works
  • Technique sharing – not limited by TSVM
  • Several technique highlights
  • Conclusion
  • Your comments & doubts
slide-3
SLIDE 3

Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) – Feature Selection

Feature selection (Competition)

  • I mpact of Weston’s Dataset selection
  • Your algorit hm A*
  • Ot her’s algorit hms A1, …

, An

  • You have M=2d-1 possible f eat ure set s f or a

d-dimensional dat aset : F1,… ,FM

  • L(A,D(Fi)) = loss of algorit hm A on dat aset D(Fi)
  • Your goal: f ind a f eat ure set F* in F1,…

,FM so t hat L(A*,D(F*)) < min1, ..., n ( L(Ai , D(F*) )

slide-4
SLIDE 4

Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) – “No Free Feature”

“No Free Feature” Theorem

  • From “No Free Brunch” (Weston NI PS 2002)
  • The generalizat ion error of t wo dat aset s f or all

algorit hms is t he same EA[ Rgen

A[D]] = EA[ Rgen A[D’]]

  • Since any t wo f eat ure set s induce t wo new dat aset s

EA[ Rgen

A[D(F)]] = EA[ Rgen A[D(F’)]]

  • Consequence: Techniques are very important!
slide-5
SLIDE 5

Technique 1 - Transductive Transductive SVM (TSVM)

Transductive SVM (SVMLight by J oachims)

slide-6
SLIDE 6

Simpler Explanation to TSVM

  • 1. Train a SVM on labeled dat a only
  • 2. Predict unlabeled dat a t o an assigned f ract ion of

Pos, ot hers being Neg

  • 3. Train t he whole dat aset
  • swit ch some pairs of Pos/ Neg f or some

goodness measure, repeat 3

  • 4. Repeat 2 & 3 t ill unlabeled dat a cont ribut e much

Technique 1 - Transductive Transductive SVM (TSVM) – Simpler Explanation

slide-7
SLIDE 7

Why TSVM Works f or FS Competition

  • unlabeled (validating+testing) data provided
  • accuracy is the f irst priority measure
  • Fraction of Pos/ Neg unlabeled samples provided
  • Also, ef f ective & compatible tools:
  • Dr. Chih-J en Lin’s SVMLI B
  • SVMLI B + SVMLI GHT

Technique 1 - Transductive Transductive SVM (TSVM) – Why works f or FS Competition

slide-8
SLIDE 8

11.52(11t h) 4.4(6t h) 1.58(11t h) BER & (Rank by submissions

  • n 1st/ Dec)

Use w t o select f eat ure and rescale f eat ure Further f eature reduction Scale f eat ure by f -score Scale f eat ure by f -score Normalize 1 D_ij / Sqrt (row- sum*col-sum) D_ij / Sqrt (row- sum*col-sum) 7~20 P Cs by P CA T-t est MI , BNS, BER score, F-score Model select ion by CV seems t o overf it ? Remarks: RBF (g=1,c=1) Linear (C+/ C

  • =19.5)

Linear P

  • ly 2

RBF ( C=2^5, g=2^-6) Kernel yes Fisher Score Gisette Yes No Yes Yes Transduction F-score Odd Rat io F-score Score Normalize 1 (0 mean, unit st d) Madelon Dorothea Dexter Arcene

Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) – Technique Summary

slide-9
SLIDE 9

Madelon – A Fisher- Score Variant

  • (µ+ - µ- )/ (s+ + s- )
  • 13 f eatures are selected

Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) – Technique Highlights

slide-10
SLIDE 10

Dorothea oddRatio

d c

Class + 1

b a

Class - 1 Truth 1 Feature value

  • ExpProb oddRatio1
  • f or unbalanced class

exp( P (1| class+) – P (1|class-) ) = exp( d/ (c+d) – b/ (a+b) )

  • Other Measures like BNS 2, MI , …
  • I s BER a score indicating goodness of f eatures?

The balanced error rat e (BER) is t he average of t he errors on each class: BER = 0.5*(b/ (a+b) + c/ (c+d)).

  • 1. Feat ure select ion f or unbalanced class dist ribut ion and Naïve Bayes, Dunj a Mladenic, Marko Grobelnik
  • 2. An Ext ensive Empirical St udy of Feat ure Select ion Met rics f or Text Classif icat ion , George Forman ,

J MLR 2003 special issue on variable and f eat ure select ion

Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) – Technique Highlights

slide-11
SLIDE 11

Dexter: A Simple Linear- TSVM- RFE

1. Pr une some f eat ur es using scor es easily calculat ed

  • 2. Rescale remaining f eat ures by scores
  • 3. Train a Linear TSVM (wit h good generalizat ion abilit y)
  • 4. Calculat e t he f eat ure weight w
  • 5. Rank f eat ur es and r escale f eat ur es by w
  • 6. Repeat 3~5 t ill a balance of f eat ure relevance &

accuracy

Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) – Technique Highlights

slide-12
SLIDE 12

Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) – Conclusion

Conclusion

  • 1. No Free Feat ure
  • 2. TSVM
  • 3. Techniques
  • 1. Scoring Met hods
  • 2. TSVM RFE
  • 4. Ot her import ant issues not ment ioned:
  • 1. Model select ion
  • 2. Normalizat ion
  • 3. …
slide-13
SLIDE 13

Your Comments! Thanks !