O and V , to yield h We cannot report L V ( h ) as the measure of - - PowerPoint PPT Presentation

o
SMART_READER_LITE
LIVE PREVIEW

O and V , to yield h We cannot report L V ( h ) as the measure of - - PowerPoint PPT Presentation

Training, Validation, Testing Testing A machine learning system has been trained, using both T O and V , to yield h We cannot report L V ( h ) as the measure of performance The set V is tainted since we used it during training


slide-1
SLIDE 1

Training, Validation, Testing

Testing

  • A machine learning system has been trained, using both T

and V, to yield ˆ h

  • We cannot report LV(ˆ

h) as the measure of performance

  • The set V is tainted since we used it during training
  • Performance measures are accepted only on pristine sets,

not used in any way for training

  • We need to test the system on a third set S, the test set
  • Estimate the true risk Lp(ˆ

h) = Ep[`(y, ˆ h(x))] by computing the empirical risk LS(ˆ h) =

1 |S|

P|S|

n=1 `(yn, ˆ

h(xn)) on S

COMPSCI 527 — Computer Vision Basics of Machine Learning 16 / 21

O

slide-2
SLIDE 2

Training, Validation, Testing

Summary of Sets Involved

  • A training set T to train the predictor given a specific set of

hyper-parameters (if any)

  • A validation set V to choose good hyper-parameters, or for

deciding termination

  • A test set S to evaluate the generalization performance of

the predictor ˆ h learned by training on T and validating on V

  • Resampling techniques (“cross-validation”) exist for making

the same set play the role of both T and V

  • S must still be entirely separate

COMPSCI 527 — Computer Vision Basics of Machine Learning 17 / 21

slide-3
SLIDE 3

The State of the Art of Image Classification

The State of the Art of Image Classification

  • ImageNet Large Scale Visual Recognition Challenge

(ILSVRC)

  • Based on ImageNet,1.4 million images, 1000 categories

(Fei-Fei Li, Stanford)

  • Three different competitions:
  • Classification:
  • One label per image, 1.2M images available for training, 50k

for validation, 100k withheld for testing

  • Zero-one loss for performance evaluation
  • Localization: Classification, plus bounding box.

Correct if ≥ 50% overlap with true box

  • Detection: Same as localization, but find every instance in

the image. Measure the fraction of mistakes (false positives, false negatives)

COMPSCI 527 — Computer Vision Basics of Machine Learning 18 / 21

slide-4
SLIDE 4

The State of the Art of Image Classification [Image from Russakovsky et al., ImageNet Large Scale Visual Recognition Challenge, Int’l. J. Comp. Vision 115:211-252, 2015] COMPSCI 527 — Computer Vision Basics of Machine Learning 19 / 21

slide-5
SLIDE 5

The State of the Art of Image Classification

Difficulties of ILSVRC

  • Images are “natural.” Arbitrary backgrounds, different sizes,

viewpoints, lighting. Partially visible objects

  • 1,000 categories, subtle distinctions. Example: Siberian

husky and Eskimo dog

  • Variations of appearance within one category can be

significant (how many lamps can you think of?)

  • What is the label of one image? For instance, a picture of a

group of people examining a fishing rod was labeled as “reel.”

COMPSCI 527 — Computer Vision Basics of Machine Learning 20 / 21

slide-6
SLIDE 6

The State of the Art of Image Classification

Performance for Image Classification

  • 2010: 28.2 percent
  • 2017: 2.3 percent (ensemble of several deep networks)
  • Improvement results from both architectural insights

(residuals, squeeze-and-excitation networks, ...) and persistent engineering

  • A book on “tricks of the trade in deep learning!”
  • We will see some after studying the basics

COMPSCI 527 — Computer Vision Basics of Machine Learning 21 / 21

3.7

1

t