Classification Image Classification Set of predefined categories - - PowerPoint PPT Presentation

classification
SMART_READER_LITE
LIVE PREVIEW

Classification Image Classification Set of predefined categories - - PowerPoint PPT Presentation

Day 1 Lecture 2 Classification Image Classification Set of predefined categories [eg: table, apple, dog, giraffe] Binary classification [1, 0] DOG 2 Image Classification 3 Image Classification pipeline Dog 4 Slide credit: Jose M lvarez


slide-1
SLIDE 1

Day 1 Lecture 2

Classification

slide-2
SLIDE 2

2

Image Classification

Set of predefined categories [eg: table, apple, dog, giraffe] Binary classification [1, 0]

DOG

slide-3
SLIDE 3

3

Image Classification

slide-4
SLIDE 4

4

Dog

Image Classification pipeline

Slide credit: Jose M Àlvarez

slide-5
SLIDE 5

5

Slide credit: Jose M Àlvarez

Dog Learned Representation

Image Classification pipeline

slide-6
SLIDE 6

6

Dog Learned Representation Part I: End-to-end learning (E2E)

Image Classification pipeline

Slide credit: Jose M Àlvarez

slide-7
SLIDE 7

7

Image Classification: Example Datasets

training set of 60,000 examples test set of 10,000 examples

slide-8
SLIDE 8

8

Image Classification: Example Datasets

slide-9
SLIDE 9

2.1 3.2 4.8 0.1 0.0 2.6 3.1 1.4 2.5 0.2 1.0 2.0 1.0 2.3 3.2 9.3 6.4 0.3 2.0 5.0 3.2 1.0 6.9 9.1 9.0 3.5 5.4 5.5 3.2 1.0 N training examples (rows) D features (columns) 1 1 N

Training set

slide-10
SLIDE 10

Dataset Shuffled data shuffle Training data (70%) Test data (30%) split Learning algorithm fit(X, y) Model Prediction algorithm predict(X) Predictions Compute error/accuracy score(X, y)

Out-of-sample error estimate NO!

Train/Test Splits

slide-11
SLIDE 11

Confusion matrices provide a by-class comparison between the results of the automatic classifications with ground truth annotations.

Metrics

slide-12
SLIDE 12

Metrics

Correct classifications appear in the diagonal, while the rest of cells correspond to errors. Prediction Class 1 Class 2 Class 3 Ground Truth Class 1 x(1,1) x(1,2) x(1,3) Class 2 x(2,1) x(2,2) x(2,3) Class 3 x(3,1) x(3,2) x(3,3)

slide-13
SLIDE 13

Special case: Binary classifiers in terms of “Positive” vs “Negative”. Prediction Positives negative Ground Truth Positives True positive (TP) False negative (FN) negative False positives (FP) True negative (TN)

Metrics

slide-14
SLIDE 14

The “accuracy” measures the proportion of correct classifications, not distinguishing between classes. Binary

Prediction Class 1 Class 2 Class 3 Ground Truth Class 1 x(1,1) x(1,2) x(1,3) Class 2 x(2,1) x(2,2) x(2,3) Class 3 x(3,1) x(3,2) x(3,3) Prediction Positives negative Ground Truth Positives True positive (TP) False negative (FN) Negative False positives (FP) True negative (TN)

Metrics

slide-15
SLIDE 15

Given a reference class, its Precision (P) and Recall (R) are complementary measures of relevance.

Prediction Positives Negatives Ground Truth Positives True positive (TP) False negative (FN) Negatives False positives (FP)

"Precisionrecall" by Walber - Own work. Licensed under Creative Commons Attribution-Share Alike 4.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Precisionrecall. svg#mediaviewer/File:Precisionrecall.svg

Example: Relevant class is “Positive” in a binary classifier.

Metrics

slide-16
SLIDE 16

Binary classification results often depend from a parameter (eg. decision threshold) whose value directly impacts precision and recall. For this reason, in many cases a Receiver Operating Curve (ROC curve) is provided as a result.

True Positive Rate

Metrics

slide-17
SLIDE 17

17

Dog

Image Classification pipeline

Slide credit: Jose M Àlvarez

slide-18
SLIDE 18

Mapping function to predict a score for the class label

Linear Models f(x, w) = (wTx + b)

CS231n: Convolutional Neural Networks for Visual Recognition

slide-19
SLIDE 19

Sigmoid f(x, w) = g(wTx + b)

Activation function: Turn score into probabilities Logistic Regression

slide-20
SLIDE 20

Neuron

slide-21
SLIDE 21
slide-22
SLIDE 22
slide-23
SLIDE 23

Split your dataset into train and test at the very start

  • Usually good practice to shuffle data (exception: time series)

Do not look at test data (data snooping)!

  • Lock it away at the start to prevent contamination

NB: Never ever train on the test data!

  • You have no way to estimate error if you do
  • Your model could easily overfit the test data and have poor generalization, you have no way of

knowing without test data

  • Model may fail in production

Data hygiene