Learning From Data Lecture 16 Similarity and Nearest Neighbor - - PowerPoint PPT Presentation

learning from data lecture 16 similarity and nearest
SMART_READER_LITE
LIVE PREVIEW

Learning From Data Lecture 16 Similarity and Nearest Neighbor - - PowerPoint PPT Presentation

Learning From Data Lecture 16 Similarity and Nearest Neighbor Similarity Nearest Neighbor M. Magdon-Ismail CSCI 4100/6100 My 5-Year-Old Called It A ManoHorse The simplest method of learning that we know. Classify according to similar


slide-1
SLIDE 1

Learning From Data Lecture 16 Similarity and Nearest Neighbor

Similarity Nearest Neighbor

  • M. Magdon-Ismail

CSCI 4100/6100

slide-2
SLIDE 2

My 5-Year-Old Called It “A ManoHorse”

The simplest method of learning that we know. Classify according to similar objects you have seen.

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 2 /16

Measuring similarity − →

slide-3
SLIDE 3

Measuring Similarity

− − − − − − − − − → features, x d(x, x′) = | | x − x′ | |

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 3 /16

Euclidean distance − →

slide-4
SLIDE 4

Measuring Similarity

− − − − − − − − − → features, x d(x, x′) = | | x − x′ | |

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 4 /16

Nearest neighbor − →

slide-5
SLIDE 5

Nearest Neighbor

Test ‘x’ is classified using its nearest neighbor.

x x[3] x[4] x[1] x[2]

d(x, x[1]) ≤ d(x, x[2]) ≤ · · · ≤ d(x, x[N]) g(x) = y[1](x)

No training needed! Ein = 0

Nearest neighbor Voronoi tesselation

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 5 /16

No training − →

slide-6
SLIDE 6

Nearest Neighbor

Test ‘x’ is classified using its nearest neighbor.

x x[3] x[4] x[1] x[2]

d(x, x[1]) ≤ d(x, x[2]) ≤ · · · ≤ d(x, x[N]) g(x) = y[1](x)

No training needed! Ein = 0

Nearest neighbor Voronoi tesselation

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 6 /16

Ein = 0 − →

slide-7
SLIDE 7

Nearest Neighbor

Test ‘x’ is classified using its nearest neighbor.

x x[3] x[4] x[1] x[2]

d(x, x[1]) ≤ d(x, x[2]) ≤ · · · ≤ d(x, x[N]) g(x) = y[1](x)

No training needed! Ein = 0

Nearest neighbor Voronoi tesselation

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 7 /16

What about Eout? − →

slide-8
SLIDE 8

What about Eout?

Theorem: Eout ≤ 2E∗

  • ut

(with high probability as N → ∞)

VC analysis: Ein is an estimate for Eout. Nearest neighbor analysis: Ein = 0, Eout is small.

So we will never know what Eout is, but it cannot be much worse than the best anyone can do.

Half the classification power of the data is in the nearest neighbor

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 8 /16

Proving Eout ≤ 2E∗

  • ut −

slide-9
SLIDE 9

Proving Eout ≤ 2E∗

  • ut

π(x) = P[y = +1|x]. ← the target in logistic regression Assume π(x) is continuous and x[1]

N→∞

− → x. Then π(x[1])

N→∞

− → π(x).

P[gN(x) = y] = P[y = +1, y[1] = −1] + P[y = −1, y[1] = +1],

= π(x) · (1 − π(x[1])) + (1 − π(x)) · π(x[1]), → π(x) · (1 − π(x)) + (1 − π(x)) · π(x), = 2π(x) · (1 − π(x)), ≤ 2 min{π(x), 1 − π(x)}. The best you can do is E∗

  • ut(x) = min{π(x), 1 − π(x)}.

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 9 /16

Nearest neighbor ‘self-regularizes’ − →

slide-10
SLIDE 10

Nearest Neighbor ‘Self-Regularizes’

N = 2 N = 3 N = 4 N = 5 N = 6

A simple boundary is used with few data points. A more complicated boundary is possible only when you have more data points.

regularization guides you to simpler hypotheses when data quality/quantity is lower.

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 10 /16

k-nearest neighbor − →

slide-11
SLIDE 11

k-Nearest Neighbor

g(x) = sign

k

  • i=1

y[i](x)

  • .

(k is odd and yn = ±1). 1-NN rule 21-NN rule 127-NN rule

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 11 /16

The role of k − →

slide-12
SLIDE 12

The Role of k

k determines the tradeoff between fitting the data and overfitting the data.

  • Theorem. For N → ∞, if k(N) → ∞ and k(N)/N → 0 then,

Ein(g) → Eout(g) and Eout(g) → E∗

  • ut.

For example k =

N

  • .

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 12 /16

3 Ways To Choose k − →

slide-13
SLIDE 13

3 Ways To Choose k

  • 1. k = 3.
  • 2. k =

N

  • .
  • 3. Validation or cross validation:

k-NN rule hypotheses gk constructed on training set, tested on validation set, and best k is picked.

# Data Points, N Eout (%) k = 1 k = 3 k = √ N CV

1000 2000 3000 4000 5000 1 1.5 2

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 13 /16

Nearest neighbor is nonparametric − →

slide-14
SLIDE 14

Nearest Neighbor is Nonparametric

NN-rule Linear Model

no parameters (d + 1) parameters expressive/flexible rigid, always linear g(x) needs data g(x) needs only weights generic, can model anything specialized

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 14 /16

Multiclass − →

slide-15
SLIDE 15

Nearest Neighbor Easily Extends to Multiclass

Average Intensity Symmetry 1 2 3 4 5 6 7 8 9 1 2 3 4 6 7 8 9 Average Intensity Symmetry

True Predicted 1 2 3 4 5 6 7 8 9 13.5 0.5 0.5 1 0.5 0.5 16.5 1 0.5 13.5 14 2 0.5 3.5 1 1 1.5 1 1 0.5 10 3 2.5 1.5 2 0.5 0.5 0.5 0.5 0.5 1 9.5 4 0.5 1 0.5 1.5 0.5 1 2 1.5 8.5 5 0.5 2.5 1 0.5 1.5 1 1 0.5 7.5 6 0.5 2 1 1 1 1 1 1 8.5 7 1.5 0.5 1.5 0.5 1 3 1 9 8 3.5 0.5 1 0.5 0.5 0.5 0.5 1 8 9 0.5 1 1 1 0.5 1 1 0.5 2 8.5 22.5 14 14 9 7.5 7 7 9.5 2 8.5 100

41% accuracy!

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 15 /16

Summary − →

slide-16
SLIDE 16

Highlights of k-Nearest Neighbor

  • 1. Simple.
  • 2. No training.
  • 3. Near optimal Eout.
  • 4. Easy to justify classification to customer.
  • 5. Can easily do multi-class.
  • 6. Can easily adapt to regression or logistic regression

g(x) = 1 k

k

  • i=1

y[i](x) g(x) = 1 k

k

  • i=1
  • y[i](x) = +1
  • 7. Computationally demanding. ←

− we will address this next

}

A good! method

c A M L Creator: Malik Magdon-Ismail

Similarity and Nearest Neighbor: 16 /16