Learning From Data Lecture 17 Memory and Efficiency in Nearest - - PowerPoint PPT Presentation

learning from data lecture 17 memory and efficiency in
SMART_READER_LITE
LIVE PREVIEW

Learning From Data Lecture 17 Memory and Efficiency in Nearest - - PowerPoint PPT Presentation

Learning From Data Lecture 17 Memory and Efficiency in Nearest Neighbor Memory Efficiency M. Magdon-Ismail CSCI 4100/6100 recap: Similarity and Nearest Neighbor Similarity 1. Simple. | x x | d ( x , x ) = | | 2. No training.


slide-1
SLIDE 1

Learning From Data Lecture 17 Memory and Efficiency in Nearest Neighbor

Memory Efficiency

  • M. Magdon-Ismail

CSCI 4100/6100

slide-2
SLIDE 2

recap: Similarity and Nearest Neighbor

Similarity d(x, x′) = | | x − x′ | | 1-NN rule 21-NN rule

  • 1. Simple.
  • 2. No training.
  • 3. Near optimal Eout:

k → ∞, k/N → 0 = ⇒ Eout → E∗

  • ut.
  • 4. Good ways to choose k:

k = 3; k = √ N

  • ; validation/cross validation.
  • 5. Easy to justify classification to customer.
  • 6. Can easily do multi-class.
  • 7. Can easily adapt to regression or logistic regression

g(x) = 1 k

k

  • i=1

y[i](x) g(x) = 1 k

k

  • i=1
  • y[i](x) = +1
  • 8. Computationally demanding.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 2 /25

Computational demands − →

slide-3
SLIDE 3

Computational Demands of Nearest Neighbor

Memory.

Need to store all the data, O(Nd) memory.

N = 106, d = 100, double precision≈ 1GB

Finding the nearest neighbor of a test point.

Need to compute distance to every data point, O(Nd).

N = 106, d = 100, 3GHz processor ≈ 3ms (compute g(x)) N = 106, d = 100, 3GHz processor ≈ 1hr (compute CV error) N = 106, d = 100, 3GHz processor > 1month (choose best k from among 1000 using CV)

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 3 /25

Two basic approaches − →

slide-4
SLIDE 4

Two Basic Approaches

Reduce the amount of data.

The 5-year old does not remember every horse he has seen, only a few representative horses.

Store the data in a specialized data structure.

Ongoing research field to develop geometric data structures to make finding nearest neighbors fast.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 4 /25

Irrelevant data − →

slide-5
SLIDE 5

Throw Away Irrelevant Data

− − − − − − → − − − →

k = 1

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 5 /25

Decision boundary consistent − →

slide-6
SLIDE 6

Decision Boundary Consistent

− − − − − − → − − − → g(x) unchanged

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 6 /25

Training set consistent − →

slide-7
SLIDE 7

Training Set Consistent

− − − − − − → − − − → g(xn) unchanged

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 7 /25

Comparing − →

slide-8
SLIDE 8

Decision Boundary Vs. Training Set Consistent

DB

− − − − − − →

TS

− − − → g(x) unchanged

versus

g(xn) unchanged

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 8 /25

Consistent

  • =

⇒ (g(xn) = yn) − →

slide-9
SLIDE 9

Consistent Does Not Mean g(xn) = yn

DB

− − − − − − →

TS

− − − →

k = 3

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 9 /25

Training set consistent (k = 3) − →

slide-10
SLIDE 10

Training Set Consistent (k = 3)

− − − − − − → − − − → g(xn) unchanged

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 10 /25

CNN − →

slide-11
SLIDE 11

CNN: Condensed Nearest Neighbor (k = 3)

+ + + +

add this point

Consider the solid blue point:

  • i. blue w.r.t. selected points
  • ii. red w.r.t. D

Add a red point:

  • i. not already selected
  • ii. closest to the inconsistent point
  • 1. Randomly select k data points into S.
  • 2. Classify all data according to S.
  • 3. Let x∗ be an inconsistent point and y∗ its class w.r.t. D.
  • 4. Add the closest point to x∗ not in S that has class y∗.
  • 5. Iterate until S classifies all points consistently with D.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 11 /25

CNN: add red point − →

slide-12
SLIDE 12

CNN: Condensed Nearest Neighbor

+ + + +

add this point

Consider the solid blue point:

  • i. blue w.r.t. selected points
  • ii. red w.r.t. D

Add a red point:

  • i. not already selected
  • ii. closest to the inconsistent point
  • 1. Randomly select k data points into S.
  • 2. Classify all data according to S.
  • 3. Let x∗ be an inconsistent point and y∗ its class w.r.t. D.
  • 4. Add the closest point to x∗ not in S that has class y∗.
  • 5. Iterate until S classifies all points consistently with D.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 12 /25

CNN: algorithm − →

slide-13
SLIDE 13

CNN: Condensed Nearest Neighbor

+ + + +

add this point

Consider the solid blue point:

  • i. blue w.r.t. selected points
  • ii. red w.r.t. D

Add a red point:

  • i. not already selected
  • ii. closest to the inconsistent point
  • 1. Randomly select k data points into S.
  • 2. Classify all data according to S.
  • 3. Let x∗ be an inconsistent point and y∗ its class w.r.t. D.
  • 4. Add the closest point to x∗ not in S that has class y∗.
  • 5. Iterate until S classifies all points consistently with D.

Minimum consistent set (MCS)? ← NP-hard

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 13 /25

Digits Data − →

slide-14
SLIDE 14

Nearest Neighbor on Digits Data

1-NN rule 21-NN rule

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 14 /25

Condensing the Digits Data − →

slide-15
SLIDE 15

Condensing the Digits Data

1-NN rule 21-NN rule

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 15 /25

Finding the nearest neighbor − →

slide-16
SLIDE 16

Finding the Nearest Neighbor

  • 1. S1, S2 are ‘clusters’ with centers µ1, µ2 and radii r1, r2.
  • 2. [Branch] Search S1 first → ˆ

x[1].

  • 3. The distance from x to any point in S2 is at least

| | x − µ2 | | − r2

  • 4. [Bound] So we are done if

| | x − ˆ x[1] | | ≤ | | x − µ2 | | − r2 A branch and bound algorithm Can be applied recursively

S1 S2 x

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 16 /25

When does the bound hold? − →

slide-17
SLIDE 17

When Does the Bound Hold?

Bound condition: | | x − ˆ x[1] | | ≤ | | x − µ2 | | − r2.

| | x − ˆ x[1] | | ≤ | | x − µ1 | | + r1 So, it suffices that r1 + r2 ≤ | | x − µ2 | | − | | x − µ1 | |. | | x − µ1 | | ≈ 0 means | | x − µ2 | | ≈ | | µ2 − µ2 | |.

It suffices that r1 + r2 ≤ | | µ2 − µ1 | |.

S1 S2 x

within cluster spread should be less than between cluster spread

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 17 /25

Finding clusters – Lloyd’s algorithm − →

slide-18
SLIDE 18

Finding Clusters – Lloyd’s Algorithm

  • 1. Pick well separated centers for each cluster.
  • 2. Compute Voronoi regions as the clusters.
  • 3. Update the Centers.
  • 4. Update the Voronoi regions.
  • 5. Compute centers and radii:

µj = 1 |Sj|

  • xn∈Sj

xn; rj = max

xn∈Sj |

| xn − µj | |.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 18 /25

Furtherest away point − →

slide-19
SLIDE 19

Finding Clusters – Lloyd’s Algorithm

  • 1. Pick well separated centers for each cluster.
  • 2. Compute Voronoi regions as the clusters.
  • 3. Update the Centers.
  • 4. Update the Voronoi regions.
  • 5. Compute centers and radii:

µj = 1 |Sj|

  • xn∈Sj

xn; rj = max

xn∈Sj |

| xn − µj | |.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 19 /25

Next furtherest away point − →

slide-20
SLIDE 20

Finding Clusters – Lloyd’s Algorithm

  • 1. Pick well separated centers for each cluster.
  • 2. Compute Voronoi regions as the clusters.
  • 3. Update the Centers.
  • 4. Update the Voronoi regions.
  • 5. Compute centers and radii:

µj = 1 |Sj|

  • xn∈Sj

xn; rj = max

xn∈Sj |

| xn − µj | |.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 20 /25

All centers picked − →

slide-21
SLIDE 21

Finding Clusters – Lloyd’s Algorithm

  • 1. Pick well separated centers for each cluster.
  • 2. Compute Voronoi regions as the clusters.
  • 3. Update the Centers.
  • 4. Update the Voronoi regions.
  • 5. Compute centers and radii:

µj = 1 |Sj|

  • xn∈Sj

xn; rj = max

xn∈Sj |

| xn − µj | |.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 21 /25

Construct Voronoi regions − →

slide-22
SLIDE 22

Finding Clusters – Lloyd’s Algorithm

  • 1. Pick well separated centers for each cluster.
  • 2. Compute Voronoi regions as the clusters.
  • 3. Update the Centers.
  • 4. Update the Voronoi regions.
  • 5. Compute centers and radii:

µj = 1 |Sj|

  • xn∈Sj

xn; rj = max

xn∈Sj |

| xn − µj | |.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 22 /25

Update centers − →

slide-23
SLIDE 23

Finding Clusters – Lloyd’s Algorithm

  • 1. Pick well separated centers for each cluster.
  • 2. Compute Voronoi regions as the clusters.
  • 3. Update the Centers.
  • 4. Update the Voronoi regions.
  • 5. Compute centers and radii:

µj = 1 |Sj|

  • xn∈Sj

xn; rj = max

xn∈Sj |

| xn − µj | |.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 23 /25

Update Voronoi regions − →

slide-24
SLIDE 24

Finding Clusters – Lloyd’s Algorithm

  • 1. Pick well separated centers for each cluster.
  • 2. Compute Voronoi regions as the clusters.
  • 3. Update the Centers.
  • 4. Update the Voronoi regions.
  • 5. Compute centers and radii:

µj = 1 |Sj|

  • xn∈Sj

xn; rj = max

xn∈Sj |

| xn − µj | |.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 24 /25

Preview RBF − →

slide-25
SLIDE 25

Radial Basis Functions (RBF)

k-Nearest Neighbor: Only considers k-nearest neighbors.

each neighbor has equal weight

What about using all data to compute g(x)? RBF: Use all data.

data further away from x have less weight.

c A M L Creator: Malik Magdon-Ismail

Memory and Efficiency in Nearest Neighbor: 25 /25