learning from data lecture 17 memory and efficiency in
play

Learning From Data Lecture 17 Memory and Efficiency in Nearest - PowerPoint PPT Presentation

Learning From Data Lecture 17 Memory and Efficiency in Nearest Neighbor Memory Efficiency M. Magdon-Ismail CSCI 4100/6100 recap: Similarity and Nearest Neighbor Similarity 1. Simple. | x x | d ( x , x ) = | | 2. No training.


  1. Learning From Data Lecture 17 Memory and Efficiency in Nearest Neighbor Memory Efficiency M. Magdon-Ismail CSCI 4100/6100

  2. recap: Similarity and Nearest Neighbor Similarity 1. Simple. | x − x ′ | d ( x , x ′ ) = | | 2. No training. 1-NN rule 3. Near optimal E out : ⇒ E out → E ∗ k → ∞ , k/N → 0 = out . 4. Good ways to choose k : � √ � k = 3; k = N ; validation/cross validation. 5. Easy to justify classification to customer. 6. Can easily do multi-class. 21-NN rule 7. Can easily adapt to regression or logistic regression k k g ( x ) = 1 g ( x ) = 1 � � � � y [ i ] ( x ) y [ i ] ( x ) = +1 k k i =1 i =1 8. Computationally demanding . M Memory and Efficiency in Nearest Neighbor : 2 /25 � A c L Creator: Malik Magdon-Ismail Computational demands − →

  3. Computational Demands of Nearest Neighbor Memory. Need to store all the data, O ( Nd ) memory. N = 10 6 , d = 100, double precision ≈ 1GB Finding the nearest neighbor of a test point. Need to compute distance to every data point, O ( Nd ). N = 10 6 , d = 100, 3GHz processor ≈ 3ms (compute g ( x )) N = 10 6 , d = 100, 3GHz processor ≈ 1hr (compute CV error) N = 10 6 , d = 100, 3GHz processor > 1month (choose best k from among 1000 using CV) M Memory and Efficiency in Nearest Neighbor : 3 /25 � A c L Creator: Malik Magdon-Ismail Two basic approaches − →

  4. Two Basic Approaches Reduce the amount of data. The 5-year old does not remember every horse he has seen, only a few representative horses. Store the data in a specialized data structure. Ongoing research field to develop geometric data structures to make finding nearest neighbors fast. M Memory and Efficiency in Nearest Neighbor : 4 /25 � A c L Creator: Malik Magdon-Ismail Irrelevant data − →

  5. Throw Away Irrelevant Data − − − − − − → − − − → k = 1 M Memory and Efficiency in Nearest Neighbor : 5 /25 � A c L Creator: Malik Magdon-Ismail Decision boundary consistent − →

  6. Decision Boundary Consistent − − − − − − → − − − → g ( x ) unchanged M Memory and Efficiency in Nearest Neighbor : 6 /25 � A c L Creator: Malik Magdon-Ismail Training set consistent − →

  7. Training Set Consistent − − − − − − → − − − → g ( x n ) unchanged M Memory and Efficiency in Nearest Neighbor : 7 /25 � A c L Creator: Malik Magdon-Ismail Comparing − →

  8. � Decision Boundary Vs. Training Set Consistent DB − − − − − − → − − TS − → g ( x ) unchanged versus g ( x n ) unchanged M Memory and Efficiency in Nearest Neighbor : 8 /25 � A c L Creator: Malik Magdon-Ismail Consistent = ⇒ ( g ( x n ) = y n ) − →

  9. Consistent Does Not Mean g ( x n ) = y n DB − − − − − − → − − TS − → k = 3 M Memory and Efficiency in Nearest Neighbor : 9 /25 � A c L Creator: Malik Magdon-Ismail Training set consistent ( k = 3) − →

  10. Training Set Consistent ( k = 3 ) − − − − − − → − − − → g ( x n ) unchanged M Memory and Efficiency in Nearest Neighbor : 10 /25 � A c L Creator: Malik Magdon-Ismail CNN − →

  11. CNN: Condensed Nearest Neighbor ( k = 3 ) + add this point + + + 1. Randomly select k data points into S . 2. Classify all data according to S . 3. Let x ∗ be an inconsistent point and y ∗ its class w.r.t. D . 4. Add the closest point to x ∗ not in S that has class y ∗ . Consider the solid blue point: 5. Iterate until S classifies all points consistently with D . i. blue w.r.t. selected points ii. red w.r.t. D Add a red point: i. not already selected ii. closest to the inconsistent point M Memory and Efficiency in Nearest Neighbor : 11 /25 � A c L Creator: Malik Magdon-Ismail CNN: add red point − →

  12. CNN: Condensed Nearest Neighbor + add this point + + + 1. Randomly select k data points into S . 2. Classify all data according to S . 3. Let x ∗ be an inconsistent point and y ∗ its class w.r.t. D . 4. Add the closest point to x ∗ not in S that has class y ∗ . Consider the solid blue point: 5. Iterate until S classifies all points consistently with D . i. blue w.r.t. selected points ii. red w.r.t. D Add a red point: i. not already selected ii. closest to the inconsistent point M Memory and Efficiency in Nearest Neighbor : 12 /25 � A c L Creator: Malik Magdon-Ismail CNN: algorithm − →

  13. CNN: Condensed Nearest Neighbor + add this point + + 1. Randomly select k data points into S . + 2. Classify all data according to S . 3. Let x ∗ be an inconsistent point and y ∗ its class w.r.t. D . 4. Add the closest point to x ∗ not in S that has class y ∗ . 5. Iterate until S classifies all points consistently with D . Consider the solid blue point: i. blue w.r.t. selected points ii. red w.r.t. D Minimum consistent set (MCS)? ← NP-hard Add a red point: i. not already selected ii. closest to the inconsistent point M Memory and Efficiency in Nearest Neighbor : 13 /25 � A c L Creator: Malik Magdon-Ismail Digits Data − →

  14. Nearest Neighbor on Digits Data 1-NN rule 21-NN rule M Memory and Efficiency in Nearest Neighbor : 14 /25 � A c L Creator: Malik Magdon-Ismail Condensing the Digits Data − →

  15. Condensing the Digits Data 1-NN rule 21-NN rule M Memory and Efficiency in Nearest Neighbor : 15 /25 � A c L Creator: Malik Magdon-Ismail Finding the nearest neighbor − →

  16. Finding the Nearest Neighbor 1. S 1 , S 2 are ‘clusters’ with centers µ 1 , µ 2 and radii r 1 , r 2 . 2. [Branch] Search S 1 first → ˆ x [1] . S 2 3. The distance from x to any point in S 2 is at least | | x − µ 2 | | − r 2 4. [Bound] So we are done if | | x − ˆ x [1] | | ≤ | | x − µ 2 | | − r 2 x S 1 A branch and bound algorithm Can be applied recursively M Memory and Efficiency in Nearest Neighbor : 16 /25 � A c L Creator: Malik Magdon-Ismail When does the bound hold? − →

  17. When Does the Bound Hold? | x − ˆ Bound condition: | x [1] | | ≤ | | x − µ 2 | | − r 2 . | x − ˆ | x [1] | | ≤ | | x − µ 1 | | + r 1 S 2 So, it suffices that r 1 + r 2 ≤ | | x − µ 2 | | − | | x − µ 1 | | . | | x − µ 1 | | ≈ 0 means | | x − µ 2 | | ≈ | | µ 2 − µ 2 | | . x It suffices that S 1 r 1 + r 2 ≤ | | µ 2 − µ 1 | | . within cluster spread should be less than between cluster spread M Memory and Efficiency in Nearest Neighbor : 17 /25 � A c L Creator: Malik Magdon-Ismail Finding clusters – Lloyd’s algorithm − →

  18. Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 18 /25 � A c L Creator: Malik Magdon-Ismail Furtherest away point − →

  19. Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 19 /25 � A c L Creator: Malik Magdon-Ismail Next furtherest away point − →

  20. Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 20 /25 � A c L Creator: Malik Magdon-Ismail All centers picked − →

  21. Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 21 /25 � A c L Creator: Malik Magdon-Ismail Construct Voronoi regions − →

  22. Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 22 /25 � A c L Creator: Malik Magdon-Ismail Update centers − →

  23. Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 23 /25 � A c L Creator: Malik Magdon-Ismail Update Voronoi regions − →

  24. Finding Clusters – Lloyd’s Algorithm 1. Pick well separated centers for each cluster. 2. Compute Voronoi regions as the clusters. 3. Update the Centers. 4. Update the Voronoi regions. 5. Compute centers and radii: 1 � µ j = x n ; r j = max x n ∈ S j | | x n − µ j | | . | S j | x n ∈ S j M Memory and Efficiency in Nearest Neighbor : 24 /25 � A c L Creator: Malik Magdon-Ismail Preview RBF − →

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend