introduction to machine learning
play

Introduction to Machine Learning COMPSCI 371D Machine Learning - PowerPoint PPT Presentation

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Introduction to Machine Learning 1 / 12 Outline 1 Nearest Neighbor Prediction 2 Complexity Considerations 3 The Voronoi Diagram 4 Overfitting


  1. Introduction to Machine Learning COMPSCI 371D — Machine Learning COMPSCI 371D — Machine Learning Introduction to Machine Learning 1 / 12

  2. Outline 1 Nearest Neighbor Prediction 2 Complexity Considerations 3 The Voronoi Diagram 4 Overfitting and k Nearest Neighbors COMPSCI 371D — Machine Learning Introduction to Machine Learning 2 / 12

  3. Nearest Neighbor Prediction Nearest Neighbor Prediction • NN is very simple: This is why we start here • NN is very unusual: • No training! • Slow inference (using the predictor) • Y can be anything • Almost no difference between regression and classification • Hypothesis space hard to define COMPSCI 371D — Machine Learning Introduction to Machine Learning 3 / 12

  4. Nearest Neighbor Prediction How it Works • Given T = { ( x 1 , y 1 ) , . . . , ( x N , y N ) } • Just store T (memorization) • Need a distance in the data space X • Perhaps ∆( x , x ′ ) = � x − x ′ � 2 • Then, h ( x ) = y ν ( x ) where ν ( x ) ∈ arg min n = 1 ,..., N ∆( x , x n ) • Return the value y n for the training point x n that is nearest to x COMPSCI 371D — Machine Learning Introduction to Machine Learning 4 / 12

  5. Nearest Neighbor Prediction COMPSCI 371D — Machine Learning Introduction to Machine Learning 5 / 12

  6. Complexity Considerations How to find ν ( x ) ? ν ( x ) = arg min n = 1 ,..., N ∆( x , x n ) • Compute all ∆( x , x n ) and find the smallest • O ( Nd ) (where x ∈ R d ) • Cannot do better exactly • Can do better if we accept ∆( x , x ν ( x ) ) < ( 1 + ǫ )∆( x , x ν ∗ ( x ) ) for some ǫ > 0 • “Approximate NN” uses k - d trees, R-trees, locality sensitive hashing COMPSCI 371D — Machine Learning Introduction to Machine Learning 6 / 12

  7. The Voronoi Diagram The Voronoi Diagram • Only conceptual, or for d = 2 , 3, maybe 4 • Θ( N log N + N ⌈ d / 2 ⌉ ) COMPSCI 371D — Machine Learning Introduction to Machine Learning 7 / 12

  8. The Voronoi Diagram Decision Boundary COMPSCI 371D — Machine Learning Introduction to Machine Learning 8 / 12

  9. Overfitting and k Nearest Neighbors Overfitting COMPSCI 371D — Machine Learning Introduction to Machine Learning 9 / 12

  10. Overfitting and k Nearest Neighbors k Nearest Neighbors • Retrieve the k nearest neighbors x 1 , . . . , x k of x • Return a summary of the corresponding y 1 , . . . , y k • Classification summary: majority • Regression summary: Mean, median COMPSCI 371D — Machine Learning Introduction to Machine Learning 10 / 12

  11. Overfitting and k Nearest Neighbors Less Overfitting ( k = 9) COMPSCI 371D — Machine Learning Introduction to Machine Learning 11 / 12

  12. Overfitting and k Nearest Neighbors A Simple Regression Example, R → R 800 800 k = 1 k = 10 k = 100 700 700 600 600 500 500 400 400 300 300 200 200 100 100 0 0 0 1000 2000 3000 4000 5000 6000 0 1000 2000 3000 4000 5000 6000 COMPSCI 371D — Machine Learning Introduction to Machine Learning 12 / 12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend