Nearest neighbor classifier Information retrieval 8 ! Databases, - PowerPoint PPT Presentation

Topics of interest hw00 poll NLP 13 ! Deep learning, neural networks 8 ! Computer vision 8 ! Nearest neighbor classifier Information retrieval 8 ! Databases, systems, networking 4 ! Subhransu Maji AI 3 ! Reinforcement learning 3 ! CMPSCI 689: Machine Learning Robotics 3 ! 29 January 2015 These got 1 or 2 mentions: ! ‣ complexity, logic, large scale learning, speech, cross modality, 3 February 2015 biology, neuroscience, graphics, recommender systems, semi- supervised learning, programming languages, virtual reality, privacy, security CMPSCI 689 Subhransu Maji (UMASS) 2 /37 Topics of interest Nearest neighbor classifier hw00 poll NLP 13 ! Deep learning, neural networks 8 ! Computer vision 8 ! Information retrieval 8 ! Databases, systems, networking 4 ! AI 3 ! Reinforcement learning 3 ! Robotics 3 ! These got 1 or 2 mentions: ! ‣ complexity, logic, large scale learning, speech, cross modality, biology, neuroscience, graphics, recommender systems, semi- supervised learning, programming languages, virtual reality, privacy, security “To pass the class with a B+” CMPSCI 689 Subhransu Maji (UMASS) 2 /37 CMPSCI 689 Subhransu Maji (UMASS) 3 /37

Nearest neighbor classifier Nearest neighbor classifier Will Alice like AI? Will Alice like AI? ‣ Alice and James are similar and James likes AI. Hence, Alice must ‣ Alice and James are similar and James likes AI. Hence, Alice must also like AI. also like AI. It is useful to think of data as feature vectors ‣ Use Euclidean distance to measure similarity CMPSCI 689 Subhransu Maji (UMASS) 3 /37 CMPSCI 689 Subhransu Maji (UMASS) 3 /37 Nearest neighbor classifier Nearest neighbor classifier Will Alice like AI? Will Alice like AI? ‣ Alice and James are similar and James likes AI. Hence, Alice must ‣ Alice and James are similar and James likes AI. Hence, Alice must also like AI. also like AI. It is useful to think of data as feature vectors It is useful to think of data as feature vectors ‣ Use Euclidean distance to measure similarity ‣ Use Euclidean distance to measure similarity Data to feature vectors Data to feature vectors ‣ Binary: e.g. AI? {no, yes} ➡ {0,1} ➡ or {-20, 2} CMPSCI 689 Subhransu Maji (UMASS) 3 /37 CMPSCI 689 Subhransu Maji (UMASS) 3 /37

Nearest neighbor classifier Nearest neighbor classifier Will Alice like AI? Will Alice like AI? ‣ Alice and James are similar and James likes AI. Hence, Alice must ‣ Alice and James are similar and James likes AI. Hence, Alice must also like AI. also like AI. It is useful to think of data as feature vectors It is useful to think of data as feature vectors ‣ Use Euclidean distance to measure similarity ‣ Use Euclidean distance to measure similarity Data to feature vectors Data to feature vectors ‣ Binary: e.g. AI? {no, yes} ‣ Binary: e.g. AI? {no, yes} ➡ {0,1} ➡ {0,1} X X ➡ or {-20, 2} ➡ or {-20, 2} ‣ Nominal: e.g. color = {red, blue, green, yellow} ➡ {0,1} ⁿ ➡ or {0,1,2,3} CMPSCI 689 Subhransu Maji (UMASS) 3 /37 CMPSCI 689 Subhransu Maji (UMASS) 3 /37 Nearest neighbor classifier Nearest neighbor classifier Will Alice like AI? Will Alice like AI? ‣ Alice and James are similar and James likes AI. Hence, Alice must ‣ Alice and James are similar and James likes AI. Hence, Alice must also like AI. also like AI. It is useful to think of data as feature vectors It is useful to think of data as feature vectors ‣ Use Euclidean distance to measure similarity ‣ Use Euclidean distance to measure similarity Data to feature vectors Data to feature vectors ‣ Binary: e.g. AI? {no, yes} ‣ Binary: e.g. AI? {no, yes} ➡ {0,1} ➡ {0,1} X X ➡ or {-20, 2} ➡ or {-20, 2} ‣ Nominal: e.g. color = {red, blue, green, yellow} ‣ Nominal: e.g. color = {red, blue, green, yellow} ➡ {0,1} ⁿ ➡ {0,1} ⁿ X X ➡ or {0,1,2,3} ➡ or {0,1,2,3} ‣ Real valued: e.g. temperature ➡ copied ➡ or {low, medium, high} CMPSCI 689 Subhransu Maji (UMASS) 3 /37 CMPSCI 689 Subhransu Maji (UMASS) 3 /37

Nearest neighbor classifier Nearest neighbor classifier ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x n , y n ) Training data is in the form of ! Fruit data: ! ‣ label: {apples, oranges, lemons} ‣ attributes: {width, height} sX ( x 1 ,i − x 2 ,i ) 2 d ( x 1 , x 2 ) = Euclidean distance i height width CMPSCI 689 Subhransu Maji (UMASS) 4 /37 CMPSCI 689 Subhransu Maji (UMASS) 5 /37 Nearest neighbor classifier Nearest neighbor classifier test data test data (a, b) ? (a, b) ? CMPSCI 689 Subhransu Maji (UMASS) 5 /37 CMPSCI 689 Subhransu Maji (UMASS) 5 /37

Nearest neighbor classifier Nearest neighbor classifier test data test data (a, b) ? (a, b) ? lemon CMPSCI 689 Subhransu Maji (UMASS) 5 /37 CMPSCI 689 Subhransu Maji (UMASS) 5 /37 Nearest neighbor classifier Nearest neighbor classifier test data test data (c, d) ? CMPSCI 689 Subhransu Maji (UMASS) 5 /37 CMPSCI 689 Subhransu Maji (UMASS) 5 /37

Nearest neighbor classifier Nearest neighbor classifier test data test data (c, d) ? (c, d) ? CMPSCI 689 Subhransu Maji (UMASS) 5 /37 CMPSCI 689 Subhransu Maji (UMASS) 5 /37 Nearest neighbor classifier k-Nearest neighbor classifier test data (c, d) ? apple outlier CMPSCI 689 Subhransu Maji (UMASS) 5 /37 CMPSCI 689 Subhransu Maji (UMASS) 6 /37

k-Nearest neighbor classifier Decision boundaries: 1NN Take majority vote among the k nearest neighbors outlier What is the effect of k? CMPSCI 689 Subhransu Maji (UMASS) 6 /37 CMPSCI 689 Subhransu Maji (UMASS) 7 /37 Decision boundaries: 1NN Decision boundaries: DT What is the effect of k? CMPSCI 689 Subhransu Maji (UMASS) 7 /37 CMPSCI 689 Subhransu Maji (UMASS) 8 /37

Decision boundaries: DT Decision boundaries: DT apple no h > 7.8 yes orange yes yes w > 7.3 w > 7.3 no no CMPSCI 689 Subhransu Maji (UMASS) 8 /37 CMPSCI 689 Subhransu Maji (UMASS) 8 /37 Decision boundaries: DT Decision boundaries: DT apple apple no no h > 7.8 yes orange h > 7.8 yes orange yes yes w > 7.3 w > 7.3 no no yes lemon h > 8 h > 8 CMPSCI 689 Subhransu Maji (UMASS) 8 /37 CMPSCI 689 Subhransu Maji (UMASS) 8 /37

Decision boundaries: DT Decision boundaries: DT apple apple no no h > 7.8 yes orange h > 7.8 yes orange yes yes w > 7.3 w > 7.3 no no yes lemon yes lemon h > 8 h > 8 no no w > 6.6 CMPSCI 689 Subhransu Maji (UMASS) 8 /37 CMPSCI 689 Subhransu Maji (UMASS) 8 /37 Decision boundaries: DT Decision boundaries: DT apple apple no no h > 7.8 yes orange h > 7.8 yes orange yes yes w > 7.3 w > 7.3 no no yes lemon yes lemon h > 8 h > 8 no no yes orange yes orange w > 6.6 w > 6.6 no CMPSCI 689 Subhransu Maji (UMASS) 8 /37 CMPSCI 689 Subhransu Maji (UMASS) 8 /37

Decision boundaries: DT Decision boundaries: DT The decision boundaries ! are axis aligned for DT apple apple no no h > 7.8 yes orange h > 7.8 yes orange yes yes w > 7.3 w > 7.3 no no yes lemon yes lemon h > 8 h > 8 no no yes orange yes orange w > 6.6 w > 6.6 no no yes lemon yes lemon h > 6 h > 6 no no orange orange CMPSCI 689 Subhransu Maji (UMASS) 8 /37 CMPSCI 689 Subhransu Maji (UMASS) 8 /37 Inductive bias of the kNN classifier An example Choice of features ! “Texture synthesis” [Efros & Leung, ICCV 99] ‣ We are assuming that all features are equally important ‣ What happens if we scale one of the features by a factor of 100? Choice of distance function ! ‣ Euclidean, cosine similarity (angle), Gaussian, etc … ‣ Should the coordinates be independent? Choice of k CMPSCI 689 Subhransu Maji (UMASS) 9 /37 CMPSCI 689 Subhransu Maji (UMASS) 10 /37

An example An example “Texture synthesis” [Efros & Leung, ICCV 99] “Texture synthesis” [Efros & Leung, ICCV 99] CMPSCI 689 Subhransu Maji (UMASS) 10 /37 CMPSCI 689 Subhransu Maji (UMASS) 10 /37 An example: Synthesizing one pixel An example: Synthesizing one pixel p p input image input image synthesized image synthesized image ‣ What is ? ‣ What is ? ‣ Find all the windows in the image that match the neighborhood ‣ Find all the windows in the image that match the neighborhood ‣ To synthesize x ‣ To synthesize x ➡ pick one matching window at random ➡ pick one matching window at random ➡ assign x to be the center pixel of that window ➡ assign x to be the center pixel of that window ‣ An exact match might not be present, so find the best matches using ‣ An exact match might not be present, so find the best matches using Euclidean distance and randomly choose between them, preferring Euclidean distance and randomly choose between them, preferring better matches with higher probability better matches with higher probability Slide from Alyosha Efros, ICCV 1999 Slide from Alyosha Efros, ICCV 1999 CMPSCI 689 Subhransu Maji (UMASS) 11 /37 CMPSCI 689 Subhransu Maji (UMASS) 11 /37

Nearest neighbor classifier Information retrieval 8 ! Databases, - PowerPoint PPT Presentation

Topics of interest hw00 poll NLP 13 ! Deep learning, neural networks 8 ! Computer vision 8 ! Nearest neighbor classifier Information retrieval 8 ! Databases, systems, networking 4 ! Subhransu Maji AI 3 ! Reinforcement learning 3 ! CMPSCI 689:

Nearest Neighbor and Locality-Sensitive Hashing Nearest Neighbor Set Similarity

NEAREST NEIGHBOR RULE Jeff Robble, Brian Renzenbrink, Doug Roberts Nearest Neighbor Rule

CSCI 447/547 MACHINE LEARNING Outline Nearest Neighbor K-Nearest Neighbor Algorithm

Lecture 2: Nearest Neighbour Classifier Aykut Erdem September 2017 Hacettepe University Your

High-Dimensional Nearest Neighbor Search High-Dimensional Nearest Neighbor Search Who?

Nearest Neighbor Classification Machine Learning 1 This lecture K-nearest neighbor

Proximity in the Age of Distraction: Robust Approximate Nearest Neighbor Search Sariel Har-Peled

Graph-based Nearest Neighbor Search: From Practice to Theory Liudmila Prokhorenkova, Aleksandr

The Nearest Neighbor Algorithm The Nearest Neighbor Algorithm Hypothesis Space Hypothesis Space

Learning From Data Lecture 16 Similarity and Nearest Neighbor Similarity Nearest Neighbor M.

Simultaneous Nearest Neighbor Search Piotr Indyk Robert Kleinberg MIT Cornell Sepideh

BAYES AND NEAREST NEIGHBOR BAYES AND NEAREST NEIGHBOR CLASSIFIERS CLASSIFIERS Matthieu R Bloch

Simple and Fast Nearest Neighbor Search Marcel Birn, Manuel Holtgrewe, Peter Sanders , Johannes

Learning: Nearest Neighbor, Perceptrons & Neural Nets Artificial Intelligence CSPP 56553

Nearest Neighbor Classifiers CSE 4308/5360: Artificial Intelligence I University of Texas at

9/28/2009 Nearest Neighbor Queries What are the two nearest stars to Andromeda? Reverse

Chapter 4: Variability Variability Provides a quantitative measure of the degree to which

Variance - Making T (or ) Simpler Random Intercept Model: T = [ 00 ] 00 01

Patterns of CO 2 variability from AIRS data Alexander Ruzmaikin & George Aumann in

ARTIFICIAL INTELLIGENCE Machine learning: introduction Lecturer: Silja Renooij These slides are

Course Summary Course Summary Introduction: Introduction: Basic problems and questions in

The Separation Theorem for Differential Interaction Nets Damiano Mazza Laboratoire

The separation principle in stochastic control, revisited Workshop in honor of Eduardo Sontag on

7. Separating Hyperplane Theorems I Daisuke Oyama Mathematics II May 1, 2020 Separating

Nearest neighbor classifier Information retrieval 8 ! Databases, - PowerPoint PPT Presentation

Topics of interest hw00 poll NLP 13 ! Deep learning, neural networks 8 ! Computer vision 8 ! Nearest neighbor classifier Information retrieval 8 ! Databases, systems, networking 4 ! Subhransu Maji AI 3 ! Reinforcement learning 3 ! CMPSCI 689:

Nearest Neighbor and Locality-Sensitive Hashing Nearest Neighbor Set Similarity

NEAREST NEIGHBOR RULE Jeff Robble, Brian Renzenbrink, Doug Roberts Nearest Neighbor Rule

CSCI 447/547 MACHINE LEARNING Outline Nearest Neighbor K-Nearest Neighbor Algorithm

Lecture 2: Nearest Neighbour Classifier Aykut Erdem September 2017 Hacettepe University Your

High-Dimensional Nearest Neighbor Search High-Dimensional Nearest Neighbor Search Who?

Nearest Neighbor Classification Machine Learning 1 This lecture K-nearest neighbor

Proximity in the Age of Distraction: Robust Approximate Nearest Neighbor Search Sariel Har-Peled

Graph-based Nearest Neighbor Search: From Practice to Theory Liudmila Prokhorenkova, Aleksandr

The Nearest Neighbor Algorithm The Nearest Neighbor Algorithm Hypothesis Space Hypothesis Space

Learning From Data Lecture 16 Similarity and Nearest Neighbor Similarity Nearest Neighbor M.

Simultaneous Nearest Neighbor Search Piotr Indyk Robert Kleinberg MIT Cornell Sepideh

BAYES AND NEAREST NEIGHBOR BAYES AND NEAREST NEIGHBOR CLASSIFIERS CLASSIFIERS Matthieu R Bloch

Simple and Fast Nearest Neighbor Search Marcel Birn, Manuel Holtgrewe, Peter Sanders , Johannes

Learning: Nearest Neighbor, Perceptrons &amp; Neural Nets Artificial Intelligence CSPP 56553

Nearest Neighbor Classifiers CSE 4308/5360: Artificial Intelligence I University of Texas at

9/28/2009 Nearest Neighbor Queries What are the two nearest stars to Andromeda? Reverse

Chapter 4: Variability Variability Provides a quantitative measure of the degree to which

Variance - Making T (or ) Simpler Random Intercept Model: T = [ 00 ] 00 01

Patterns of CO 2 variability from AIRS data Alexander Ruzmaikin &amp; George Aumann in

ARTIFICIAL INTELLIGENCE Machine learning: introduction Lecturer: Silja Renooij These slides are

Course Summary Course Summary Introduction: Introduction: Basic problems and questions in

The Separation Theorem for Differential Interaction Nets Damiano Mazza Laboratoire

The separation principle in stochastic control, revisited Workshop in honor of Eduardo Sontag on

7. Separating Hyperplane Theorems I Daisuke Oyama Mathematics II May 1, 2020 Separating

Learning: Nearest Neighbor, Perceptrons & Neural Nets Artificial Intelligence CSPP 56553

Patterns of CO 2 variability from AIRS data Alexander Ruzmaikin & George Aumann in