Nearest Neighbor Classifiers
CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington
1
Nearest Neighbor Classifiers CSE 4308/5360: Artificial Intelligence - - PowerPoint PPT Presentation
Nearest Neighbor Classifiers CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington 1 The Nearest Neighbor Classifier Let X be the space of all possible patterns for some classification problem. Let F be a distance
1
2
3
2 𝐸 𝑗=1
𝐸 𝑗=1
4
– We have three classes (red, green, yellow). – Each pattern is a two-dimensional vector.
5
0 1 2 3 4 5 6 7 8 x axis y axis 0 1 2 3 4 5 6
– We have three classes (red, green, yellow). – Each pattern is a two-dimensional vector.
6
0 1 2 3 4 5 6 7 8 x axis y axis 0 1 2 3 4 5 6
– We have three classes (red, green, yellow). – Each pattern is a two-dimensional vector.
7
0 1 2 3 4 5 6 7 8 x axis y axis 0 1 2 3 4 5 6
– We have three classes (red, green, yellow). – Each pattern is a two-dimensional vector.
8
0 1 2 3 4 5 6 7 8 x axis y axis 0 1 2 3 4 5 6
– We have three classes (red, green, yellow). – Each pattern is a two-dimensional vector.
9
0 1 2 3 4 5 6 7 8 x axis y axis 0 1 2 3 4 5 6
– We have three classes (red, green, yellow). – Each pattern is a two-dimensional vector.
10
0 1 2 3 4 5 6 7 8 x axis y axis 0 1 2 3 4 5 6
11
– The first dimension is surface temperature, measured in Fahrenheit. – Your second dimension is weight, measured in pounds.
12
– The first dimension is surface temperature, measured in Fahrenheit. – Your second dimension is weight, measured in pounds.
13
14
15
Original Data Min = 0, Max = 1 Mean = 0, std = 1 Object ID Temp. (F) Weight (lb.) Temp. Weight Temp. Weight 1 4700 1.5*1030 0.0000 0.0108
2 11000 3.5*1030 0.1525 0.0377
3 46000 7.5*1031 1.0000 1.0000 1.9218 1.9931 4 12000 5.0*1031 0.1768 0.6635
1.1101 5 20000 7.0*1029 0.3705 0.0000 0.0949
6 13000 2.0*1030 0.2010 0.0175
7 8500 8.5*1029 0.0920 0.0020
8 34000 1.5*1031 0.7094 0.1925 1.0786
16
17
– We need to consider each dimension of each training example.
18
19
20
21
– Common enough that it has a dedicated Wikipedia article.
– Finding nearest neighbors in low dimensions (like 1, 2, 3 dimensions) can be done in close to logarithmic time. – However, these approaches take time exponential to D. – By the time you get to 50, 100, 1000 dimensions, they get hopeless. – Data in AI oftentimes has thousands or millions of dimensions.
22
23
24