Introduction to Machine Learning
- 3. Instance Based Learning
Alex Smola Carnegie Mellon University
http://alex.smola.org/teaching/cmu2013-10-701 10-701
Introduction to Machine Learning 3. Instance Based Learning Alex - - PowerPoint PPT Presentation
Introduction to Machine Learning 3. Instance Based Learning Alex Smola Carnegie Mellon University http://alex.smola.org/teaching/cmu2013-10-701 10-701 Outline Parzen Windows Kernels, algorithm Model selection Crossvalidation, leave
http://alex.smola.org/teaching/cmu2013-10-701 10-701
Parzen
p(y|x) = p(x, y) p(x) = p(x|y)p(y) P
y0 p(x|y0)p(y0)
25 English Chinese German French Spanish male 5 2 3 1 female 6 3 2 2 1
25 English Chinese German French Spanish male 0.2 0.08 0.12 0.04 female 0.24 0.12 0.08 0.08 0.04
25 English Chinese German French Spanish male 0.2 0.08 0.12 0.04 female 0.24 0.12 0.08 0.08 0.04
25 English Chinese German French Spanish male 0.2 0.08 0.12 0.04 female 0.24 0.12 0.08 0.08 0.04
not enough data
probability mass per cell also decreases by 1010.
40 50 60 70 80 90 100 110 0.00 0.01 0.02 0.03 0.04 0.05 40 50 60 70 80 90 100 110 0.00 0.05 0.10
Pr (
m
m
X
i=1
xi
) ≤ 2e−2m✏2
⇒ ✏ ≤ r log 2|A| − log 2m Pr ✓ sup
a∈A
|ˆ p(a) − p(a)| ≥ ✏ ◆ ≤ X
a∈A
Pr (|ˆ p(a) − p(a)| ≥ ✏) ≤2|A| exp
Pr (
m
m
X
i=1
xi
) ≤ 2e−2m✏2
⇒ ✏ ≤ r log 2|A| − log 2m Pr ✓ sup
a∈A
|ˆ p(a) − p(a)| ≥ ✏ ◆ ≤ X
a∈A
Pr (|ˆ p(a) − p(a)| ≥ ✏) ≤2|A| exp
Pr (
m
m
X
i=1
xi
) ≤ 2e−2m✏2
⇒ ✏ ≤ r log 2|A| − log 2m Pr ✓ sup
a∈A
|ˆ p(a) − p(a)| ≥ ✏ ◆ ≤ X
a∈A
Pr (|ˆ p(a) − p(a)| ≥ ✏) ≤2|A| exp
Pr (
m
m
X
i=1
xi
) ≤ 2e−2m✏2
pemp(x) = 1 m
m
X
i=1
δxi(x) Z
X
kx(x0)dx0 = 1 for all x
pemp(x) = 1 m
m
X
i=1
δxi(x) ˆ p(x) = 1 m
m
X
i=1
kxi(x)
1 2
0.0 0.5 1.0
1 2
0.0 0.5 1.0
1 2
0.0 0.5 1.0
1 2
0.0 0.5 1.0
(2π)− 1
2 e− 1 2 x2
1 2e−|x| 3 4 max(0, 1 − x2) 1 2χ[−1,1](x)
Gauss Laplace Epanechikov Uniform
dist = norm(X - x * ones(1,m),'columns'); p = (1/m) * ((2 * pi)**(-d/2)) * sum(exp(-0.5 * dist.**2))
40 60 80 100
0.000 0.025 0.050
40 60 80 100
0.000 0.025 0.050
40 60 80 100
0.000 0.025 0.050
40 60 80 100
0.000 0.025 0.050
40 50 60 70 80 90 100 110 0.00 0.01 0.02 0.03 0.04 0.05 40 50 60 70 80 90 100 110 0.00 0.05 0.100.3 1 3 10
40 60 80 100
0.000 0.025 0.050
40 60 80 100
0.000 0.025 0.050
40 60 80 100
0.000 0.025 0.050
40 60 80 100
0.000 0.025 0.050
kxi(x) = r−dh ✓x − xi r ◆
Pr {X} =
m
Y
i=1
p(xi)
0.000 0.025 0.050 0.025
0.000 0.025 0.050 0.025
0.000 0.025 0.050
L(X0|X) := 1 n0
n0
X
i=1
log ˆ p(x0
i)
1 n
n
X
i=1
log ˆ p(xi) − Ex [log ˆ p(x)]
L(X0|X) := 1 n0
n0
X
i=1
log ˆ p(x0
i)
1 n
n
X
i=1
log ˆ p(xi) − Ex [log ˆ p(x)]
L(X0|X) := 1 n0
n0
X
i=1
log ˆ p(x0
i)
1 n
n
X
i=1
log ˆ p(xi) − Ex [log ˆ p(x)]
L(X0|X) := 1 n0
n0
X
i=1
log ˆ p(x0
i)
1 n
n
X
i=1
log ˆ p(xi) − Ex [log ˆ p(x)]
log p(xi|X\xi) = log 1 n − 1 X
j6=i
k(xi, xj)
1 n
n
X
i=1
log n n − 1p(xi) − 1 n − 1k(xi, xi)
n
n
X
i=1
k(xi, x)
(error is for (k-1)/k sized set)
1 k
k
X
i=1
l(p(Xi|X\Xi))
Geoff Watson
p(x|y = 1) and p(x|y = −1) p(y|x) = p(x|y)p(y) p(x) =
1 my
P
yi=y k(xi, x) · my m 1 m
P
i k(xi, x)
p(y = 1|x) − p(y = −1|x) = P
j yjk(xj, x)
P
i k(xi, x)
= X
j
yj k(xj, x) P
i k(xi, x)
dist = norm(X - x * ones(1,m),'columns'); f = sum(y .* exp(-0.5 * dist.**2));
p(y = 1|x) − p(y = −1|x) = P
j yjk(xj, x)
P
i k(xi, x)
= X
j
yj k(xj, x) P
i k(xi, x)
ˆ y(x) = X
j
yj k(xj, x) P
i k(xi, x)
Bernard Silverman
ri = r k X
x∈NN(xi,k)
kxi xk
ˆ y(x) = X
j
yj k(xi, x) P
i k(xi, x) =
X
j
yjwj(x) ˆ y(x) = X
j
yj k(xj, x) P
i k(xi, x) =
X
j
yjwj(x)
(use increasing k)
so we can make this small
(up to some approximation error in neighborhood)
(e.g. via Hoeffding’s theorem for tail). Show that it vanishes
http://hunch.net/~jl/projects/cover_tree/cover_tree.html
(Andrew Moore’s tutorial from his PhD thesis)
http://dx.doi.org/10.1137/1109020
http://www.jstor.org/stable/25049340
http://cran.r-project.org/web/packages/np/index.html
http://projecteuclid.org/euclid.aos/1176343886
http://www-isl.stanford.edu/people/cover/papers/transIT/0021cove.pdf
Rates of Convergence for Nearest Neighbor Procedures.
http://cseweb.ucsd.edu/~dasgupta/papers/nnactive.pdf
http://cgm.cs.mcgill.ca/~godfried/teaching/pr-notes/dasarathy.pdf
http://valis.cs.uiuc.edu/~sariel/papers/04/survey/survey.pdf