PCA
Wrap Up
Projection
perspective
Ii
132
B
again
- rthonormal
Want
To
Minimize
reconstruction
error
Jm
NtE
xi
I.li
What
are
The
- ptimal
Coordinates
2
for
Xi
w.ro
B
JI
JI
JI
J2ji
Iki
d2ji
II
f
x sitter
CIzmibm
b
descent fly gradient Using C SNE Symmetric Instead of Pilj - - PDF document
PCA Wrap Up Projection perspective Ii 132 B orthonormal again Want To Minimize error reconstruction Nt E I.li Jm xi What The for Coordinates 2 Xi are optimal B w.ro JI JI JI J2ji Iki d2ji II x sitter f CI zmibm b
PCA
Wrap Up
Projection
perspective
132
B
again
Want
To
Minimize
reconstruction
error
xi
What
are
The
Coordinates
2
for
Xi
w.ro
B
JI
Iki
x sitter
b
So
IJM
x Htb
bmt.bg
if
mtj
x I zmib
x b
2
b
x b
2
Set
to
Zji
A
similar
argument
for
choice
B
Cbasis can
be
made
yielding again
The
M
largest
Eigenvectors
see
reading
tochastic
Neighbor
Embedding
CSNE
very low dim
Airn
X
y
very high dim
Define
a
conditional
probability
That
encodes
similarity
exPE
llxi
xg.li
2o
xklf
zo
i
Th
is
in
high
dim space
Xi
Xj
Xu
similarly
in
The
map
9J PE Nyi
y If
Ideally
9J
Formalize
this
with
KL
Divergence
KL
How different
is
the
distribution 9
from p
Properties
i
KL gllp
Z 0
2
If
KL glp
g
p
3
KL gmp
KL pkg
OST
function
c
Same
but
Conditional
distribution
in
M
all
points j
given
i
in
D
To
place
points
find
TO
MINIMIZE
C
Using
gradient
descent fly C
Symmetric
SNE
Instead
conditionals
Pilj 9ilj define
joint
distributions
Pij 9ij
9j
PE Yi yjH
p
9ij
9ji
For
high dim
space
pose
a
challenge
because
denominator
will
be
large Pij
small
unimportant
Instead
Pig If
Piti
Pig
Zn
Pig Z In
txi
Yields
a
nicer fly
Thicsymmetric
Yi Yj
The
crowding
problem
Not
enough
space
in
lower dims far
In
t
SNE
we
model
The
joint
probability
9 ij
Using
a
Student T
distribution
which
has
heavier
tails
moderate distance
in
X
big
distance
in
4
is
does
not
force
moderate
distances
in
X
to
yield
small distances
in
Auto
encoders
Design
a
network
that
consumes
X
and
Then
re
constructs
IT
I
784
I
Decode
f
200
TI
p
20
I
Bottleneck
p E
zoo
I
encode
E
784
I
Simplest
Version
W V
Note
This
is
just
linear
dam
reduction
and
should
look
familiar
Wx
z
I
_Vz l
l
l
l
I
l
l
l
dxm
next
dx1
Mxd
More
next
Time