Dimensionality Reduction & Embedding
2
Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/
Many ideas/slides attributable to: Liping Liu (Tufts), Emily Fox (UW) Matt Gormley (CMU)
- Prof. Mike Hughes
Dimensionality Reduction & Embedding Prof. Mike Hughes Many - - PowerPoint PPT Presentation
Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Dimensionality Reduction & Embedding Prof. Mike Hughes Many ideas/slides attributable to: Liping Liu (Tufts), Emily Fox (UW) Matt Gormley (CMU) 2
2
Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/
Many ideas/slides attributable to: Liping Liu (Tufts), Emily Fox (UW) Matt Gormley (CMU)
3
Mike Hughes - Tufts COMP 135 - Spring 2019
Data Examples data x
Supervised Learning Unsupervised Learning Reinforcement Learning
n=1
Task summary
Performance measure
4
Mike Hughes - Tufts COMP 135 - Spring 2019
Supervised Learning Unsupervised Learning Reinforcement Learning
embedding
5
Mike Hughes - Tufts COMP 135 - Spring 2019
6
Mike Hughes - Tufts COMP 135 - Spring 2019
7
Mike Hughes - Tufts COMP 135 - Spring 2019
8
Mike Hughes - Tufts COMP 135 - Spring 2019
9
Mike Hughes - Tufts COMP 135 - Spring 2019
10
Mike Hughes - Tufts COMP 135 - Spring 2019
11
Mike Hughes - Tufts COMP 135 - Spring 2019
12
Mike Hughes - Tufts COMP 135 - Spring 2019
13
Mike Hughes - Tufts COMP 135 - Spring 2019
14
Mike Hughes - Tufts COMP 135 - Spring 2019
15
Mike Hughes - Tufts COMP 135 - Spring 2019
16
Mike Hughes - Tufts COMP 135 - Spring 2019
17
Mike Hughes - Tufts COMP 135 - Spring 2019
18
Mike Hughes - Tufts COMP 135 - Spring 2019
19
Mike Hughes - Tufts COMP 135 - Spring 2019
20
Mike Hughes - Tufts COMP 135 - Spring 2019
Take K=50
21
Mike Hughes - Tufts COMP 135 - Spring 2019
PRO
slow
K=4
CON
22
Mike Hughes - Tufts COMP 135 - Spring 2019
23
Mike Hughes - Tufts COMP 135 - Spring 2019
24
Mike Hughes - Tufts COMP 135 - Spring 2019 Credit: Luuk Derksen (https://medium.com/@luckylwk/visualising-high-dimensional-datasets-using-pca-and-t-sne-in-python-
8ef87e7915b)
25
Mike Hughes - Tufts COMP 135 - Spring 2019 Credit: Luuk Derksen (https://medium.com/@luckylwk/visualising-high-dimensional-datasets-using-pca-and-t-sne-in-python-
8ef87e7915b)
26
Mike Hughes - Tufts COMP 135 - Spring 2019
27
Mike Hughes - Tufts COMP 135 - Spring 2019
https://distill.pub/2016/misread-tsne/
28
Mike Hughes - Tufts COMP 135 - Spring 2019
29
Mike Hughes - Tufts COMP 135 - Spring 2019
*() approximates the utility +#)
from the same user;
scores to the same item
30
Mike Hughes - Tufts COMP 135 - Spring 2019
31
Mike Hughes - Tufts COMP 135 - Spring 2019
32
Goal: map each word in vocabulary to an embedding vector
vec(swimming) – vec(swim) + vec(walk) = vec(walking)
33
Goal: map each word in vocabulary to an embedding vector
34
Reward embeddings that predict nearby words in the sentence. tacos s t a f f dinosaur hammer embedding dimensions typical 100-1000
Credit: https://www.tensorflow.org/tutorials/representation/word2vec
3.2
7.1
fixed vocabulary typical 1000-100k