On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
On the Downstream Performance of Compressed Word Embeddings
Avner May, Jian Zhang, Tri Dao, Chris Ré Stanford University
On the Downstream Performance of Compressed Word Embeddings Avner - - PowerPoint PPT Presentation
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19. On the Downstream Performance of Compressed Word Embeddings Avner May, Jian Zhang, Tri Dao, Chris R Stanford University On the Downstream Performance of
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Avner May, Jian Zhang, Tri Dao, Chris Ré Stanford University
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
2
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
2
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
2
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
3
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
4
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Train model
4
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Train model
4
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Train model
Train model
4
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
5
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
5
Better compression quality measure
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
5
Better compression quality measure Worse downstream performance
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
6
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Define a new measure of compression quality.
1
6
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Define a new measure of compression quality. Prove generalization bounds using this measure.
1 2
6
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Define a new measure of compression quality. Prove generalization bounds using this measure. Show strong empirical correlation w. downstream performance.
1 2 3
6
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Define a new measure of compression quality. Prove generalization bounds using this measure. Show strong empirical correlation w. downstream performance. Use measure to select compressed embeddings.
1 2 3 4
6
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Define a new measure of compression quality. Prove generalization bounds using this measure. Show strong empirical correlation w. downstream performance. Use measure to select compressed embeddings.
1 2 3 4
Up to 2x lower selection error rates than the next best measure.
6
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Predictions are determined by data matrix’s left singular vectors.
7
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Embed. matrix
Singular Value Decomposition
7
Predictions are determined by data matrix’s left singular vectors.
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Embed. matrix
Singular Value Decomposition
7
Predictions are determined by data matrix’s left singular vectors.
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Embed. matrix
Regression label Singular Value Decomposition
7
Predictions are determined by data matrix’s left singular vectors.
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Embed. matrix
Regression label
Linear regressor predictions
Project y onto span of left singular vectors Singular Value Decomposition
7
Predictions are determined by data matrix’s left singular vectors.
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Measures similarity between the span of left singular vectors.
8
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Compressed
Uncompressed
Eigenspace
Measures similarity between the span of left singular vectors.
8
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
9
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
9
Higher EOS
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
9
Higher EOS Better downstream performance
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
EOS attains strong correlation with downstream model accuracy.
10
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
EOS attains strong correlation with downstream model accuracy.
Higher accuracy
EOS
Higher quality
10
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
EOS attains strong correlation with downstream model accuracy.
Higher accuracy
EOS
Higher quality
10
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
EOS attains strong correlation with downstream model accuracy.
Higher accuracy
[1] Yin and Shen, On the Dimensionality of Word Embeddings. NeurIPS 2018.
EOS
Higher quality Higher quality
10
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
EOS attains up to 2x lower selection error rates than 2nd best.
11
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
EOS attains up to 2x lower selection error rates than 2nd best.
[1] Avron et al., ICML 2017. [2] Yin and Shen. NeurIPS 2018. [3] Zhang et al., AISTATS 2019.
Selection Error Rate (%) NLP Tasks
11
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
EOS attains up to 2x lower selection error rates than 2nd best.
[1] Avron et al., ICML 2017. [2] Yin and Shen. NeurIPS 2018. [3] Zhang et al., AISTATS 2019.
Selection Error Rate (%) NLP Tasks
11
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
12
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Defined a new measure of compression quality.
1
12
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Defined a new measure of compression quality. Proved generalization bounds using this measure.
1 2
12
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Defined a new measure of compression quality. Proved generalization bounds using this measure. Showed strong empirical correlation w. downstream perf.
1 2 3
12
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Defined a new measure of compression quality. Proved generalization bounds using this measure. Showed strong empirical correlation w. downstream perf. Used measure to select compressed embeddings.
1 2 3 4
12
On the Downstream Performance of Compressed Word Embeddings. NeurIPS Spotlight 12/12/19.
Paper: https://arxiv.org/pdf/1909.01264.pdf Code: https://github.com/HazyResearch/smallfry E-mail: avnermay@cs.stanford.edu
13