principal components analysis pca
play

Principal Components Analysis (PCA) Prof. Mike Hughes Many - PowerPoint PPT Presentation

Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2020f/ Principal Components Analysis (PCA) Prof. Mike Hughes Many ideas/slides attributable to: Liping Liu (Tufts), Emily Fox (UW) Matt Gormley (CMU) 2 What


  1. Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2020f/ Principal Components Analysis (PCA) Prof. Mike Hughes Many ideas/slides attributable to: Liping Liu (Tufts), Emily Fox (UW) Matt Gormley (CMU) 2

  2. What will we learn? Supervised Learning Data Examples Performance { x n } N measure Task n =1 Unsupervised Learning summary data of x x Reinforcement Learning Mike Hughes - Tufts COMP 135 - Fall 2020 3

  3. Task: Embedding Supervised Learning x 2 Unsupervised Learning embedding Reinforcement x 1 Learning Mike Hughes - Tufts COMP 135 - Fall 2020 4

  4. Dim. Reduction/Embedding Unit Objectives • Goals of dimensionality reduction • Reduce feature vector size (keep signal, discard noise) • “Interpret” features: visualize/explore/understand • Common approaches • Principal Component Analysis (PCA) • word2vec and other neural embeddings • Evaluation Metrics • Storage size - Reconstruction error • “Interpretability” Mike Hughes - Tufts COMP 135 - Fall 2020 5

  5. Example: 2D viz. of movies Mike Hughes - Tufts COMP 135 - Fall 2020 6

  6. Example: Genes vs. geography Nature, 2008 Mike Hughes - Tufts COMP 135 - Fall 2020 7

  7. Centering the Data Goal: each feature’s mean = 0.0 Mike Hughes - Tufts COMP 135 - Fall 2020 8

  8. <latexit sha1_base64="dUHKmRLUMswF0x+NPSnxqRI0fg=">AB/3icbVDLSsNAFJ34rPUVFdy4GSyCq5JUQTdC0Y3LCvYBTQiT6aQdOnkwcyMtMQt/xY0LRdz6G+78G6dtFtp64MLhnHu59x4/EVyBZX0bS8srq2vrpY3y5tb2zq65t9ScSopa9JYxLjE8UEj1gTOAjWSQjoS9Y2x/eTPz2A5OKx9E9jBPmhqQf8YBTAlryzENnQCBzgI3AD7JR7vEcX+HQMytW1ZoCLxK7IBVUoOGZX04vpmnIqCKNW1rQTcjEjgVLC87KSKJYQOSZ91NY1IyJSbTe/P8YlWejiIpa4I8FT9PZGRUKlx6OvOkMBAzXsT8T+vm0Jw6WY8SlJgEZ0tClKBIcaTMHCPS0ZBjDUhVHJ9K6YDIgkFHVlZh2DPv7xIWrWqfVat3Z1X6tdFHCV0hI7RKbLRBaqjW9RATUTRI3pGr+jNeDJejHfjY9a6ZBQzB+gPjM8fH62WJw=</latexit> Constant Reconstruction model ˆ x i = m m Parameters: m, an F-dim vector Training problem: Minimize reconstruction error N ( x n − m ) T ( x n − m ) X min m ∈ R F n =1 This is squared error between two vectors Optimal parameters: m ∗ = mean( x 1 , . . . x N ) Think of mean vector as optimal “reconstruction” of a dataset if you must use a single vector Mike Hughes - Tufts COMP 135 - Fall 2020 9

  9. Mean reconstruction original reconstructed Mike Hughes - Tufts COMP 135 - Fall 2020 10

  10. Linear Reconstruction and Principal Component Analysis Mike Hughes - Tufts COMP 135 - Fall 2020 11

  11. Linear Projection to 1D Mike Hughes - Tufts COMP 135 - Fall 2020 12

  12. Reconstruction from 1D to 2D Mike Hughes - Tufts COMP 135 - Fall 2020 13

  13. 2D Orthogonal Basis If we could project into 2 dims (same as F), we can perfectly reconstruct Mike Hughes - Tufts COMP 135 - Fall 2020 14

  14. Which 1D projection is best? Idea: Minimize reconstruction error Mike Hughes - Tufts COMP 135 - Fall 2020 15

  15. <latexit sha1_base64="zytQD0ua0ZeTvtQCu7rVyhs+2Jg=">ACGXicbVDLSsNAFJ3UV62vqEs3g0UQhJUQTdC0Y3LCvYBTQmT6aQdOpmEmYlaQ37Djb/ixoUiLnXl3zhpo2jrgYEz59zLvfd4EaNSWdanUZibX1hcKi6XVlbX1jfMza2mDGOBSQOHLBRtD0nCKCcNRUj7UgQFHiMtLzhea3romQNORXahSRboD6nPoUI6Ul17ScAVKJEyA18PzkNk1dCk/h9/8mhXdaOPgRgtQ1y1bFGgPOEjsnZCj7prvTi/EcUC4wgxJ2bGtSHUTJBTFjKQlJ5YkQniI+qSjKUcBkd1kfFkK97TSg34o9OMKjtXfHQkKpBwFnq7MNpTXib+53Vi5Z90E8qjWBGOJ4P8mEVwiwm2KOCYMVGmiAsqN4V4gESCsdZkmHYE+fPEua1Yp9WKleHpVrZ3kcRbADdsE+sMExqIELUAcNgME9eATP4MV4MJ6MV+NtUlow8p5t8AfGxf1vKDg</latexit> Linear Reconstruction Model with 1 components ˆ x i = w z i + m Fx1 F x 1 1 x 1 F x 1 High-dim. Weights Low-dim “mean” data embedding vector or “score” Mike Hughes - Tufts COMP 135 - Fall 2020 16

  16. <latexit sha1_base64="+9+RCl1NQLq0x56CkST31Y4T5uY=">ACAHicbVDLSsNAFJ34rPUVdeHCzWARXJWkCropFAVxWcE+oE3DZDph85MwsxEKSEbf8WNC0Xc+hnu/BunbRbaeuDC4Zx7ufeIGZUacf5tpaWV1bX1gsbxc2t7Z1de2+/qaJEYtLAEYtkO0CKMCpIQ1PNSDuWBPGAkVYwup74rQciFY3EvR7HxONoIGhIMdJG8u3Drkq4n4ZVN+vdwEc/7FVgFbrQt0tO2ZkCLhI3JyWQo+7bX91+hBNOhMYMKdVxnVh7KZKaYkayYjdRJEZ4hAakY6hAnCgvnT6QwROj9GEYSVNCw6n6eyJFXKkxD0wnR3qo5r2J+J/XSXR46aVUxIkmAs8WhQmDOoKTNGCfSoI1GxuCsKTmVoiHSCKsTWZFE4I7/IiaVbK7lm5cndeql3lcRTAETgGp8AF6AGbkEdNAGXgGr+DNerJerHfrY9a6ZOUzB+APrM8fKFmUzw=</latexit> <latexit sha1_base64="zytQD0ua0ZeTvtQCu7rVyhs+2Jg=">ACGXicbVDLSsNAFJ3UV62vqEs3g0UQhJUQTdC0Y3LCvYBTQmT6aQdOpmEmYlaQ37Djb/ixoUiLnXl3zhpo2jrgYEz59zLvfd4EaNSWdanUZibX1hcKi6XVlbX1jfMza2mDGOBSQOHLBRtD0nCKCcNRUj7UgQFHiMtLzhea3romQNORXahSRboD6nPoUI6Ul17ScAVKJEyA18PzkNk1dCk/h9/8mhXdaOPgRgtQ1y1bFGgPOEjsnZCj7prvTi/EcUC4wgxJ2bGtSHUTJBTFjKQlJ5YkQniI+qSjKUcBkd1kfFkK97TSg34o9OMKjtXfHQkKpBwFnq7MNpTXib+53Vi5Z90E8qjWBGOJ4P8mEVwiwm2KOCYMVGmiAsqN4V4gESCsdZkmHYE+fPEua1Yp9WKleHpVrZ3kcRbADdsE+sMExqIELUAcNgME9eATP4MV4MJ6MV+NtUlow8p5t8AfGxf1vKDg</latexit> Linear Reconstruction Model with 1 components ˆ x i = w z i + m W is a vector on unit circle. Magnitude is Problem: “Over-parameterized”. Too many possible solutions! always 1. Suppose we have an alternate model with weights w’ and embedding z’ We would get equivalent reconstructions if we set: • w’ = w * 2 • z’ = z / 2 F Solution: Constrain magnitude of w. X w 2 f = 1 w is a unit vector. We care about direction, not scale. f =1 Mike Hughes - Tufts COMP 135 - Fall 2020 17

  17. <latexit sha1_base64="zytQD0ua0ZeTvtQCu7rVyhs+2Jg=">ACGXicbVDLSsNAFJ3UV62vqEs3g0UQhJUQTdC0Y3LCvYBTQmT6aQdOpmEmYlaQ37Djb/ixoUiLnXl3zhpo2jrgYEz59zLvfd4EaNSWdanUZibX1hcKi6XVlbX1jfMza2mDGOBSQOHLBRtD0nCKCcNRUj7UgQFHiMtLzhea3romQNORXahSRboD6nPoUI6Ul17ScAVKJEyA18PzkNk1dCk/h9/8mhXdaOPgRgtQ1y1bFGgPOEjsnZCj7prvTi/EcUC4wgxJ2bGtSHUTJBTFjKQlJ5YkQniI+qSjKUcBkd1kfFkK97TSg34o9OMKjtXfHQkKpBwFnq7MNpTXib+53Vi5Z90E8qjWBGOJ4P8mEVwiwm2KOCYMVGmiAsqN4V4gESCsdZkmHYE+fPEua1Yp9WKleHpVrZ3kcRbADdsE+sMExqIELUAcNgME9eATP4MV4MJ6MV+NtUlow8p5t8AfGxf1vKDg</latexit> <latexit sha1_base64="K0louETznvNXoYDCXF7hv1DtY=">ACMHicbVDLSgMxFM34rPVdenmYhEUscxUQZeiC12qWBU6tWTSjIYmSHJqHWY/pEbP0U3Coq49StMH4qvA4Fzr2X3HuCmDNtXPfJGRgcGh4ZzY3lxycmp6YLM7PHOkoUoRUS8UidBlhTziStGY4PY0VxSLg9CRo7nTqJ5dUaRbJI9OKaU3gc8lCRrCxVr2w6wsm6+kN+EyCL7C5CIL0Mug3V7qyTC9zmAVvtRVBjewAp9SZMuwfFauF4puye0C/hKvT4qoj/164d5vRCQRVBrCsdZVz41NLcXKMJplvcTWNMmvicVi2VWFBdS7sHZ7BonQaEkbJPGui63ydSLRuicB2dtbUv2sd879aNTHhZi1lMk4MlaT3UZhwMBF0oMGU5QY3rIE8XsrkAusMLE2IzNgTv98l/yXG5K2Vygfrxa3tfhw5NI8W0BLy0AbaQntoH1UQbfoAT2jF+fOeXRenbde64DTn5lDP+C8fwAqVaj7</latexit> <latexit sha1_base64="wLdgEcLQpRKuImgEmyR48SYRZ9Q=">AB+HicbVDLTgIxFO3gC/HBqEs3jcQEF5IZNGNCdGNS0x4JTCSTulAQ9uZtB0VJnyJGxca49ZPcefWGAWCp7kJifn3Jt7/EjRpV2nG8rs7K6tr6R3cxtbe/s5u29/YKY4lJHYcslC0fKcKoIHVNSOtSBLEfUa/vBm6jcfiFQ0FDU9iojHUV/QgGKkjdS182N4BR/va7D4BE8hP+naBafkzACXiZuSAkhR7dpfnV6IY06Exgwp1XadSHsJkpiRia5TqxIhPAQ9UnbUIE4UV4yO3wCj43Sg0EoTQkNZ+rviQRxpUbcN50c6YFa9Kbif1471sGl1ARxZoIPF8UxAzqE5TgD0qCdZsZAjCkpbIR4gibA2WeVMCO7iy8ukUS65Z6Xy3Xmhcp3GkQWH4AgUgQsuQAXcgiqoAwxi8AxewZs1tl6sd+tj3pqx0pkD8AfW5w+wBpEo</latexit> Linear Reconstruction Model with 1 components ˆ x i = w z i + m W is a vector on unit circle. Fx1 F x 1 1 x 1 Magnitude is F x 1 always 1. Given fixed weights w and a specific x, what is the optimal scalar z value? Minimize reconstruction error! ( x − ( w z + m )) 2 min z ∈ R Exact analytical solution (take gradient, set to zero, solve for z) gives: z = w T ( x − m ) Projection of feature vector x onto vector w after “centering” (removing the mean) Mike Hughes - Tufts COMP 135 - Fall 2020 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend