unsupervised learning
play

Unsupervised Learning Shan-Hung Wu shwu@cs.nthu.edu.tw Department - PowerPoint PPT Presentation

Unsupervised Learning Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science, National Tsing Hua University, Taiwan Machine Learning Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 1 / 49 Outline Unsupervised


  1. Unsupervised Learning Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science, National Tsing Hua University, Taiwan Machine Learning Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 1 / 49

  2. Outline Unsupervised Learning 1 Predictive Learning 2 Autoencoders & Manifold Learning 3 Generative Adversarial Networks 4 Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 2 / 49

  3. Outline Unsupervised Learning 1 Predictive Learning 2 Autoencoders & Manifold Learning 3 Generative Adversarial Networks 4 Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 3 / 49

  4. Unsupervised Learning Dataset: X = { x ( i ) } i No supervision Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 4 / 49

  5. Unsupervised Learning Dataset: X = { x ( i ) } i No supervision What can we learn? Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 4 / 49

  6. Clustering I Goal: to group similar x ( i ) ’s Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 5 / 49

  7. Clustering II K -means algorithm: Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 6 / 49

  8. Clustering II K -means algorithm: Hierarchical clustering: Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 6 / 49

  9. Factorization and Recommendation Goal: to uncover the factors behind data (rating matrix) Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 7 / 49

  10. Factorization and Recommendation Goal: to uncover the factors behind data (rating matrix) Commonly used in the recommender systems Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 7 / 49

  11. Factorization and Recommendation Goal: to uncover the factors behind data (rating matrix) Commonly used in the recommender systems Non-negative matrix factorization (NMF) [9, 10] Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 7 / 49

  12. Dimension Reduction Goal: to reduce the dimension of each x ( i ) E.g., PCA Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 8 / 49

  13. Dimension Reduction Goal: to reduce the dimension of each x ( i ) E.g., PCA Predictive learning Learn to “fill in the blanks” Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 8 / 49

  14. Dimension Reduction Goal: to reduce the dimension of each x ( i ) E.g., PCA Predictive learning Learn to “fill in the blanks” Manifold learning Learn tangent vectors of a given point Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 8 / 49

  15. Data Generation I Goal: to generate new data points/samples Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 9 / 49

  16. Data Generation I Goal: to generate new data points/samples Generative adversarial networks (GANs) Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 9 / 49

  17. Data Generation II Text to image based on conditional GANs: “ This bird is completely red with black wings and pointy beak. ” Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 10 / 49

  18. Outline Unsupervised Learning 1 Predictive Learning 2 Autoencoders & Manifold Learning 3 Generative Adversarial Networks 4 Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 11 / 49

  19. Predictive Learning I.e., blank filling Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 12 / 49

  20. Predictive Learning I.e., blank filling E.g., word2vec [13, 12]: “ ... the cat sat on ... ” Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 12 / 49

  21. Doc2Vec How to encode a document? Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 13 / 49

  22. Doc2Vec How to encode a document? Bag of words (TF-IDF), average word2vec, etc. Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 13 / 49

  23. Doc2Vec How to encode a document? Bag of words (TF-IDF), average word2vec, etc. Do not capture the semantic meaning of a doc “ I like final project ” 6 = “ Final project likes me ” Predictive learning for docs? Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 13 / 49

  24. Doc2Vec How to encode a document? Bag of words (TF-IDF), average word2vec, etc. Do not capture the semantic meaning of a doc “ I like final project ” 6 = “ Final project likes me ” Predictive learning for docs? Doc2vec [7]: to capture the context not explained by words Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 13 / 49

  25. Filling Images How? Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 14 / 49

  26. Filling Images How? PixelRNN [19] Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 14 / 49

  27. More Predicting the future by watching unlabeled videos [6, 21]: Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 15 / 49

  28. Outline Unsupervised Learning 1 Predictive Learning 2 Autoencoders & Manifold Learning 3 Generative Adversarial Networks 4 Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 16 / 49

  29. Autoencoders I Encoder: to learn a low dimensional representation c (called code ) of input x Decoder: to reconstruct x from c Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 17 / 49

  30. Autoencoders I Encoder: to learn a low dimensional representation c (called code ) of input x Decoder: to reconstruct x from c Cost function: argmin Θ � logP ( X | Θ ) = argmin Θ � ∑ n logP ( x ( n ) | Θ ) Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 17 / 49

  31. Autoencoders I Encoder: to learn a low dimensional representation c (called code ) of input x Decoder: to reconstruct x from c Cost function: argmin Θ � logP ( X | Θ ) = argmin Θ � ∑ n logP ( x ( n ) | Θ ) Sigmoid output units a ( L ) = ˆ ρ j for x j ⇠ Bernoulli ( ρ j ) j ) x ( n ) ) ( 1 � x ( n ) P ( x ( n ) | Θ ) = ( a ( L ) j ( 1 � a ( L ) ) j j j j Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 17 / 49

  32. Autoencoders I Encoder: to learn a low dimensional representation c (called code ) of input x Decoder: to reconstruct x from c Cost function: argmin Θ � logP ( X | Θ ) = argmin Θ � ∑ n logP ( x ( n ) | Θ ) Sigmoid output units a ( L ) = ˆ ρ j for x j ⇠ Bernoulli ( ρ j ) j ) x ( n ) ) ( 1 � x ( n ) P ( x ( n ) | Θ ) = ( a ( L ) j ( 1 � a ( L ) ) j j j j Linear output units a ( L ) = z ( L ) = ˆ µ for x ⇠ N ( µ , Σ ) � logP ( x ( n ) | Θ ) = k x ( n ) � z ( L ) k 2 Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 17 / 49

  33. Autoencoders II A 32 -bit code can roughly represents a 32 ⇥ 32 MNIST image Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 18 / 49

  34. Convolutional Autoencoders Convolution + deconvolution: Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 19 / 49

  35. Convolutional Autoencoders Convolution + deconvolution: How to train deconvolution layer? Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 19 / 49

  36. Convolutional Autoencoders Convolution + deconvolution: How to train deconvolution layer? Treat it as convolution layer Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 19 / 49

  37. Manifolds I In many applications, data concentrate around one or more low-dimensional manifolds Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 20 / 49

  38. Manifolds I In many applications, data concentrate around one or more low-dimensional manifolds A manifold is a topological space that are linear locally Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 20 / 49

  39. Manifolds II For each point x on a manifold, we have its tangent space spanned by tangent vectors Local directions specify how one can change x infinitesimally while staying on the manifold Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 21 / 49

  40. Learning Manifolds I How to learn manifolds with autoencoders? Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 22 / 49

  41. Learning Manifolds I How to learn manifolds with autoencoders? Contractive autoencoder [16]: regularize the code c such that it is invariant to local changes of x : 2 � � ∂ c ( n ) � � Ω ( c ) = ∑ � � ∂ x ( n ) � � n � � F ∂ c ( n ) / ∂ x ( n ) is a Jacobian matrix Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 22 / 49

  42. Learning Manifolds I How to learn manifolds with autoencoders? Contractive autoencoder [16]: regularize the code c such that it is invariant to local changes of x : 2 � � ∂ c ( n ) � � Ω ( c ) = ∑ � � ∂ x ( n ) � � n � � F ∂ c ( n ) / ∂ x ( n ) is a Jacobian matrix Encoder preserves local structures in code space Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 22 / 49

  43. Learning Manifolds II In practice, it is easier to train a denoising autoencoder [20]: Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 23 / 49

  44. Learning Manifolds II In practice, it is easier to train a denoising autoencoder [20]: Encoder: to encode x with random noises Decoder: to reconstruct x without noises Shan-Hung Wu (CS, NTHU) Unsupervised Learning Machine Learning 23 / 49

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend