review
play

Review Models that use SVD or eigen-analysis PageRank: - PowerPoint PPT Presentation

Review Models that use SVD or eigen-analysis PageRank: eigen-analysis of random dolphin surfer transition matrix friendships usually uses only first eigenvector !"% Spectral embedding: eigen-analysis (or !"#


  1. Review • Models that use SVD or eigen-analysis ‣ PageRank: eigen-analysis of random dolphin surfer transition matrix friendships ‣ usually uses only first eigenvector !"% ‣ Spectral embedding: eigen-analysis (or !"# equivalently SVD) of random surfer model !"$ in symmetric graph ! ! !"$ ‣ usually uses 2nd–Kth EVs (small K) ! !"# ‣ first EV is boring ! !"% ! !"# ! !"$ ! !"$ !"# !"% ‣ Spectral clustering = spectral embedding followed by clustering 1

  2. Review: PCA • The good: simple, successful • The bad: linear, Gaussian ‣ E(X) = UV T ‣ X, U, V ~ Gaussian • The ugly: failure to generalize to new entities ‣ Partial answer: hierarchical PCA 2

  3. What about the second rating for a new user? • MLE/MAP of U i ⋅ from one rating: ‣ knowing μ U : ‣ result: • How should we fix? • Note: often have only a few ratings per user 3

  4. MCMC for PCA Need: • Can do Bayesian inference by Gibbs sampling—for simplicity, assume σ s known 4

  5. Recognizing a Gaussian • Suppose X ~ N(X | μ , σ 2 ) • L = –log P(X=x | μ , σ 2 ) = ‣ dL/dx = ‣ d 2 L/dx 2 = • So: if we see d 2 L/dx 2 = a, dL/dx = a(x – b) ‣ μ = σ 2 = 5

  6. Gibbs step for an element of μ U • L = 6

  7. Gibbs: element of U • L = • dL / dU ik = • dL 2 / (dU ik ) 2 = ‣ post. mean = post. var. = 7

  8. In reality • Above, blocks are single elements of U or V • Better: blocks are entire rows of U or V ‣ take gradient, Hessian to get mean, covariance ‣ formulas look a lot like linear regression (normal equations) • And, want to fit σ U , σ V too ‣ sample 1/ σ 2 from a Gamma (or Σ –1 from a Wishart ) distribution 8

  9. Nonlinearity: conjunctive features P(rent) Foreign Comedy 9

  10. Disjunctive features P(rent) Comedy Foreign 10

  11. Non-Gaussian • X, U, and V could each be non-Gaussian ‣ e.g., binary! ‣ rents(U, M), comedy(M), female(U) • For X: predicting –0.1 instead of 0 is only as bad as predicting +0.1 instead of 0 • For U, V: might infer –17% comedy or 32% female 11

  12. Logistic PCA • Regular PCA: X ij ~ N(U i ⋅ V j , σ 2 ) • Logistic PCA: • Might expect learning, inference to be hard ‣ but, MH works well, using dL/d θ , d 2 L/d θ 2 • Generalization: exponential family PCA ‣ w/ optional hierarchy, Bayesianism 12

  13. Application: fMRI Brain activity fMRI :-) stimulus: “dog” fMRI ;-> stimulus: “cat” fMRI stimulus: “hammer” :-)) credit: Ajit Singh Voxels Stimulus Y 13

  14. 2-matrix model Z p Σ U U i Σ Z Σ Z X ij Y jp i = 1 . . . n µ U p = 1 . . . r µ Z V j co-occurrences fMRI voxels j = 1 . . . m (logistic PCA) (linear PCA) Σ V µ V 14

  15. Results (logistic PCA) Y (fMRI data): Fold-in 1.4 HB � CMF H � CMF Maximum a posteriori (fixed hyperparameters) 1.2 CMF 1 Mean Squared Error 0.8 Lower is credit: Ajit Singh 0.6 0.4 Better 0.2 0 Augmenting fMRI data with Just using fMRI data word co-occurrence 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend