principal components analysis
play

Principal Components Analysis David Benjamin, Broad DSDE Methods - PowerPoint PPT Presentation

Principal Components Analysis David Benjamin, Broad DSDE Methods February 10, 2016 What is PCA? PCA turns high-dimensional data into low-dimensional data by throwing out directions with low variance. Keep y , throw out x . Assumption: noise


  1. Principal Components Analysis David Benjamin, Broad DSDE Methods February 10, 2016

  2. What is PCA? PCA turns high-dimensional data into low-dimensional data by throwing out directions with low variance. Keep y , throw out x . Assumption: noise smaller than signal.

  3. What about correlations? PCA turns high-dimensional data into low-dimensional data by throwing out directions with low variance. Find the pink and green axes. Throw out the pink component. Resulting low-dimensional data is projection onto green axis.

  4. Covariance matrix Σ ij = 1 � ( x ni − µ i )( x nj − µ j ) � = 0 if x i and x j are correlated. N n � Σ xx Figure: � 0 Figure: Σ = � � Σ xx Σ xy > 0 0 Σ yy Σ = Σ xy > 0 Σ yy We want coordinates that make Σ diagonal.

  5. PCA recipe Coordinates (principal components) that make Σ diagonal are the eigenvectors of Σ . PCA recipe Calculate covariance matrix Σ . Find eigenvectors v and eigenvalues λ such that Σ v k = λ k v k . λ k is the variance in the k k direction. Use heuristic to choose K eigenvectors to keep. K � Data is now K -dimensional: x ≈ µ + c k v k , k =1 c k = ( x − µ ) · v k K � Generative model: x = µ + c k v k + noise k =1

  6. Eigenfaces Pixel images are very high-dimensional vectors. Run PCA and look at the principal components. . . Not strictly “eigenfaces,” but eigen-variation in faces relative to average face.

  7. Eigenfaces Pixel images are very high-dimensional vectors. Run PCA and look at the principal components. . . Clockwise from top left full head of hair sunken eyes war paint your interpretation goes here Not strictly “eigenfaces,” but eigen-variation in faces relative to average face.

  8. Eigenfaces

  9. PCA map of Europe Data: x ni = genotype (0, 1, 2) of SNP i in person n .

  10. PCA map of Europe Applications Classification / geneaology Population stratification in GWAS (regress against PCs)

  11. PCA map of Europe Applications Classification / geneaology Population stratification in GWAS (regress against PCs) Do the PCs correspond to the map suspiciously well? Why do the genes of a population migrating north keep going straight along the first PC? Why is Hungary - Austria parallel to Switzerland - France?

  12. Copy number variation from exome capture crash course in exome capture get DNA exon DNA hybridizes to baits, throw out remaining DNA sequence exon DNA

  13. Copy number variation from exome capture crash course in exome capture get DNA exon DNA hybridizes to baits, throw out remaining DNA sequence exon DNA copy number variation align sequenced DNA to reference genome count number of reads from each exon more (less) reads implies duplication (deletion)

  14. Copy number variation from exome capture

  15. Copy number variation from exome capture

  16. Copy number variation from exome capture � ( v ⊤ x = µ + k x ) v k + copy number signal k � ( v ⊤ ⇒ copy number signal = x − µ − k x ) v k k PCs v come from non-tumor samples with no CNVs!

  17. Pitfalls PCs might not be good for classification

  18. Pitfalls PCs might not be good for classification Low-dimensional space might be non-linear

  19. Pitfalls PCs might not be good for classification Low-dimensional space might be non-linear Non-issue: Σ is a big matrix. (Use iterative PCA, FastPCA, flashpca. . .)

  20. Generalizations � x = µ + c k v k + noise is part of a larger model: probabilistic PCA.

  21. Generalizations � x = µ + c k v k + noise is part of a larger model: probabilistic PCA. Don’t like heuristics for choosing number of PCs to use: Bayesian PCA.

  22. Generalizations � x = µ + c k v k + noise is part of a larger model: probabilistic PCA. Don’t like heuristics for choosing number of PCs to use: Bayesian PCA. Data are not linear: nonlinear dimensionality reduction (tSNE, autoencoders, GPLVM, Isomap, SOM. . .)

  23. Equations Find the direction (unit vector) v of greatest variance. Projection of x is x ⊤ v . σ 2 = 1 = 1 � 2 � 2 � � � � x ⊤ n v − µ ⊤ v ( x n − µ ) ⊤ v N N n n = v ⊤ 1 � ( x n − µ )( x n − µ ) ⊤ v = v ⊤ Σ v N n Set ∇ v = 0 with Lagrange multiplier for v ⊤ v = 1 : � � v ⊤ Σ v + λ (1 − v ⊤ v ) = 0 ⇒ Σ v = λ v ∇ v Dotting with v ⊤ gives λ = λ v ⊤ v = v ⊤ Σv = σ 2 .

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend