Principal Components Analysis (PCA) and Singular Value Decomposition - - PowerPoint PPT Presentation

principal components analysis pca and singular value
SMART_READER_LITE
LIVE PREVIEW

Principal Components Analysis (PCA) and Singular Value Decomposition - - PowerPoint PPT Presentation

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays Prof. Tesler Math 283 Fall 2018 Prof. Tesler Principal Components Analysis Math 283 / Fall 2018 1 / 40 Covariance Let X and Y be


slide-1
SLIDE 1

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays

  • Prof. Tesler

Math 283 Fall 2018

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 1 / 40

slide-2
SLIDE 2

Covariance

Let X and Y be random variables, possibly dependent. Recall that the covariance of X and Y is defined as Cov(X, Y) = E

  • (X − µX)(Y − µY)
  • and that an alternate formula is

Cov(X, Y) = E(XY) − E(X)E(Y) Previously we used Var(X + Y) = Var(X) + Var(Y) + 2 Cov(X, Y) and Var(X1 + X2 + · · · + Xn) = Var(X1) + · · · + Var(Xn)

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 2 / 40

slide-3
SLIDE 3

Covariance properties

Covariance properties

Cov(X, X) = Var(X) Cov(X, Y) = Cov(Y, X) Cov(aX + b, cY + d) = ac Cov(X, Y)

Sign of covariance Cov(X, Y) = E((X − µX)(Y − µY))

When Cov(X, Y) is positive: there is a tendency to have X > µX when Y > µY and vice-versa, and X < µX when Y < µY and vice-versa. When Cov(X, Y) is negative: there is a tendency to have X > µX when Y < µY and vice-versa, and X < µX when Y > µY and vice-versa. When Cov(X, Y) = 0: a) X and Y might be independent, but it’s not guaranteed. b) Var(X + Y) = Var(X) + Var(Y)

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 3 / 40

slide-4
SLIDE 4

Sample variance

Variance of a random variable: σ2 = Var(X) = E((X − µX)2) = E(X2) − (E(X))2 Sample variance from data x1, . . . , xn: s2 = var(x) = 1 n − 1

n

  • i=1

(xi − ¯ x)2 = 1 n − 1 n

  • i=1

xi2

n n − 1 ¯ x2 Vector formula: Centered data: M =

  • x1 − ¯

x x2 − ¯ x · · · xn − ¯ x

  • s2 = M · M

n − 1 = M M ′ n − 1

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 4 / 40

slide-5
SLIDE 5

Sample covariance

Covariance between random variables X, Y: σXY = Cov(X, Y) = E((X − µX)(Y − µY)) = E(XY) − E(X)E(Y) Sample covariance from data (x1, y1), . . . , (xn, yn): sXY = cov(x, y) = 1 n − 1

n

  • i=1

(xi − ¯ x)(yi − ¯ y) = 1 n − 1 n

  • i=1

xiyi

n n − 1 ¯ x¯ y Vector formula: MX =

  • x1 − ¯

x x2 − ¯ x · · · xn − ¯ x

  • MY

=

  • y1 − ¯

y y2 − ¯ y · · · yn − ¯ y

  • sXY = MX · MY

n − 1 = MX M ′

Y

n − 1

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 5 / 40

slide-6
SLIDE 6

Covariance matrix

For problems with many simultaneous random variables, put them into vectors:

  • X =

R S

  • Y =

  T U V   and then form a covariance matrix: Cov( X, Y) = Cov(R, T) Cov(R, U) Cov(R, V) Cov(S, T) Cov(S, U) Cov(S, V)

  • In matrix/vector notation,

Cov( X, Y)

  • 2×3

= E

  • (

X − E( X))

  • 2×1

( Y − E( Y))′

  • (3×1)′=1×3
  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 6 / 40

slide-7
SLIDE 7

Covariance matrix (a.k.a. Variance-Covariance matrix)

Often there’s one vector with all the variables:

  • X =

  R S T   Cov( X) = Cov( X, X) = E

  • (

X − E( X)) ( X − E( X))′ =   Cov(R, R) Cov(R, S) Cov(R, T) Cov(S, R) Cov(S, S) Cov(S, T) Cov(T, R) Cov(T, S) Cov(T, T)   =   Var(R) Cov(R, S) Cov(R, T) Cov(R, S) Var(S) Cov(S, T) Cov(R, T) Cov(S, T) Var(T)   This matrix is symmetric (it equals its own transpose). The diagonal entries are ordinary variances.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 7 / 40

slide-8
SLIDE 8

Covariance matrix properties

Cov( X, Y) = Cov( Y, X)′ Cov(A X + B, Y) = A Cov( X, Y) Cov( X, C Y + D) = Cov( X, Y)C′ Cov(A X + B) = A Cov( X)A′ Cov( X1 + X2, Y) = Cov( X1, Y) + Cov( X2, Y) Cov( X, Y1 + Y2) = Cov( X, Y1) + Cov( X, Y2) A, C are constant matrices, B, D are constant vectors, and all dimensions must be correct for matrix arithmetic.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 8 / 40

slide-9
SLIDE 9

Example (2D, but works for higher dimensions too)

Data (x1, y1), . . . , (x100, y100): M0 = x1 · · · x100 y1 · · · y100

  • =

3.0858 0.8806 9.8850 · · · 4.4106 12.8562 10.7804 8.7504 · · · 13.5627

  • !5

5 10 15 2 4 6 ' 10 12 14 16 1' 20 (ri+inal data

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 9 / 40

slide-10
SLIDE 10

Centered data

!5 5 10 15 2 4 6 ' 10 12 14 16 1' 20 (ri+inal data !10 !5 5 10 !8 !6 !4 !2 2 4 6 8 10 Centered data

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 10 / 40

slide-11
SLIDE 11

Computing sample covariance matrix

Original data: 100 (x, y) points in a 2 × 100 matrix M0: M0 = x1 · · · x100 y1 · · · y100

  • =

3.0858 0.8806 9.8850 · · · 4.4106 12.8562 10.7804 8.7504 · · · 13.5627

  • Centered data: subtract ¯

x from x’s and ¯ y from y’s to get M; here ¯ x = 5, ¯ y = 10: M = −1.9142 −4.1194 4.8850 · · · −0.5894 2.8562 0.7804 −1.2496 · · · 3.5627

  • Sample covariance:

C = M M ′ 100 − 1 = 31.9702 −16.5683 −16.5683 13.0018

  • =

sXX sXY sYX sYY

  • =
  • sX

2

sXY sXY sY

2

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 11 / 40

slide-12
SLIDE 12

Orthonormal matrix

Recall that for vectors v, w, we have v · w = | v| | w| cos(θ), where θ is the angle between the vectors. Orthogonal means perpendicular.

  • v and

w are orthogonal when the angle between them is θ = 90◦ = π

2 radians. So cos(θ) = 0 and

v · w = 0. Vectors v1, . . . , vn are orthonormal when

  • vi ·

vj = 0 for i j (different vectors are orthogonal)

  • vi ·

vi = 1 for all i (each vector has length 1; they are all unit vectors) In short: vi · vj = δij =

  • if i j

1 if i = j. Example: ˆ ı, ˆ , ˆ k (3D unit vectors along x, y, z axes) are orthonormal. These can be rotated into other orientations to give new “axes” in

  • ther directions; that will be our focus.
  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 12 / 40

slide-13
SLIDE 13

Orthonormal matrix

Form an n × n matrix of orthonormal vectors V =

  • v1 | · · · |

vn

  • by loading n-dimensional column vectors into the columns of V.

Transpose it to convert the vectors to row vectors: V ′ =    

  • v′

1

  • v′

2

. . .

  • v′

n

    (V ′V)ij is the ith row of V ′ dotted with the jth column of V: (V ′V)ij = vi · vj = δij V ′V =    1 · · · 1 · · · . . . . . . ... . . . · · · 1    Thus, V ′V = I (n × n identity matrix), so V ′ = V−1. An n × n matrix V is orthonormal when V ′V = I (or equivalently, VV ′ = I), where I is the n × n identity matrix.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 13 / 40

slide-14
SLIDE 14

Diagonalizing the sample covariance matrix C

C = V D V ′ 31.9702 −16.5683 −16.5683 13.0018

  • =

−0.8651 −0.5016 0.5016 −0.8651 41.5768 3.3952 −0.8651 0.5016 −0.5016 −0.8651

  • C is a real-valued symmetric matrix. It can be shown that:

C can be diagonalized (recall not all matrices are diagonalizable); in the special form C = VDV ′ with V orthonormal, so V−1 = V ′; all eigenvalues are real numbers 0. So we can put them on the diagonal of D in decreasing order: λ1 λ2 · · · 0.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 14 / 40

slide-15
SLIDE 15

Diagonalizing the sample covariance matrix C

Since C is symmetric, if v is a right eigenvector with eigenvalue λ, then v′ is a left eigenvector with eigenvalue λ, and vice-versa: C v = λ v so λ v′ = (C v)′ = v′ C′ = v′ C Diagonalization C = VDV−1 loads right and left eigenvectors into V and V−1. Here those eigenvectors are transposes of each other, leading to the special form C = VDV ′. Also, all eigenvalues are 0 (“C is positive semidefinite”): For all vectors w,

  • w′C

w = w′MM ′ w n − 1 = |M ′ w|2 n − 1 0 Eigenvector equation C w = λ w gives w′C w = λ w′ w = λ| w|2. So λ| w|2 = w′C w 0, giving λ 0.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 15 / 40

slide-16
SLIDE 16

Principal axes

The columns of V are the right eigenvectors of C. Multiply each eigenvector by the square root of its eigenvalue to get the principal components. Eigenvalue Eigenvector PC 41.5768 −0.8651 0.5016

  • −5.5782

3.2343

  • 3.3952

−0.5016 −0.8651

  • −0.9242

−1.5940

  • Put them into the columns of a matrix:

P = V √ D = −5.5782 −0.9242 3.2343 −1.5940

  • C = VDV ′ = V

√ D √ D′V ′ = (V √ D)(V √ D)′ = PP′

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 16 / 40

slide-17
SLIDE 17

Principal axes

Plot the centered data with lines along the principal axes:

!10 !5 5 10 !8 !6 !4 !2 2 4 6 8 10 Principal axes

Sum of squared perpendicular distances of data points to first PC line (red) is minimum among all lines through origin. ith PC is perpendicular to the previous ones, and the sum of squared perpendicular distances to the span (line, plane, ...) of the first i PCs is minimum among all i-dim. spaces through origin.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 17 / 40

slide-18
SLIDE 18

Rotate axes

Transform M to M2 = V ′M and plot points given by the columns of M2:

!10 !5 5 10 !8 !6 !4 !2 2 4 6 8 10 Principal axes !15 !10 !5 5 10 !10 !$ !% !4 !2 2 4 % $ 10 (ot+tion / ne1 prin4ip+5 +6es

From linear algebra, a linear transformation M → AM does a combination

  • f rotations, reflections, scaling, shearing, and orthogonal projections.

V is orthonormal, so M2 = V ′M rotates/reflects all the data. M = VM2 recovers centered data M from rotated data M2. The sample covariance matrix of M2 is M2 M ′

2

n − 1 = (V ′M)(V ′M)′ n − 1 = V ′MM ′V n − 1 = V ′ MM ′ n − 1

  • V = V ′CV = D

Note C = VDV ′ and V ′ = V−1, so D = V−1C(V ′)−1 = V ′CV.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 18 / 40

slide-19
SLIDE 19

New coordinates

The rotated data has new coordinates (t1, u1), . . . , (t100, u100) and covariance matrix D: V ′CV = D = var(T) cov(T, U) cov(T, U) var(U)

  • =

41.5768 3.3952

  • In D, the total variance is var(T) + var(U) = 44.9720.

Note that this is the sum of the eigenvalues, λ1 + λ2 + · · · . The trace of a matrix is the sum of its diagonal entries. So the total variance is Tr(D) = λ1 + λ2 + · · · . General linear algebra fact: Tr(X) = Tr(AXA−1). So Tr(C) = Tr(VDV ′) = Tr(VDV−1) = Tr(D). Below, Tr(C) = Tr(D) = 44.9720. C = 31.9702 −16.5683 −16.5683 13.0018

  • D =

41.5768 3.3952

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 19 / 40

slide-20
SLIDE 20

Part of variance explained by each axis

The part of the variance explained by each axis is λi/total variance: Eigenvector Eigenvalue Explained −0.8651 0.5016

  • 41.5768

41.5768/44.9720 = 92.45% −0.5016 −0.8651

  • 3.3952

3.3952/44.9720 = 7.55% Total 44.9720 100% This is an application of Cov(A X) = A Cov( X)A′: Cov

  • V ′

X Y

  • = V ′ Cov

X Y

  • V
  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 20 / 40

slide-21
SLIDE 21

Dimension reduction

To clean up “noise,” set all ui = 0 and rotate back: V t1 t2 t3 · · · · · ·

  • =
  • x1
  • x2
  • x3

· · ·

  • y1
  • y2
  • y3

· · ·

  • !10

!5 5 10 !10 !$ !% !4 !2 2 4 % $ (ro+ect ori1inal centered data to 1st (C

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 21 / 40

slide-22
SLIDE 22

Dimension reduction

Say we want to keep enough information to explain 90% of the variance. Take enough top PCs to explain 90% of the variance. Let M3 be M2 (rotated data) with the remaining coordinates zeroed

  • ut.

Rotate it back to the original axes with VM3. In other applications, a dominant signal can be suppressed by zeroing out the coordinates for the top PCs instead of the bottom PCs.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 22 / 40

slide-23
SLIDE 23

Variations for PCA (and SVD, upcoming)

Some people reverse the roles of rows and columns of M. In some applications, M is “centered” (subtract off row means) and in others, it’s not. If the ranges on the variables (rows) are very different, the data might be rescaled in each row to make similar ranges. For example, replace each row by Z-scores for the row.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 23 / 40

slide-24
SLIDE 24

Sensitivity to scaling

PCA was originally designed for measurements in ordinary space, so all axes had the same units (e.g., cm or inches) and equivalent results would be obtained no matter what units were used. It’s problematic to mix physical quantities with different units:

Length of (a, b) in (seconds,mm): √ a2 + b2 (adding sec2 plus mm2 is not legitimate!) Convert to (hours,miles):

  • a

3600, b 1609344

  • =
  • a2

36002 + b2 16093442

Angles are also distorted by this unit conversion: arctan(b/a) arctan

  • b

1609344

  • a

3600

  • .

|(0 ◦C, 0 ◦C)| = 0 vs. |(32 ◦F, 32 ◦F)| = 32 √ 2. Both ◦C and ◦F use an arbitrary “zero” instead of absolute zero.

PCA is sensitive to differences in the scale, offset, and ranges of the variables. Rescaling one row w/o the others changes angles and lengths nonuniformly, and changes eigenvalues and eigenvectors in an inequivalent way. Typically addressed by replacing each row with Z-scores.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 24 / 40

slide-25
SLIDE 25

Microarrays

Before we considered single genes where “red” or “green” (positive or negative expression level) distinguished the classes. If xi is the expression level of gene i then L = a1x1 + a2x2 + · · · is a linear combination of genes. Next up: a method that finds linear combinations of genes where L > C and L < C distinguish two classes, for some constant C. So L = C is a line / plane / etc. that splits the multidimensional space of expression levels. Different classes are not always separated in this fashion. In some situations, nonlinear relations may be required.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 25 / 40

slide-26
SLIDE 26

Microarrays

Consider an experiment with 80 microarrays, each with 10000 spots. M is 10000 × 80. C = MM ′

80−1 is 10000 × 10000!

M ′M is 80 × 80. We will see that MM ′ has the same 80 eigenvalues as M ′M, plus an additional 10000 − 80 = 9920 eigenvalues equal to 0. Some of the 80 eigenvalues of M ′M may also be 0. For centered data, all row sums of M are 0 so [1, . . . , 1]′ is an eigenvector of M ′M with eigenvalue 0. We will see we can work with the smaller of MM ′ or M ′M.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 26 / 40

slide-27
SLIDE 27

Singular Value Decomposition (SVD)

Let M be a p × q matrix (not necessarily “centered”). The Singular Value Decomposition of M is M = USV ′, where U is orthonormal, p × p. V is orthonormal, q × q. S is a diagonal p × q matrix, s1 s2 · · · 0. If M is 5 × 3, this would look like M U S V ′       · · · · · · · · · · · · · · ·       =       · · · · · · · · · · · · · · · · · · · · · · · · ·             s1 s2 s3         · · · · · · · · ·  

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 27 / 40

slide-28
SLIDE 28

“Compact” SVD

For p > q: The bottom p − q rows of S are all 0. Remove them. Keep

  • nly the first q rows of S and first q columns of U.

U is orthonormal, p × q. V is orthonormal, q × q. S is a diagonal p × q matrix, s1 s2 · · · 0. If M is 5 × 3, this would look like M U S V ′     · · · · · · · · · · · · · · ·     =     · · · · · · · · · · · · · · ·     s1 s2 s3

  • ·

· · · · · · · ·

  • For q > p: keep only the first p columns of S and first p rows of V ′.

Matlab and R have options for full or compact form in svd(M).

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 28 / 40

slide-29
SLIDE 29

Computing the SVD

M ′M = (VS′U ′)(USV ′) = V(S′S)V ′ = V   s12 s22 s32   V ′ MM ′ = (USV ′)(VS′U ′) = U(SS′)U ′ = U      s12 s22 s32      U ′ This diagonalization of M ′M and MM ′ shows they have the same eigenvalues up to the dimension of the smaller matrix. The larger matrix has all additional eigvenvalues equal to 0. Compute the SVD using whichever gives smaller dimensions!

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 29 / 40

slide-30
SLIDE 30

Computing the SVD

M ′M = (VS′U ′)(USV ′) = V(S′S)V ′ = V   s12 s22 s32   V ′ MM ′ = (USV ′)(VS′U ′) = U(SS′)U ′ = U      s12 s22 s32      U ′ First method (recommended when p q):

Diagonalize M ′M = VDV ′. Compute p × q matrix S with Sii = √Dii and 0’s elsewhere. The pseudoinverse of S is S−1: replace each nonzero diagonal entry of S by its reciprocal, and transpose to get a q × p matrix. Compute U: M = USV ′ =⇒ U = M(V ′)−1S−1 = MVS−1.

q p is analogous: diagonalize MM ′ = UDU ′; compute S from D; then compute V = (S−1U ′M)′. svd(M) in both Matlab and R.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 30 / 40

slide-31
SLIDE 31

Singular values and singular vectors

Let M be a p × q matrix (not necessarily centered). Suppose

s is a scalar.

  • v is a q × 1 unit vector (column vector).
  • u is a p × 1 unit vector (column vector).

s is a singular value of M with right singular vector v and left singular vector u if M v = s u and

  • u′M = s

v′ (same as M ′ u = s v). Break U and V into columns U =

  • u1 |

u2 | · · · | up

  • V =
  • v1 |

v2 | · · · | vq

  • Then M

vi = si ui and M ′ ui = si vi for i min(p, q). If p > q: M ′ ui = 0 for i > q. If q > p: M vi = 0 for i > p. To get full-sized M = USV ′ from compact ( p ≥ q case): choose the remaining columns of U from the nullspace of M ′ in such a way that the columns of U are an orthonormal basis of Rp.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 31 / 40

slide-32
SLIDE 32

Relation between PCA and SVD

Previous computation for PCA

Start with centered data matrix M (n columns). Compute covariance matrix, diagonalize it, compute P: C = MM ′ n − 1 = VDV ′ = PP′ where P = V √ D

Computing PCA using SVD

In terms of the SVD factorization M = USV ′, covariance is C = MM ′ n − 1 = (USV ′)(VS′U ′) n − 1 = U(SS′)U ′ n − 1 = UDU ′ where D = SS′

n−1

= PP′ where P =

US √n−1

Variance for ith component is

si2 n−1

Note: there were minor notation adjustments to deal with n − 1.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 32 / 40

slide-33
SLIDE 33

SVD in microarrays

Nielsen et al.1 studied tumors in six types of tissue. 41 tissue samples and 46 microarray slides They switched microarray platforms in the middle of the experiment:

The first 26 slides have 22,654 spots (22K). The next 20 slides have 42,611 spots (42K) (mostly a superset). Five of the samples were done on both 22K and 42K platforms.

7425 spots were in common to both platforms, had good signal across all slides, and had sample variance above a certain

  • threshold. So M is 7425 × 46.

1Molecular characterisation of soft tissue tumours: a gene expression study,

Lancet (2002) 359: 1301–1307.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 33 / 40

slide-34
SLIDE 34

SVD in microarrays

Color scale: Negative Positive Nielsen et al., supplementary material. http://genome-www.stanford.edu/sarcoma/Supplemental_data.shtml

The compact form M = USV ′ is shown above. They call the columns of U “eigenarrays” and the columns of V (rows of V ′) “eigengenes.”

An eigenarray is a linear combination of arrays. An eigengene is a linear combination of genes.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 34 / 40

slide-35
SLIDE 35

SVD in microarrays

Sample covariance matrix: C = M M ′/45 = U S S′ U ′/45 Sample variance of ith component: si2/45. Total sample variance: (s12 + · · · + s462)/45. Here is V ′ and the explained fractions si2/(s12 + · · · + s462)

Nielsen et al., supplementary material.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 35 / 40

slide-36
SLIDE 36

“Expression level” of eigengenes

The expression level of gene i on array j is Mij. Interpretation of change of basis S = U′MV: the ith eigenarray only detects the ith eigengene, and has 0 response to other eigengenes. Interpretation of V ′ = U′MS−1: The “expression level” of eigengene i on array j is (V ′)ij = Vji. Let m represent a new array (e.g., a column vector of expression levels in each gene). The expression level of eigengene i is (U ′ m)i/si.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 36 / 40

slide-37
SLIDE 37

Platform bias

They re-ordered the arrays according to the expression level Vj1 in the first eigengene (largest eigenvalue). They found that Vj1 tends to be larger in the 42K arrays and smaller in the 22K arrays. This is an experimental artifact, not a property of the specimens under study.

Nielsen et al., supplementary material.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 37 / 40

slide-38
SLIDE 38

Removing 22K vs. 42K array bias

Let S be S with the (1,1) entry replaced by 0. Let M = U SV ′. This reduces the signal and variance in many spots. After removing weak spots, they cut down to 5520 spots, giving a 5520 × 46 data matrix.

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 38 / 40

slide-39
SLIDE 39

Classification — Eigengenes — 1D

For the 5520 × 46 matrix, the expression levels of the top three eigengenes can be used to classify some tumor types.

!2 !1 1 2 $ 10

4

'((656 !+,'( - '((094 !+,'( / '((850 !'1n'arc '((656 !+,'( / '((094 !+,'( - '((638 !'1n'arc '((535 !'1n'arc '((854 !'1n'arc '((117 !'1n'arc '((219 !+,'( '((335 !+,'( '((865 !'1n'arc '((794 !+,'( '((646 !+,'( '((111 !+,'( '((108 !'1n'arc - '((1148!+,'( '((108 !'1n'arc / '((1324!'1n'arc '((516 !89,: - '((516 !89,: / '((840 !89,: '((1220!89,: '((525 !89,: '((390 !8,;: '((419 !8,;:<=>? '((616 !89,: '((418 !=FA '((889 !=FA '((641 !89,: '((629 !'cBCannoEa '((417 !=FA '((890 !=FA '((398 !8,;: - '((742 !89,: '((894 !=FA '((739 !89,: '((398 !8,;: / '((523 !89,: '((563 !8,;: '((524 !'cBCannoEa '((420 !=FA '((710 !=FA '((607 !89,: E.523 '((526 !89,: '((709 !=FA 5520 spots eigengene 1 !1 !0.5 0.5 1 1.5 x 10

4

STT094 !GIST A STT794 !GIST STT219 !GIST STT335 !GIST STT646 !GIST STT094 !GIST B STT656 !GIST B STT1148!GIST STT656 !GIST A STT629 !Schwannoma STT111 !GIST STT524 !Schwannoma STT710 !MFH STT840 !LEIO STT398 !LIPO A STT398 !LIPO B STT641 !LEIO STT709 !MFH STT742 !LEIO STT516 !LEIO B STT516 !LEIO A STT526 !LEIO STT739 !LEIO STT890 !MFH STT616 !LEIO STT420 !MFH STT419 !LIPO/MYX STT894 !MFH STT607 !LEIO m.523 STT390 !LIPO STT563 !LIPO STT1220!LEIO STT417 !MFH STT889 !MFH STT523 !LEIO STT525 !LEIO STT1324!SynSarc STT865 !SynSarc STT418 !MFH STT854 !SynSarc STT108 !SynSarc B STT108 !SynSarc A STT638 !SynSarc STT850 !SynSarc STT117 !SynSarc STT535 !SynSarc 5520 spots eigengene 2 !!"### !!#### !"### # "### $%%"!&'!()*+', $%%&!&'!()*+ $%%"!&'!()*+'- $%%&.!'!()*+ $%%/.#'!()*+ $%%"0"'!()*+ $%%!00#!()*+ $%%.!1'!(*2+3456 $%%"07'!()*+ $%%//1'!489 $%%.!:'!489 $%%#1.'!;*$%', $%%/&"'!$<=$>?@ $%%!!!'!;*$% $%%#1.'!;*$%'- $%%&7/'!$<=$>?@ $%%!!./!;*$% $%%/"#'!$<=$>?@ $%%&#:'!()*+'AB"07 $%%/".'!$<=$>?@ $%%0!1'!;*$% $%%.!/'!489 $%%!70.!$<=$>?@ $%%71#'!(*2+ $%%"7"'!$<=$>?@ $%%&"&'!;*$%'- $%%77"'!;*$% $%%:1.'!;*$% $%%!!:'!$<=$>?@ $%%"0.'!$@CD>==EA> $%%.0#'!489 $%%&"&'!;*$%', $%%/1#'!489 $%%:.0'!()*+' $%%&01'!$@CD>==EA> $%%"0&'!()*+ $%%/1.'!489 $%%:71'!()*+ $%%!#/'!$<=$>?@', $%%&.&'!;*$% $%%"&7'!(*2+ $%%:!#'!489 $%%:#1'!489 $%%!#/'!$<=$>?@'- $%%71/'!(*2+', $%%71/'!(*2+'- ""0#'FGEHF'IJKI=KI=I'7

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 39 / 40

slide-40
SLIDE 40

Classification — Eigengenes — 2D

λ1, λ2, λ3 help distinguish between tumor types

!!"# !! !$"# $ $"# ! !"# %&!$

'

!! !$"# $ $"# ! !"# %&!$

'

()*+,*+,+&! ()*+,*+,+&- ./,.012 34.5 6(47 6487 9:; .2<=0,,>?0 !!"# !! !$"# $ $"# ! !"# %&!$

'

!!($$$ !!$$$$ !)$$$ !*$$$ !'$$$ !($$$ $ ($$$ '$$$ +,-./-./.&! +,-./-./.&0 12/1345 6718 9+7: 97;: <=> 15?@3//AB3 !! !"#$ " "#$ ! !#$ %&!"

'

!!(""" !!"""" !)""" !*""" !'""" !(""" " (""" '""" +,-./-./.&( +,-./-./.&0 12/1345 6718 9+7: 97;: <=> 15?@3//AB3

  • Prof. Tesler

Principal Components Analysis Math 283 / Fall 2018 40 / 40