Class Averaging and Symmetry Detection in Cryo-EM Amit Singer - - PowerPoint PPT Presentation

class averaging and symmetry detection in cryo em
SMART_READER_LITE
LIVE PREVIEW

Class Averaging and Symmetry Detection in Cryo-EM Amit Singer - - PowerPoint PPT Presentation

Class Averaging and Symmetry Detection in Cryo-EM Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 25, 2014 Amit Singer (Princeton University) July 2014 1 / 29 Image


slide-1
SLIDE 1

Class Averaging and Symmetry Detection in Cryo-EM

Amit Singer

Princeton University Department of Mathematics and Program in Applied and Computational Mathematics

July 25, 2014

Amit Singer (Princeton University) July 2014 1 / 29

slide-2
SLIDE 2

Image denoising by vector diffusion maps

S, Zhao, Shkolnisky, Hadani (SIIMS 2011) Hadani, S (FoCM 2011) S, Wu (Comm. Pure Appl. Math 2012) Zhao, S (J Struct. Bio. 2014) Generalization of Laplacian Eigenmaps (Belkin, Niyogi 2003) and Diffusion Maps (Coifman, Lafon 2006) Introduced the graph Connection Laplacian Experimental images (70S) courtesy of

  • Dr. Joachim Frank (Columbia)

Class averages by vector diffusion maps (averaging with 20 nearest neighbors)

Amit Singer (Princeton University) July 2014 2 / 29

slide-3
SLIDE 3

Class Averaging in Cryo-EM: Improve SNR

Amit Singer (Princeton University) July 2014 3 / 29

slide-4
SLIDE 4

Clustering method (Penczek, Zhu, Frank 1996)

Projection images P1, P2, . . . , Pn with unknown rotations R1, R2, . . . , Rn ∈ SO(3) Rotationally Invariant Distances (RID) dRID(i, j) = min

O∈SO(2) Pi − OPj

Cluster the images using K-means. Images are not centered; also possible to include translations and to

  • ptimize over the special Euclidean group.

Problem with this approach: outliers. At low SNR images with completely different viewing directions may have relatively small dRID (noise aligns well, instead of underlying signal).

Amit Singer (Princeton University) July 2014 4 / 29

slide-5
SLIDE 5

Outliers: Small World Graph on S2

Define graph G = (V , E) by {i, j} ∈ E ⇐ ⇒ dRID(i, j) ≤ ε.

1 i

R

3 i

R

2 i

R

Optimal rotation angles Oij = argmin

O∈SO(2)

Pi − OPj, i, j = 1, . . . , n. Triplet consistency relation – good triangles OijOjkOki ≈ I2×2. How to use information of optimal rotations in a systematic way? Vector Diffusion Maps “Non-local means with rotations”

Amit Singer (Princeton University) July 2014 5 / 29

slide-6
SLIDE 6

Vector Diffusion Maps: Setup

wij

Oij

i j

In VDM, the relationships between data points (e.g., cryo-EM images) are represented as a weighted graph, where the weights wij describing affinities between data points are accompanied by linear orthogonal transformations Oij.

Amit Singer (Princeton University) July 2014 6 / 29

slide-7
SLIDE 7

Manifold Learning: Point cloud in Rp

x1, x2, . . . , xn ∈ Rp. Manifold assumption: x1, . . . , xn ∈ Md, with d ≪ p. Local Principal Component Analysis (PCA) gives an approximate

  • rthonormal basis Oi for the tangent space TxiM.

Oi is a p × d matrix with orthonormal columns: OT

i Oi = Id×d.

Alignment: Oij = argminO∈O(d) O − OT

i OjHS

(computed through the singular value decomposition of OT

i Oj).

  • 1

2 3 3 2 1

  • 5

6 4 4 6 5

  • Amit Singer (Princeton University)

July 2014 7 / 29

slide-8
SLIDE 8

Parallel Transport

Oij approximates the parallel transport operator Pxi,xj : TxjM → TxiM

Amit Singer (Princeton University) July 2014 8 / 29

slide-9
SLIDE 9

Laplacian Eigenmap (Belkin and Niyogi 2003) and Diffusion Map (Coifman and Lafon 2006)

Symmetric n × n matrix W0: W0(i, j) = wij (i, j) ∈ E, (i, j) / ∈ E. Diagonal matrix D0 of the same size: D0(i, i) = deg(i) =

  • j:(i,j)∈E

wij. Graph Laplacian, Normalized graph Laplacian and the random walk matrix: L0 = D0 − W0, L0 = I − D−1/2 W0D−1/2 , A0 = D−1

0 W0

The diffusion map Φt is defined in terms of the eigenvectors of A0: A0φl = λlφl, l = 1, . . . , n Φt : i → (λt

l φl(i))n l=1.

Amit Singer (Princeton University) July 2014 9 / 29

slide-10
SLIDE 10

Vector diffusion mapping: W1 and D1

Symmetric nd × nd matrix W1: W1(i, j) = wijOij (i, j) ∈ E, 0d×d (i, j) / ∈ E. n × n blocks, each of which is of size d × d. Diagonal matrix D1 of the same size, where the diagonal d × d blocks are scalar matrices with the weighted degrees: D1(i, i) = deg(i)Id×d, and deg(i) =

  • j:(i,j)∈E

wij

Amit Singer (Princeton University) July 2014 10 / 29

slide-11
SLIDE 11

A1 = D−1

1 W1 is an averaging operator for vector fields

The matrix A1 can be applied to vectors v of length nd, which we regard as n vectors of length d, such that v(i) is a vector in Rd viewed as a vector in TxiM. The matrix A1 = D−1

1 W1 is an

averaging operator for vector fields, since (A1v)(i) = 1 deg(i)

  • j:(i,j)∈E

wijOijv(j). This implies that the operator A1 transport vectors from the tangent spaces TxjM (that are nearby to TxiM) to TxiM and then averages the transported vectors in TxiM.

Amit Singer (Princeton University) July 2014 11 / 29

slide-12
SLIDE 12

Affinity between nodes based on consistency of transformations

In the VDM framework, we define the affinity between i and j by considering all paths of length t connecting them, but instead of just summing the weights of all paths, we sum the transformations. Every path from j to i may result in a different transformation (like parallel transport due to curvature). When adding transformations of different paths, cancelations may happen. We define the affinity between i and j as the consistency between these transformations. A1 = D−1

1 W1 is similar to the symmetric matrix ˜

W1 ˜ W1 = D−1/2

1

W1D−1/2

1

We define the affinity between i and j as ˜ W 2t

1 (i, j)2 HS = deg(i)

deg(j)(D−1

1 W1)2t(i, j)2 HS.

Amit Singer (Princeton University) July 2014 12 / 29

slide-13
SLIDE 13

Embedding into a Hilbert Space

Since ˜ W1 is symmetric, it has a complete set of eigenvectors {vl}nd

l=1

and eigenvalues {λl}nd

l=1 (ordered as |λ1| ≥ |λ2| ≥ . . . ≥ |λnd|).

Spectral decompositions of ˜ W1 and ˜ W 2t

1 :

˜ W1(i, j) =

nd

  • l=1

λlvl(i)vl(j)T , and ˜ W 2t

1 (i, j) = nd

  • l=1

λ2t

l vl(i)vl(j)T ,

where vl(i) ∈ Rd for i = 1, . . . , n and l = 1, . . . , nd. The HS norm of ˜ W 2t

1 (i, j) is calculated using the trace:

˜ W 2t

1 (i, j)2 HS = nd

  • l,r=1

(λlλr)2tvl(i), vr(i)vl(j), vr(j). The affinity ˜ W 2t

1 (i, j)2 HS = Vt(i), Vt(j) is an inner product for the

finite dimensional Hilbert space R(nd)2 via the mapping Vt: Vt : i →

  • (λlλr)tvl(i), vr(i)

nd

l,r=1 .

Amit Singer (Princeton University) July 2014 13 / 29

slide-14
SLIDE 14

Vector Diffusion Distance

The vector diffusion mapping is defined as Vt : i →

  • (λlλr)tvl(i), vr(i)

nd

l,r=1 .

The vector diffusion distance between nodes i and j is denoted dVDM,t(i, j) and is defined as d2

VDM,t(i, j) = Vt(i), Vt(i) + Vt(j), Vt(j) − 2Vt(i), Vt(j).

Other normalizations of the matrix W1 are possible and lead to slightly different embeddings and distances (similar to diffusion maps). The matrices I − ˜ W1 and I + ˜ W1 are positive semidefinite, because v T(I ± D−1/2

1

W1D−1/2

1

)v =

  • (i,j)∈E
  • v(i)
  • deg(i)

± wijOijv(j)

  • deg(j)
  • 2

≥ 0, for any v ∈ Rnd. Therefore, λl ∈ [−1, 1]. As a result, the vector diffusion mapping and distances can be well approximated by using

  • nly the few largest eigenvalues and their corresponding eigenvectors.

Amit Singer (Princeton University) July 2014 14 / 29

slide-15
SLIDE 15

Application to the class averaging problem in Cryo-EM (S, Zhao, Shkolnisky, Hadani 2011)

20 40 60 80 100 120 140 160 180 2 4 6 8 10 12 14x 10

4

degrees

(a) Neighbors are identified using dRID

20 40 60 80 100 120 140 160 180 0.5 1 1.5 2 2.5x 10

5

degrees

(b) Neighbors are identified using dVDM,t=2

Figure : SNR=1/64: Histogram of the angles (x-axis, in degrees) between the viewing directions of each image (out of 40000) and it 40 neighboring images. Left: neighbors are identified using the original rotationally invariant distances

  • dRID. Right: neighbors are post identified using vector diffusion distances.

Amit Singer (Princeton University) July 2014 15 / 29

slide-16
SLIDE 16

Computational Aspects

Zhao, S J. Struct. Biol. 2014 Na¨ ıve implementation requires O(n2) rotational alignments of images Rotational invariant representation of images: “bispectrum” Dimensionality reduction using a randomized algorithm for PCA (Rokhlin, Liberty, Tygert, Martinsson, Halko, Tropp, Szlam, ...) Randomized approximated nearest neighbors search in nearly linear time (Jones, Osipov, Rokhlin 2011)

Amit Singer (Princeton University) July 2014 16 / 29

slide-17
SLIDE 17

Rotational Invariance: Bispectrum

Bispectrum for a 1D periodic signal f bf (k1, k2) = ˆ f (k1)ˆ f (k2)ˆ f (−(k1 + k2)) Bispectrum is shift invariant, complete and unbiased. Phase information is preserved (unlike power spectrum)

Amit Singer (Princeton University) July 2014 17 / 29

slide-18
SLIDE 18

Rotational Invariance: Bispectrum and Steerable Basis

Steerable basis: uk,q(r, θ) = gq(r)eıkθ. PCA for images and their in-plane rotations (Zhao, S JOSA A 2013) I(r, θ) =

k,q ak,quk,q(r, θ)

Image rotation by α accounts for phase shift ak,q → ak,qe−ıkα Rotationally invariant bispectrum bI(k1, k2, q1, q2, q3) = ak1,q1ak2,q2a∗

k1+k2,q3

Amit Singer (Princeton University) July 2014 18 / 29

slide-19
SLIDE 19

The Hairy Ball Theorem

There is no non-vanishing continuous tangent vector field on the sphere. Cannot find Oi such that Oij = OiO−1

j

. No global rotational alignment of all images.

Amit Singer (Princeton University) July 2014 19 / 29

slide-20
SLIDE 20

Example: Connection-Laplacian for Sd embedded in Rd+1

The connection-Laplacian commutes with rotations and the eigenvalues and eigen-vector-fields are calculated using representation theory: S2 : 6, 10, 14, . . . . S3 : 4, 6, 9, 16, 16, . . . . S4 : 5, 10, 14, . . . . S5 : 6, 15, 20, . . . .

10 20 30 0.6 0.7 0.8 0.9

(a) S2

10 20 30 0.6 0.7 0.8 0.9

(b) S3

10 20 30 0.6 0.7 0.8 0.9

(c) S4

10 20 30 0.6 0.7 0.8 0.9

(d) S5

Figure : Bar plots of the largest 30 eigenvalues of A1 for n = 8000 points uniformly distributed over spheres of different dimensions.

Amit Singer (Princeton University) July 2014 20 / 29

slide-21
SLIDE 21

Spectral graph theory with vector fields

The Graph Connection Laplacian L1 = D1 − W1 The Normalized Graph Connection Laplacian L1 = I − D−1/2

1

W1D−1/2 = I − ˜ W1 Averaging operator / random walk matrix for vector diffusion: A1 = D−1

1 W1

A Cheeger inequality for the graph connection Laplacian (Bandeira, S, Spielman SIAM Matrix Analysis 2013)

Amit Singer (Princeton University) July 2014 21 / 29

slide-22
SLIDE 22

More applications of VDM: Orientability from a point cloud

Encode the information about reflections in a symmetric n × n matrix Z with entries Zij = det Oij (i, j) ∈ E, (i, j) / ∈ E. That is, Zij = 1 if no reflection is needed, Zij = −1 if a reflection is needed, and Zij = 0 if the points are not nearby. Normalize Z by the node degrees.

−0.015 −0.01 −0.005 0.005 0.01 0.015 500 1000 1500 2000 2500 3000

(a) S2

−0.02 −0.01 0.01 0.02 10 20 30 40 50 60

(b) Klein bottle

−0.02 −0.01 0.01 0.02 5 10 15 20 25 30 35 40

(c) RP2

Figure : Histogram of the values of the top eigenvector of D−1

0 Z.

Amit Singer (Princeton University) July 2014 22 / 29

slide-23
SLIDE 23

Orientable Double Covering

Embedding obtained using the eigenvectors of the (normalized) matrix

  • Z

−Z −Z Z

  • =
  • 1

−1 −1 1

  • ⊗ Z,

Figure : Left: the orientable double covering of RP(2), which is S2; Middle: the

  • rientable double covering of the Klein bottle, which is T 2; Right: the orientable

double covering of the M¨

  • bius strip, which is a cylinder.

Amit Singer (Princeton University) July 2014 23 / 29

slide-24
SLIDE 24

Detecting the symmetry group in cryo-EM

Suppose G ⊂ SO(3) is a finite subgroup (cyclic, dihedral, etc.) s.t. φ(gr) = φ(r), for all r = (x, y, z) and g ∈ G. Goal: detect the symmetry group G, given noisy projection images.

Density map of IP3R1 EM projections From http://pdbj.org/emnavi/emnavi_detail.php?id=5278. The reconstruction shown is made from 37,000 projections.

Amit Singer (Princeton University) July 2014 24 / 29

slide-25
SLIDE 25

Kam’s theory

  • Z. Kam, J. Theor. Biol. (1980) 82, 15-39

The density function in Fourier space has the spherical harmonic expansion ˆ φ(r, θ, ϕ) =

  • l=0

l

  • m=−l

Alm(r)Y m

l (θ, ϕ)

By the slice theorem, the projection at orientation ω ∈ SO(3) ˆ I ω(r, ϕ) =

  • l=0

l

  • m=−l

Alm(r)

l

  • m′=−l

Dl

m,m′(ω)Y m′ l

(π 2 , ϕ)

Amit Singer (Princeton University) July 2014 25 / 29

slide-26
SLIDE 26

Autocorrelation Function

Covariance matrix of 2D Fourier images − →

averaging over ω∈SO(3)

the autocorrelation function of the 3D Fourier volume: C(r1, r2, ϕ1, ϕ2) := Aveω∈SO(3)

  • ˆ

I ω(r1, ϕ1) · ˆ I ω(r2, ϕ2)

  • (1)

=

  • l=0

l

  • m=−l

Alm(r1)Alm(r2) 1 4π Pl(cos(ϕ1 − ϕ2)), Thus C(r1, r2, ϕ1, ϕ2) only depends on ψ := ϕ1 − ϕ2. Define Cl(r1, r2) = 2π(2l + 1) π C(r1, r2, ψ)Pl(cos ψ) sin ψdψ =

l

  • m=−l

Alm(r1)Alm(r2). Cl(r1, r2) has rank at most 2l + 1. When the volume is symmetric, some of Alm(r) must vanish so that the rank of Cl(r1, r2) degenerates.

Amit Singer (Princeton University) July 2014 26 / 29

slide-27
SLIDE 27

Outline of the Symmetry Detection Algorithm

1 Compute the steerable PCA of the noisy projections 2 Compute the l-th order radial correlation matrix Cl(r1, r2). 3 Compute the rank of Cl(r1, r2), l = 1, 2, . . ., which encodes the

symmetry group G (spectral signature)

Analytical rank of Cl(r1, r2)

G\l 1 2 3 4 5 C1 3 5 7 9 11 C2 1 3 3 5 5 C4 1 1 1 3 3 D2 2 1 3 2

Amit Singer (Princeton University) July 2014 27 / 29

slide-28
SLIDE 28

Numerical Results on Simulated Data

The bar plot of eigenvalues of Cl, l = 1, 2, 3 by spherical harmonic expansion

5 10 15 2 4 6 8 10 12 14 16 18 x 10

−3

l=1 5 10 15 1 2 3 4 5 6 7 8 x 10

−3

l=2 5 10 15 2 4 6 8 10 12 14 16 x 10

−3

l=3 5 10 15 2 4 6 8 10 12 14 16 x 10

−3

5 10 15 0.005 0.01 0.015 0.02 0.025 5 10 15 0.005 0.01 0.015 0.02

from n projections

5 10 15 0.005 0.01 0.015 0.02 l=1 5 10 15 1 2 3 4 5 6 7 8 x 10

−3

l=2 5 10 15 2 4 6 8 10 12 14 16 x 10

−3

l=3 5 10 15 2 4 6 8 10 12 14 16x 10

−3

5 10 15 0.005 0.01 0.015 0.02 5 10 15 0.005 0.01 0.015 0.02

(Upper) no symmetry. (Lower) C2 symmetry. Both volumes are mixed Gaussian phantom. n = 1000 projections, SNR = 1/4.

Amit Singer (Princeton University) July 2014 28 / 29

slide-29
SLIDE 29

References

  • Z. Zhao, A. Singer, “Rotationally Invariant Image Representation for Viewing Direction

Classification in Cryo-EM”, Journal of Structural Biology, 186 (1), pp. 153–166 (2014).

  • Z. Zhao, A. Singer, “Fourier-Bessel Rotational Invariant Eigenimages”, The Journal of the

Optical Society of America A, 30 (5), pp. 871–877 (2013).

  • A. S. Bandeira, A. Singer, D. A. Spielman, “A Cheeger inequality for the graph

connection Laplacian”, SIAM Journal on Matrix Analysis and Applications, 34 (4), pp. 1611–1630 (2013).

  • A. Singer, H.-T. Wu, “Vector Diffusion Maps and the Connection Laplacian”,

Communications on Pure and Applied Mathematics, 65 (8), pp. 1067–1144 (2012).

  • A. Singer, H.-T. Wu, “Orientability and Diffusion Maps”, Applied and Computational

Harmonic Analysis, 31 (1), pp. 44–58 (2011).

  • A. Singer, Z. Zhao, Y. Shkolnisky, R. Hadani, “Viewing Angle Classification of

Cryo-Electron Microscopy Images using Eigenvectors”, SIAM Journal on Imaging Sciences, 4 (2), pp. 723–759 (2011).

  • R. Hadani, A. Singer, “Representation theoretic patterns in three dimensional

Cryo-Electron Microscopy II – The class averaging problem”, Foundations of Computational Mathematics, 11 (5), pp. 589–616 (2011).

Amit Singer (Princeton University) July 2014 29 / 29