Orientation Assignment in Cryo-EM Amit Singer Princeton University - - PowerPoint PPT Presentation

orientation assignment in cryo em
SMART_READER_LITE
LIVE PREVIEW

Orientation Assignment in Cryo-EM Amit Singer Princeton University - - PowerPoint PPT Presentation

Orientation Assignment in Cryo-EM Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 24, 2014 Amit Singer (Princeton University) July 2014 1 / 30 Single Particle


slide-1
SLIDE 1

Orientation Assignment in Cryo-EM

Amit Singer

Princeton University Department of Mathematics and Program in Applied and Computational Mathematics

July 24, 2014

Amit Singer (Princeton University) July 2014 1 / 30

slide-2
SLIDE 2

Single Particle Reconstruction using cryo-EM

Schematic drawing of the imaging process: The cryo-EM problem:

Amit Singer (Princeton University) July 2014 2 / 30

slide-3
SLIDE 3

Orientation Estimation: Fourier projection-slice theorem

Projection Ii Molecule φ Electron source Ri ∈ SO(3)

Projection Ii Projection Ij ˆ Ii ˆ Ij 3D Fourier space 3D Fourier space

(xij, yij) (xji , yji ) Ri cij cij = (xij, yij , 0)T Ri cij = Rj cji

Cryo-EM inverse problem: Find φ (and R1, . . . , Rn) given I1, . . . , In. n = 3: Vainshtein and Goncharov 1986, van Heel 1987 n > 3: S, Shkolnisky SIAM Imaging 2011 n > 3: Bandeira, Charikar, Chen, S, in preparation

Amit Singer (Princeton University) July 2014 3 / 30

slide-4
SLIDE 4

Assumptions for today’s talk

Trivial point-group symmetry Homogeneity

Amit Singer (Princeton University) July 2014 4 / 30

slide-5
SLIDE 5

Experiments with simulated noisy projections

Each projection is 129x129 pixels. SNR = Var(Signal) Var(Noise) ,

(a) Clean (b) SNR=20 (c) SNR=2−1 (d) SNR=2−2 (e) SNR=2−3 (f) SNR=2−4 (g) SNR=2−5 (h) SNR=2−6 (i) SNR=2−7 (j) SNR=2−8

Amit Singer (Princeton University) July 2014 5 / 30

slide-6
SLIDE 6

Fraction of correctly identified common lines and the SNR

Define common line as being correctly identified if both radial lines deviate by no more than 10◦ from true directions. Fraction p of correctly identified common lines increases by PCA

log2(SNR) p 20 0.997 0.980

  • 1

0.956

  • 2

0.890

  • 3

0.764

  • 4

0.575

  • 5

0.345

  • 6

0.157

  • 7

0.064

  • 8

0.028

  • 9

0.019

Amit Singer (Princeton University) July 2014 6 / 30

slide-7
SLIDE 7

Least Squares Approach

Projection Ii Projection Ij ˆ Ii ˆ Ij 3D Fourier space 3D Fourier space

(xij, yij) (xji, yji) Ricij cij = (xij, yij, 0)T Ricij = Rjcji

Least squares: min

R1,R2,...,Rn∈SO(3)

  • i=j

Ricij − Rjcji2 Search space is exponentially large and non-convex.

Amit Singer (Princeton University) July 2014 7 / 30

slide-8
SLIDE 8

Quadratic Optimization Under Orthogonality Constraints

Quadratic cost:

i=j Ricij − Rjcji2

Quadratic constraints: RT

i Ri = I3×3

(det(Ri) = +1 constraint is ignored) We approximate the solution using SDP and rounding (S, Shkolnisky 2011) Related to:

MAX-CUT (Goemans and Williamson 1995) Generalized Orthogonal Procrustes Problem (Nemirovski 2007) PhaseLift (Candes, Strohmer, Voroninski 2013) Non-commutative Grothendick Problem (Naor, Regev, Vidick 2013)

“Robust” version – Least Unsquared Deviations (Wang, S, Wen 2013) min

R1,R2,...,Rn∈SO(3)

  • i=j

Ricij − Rjcji

Amit Singer (Princeton University) July 2014 8 / 30

slide-9
SLIDE 9

Detour: MAX-CUT

MAX-CUT: Given a weighted undirected graph with n vertices and non-negative weights wij ≥ 0, split the vertices into two sets such that the total weight of edges across the cut is maximized.

1 5 4 3 2 w12 w34 w35 w25 w45 w14 w24

CUT({1, 2, 3}, {4, 5}) = w14 + w24 + w25 + w34 + w35

Amit Singer (Princeton University) July 2014 9 / 30

slide-10
SLIDE 10

The MAX-CUT Approximation of Goemans & Williamson

OPT = max

y1,y2,...,yn∈{−1,+1}

1 4

  • i=j

wij(1 − yiyj) Combinatorial problem that is known to be NP-complete SDP relaxation of Goemans-Williamson (1995): The n × n Gram matrix G = yy T (Gij = yiyj) is PSD (G 0), Gii = 1, rank(G) = 1. max

G∈Rn×n

1 4

  • i,j

wij − 1 4 Tr(GW ) s.t. G 0, Gii = 1, i = 1, 2, . . . , n. Randomize a random vector z with i.i.d standard Gaussian entries zi ∼ N(0, 1), and compute Yz, where G = YY T is the Cholesky decomposition of G. Round yi = sign[(Yz)i]. Theorem (G-W): E[CUTGW ] > 0.87 · OPT.

Amit Singer (Princeton University) July 2014 10 / 30

slide-11
SLIDE 11

SDP Relaxation for the Common-Lines Problem

Least squares is equivalent to maximizing the sum of inner products: min

R1,R2,...,Rn∈SO(3)

  • i=j

Ricij−Rjcji2 ⇐ ⇒ max

R1,R2,...,Rn∈SO(3)

  • i=j

Ricij, Rjcji ⇐ ⇒ max

R1,R2,...,Rn∈SO(3)

  • i=j

Tr(cjicT

ij RT i Rj) ⇐

⇒ max

R1,R2,...,Rn∈SO(3) Tr(CG)

C is the 2n × 2n matrix (“the common lines matrix”) with Cij = ˜ cji˜ cT

ij =

xji yji xij yij

  • =

xjixij xjiyij yjixij yjiyij

  • ,

Cii =

  • G is the 2n × 2n Gram matrix G = ˜

RT ˜ R with Gij = ˜ RT

i ˜

Rj: G =      ˜ RT

1

˜ RT

2

. . . ˜ RT

n

     ˜ R1 ˜ R2 · · · ˜ Rn

  • =

     I2×2 ˜ RT

1 ˜

R2 · · · ˜ RT

1 ˜

Rn ˜ RT

2 ˜

R1 I2×2 · · · ˜ RT

2 ˜

Rn . . . . . . ... . . . ˜ RT

n ˜

R1 ˜ RT

n ˜

R2 · · · I2×2     

Amit Singer (Princeton University) July 2014 11 / 30

slide-12
SLIDE 12

SDP Relaxation and Rounding

max

R1,R2,...,Rn∈SO(3) Tr(CG)

SDP Relaxation: max

G∈R2n×2n Tr(CG)

s.t. G 0, Gii = I2×2, i = 1, 2, . . . , n. Missing is the non-convex constraint rank(G) = 3. Randomize a 2n × 3 orthogonal matrix Q using (careful) QR factorization of a 2n × 3 matrix with i.i.d standard Gaussian entries Compute Cholesky decomposition G = YY T Round using SVD: (YQ)i = UiΣiV T

i

= ⇒ ˜ RT

i

= UiV T

i .

Use the cross product to find RT

i .

Loss of handedness.

Amit Singer (Princeton University) July 2014 12 / 30

slide-13
SLIDE 13

Spectral Relaxation for Uniformly Distributed Rotations

  • |

| R1

i

R2

i

| |

  • =

x1

i

x2

i

y 1

i

y 2

i

z1

i

z2

i

  • ,

i = 1, . . . , n. Define 3 vectors of length 2n x =

  • x1

1

x2

1

x1

2

x2

2

· · · x1

n

x2

n

T y =

  • y 1

1

y 2

1

y 1

2

y 2

2

· · · y 1

n

y 2

n

T z =

  • z1

1

z2

1

z1

2

z2

2

· · · z1

n

z2

n

T Rewrite the least squares objective function as max

R1,...,Rn∈SO(3)

  • i=j

Ricij, Rjcji = max

R1,...,Rn∈SO(3) xT Cx + y TCy + zTCz

By symmetry, if rotations are uniformly distributed over SO(3), then the top eigenvalue of C has multiplicity 3 and corresponding eigenvectors are x, y, z from which we recover R1, R2, . . . , Rn!

Amit Singer (Princeton University) July 2014 13 / 30

slide-14
SLIDE 14

Spectrum of C

Numerical simulation with n = 1000 rotations sampled from the Haar measure; no noise. Bar plot of positive (left) and negative (right) eigenvalues of C:

10 20 30 40 50 60 100 200 300 400 500 600 λ 10 20 30 40 50 60 20 40 60 80 100 120 140 160 180 −λ

Eigenvalues: λl ≈ n (−1)l+1

l(l+1) ,

l = 1, 2, 3, . . .. (1

2, − 1 6, 1 12, . . .)

Multiplicities: 2l + 1. Two basic questions:

1

Why this spectrum? Answer: Representation Theory of SO(3) (Hadani, S, 2011)

2

Is it stable to noise? Answer: Yes, due to random matrix theory.

Amit Singer (Princeton University) July 2014 14 / 30

slide-15
SLIDE 15

Common Lines Kernel

Common line equation: Ricij = Rjcji = ± R3

i × R3 j

R3

i × R3 j

cij = ±R−1

i

R3

i × R3 j

R3

i × R3 j ,

cji = ±R−1

j

R3

i × R3 j

R3

i × R3 j

Basic properties of the cross product: Cij is only a function of the ratio Uij = R−1

i

Rj Cij = xijxji xijyji yijxji yijyji

  • =

xij yij xji yji

  • = (Pcij)(Pcji)T

= P [I 3 × U3

ij][(U−1 ij )3 × I 3]T

I 3 × U3

ij2

PT, where P = 1 1

  • .

Common lines kernel: Cij = k(Ri, Rj) = ˜ k(R−1

i

Rj) : R2 → R2.

Amit Singer (Princeton University) July 2014 15 / 30

slide-16
SLIDE 16

Common Lines Operator

Common lines kernel: Cij = k(Ri, Rj) = ˜ k(R−1

i

Rj) : R2 → R2. Suppose f : SO(3) → R2 then 1 n(Cf )(Ri) = 1 n

n

  • j=1

Cijf (Rj) ≈

  • SO(3)

k(Ri, Rj)f (Rj) dµ(Rj) = (Cf )(Ri). Common lines integral operator C is a convolution over SO(3): (Cf )(Ri) =

  • SO(3)

˜ k(R−1

i

Rj)f (Rj) dµ(Rj) = λf (Ri) Representation theory of SO(3) explains eigenvalues and eigenfunctions. Eigenfunctions are tangent vector fields to the 2-sphere (sections of the tangent bundle). They are the projected gradient of the scalar spherical harmonics.

Amit Singer (Princeton University) July 2014 16 / 30

slide-17
SLIDE 17

Probabilistic Model and Wigner’s Semi-Circle Law

Simplistic Model: every common line is detected correctly with probability p, independently of all other common-lines, and with probability 1 − p the common lines are falsely detected and are uniformly distributed over the unit circle. Let C clean be the matrix C when all common-lines are detected correctly (p = 1). The expected value of the noisy matrix C is E[C] = pC clean, as the contribution of the falsely detected common lines to the expected value vanishes. Decompose C as C = pC clean + W , where W is a 2n × 2n zero-mean random matrix. The eigenvalues of W are distributed according to Wigner’s semi-circle law whose support, up to O(p) and finite sample fluctuations, is [− √ 2n, √ 2n].

Amit Singer (Princeton University) July 2014 17 / 30

slide-18
SLIDE 18

Threshold probability

Sufficient condition for top three eigenvalues to be pushed away from the semi-circle and no other eigenvalue crossings: (rank-1 and finite rank deformed Wigner matrices, F¨ uredi and Koml´

  • s 1981, F´

eral and P´ ech´ e 2007, ...) p∆(C clean) > 1 2λ1(W ) Spectral gap ∆(C clean) and spectral norm λ1(W ) are given by ∆(C clean) ≈ (1 2 − 1 12)n and λ1(W ) ≈ √ 2n. Threshold probability pc = 5 √ 2 6√n.

Amit Singer (Princeton University) July 2014 18 / 30

slide-19
SLIDE 19

Numerical Spectra of C, n = 1000

−200 200 400 600 200 400 600 800 λ

(a) SNR=20

−200 200 400 600 200 400 600 800 λ

(b) SNR=2−1

−200 200 400 600 100 200 300 400 λ

(c) SNR=2−2

−200 200 400 600 50 100 150 200 250 λ

(d) SNR=2−3

−200 200 400 600 50 100 150 200 λ

(e) SNR=2−4

−100 100 200 300 400 20 40 60 80 100 λ

(f) SNR=2−5

−100 100 200 300 20 40 60 λ

(g) SNR=2−6

−50 50 100 150 200 10 20 30 40 λ

(h) SNR=2−7

−50 50 100 150 5 10 15 20 25 λ

(i) SNR=2−8

Amit Singer (Princeton University) July 2014 19 / 30

slide-20
SLIDE 20

Estimation Error

True rotations: R1, . . . , Rn. Estimated rotations: ˆ R1, . . . , ˆ Rn. Registration: ˆ O = argmin

O∈SO(3) n

  • i=1

Ri − O ˆ Ri2

F

Mean squared error: MSE = 1 n

n

  • i=1

Ri − ˆ O ˆ Ri2

F

Amit Singer (Princeton University) July 2014 20 / 30

slide-21
SLIDE 21

MSE for n = 1000

SNR p MSE 2−1 0.951 0.0182 2−2 0.890 0.0224 2−3 0.761 0.0361 2−4 0.564 0.0737 2−5 0.342 0.2169 2−6 0.168 1.8011 2−7 0.072 2.5244 2−8 0.032 3.5196 Model fails at low SNR. Why? Wigner model is too simplistic – cannot have n2 independent random variables from just n images. Cij = K(Ii, Ij), “random kernel matrix” (Koltchinskii and Gin´ e 2000, El-Karoui 2010). Kernel is discontinuous (Cheng, S, 2013)

Amit Singer (Princeton University) July 2014 21 / 30

slide-22
SLIDE 22

From Common Lines to Maximum Likelihood

The images contain more information than that expressed by the common lines. Common line based algorithms can succeed only at “high” SNR. (Quasi) Maximum Likelihood: We would like to try all possible rotations R1, . . . , Rn and choose the combination for which the agreement on the common lines as observed in the images is maximal. Computationally intractable: exponentially large search space.

Amit Singer (Princeton University) July 2014 22 / 30

slide-23
SLIDE 23

Maximum Likelihood Solution using SDP

Main idea: Lift SO(3) to Sym(S2) Suppose ω1, ω2, . . . , ωL ∈ S2 are “evenly” distributed points over the sphere (e.g., a spherical t-design). g ∈ SO(3) ← → π ∈ SL (

Assignment

→ , Procrustes ← ) (in practice no explicit construction is needed).

Amit Singer (Princeton University) July 2014 23 / 30

slide-24
SLIDE 24

Convex Relaxation of Permutations arising from Rotations

The convex hull of permutation matrices are the doubly stochastic matrices (Birkhoff-von Neumann polytope): Π ∈ RL×L, Π ≥ 0, 1TΠ = 1T , Π1 = 1 Rotation by an element of SO(3) should “map nearby-points to nearby-points”. More precisely, SO(3) preserves inner products: Ωij = ωi, ωj ǫ = ωπ(i), ωπ(j) = Ωπ(i),π(j) ΠΩΠT

ǫ

= Ω = ⇒ ΠΩ ǫ = ΩΠ Axis of rotation (Euler’s rotation theorem): Tr(Π)

ǫ

≥ 2 Notice: We are discretizing S2, not SO(3) (substantial gain in computational complexity)

Amit Singer (Princeton University) July 2014 24 / 30

slide-25
SLIDE 25

Convex Relaxation of Cycle Consistency: SDP

Let G be a block-matrix of size n × n with Gij ∈ RL×L. We want Gij = ΠiΠT

j .

G is PSD, Gii = IL×L, and rank(G) = L (the rank constraint is dropped). Convex relaxation of search space (putting it all together): G ∈ RnL×nL G 0 Gii = IL×L (cycle consistency) Gij ≥ 0 Gij1 = 1 1T Gij = 1T (doubly stochastic) GijΩ ǫ = ΩGij Tr(Gij)

ǫ

≥ 2 (SO(3)) Related work: Charikar, Makarychev, Makarychev, “Near optimal algorithms for Unique Games” (STOC 2006)

Amit Singer (Princeton University) July 2014 25 / 30

slide-26
SLIDE 26

Maximum Likelihood: Linear Objective Function

The common line depends on RiRT

j .

Log-Likelihood function is of the form

  • i=j

fij(RiRT

j )

Nonlinear in RiRT

j , but linear in G.

Proof by picture.

Amit Singer (Princeton University) July 2014 26 / 30

slide-27
SLIDE 27

Linear Objective: Proof by picture

Amit Singer (Princeton University) July 2014 27 / 30

slide-28
SLIDE 28

Tightness of the SDP

0.25 0.5 0.75 1 0.2 0.4 0.6 0.8 1

σ p

0.25 0.5 0.75 1 0.2 0.4 0.6 0.8 1

σ p

Figure : Fraction of trials (among 100) on which rank recovery was observed. Left:

Generalized Orthogonal Procrustes. Right: multi-reference alignment (registration).

The solution of the SDP has the desired rank up to a certain level of noise (w.h.p). In other words, even though the search-space is exponentially large, we typically find the MLE in polynomial time. This is a viable alternative to heuristic methods such as EM and alternating minimization. The SDP gives a certificate whenever it finds the MLE.

Amit Singer (Princeton University) July 2014 28 / 30

slide-29
SLIDE 29

Are we there yet?

The SDP approach can be used in a variety of problems where the

  • bjective function is a sum of pairwise interactions

Need better theoretical understanding for the phase transition behavior and conditions for tightness of the SDP. Need better computational tools for solving large scale SDPs. (we use ADMM).

Amit Singer (Princeton University) July 2014 29 / 30

slide-30
SLIDE 30

References

  • A. Singer, Y. Shkolnisky, “Three-Dimensional Structure Determination from

Common Lines in Cryo-EM by Eigenvectors and Semidefinite Programming”, SIAM Journal on Imaging Sciences, 4 (2), pp. 543–572 (2011).

  • R. Hadani, A. Singer, “Representation theoretic patterns in three

dimensional Cryo-Electron Microscopy I – The intrinsic reconstitution algorithm”, Annals of Mathematics, 174 (2), pp. 1219–1241 (2011).

  • L. Wang, A. Singer, Z. Wen, “Orientation Determination of Cryo-EM

images using Least Unsquared Deviations”, SIAM Journal on Imaging Sciences, 6(4), pp. 2450–2483 (2013).

  • A. S. Bandeira, M. Charikar, A. Singer, A. Zhu, “Multireference Alignment

using Semidefinite Programming”, 5th Innovations in Theoretical Computer Science (ITCS 2014).

  • A. S. Bandeira, Y. Khoo, A. Singer, “Open problem: Tightness of maximum

likelihood semidenite relaxations”, COLT 2014. Also available at http://arxiv.org/abs/1404.2655.

Amit Singer (Princeton University) July 2014 30 / 30