Three-dimensional structure determination of molecules without - - PowerPoint PPT Presentation

three dimensional structure determination of molecules
SMART_READER_LITE
LIVE PREVIEW

Three-dimensional structure determination of molecules without - - PowerPoint PPT Presentation

Three-dimensional structure determination of molecules without crystallization: from electron microscopy to semidefinite programming Amit Singer Princeton University, Department of Mathematics and PACM February 13, 2014 Amit Singer (Princeton


slide-1
SLIDE 1

Three-dimensional structure determination of molecules without crystallization: from electron microscopy to semidefinite programming

Amit Singer

Princeton University, Department of Mathematics and PACM

February 13, 2014

Amit Singer (Princeton University) February 2014 1 / 29

slide-2
SLIDE 2

Single Particle Cryo-Electron Microscopy

Drawing of the imaging process:

Amit Singer (Princeton University) February 2014 2 / 29

slide-3
SLIDE 3

Single Particle Cryo-Electron Microscopy: Model

Projection Ii Molecule φ Electron source Ri =   | | | R1

i

R2

i

R3

i

| | |   ∈ SO(3)

Projection images Ii(x, y) = ∞

−∞ φ(xR1 i + yR2 i + zR3 i ) dz + “noise”.

φ : R3 → R is the electric potential of the molecule. Cryo-EM problem: Find φ and R1, . . . , Rn given I1, . . . , In.

Amit Singer (Princeton University) February 2014 3 / 29

slide-4
SLIDE 4

Toy Example

Amit Singer (Princeton University) February 2014 4 / 29

slide-5
SLIDE 5
  • E. coli 50S ribosomal subunit: sample images

Fred Sigworth, Yale Medical School

Movie by Lanhui Wang and Zhizhen (Jane) Zhao

Amit Singer (Princeton University) February 2014 5 / 29

slide-6
SLIDE 6

Algorithmic Pipeline

Particle Picking: manual, automatic or experimental image segmentation. Class Averaging: classify images with similar viewing directions, register and average to improve their signal-to-noise ratio (SNR). Orientation Estimation: S, Shkolnisky, SIIMS 2011. Bandeira, Charikar, S, Zhu, ITCS 2014. Three-dimensional Reconstruction: a 3D volume is generated by a tomographic inversion algorithm. Iterative Refinement Assumptions for today’s talk: Trivial point-group symmetry Homogeneity: no structural variability

Amit Singer (Princeton University) February 2014 6 / 29

slide-7
SLIDE 7

Orientation Estimation: Fourier projection-slice theorem

Projection Ii Projection Ij ˆ Ii ˆ Ij 3D Fourier space 3D Fourier space

(xij, yij) (xji, yji) Ricij cij = (xij, yij, 0)T Ricij = Rjcji

Amit Singer (Princeton University) February 2014 7 / 29

slide-8
SLIDE 8

Angular Reconstitution (Van Heel 1987, Vainshtein and Goncharov 1986)

Amit Singer (Princeton University) February 2014 8 / 29

slide-9
SLIDE 9

Experiments with simulated noisy projections

Each projection is 129x129 pixels. SNR = Var(Signal) Var(Noise) ,

(a) Clean (b) SNR=20 (c) SNR=2−1 (d) SNR=2−2 (e) SNR=2−3 (f) SNR=2−4 (g) SNR=2−5 (h) SNR=2−6 (i) SNR=2−7 (j) SNR=2−8

Amit Singer (Princeton University) February 2014 9 / 29

slide-10
SLIDE 10

Fraction of correctly identified common lines and the SNR

Define common line as being correctly identified if both radial lines deviate by no more than 10◦ from true directions.

log2(SNR) p 20 0.997 0.980

  • 1

0.956

  • 2

0.890

  • 3

0.764

  • 4

0.575

  • 5

0.345

  • 6

0.157

  • 7

0.064

  • 8

0.028

  • 9

0.019

Amit Singer (Princeton University) February 2014 10 / 29

slide-11
SLIDE 11

Least Squares Approach

Consider the unit directional vectors as three-dimensional vectors: cij = (xij, yij, 0)T , cji = (xji, yji, 0)T . Being the common-line of intersection, the mapping of cij by Ri must coincide with the mapping of cji by Rj: (Ri, Rj ∈ SO(3)) Ricij = Rjcji, for 1 ≤ i < j ≤ n. Least squares: min

R1,R2,...,Rn∈SO(3)

  • i=j

Ricij − Rjcji2 Search space is exponentially large and non-convex.

Amit Singer (Princeton University) February 2014 11 / 29

slide-12
SLIDE 12

Quadratic Optimization Under Orthogonality Constraints

Quadratic cost:

i=j Ricij − Rjcji2

Quadratic constraints: RT

i Ri = I3×3

(det(Ri) = +1 constraint is ignored) We approximate the solution using SDP and rounding. Related to:

Goemans-Williamson (1995) SDP relaxation for MAX-CUT PhaseLift (Candes et al 2012) Generalized Orthogonal Procrustes Problem (Nemirovski 2007) Non-commutative Grothendick Problem (Naor et al 2013)

“Robust” version – Least Unsquared Deviations (Wang, S, Wen 2013) min

R1,R2,...,Rn∈SO(3)

  • i=j

Ricij − Rjcji

Amit Singer (Princeton University) February 2014 12 / 29

slide-13
SLIDE 13

SDP Relaxation for the Common-Lines Problem

Least squares is equivalent to maximizing the sum of inner products: min

R1,R2,...,Rn∈SO(3)

  • i=j

Ricij−Rjcji2 ⇐ ⇒ max

R1,R2,...,Rn∈SO(3)

  • i=j

Ricij, Rjcji ⇐ ⇒ max

R1,R2,...,Rn∈SO(3)

  • i=j

Tr(cjicT

ij RT i Rj) ⇐

⇒ max

R1,R2,...,Rn∈SO(3) Tr(CG)

C is the 2n × 2n matrix (“the common lines matrix”) with Cij = ˜ cji˜ cT

ij =

xji yji xij yij

  • =

xjixij xjiyij yjixij yjiyij

  • ,

Cii =

  • G is the 2n × 2n Gram matrix G = ˜

RT ˜ R with Gij = ˜ RT

i ˜

Rj: G =      ˜ RT

1

˜ RT

2

. . . ˜ RT

n

     ˜ R1 ˜ R2 · · · ˜ Rn

  • =

     I2×2 ˜ RT

1 ˜

R2 · · · ˜ RT

1 ˜

Rn ˜ RT

2 ˜

R1 I2×2 · · · ˜ RT

2 ˜

Rn . . . . . . ... . . . ˜ RT

n ˜

R1 ˜ RT

n ˜

R2 · · · I2×2     

Amit Singer (Princeton University) February 2014 13 / 29

slide-14
SLIDE 14

SDP Relaxation and Rounding

max

R1,R2,...,Rn∈SO(3) Tr(CG)

SDP Relaxation: max

G∈R2n×2n Tr(CG)

s.t. G 0, Gii = I2×2, i = 1, 2, . . . , n. Missing is the non-convex constraint rank(G) = 3. Randomize a 2n × 3 orthogonal matrix Q using (careful) QR factorization of a 2n × 3 matrix with i.i.d standard Gaussian entries Compute Cholesky decomposition G = YY T Round using SVD: (YQ)i = UiΣiV T

i

= ⇒ ˜ RT

i

= UiV T

i .

Use the cross product to find RT

i .

Loss of handedness.

Amit Singer (Princeton University) February 2014 14 / 29

slide-15
SLIDE 15

Spectral Relaxation for Uniformly Distributed Rotations

  • |

| R1

i

R2

i

| |

  • =

x1

i

x2

i

y 1

i

y 2

i

z1

i

z2

i

  • ,

i = 1, . . . , n. Define 3 vectors of length 2n x =

  • x1

1

x2

1

x1

2

x2

2

· · · x1

n

x2

n

T y =

  • y 1

1

y 2

1

y 1

2

y 2

2

· · · y 1

n

y 2

n

T z =

  • z1

1

z2

1

z1

2

z2

2

· · · z1

n

z2

n

T Rewrite the least squares objective function as max

R1,...,Rn∈SO(3)

  • i=j

Ricij, Rjcji = max

R1,...,Rn∈SO(3) xT Cx + y TCy + zTCz

By symmetry, if rotations are uniformly distributed over SO(3), then the top eigenvalue of C has multiplicity 3 and corresponding eigenvectors are x, y, z from which we recover R1, R2, . . . , Rn!

Amit Singer (Princeton University) February 2014 15 / 29

slide-16
SLIDE 16

Spectrum of C

Numerical simulation with n = 1000 rotations sampled from the Haar measure; no noise. Bar plot of positive (left) and negative (right) eigenvalues of C:

10 20 30 40 50 60 100 200 300 400 500 600 λ 10 20 30 40 50 60 20 40 60 80 100 120 140 160 180 −λ

Eigenvalues: λl ≈ n (−1)l+1

l(l+1) ,

l = 1, 2, 3, . . .. (1

2, − 1 6, 1 12, . . .)

Multiplicities: 2l + 1. Two basic questions:

1

Why this spectrum? Answer: Representation Theory of SO(3) (Hadani, S, 2011)

2

Is it stable to noise? Answer: Yes, due to random matrix theory.

Amit Singer (Princeton University) February 2014 16 / 29

slide-17
SLIDE 17

Probabilistic Model and Wigner’s Semi-Circle Law

Simplistic Model: every common line is detected correctly with probability p, independently of all other common-lines, and with probability 1 − p the common lines are falsely detected and are uniformly distributed over the unit circle. Let C clean be the matrix C when all common-lines are detected correctly (p = 1). The expected value of the noisy matrix C is E[C] = pC clean, as the contribution of the falsely detected common lines to the expected value vanishes. Decompose C as C = pC clean + W , where W is a 2n × 2n zero-mean random matrix. The eigenvalues of W are distributed according to Wigner’s semi-circle law whose support, up to O(p) and finite sample fluctuations, is [− √ 2n, √ 2n].

Amit Singer (Princeton University) February 2014 17 / 29

slide-18
SLIDE 18

Threshold probability

Sufficient condition for top three eigenvalues to be pushed away from the semi-circle and no other eigenvalue crossings: (rank-1 and finite rank deformed Wigner matrices, F¨ uredi and Koml´

  • s 1981, F´

eral and P´ ech´ e 2007, ...) p∆(C clean) > 1 2λ1(W ) Spectral gap ∆(C clean) and spectral norm λ1(W ) are given by ∆(C clean) ≈ (1 2 − 1 12)n and λ1(W ) ≈ √ 2n. Threshold probability pc = 5 √ 2 6√n.

Amit Singer (Princeton University) February 2014 18 / 29

slide-19
SLIDE 19

Numerical Spectra of C, n = 1000

−200 200 400 600 200 400 600 800 λ

(a) SNR=20

−200 200 400 600 200 400 600 800 λ

(b) SNR=2−1

−200 200 400 600 100 200 300 400 λ

(c) SNR=2−2

−200 200 400 600 50 100 150 200 250 λ

(d) SNR=2−3

−200 200 400 600 50 100 150 200 λ

(e) SNR=2−4

−100 100 200 300 400 20 40 60 80 100 λ

(f) SNR=2−5

−100 100 200 300 20 40 60 λ

(g) SNR=2−6

−50 50 100 150 200 10 20 30 40 λ

(h) SNR=2−7

−50 50 100 150 5 10 15 20 25 λ

(i) SNR=2−8

Amit Singer (Princeton University) February 2014 19 / 29

slide-20
SLIDE 20

Estimation Error

True rotations: R1, . . . , Rn. Estimated rotations: ˆ R1, . . . , ˆ Rn. Registration: ˆ O = argmin

O∈SO(3) N

  • i=1

Ri − O ˆ Ri2

F

Mean squared error: MSE = 1 N

N

  • i=1

Ri − ˆ O ˆ Ri2

F

Amit Singer (Princeton University) February 2014 20 / 29

slide-21
SLIDE 21

MSE for n = 1000

SNR p MSE 2−1 0.951 0.0182 2−2 0.890 0.0224 2−3 0.761 0.0361 2−4 0.564 0.0737 2−5 0.342 0.2169 2−6 0.168 1.8011 2−7 0.072 2.5244 2−8 0.032 3.5196 Model fails at low SNR. Why? Wigner model is too simplistic – cannot have n2 independent random variables from just n images. Cij = K(Pi, Pj), “random kernel matrix” (Koltchinskii and Gin´ e 2000, El-Karoui 2010). Kernel is discontinuous (Cheng, S, 2013)

Amit Singer (Princeton University) February 2014 21 / 29

slide-22
SLIDE 22

Maximum Likelihood Solution using SDP

Main idea: Lift SO(3) to Sym(S2) Suppose x1, x2, . . . , xL ∈ S2 are “evenly” distributed points over the sphere (e.g., a spherical t-design). To each R ∈ SO(3) we can attach a permutation π ∈ SL via the group action and the assignment/Hungarian algorithm (this does not need to be constructed implicitly). Notice: We are discretizing S2, not SO(3) (substantial gain in computational complexity) We will see that the likelihood function is linear in the PSD matrix that encodes the relative permutations, and that SO(3) implies further linear constraints.

Amit Singer (Princeton University) February 2014 22 / 29

slide-23
SLIDE 23

Convex Relaxation of Permutations arising from Rotations

The convex hull of permutation matrices are the doubly stochastic matrices (Birkhoff-von Neumann polytope): Π ∈ RL×L, Π ≥ 0, 1TΠ = 1T , Π1 = 1 Rotation by an element of SO(3) should “map nearby-points to nearby-points”. More precisely, SO(3) preserves inner products: Xij = xi, xj, Xπ(i),π(j)

ǫ

= Xij ΠXΠT

ǫ

= X = ⇒ ΠX

ǫ

= XΠ

Amit Singer (Princeton University) February 2014 23 / 29

slide-24
SLIDE 24

Convex Relaxation of Cycle Consistency: SDP

Let G be a block-matrix of size n × n with Gij ∈ RL×L. We want Gij = ΠiΠT

j .

G is PSD, Gii = IL×L, and rank(G) = L (the rank constraint is dropped). Gij ∈ RL×L, Gij ≥ 0, G1 = 1 GijX

ǫ

= XGij

Amit Singer (Princeton University) February 2014 24 / 29

slide-25
SLIDE 25

Maximum Likelihood: Linear Objective Function

The common line depends on RiRT

j .

Likelihood function is of the form

  • i=j

fij(RiRT

j )

Nonlinear in RiRT

j , but linear in G.

Proof by picture.

Amit Singer (Princeton University) February 2014 25 / 29

slide-26
SLIDE 26

Exact Recovery

Experience with the multireference alignment problem: The solution

  • f the SDP has the desired rank up to a certain level of noise (w.h.p).

In other words, even though the search-space is exponentially large, we solve ML in polynomial time. This is a viable alternative to heuristic methods such as EM and alternating minimization. Can be used in a variety of problems where the objective function is a sum of pairwise interactions. Need better theoretical understanding for the phase transition behavior and conditions for exact recovery.

Amit Singer (Princeton University) February 2014 26 / 29

slide-27
SLIDE 27

Ongoing Research in cryo-EM and Related Applications

Ab-initio reconstruction without class averaging: scaling the SDP to larger n Heterogeneity Translations Contrast transfer function of the microscope, different defocus groups Molecules with symmetries Beam induced motion and motion correction XFEL (X-ray free electron lasers) Structure from motion (computer vision)

Amit Singer (Princeton University) February 2014 27 / 29

slide-28
SLIDE 28

References

  • A. Singer, Y. Shkolnisky, “Three-Dimensional Structure Determination from

Common Lines in Cryo-EM by Eigenvectors and Semidefinite Programming”, SIAM Journal on Imaging Sciences, 4 (2), pp. 543–572 (2011).

  • L. Wang, A. Singer, Z. Wen, “Orientation Determination of Cryo-EM

images using Least Unsquared Deviations”, SIAM Journal on Imaging Sciences, 6(4), pp. 2450–2483 (2013).

  • A. S. Bandeira, M. Charikar, A. Singer, A. Zhu, “Multireference Alignment

using Semidefinite Programming”, 5th Innovations in Theoretical Computer Science (ITCS 2014).

Amit Singer (Princeton University) February 2014 28 / 29

slide-29
SLIDE 29

Thank You!

Funding: NIH/NIGMS R01GM090200 AFOSR FA9550-12-1-0317 Simons Foundation LTR DTD 06-05-2012

Amit Singer (Princeton University) February 2014 29 / 29