orientation assignment in cryo em
play

Orientation Assignment in Cryo-EM Amit Singer Princeton University - PowerPoint PPT Presentation

Orientation Assignment in Cryo-EM Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 24, 2014 Amit Singer (Princeton University) July 2014 1 / 30 Single Particle


  1. Orientation Assignment in Cryo-EM Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 24, 2014 Amit Singer (Princeton University) July 2014 1 / 30

  2. Single Particle Reconstruction using cryo-EM Schematic drawing of the imaging process: The cryo-EM problem: Amit Singer (Princeton University) July 2014 2 / 30

  3. Orientation Estimation: Fourier projection-slice theorem R i c ij c ij = ( x ij , y ij , 0) T ( x ij , y ij ) Projection I i ˆ Projection I i I i 3D Fourier space ( x ji , y ji ) Molecule φ R i c ij = R j c ji R i ∈ SO(3) ˆ Projection I j I j 3D Fourier space Electron source Cryo-EM inverse problem: Find φ (and R 1 , . . . , R n ) given I 1 , . . . , I n . n = 3: Vainshtein and Goncharov 1986, van Heel 1987 n > 3: S, Shkolnisky SIAM Imaging 2011 n > 3: Bandeira, Charikar, Chen, S, in preparation Amit Singer (Princeton University) July 2014 3 / 30

  4. Assumptions for today’s talk Trivial point-group symmetry Homogeneity Amit Singer (Princeton University) July 2014 4 / 30

  5. Experiments with simulated noisy projections Each projection is 129x129 pixels. SNR = Var( Signal ) Var( Noise ) , (b) SNR=2 0 (c) SNR=2 − 1 (d) SNR=2 − 2 (e) SNR=2 − 3 (a) Clean (f) SNR=2 − 4 (g) SNR=2 − 5 (h) SNR=2 − 6 (i) SNR=2 − 7 (j) SNR=2 − 8 Amit Singer (Princeton University) July 2014 5 / 30

  6. Fraction of correctly identified common lines and the SNR Define common line as being correctly identified if both radial lines deviate by no more than 10 ◦ from true directions. Fraction p of correctly identified common lines increases by PCA log 2 (SNR) p 20 0.997 0 0.980 -1 0.956 -2 0.890 -3 0.764 -4 0.575 -5 0.345 -6 0.157 -7 0.064 -8 0.028 -9 0.019 Amit Singer (Princeton University) July 2014 6 / 30

  7. Least Squares Approach R i c ij c ij = ( x ij , y ij , 0) T ( x ij , y ij ) ˆ Projection I i I i 3D Fourier space ( x ji , y ji ) R i c ij = R j c ji ˆ Projection I j I j 3D Fourier space Least squares: � � R i c ij − R j c ji � 2 min R 1 , R 2 ,..., R n ∈ SO (3) i � = j Search space is exponentially large and non-convex. Amit Singer (Princeton University) July 2014 7 / 30

  8. Quadratic Optimization Under Orthogonality Constraints i � = j � R i c ij − R j c ji � 2 Quadratic cost: � Quadratic constraints: R T i R i = I 3 × 3 (det( R i ) = +1 constraint is ignored) We approximate the solution using SDP and rounding (S, Shkolnisky 2011) Related to: MAX-CUT (Goemans and Williamson 1995) Generalized Orthogonal Procrustes Problem (Nemirovski 2007) PhaseLift (Candes, Strohmer, Voroninski 2013) Non-commutative Grothendick Problem (Naor, Regev, Vidick 2013) “Robust” version – Least Unsquared Deviations (Wang, S, Wen 2013) � min � R i c ij − R j c ji � R 1 , R 2 ,..., R n ∈ SO (3) i � = j Amit Singer (Princeton University) July 2014 8 / 30

  9. Detour: MAX-CUT MAX-CUT: Given a weighted undirected graph with n vertices and non-negative weights w ij ≥ 0, split the vertices into two sets such that the total weight of edges across the cut is maximized. 1 w 14 w 12 4 w 24 2 w 45 w 25 w 34 5 3 w 35 CUT( { 1 , 2 , 3 } , { 4 , 5 } ) = w 14 + w 24 + w 25 + w 34 + w 35 Amit Singer (Princeton University) July 2014 9 / 30

  10. The MAX-CUT Approximation of Goemans & Williamson 1 � OPT = max w ij (1 − y i y j ) 4 y 1 , y 2 ,..., y n ∈{− 1 , +1 } i � = j Combinatorial problem that is known to be NP-complete SDP relaxation of Goemans-Williamson (1995): The n × n Gram matrix G = yy T ( G ij = y i y j ) is PSD ( G � 0), G ii = 1, rank( G ) = 1. 1 w ij − 1 � max 4 Tr( GW ) 4 G ∈ R n × n i , j s.t. G � 0 , G ii = 1 , i = 1 , 2 , . . . , n . Randomize a random vector z with i.i.d standard Gaussian entries z i ∼ N (0 , 1), and compute Yz , where G = YY T is the Cholesky decomposition of G . Round y i = sign[( Yz ) i ]. Theorem (G-W): E [CUT GW ] > 0 . 87 · OPT. Amit Singer (Princeton University) July 2014 10 / 30

  11. SDP Relaxation for the Common-Lines Problem Least squares is equivalent to maximizing the sum of inner products: � R i c ij − R j c ji � 2 ⇐ � � min ⇒ max � R i c ij , R j c ji � R 1 , R 2 ,..., R n ∈ SO (3) R 1 , R 2 ,..., R n ∈ SO (3) i � = j i � = j � Tr( c ji c T ij R T ⇐ ⇒ i R j ) ⇐ ⇒ max R 1 , R 2 ,..., R n ∈ SO (3) Tr( CG ) max R 1 , R 2 ,..., R n ∈ SO (3) i � = j C is the 2 n × 2 n matrix (“the common lines matrix”) with � 0 � x ji � x ji x ij � � � � x ji y ij 0 c T � C ij = ˜ c ji ˜ ij = x ij y ij = , C ii = y ji y ji x ij y ji y ij 0 0 R T ˜ G is the 2 n × 2 n Gram matrix G = ˜ R with G ij = ˜ i ˜ R T R j : ˜ ˜ 1 ˜ ˜ 1 ˜  R T   R T R T  · · · I 2 × 2 R 2 R n 1 ˜ ˜ 2 ˜ ˜ 2 ˜ R T R T R T · · · � ˜ R 1 I 2 × 2 R n     2 ˜ ˜ � G = R 1 R 2 · · · R n =  .   . . .  ... . . . .     . . . .     ˜ ˜ n ˜ ˜ n ˜ R T R T R T · · · R 1 R 2 I 2 × 2 n Amit Singer (Princeton University) July 2014 11 / 30

  12. SDP Relaxation and Rounding R 1 , R 2 ,..., R n ∈ SO (3) Tr( CG ) max SDP Relaxation : G ∈ R 2 n × 2 n Tr( CG ) max s.t. G � 0 , G ii = I 2 × 2 , i = 1 , 2 , . . . , n . Missing is the non-convex constraint rank( G ) = 3. Randomize a 2 n × 3 orthogonal matrix Q using (careful) QR factorization of a 2 n × 3 matrix with i.i.d standard Gaussian entries Compute Cholesky decomposition G = YY T ⇒ ˜ Round using SVD: ( YQ ) i = U i Σ i V T R T = U i V T = i . i i Use the cross product to find R T i . Loss of handedness. Amit Singer (Princeton University) July 2014 12 / 30

  13. Spectral Relaxation for Uniformly Distributed Rotations � x 1 x 2 � | | � � i i R 1 R 2 y 1 y 2 = i = 1 , . . . , n . , i i i i z 1 z 2 | | i i Define 3 vectors of length 2 n � T � x 1 x 2 x 1 x 2 x 1 x 2 · · · x = 1 1 2 2 n n � T � y 1 y 2 y 1 y 2 y 1 y 2 = · · · y 1 1 2 2 n n � T � z 1 z 2 z 1 z 2 z 1 z 2 = · · · z 1 1 2 2 n n Rewrite the least squares objective function as R 1 ,..., R n ∈ SO (3) x T Cx + y T Cy + z T Cz � � R i c ij , R j c ji � = max max R 1 ,..., R n ∈ SO (3) i � = j By symmetry , if rotations are uniformly distributed over SO (3), then the top eigenvalue of C has multiplicity 3 and corresponding eigenvectors are x , y , z from which we recover R 1 , R 2 , . . . , R n ! Amit Singer (Princeton University) July 2014 13 / 30

  14. Spectrum of C Numerical simulation with n = 1000 rotations sampled from the Haar measure; no noise. Bar plot of positive (left) and negative (right) eigenvalues of C : 600 180 160 500 140 400 120 100 − λ λ 300 80 200 60 40 100 20 0 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Eigenvalues: λ l ≈ n ( − 1) l +1 l = 1 , 2 , 3 , . . . . ( 1 2 , − 1 6 , 1 l ( l +1) , 12 , . . . ) Multiplicities: 2 l + 1. Two basic questions: Why this spectrum? Answer: Representation Theory of SO(3) 1 (Hadani, S, 2011) Is it stable to noise? Answer: Yes, due to random matrix theory. 2 Amit Singer (Princeton University) July 2014 14 / 30

  15. Common Lines Kernel R 3 i × R 3 j Common line equation: R i c ij = R j c ji = ± � R 3 i × R 3 j � R 3 i × R 3 R 3 i × R 3 j j c ij = ± R − 1 c ji = ± R − 1 j � , i � R 3 i × R 3 j � R 3 i × R 3 j � Basic properties of the cross product: C ij is only a function of the ratio U ij = R − 1 R j i � x ij x ji � x ij � � � x ij y ji = ( Pc ij )( Pc ji ) T � = = C ij x ji y ji y ij x ji y ij y ji y ij � 1 [ I 3 × U 3 ij ) 3 × I 3 ] T ij ][( U − 1 � 0 0 P T , where P = = P . � I 3 × U 3 ij � 2 0 1 0 R j ) : R 2 �→ R 2 . Common lines kernel: C ij = k ( R i , R j ) = ˜ k ( R − 1 i Amit Singer (Princeton University) July 2014 15 / 30

  16. Common Lines Operator R j ) : R 2 �→ R 2 . Common lines kernel: C ij = k ( R i , R j ) = ˜ k ( R − 1 i Suppose f : SO (3) �→ R 2 then n n ( Cf )( R i ) = 1 1 � � C ij f ( R j ) ≈ k ( R i , R j ) f ( R j ) d µ ( R j ) = ( C f )( R i ) . n SO (3) j =1 Common lines integral operator C is a convolution over SO(3): � ˜ k ( R − 1 ( C f )( R i ) = R j ) f ( R j ) d µ ( R j ) = λ f ( R i ) i SO (3) Representation theory of SO(3) explains eigenvalues and eigenfunctions. Eigenfunctions are tangent vector fields to the 2-sphere (sections of the tangent bundle). They are the projected gradient of the scalar spherical harmonics. Amit Singer (Princeton University) July 2014 16 / 30

  17. Probabilistic Model and Wigner’s Semi-Circle Law Simplistic Model: every common line is detected correctly with probability p , independently of all other common-lines, and with probability 1 − p the common lines are falsely detected and are uniformly distributed over the unit circle. Let C clean be the matrix C when all common-lines are detected correctly ( p = 1). The expected value of the noisy matrix C is E [ C ] = pC clean , as the contribution of the falsely detected common lines to the expected value vanishes . Decompose C as C = pC clean + W , where W is a 2 n × 2 n zero-mean random matrix. The eigenvalues of W are distributed according to Wigner’s semi-circle law whose support, up to O ( p ) and finite sample √ √ fluctuations, is [ − 2 n , 2 n ]. Amit Singer (Princeton University) July 2014 17 / 30

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend