Orbit Retrieval with Applications to Cryo-Electron Microcopy Joe - PowerPoint PPT Presentation

Orbit Retrieval with Applications to Cryo-Electron Microcopy Joe Kileel , Princeton University New York, December 2018

Thanks to many collaborators ◮ Amit Singer (Princeton) ◮ Afonso Bandeira (NYU) ◮ Alex Wein (NYU) ◮ Ben Blum-Smith (NYU) ◮ Amelia Perry (MIT) ◮ Jon Weed (MIT) ◮ Nir Sharon (Tel Aviv) ◮ Yuehaw Khoo (Stanford) ◮ Tamir Bendory (Princeton) ◮ Nicolas Boumal (Princeton) ◮ Jo˜ ao Pereira (Princeton) ◮ Emmanuel Abbe (Princeton) ◮ Eitan Levin (Princeton) ◮ Boris Landa (Tel Aviv)

Cryo-EM: 3D protein structure from 2D images

Cryo-EM: 3D protein structure from 2D images Given: ∼ 10 5 very noisy 2D images from unknown viewing directions Want: 3D structure at resolution ∼ 3 × 10 − 10 m

Much promise for cryo-EM

Multi-reference alignment: cyclically shifted, noisy signals

Orbit retrieval: a common abstraction Let G � V where G is a compact group acting linearly, continuously and orthogonally on V = R L . Let Π : V → W be a linear map where W = R K . Let x ∈ V be fixed. We observe projected, rotated, noisy copies, precisely i.i.d. realizations of: y = Π( g · x ) + ǫ where g ∼ Haar( G ) and ǫ ∼ N (0 , σ 2 I K ). The goal is to estimate the orbit G · x .

Orbit retrieval: a common abstraction Let G � V where G is a compact group acting linearly, continuously and orthogonally on V = R L . Let Π : V → W be a linear map where W = R K . Let x ∈ V be fixed. We observe projected, rotated, noisy copies, precisely i.i.d. realizations of: y = Π( g · x ) + ǫ where g ∼ Haar( G ) and ǫ ∼ N (0 , σ 2 I K ). The goal is to estimate the orbit G · x . MRA: Z / L Z � R L by cyclic shifts, Π = id Cryo-EM: SO(3) � { band-limited molecular densities } , Π = tomographic projection

Fundamental obstacle at low SNR & state-of-the-art Estimation of rotations is impossible when σ 2 is very big Consider an oracle that knows x (the 3D structure). The oracle would estimate g ’s (the rotations) by generating projections and matching them to observations. The oracle would suffer large errors at very low SNR.

Fundamental obstacle at low SNR & state-of-the-art Estimation of rotations is impossible when σ 2 is very big Consider an oracle that knows x (the 3D structure). The oracle would estimate g ’s (the rotations) by generating projections and matching them to observations. The oracle would suffer large errors at very low SNR. RELION , the leading reconstruction software for cryo-EM, does not attempt to estimate one rotation per each experimental image. Instead, RELION takes a Bayesian approach, estimating a probability distribution of rotations per each image. Holding these fixed, it estimates the 3D structure. Then the rotational distributions are updated with the 3D structure fixed, and so on . . . Iterative refinement has model bias, is computationally intensive, and lacks rigorous guarantees.

Invariant features approach / Kam’s method To bypass estimation of g , we could average it out by calculating moments of the data and equating with population moments:

Invariant features approach / Kam’s method To bypass estimation of g , we could average it out by calculating moments of the data and equating with population moments: samples are i.i.d. draws of y = Π( g · x ) + ǫ sample average ≈ E g ,ǫ [ y ] = Π E g [ g · x ] sample 2 nd moment ≈ E g ,ǫ [ y ⊗ 2 ] = Π ⊗ 2 E g [( g · x ) ⊗ 2 ] + “bias” sample 3 rd moment ≈ E g ,ǫ [ y ⊗ 3 ] = Π ⊗ 3 E g [( g · x ) ⊗ 3 ] + “bias” . . . [should have O( σ 2 d ) samples to estimate the d th population moment]

Invariant features approach / Kam’s method To bypass estimation of g , we could average it out by calculating moments of the data and equating with population moments: samples are i.i.d. draws of y = Π( g · x ) + ǫ sample average ≈ E g ,ǫ [ y ] = Π E g [ g · x ] sample 2 nd moment ≈ E g ,ǫ [ y ⊗ 2 ] = Π ⊗ 2 E g [( g · x ) ⊗ 2 ] + “bias” sample 3 rd moment ≈ E g ,ǫ [ y ⊗ 3 ] = Π ⊗ 3 E g [( g · x ) ⊗ 3 ] + “bias” . . . [should have O( σ 2 d ) samples to estimate the d th population moment] Invariant features approach : estimate enough moments so that G · x is determined and then estimate G · x by (somehow) solving a (noisy) polynomial system Note: the red terms are exactly the low-degree invariants in R [ V ] G

Invariant features for MRA What are low-degree invariants for MRA? How to invert them?

Invariant features for MRA What are low-degree invariants for MRA? How to invert them? Since Z / L Z is finite abelian, its action may be diagonalized over C . DFT does this. Z / L Z acts by modulating the phase of each Fourier coefficient: � 2 π isk � ( s · ˆ x )[ k ] = exp x [ k ] ˆ L Thus the invariants are monomial in this basis. ◮ DC component : ˆ x [0] ◮ Power spectrum : ˆ x [ k ]ˆ x [ − k ] : k = 0 , . . . , L − 1 ◮ Bispectrum : ˆ x [ k 1 ]ˆ x [ k 2 ]ˆ x [ k 3 ] : k 1 + k 2 + k 3 = 0 (mod L )

Invariant features for MRA What are low-degree invariants for MRA? How to invert them? Since Z / L Z is finite abelian, its action may be diagonalized over C . DFT does this. Z / L Z acts by modulating the phase of each Fourier coefficient: � 2 π isk � ( s · ˆ x )[ k ] = exp x [ k ] ˆ L Thus the invariants are monomial in this basis. ◮ DC component : ˆ x [0] ◮ Power spectrum : ˆ x [ k ]ˆ x [ − k ] : k = 0 , . . . , L − 1 ◮ Bispectrum : ˆ x [ k 1 ]ˆ x [ k 2 ]ˆ x [ k 3 ] : k 1 + k 2 + k 3 = 0 (mod L ) Assuming all DFT coefficients are nonzero, recover ˆ x by multiplying/dividing: � L − 1 x [ − 1 − k ] k =0 ˆ x [1]ˆ x [ k ]ˆ x [ − 1] = ˆ x [1]ˆ x [ − 1] x [2] = ˆ x [2]ˆ x [ − 1]ˆ x [ − 1] x [1] L = ˆ , ˆ , ˆ , . . . � L − 1 x [ − 1] 2 x [ − k ] ˆ x [1] ˆ k =0 ˆ x [ k ]ˆ More stable bispectrum inversion is by a certain eigendecomposition (provably) and even better (empirically) is by non-convex least squares moment-fitting.

Statistical optimality for invariant features in large σ 2 limit Theorem (2018) Let x ∈ V . Let d ∈ Z > 0 be the least degree such that for x ′ ∈ V :  Π E [ g · x ′ ] = Π E [ g · x ]   Π ⊗ 2 E [( g · x ′ ) ⊗ 2 ] = Π ⊗ 2 E [( g · x ) ⊗ 2 ]    . . .     Π ⊗ d E [( g · x ′ ) ⊗ d ] = Π ⊗ d E [( g · x ) ⊗ d ]  implies G · x ′ = G · x. Then any estimation procedure requires O( σ 2 d ) samples to accurately estimate G · x with high probability, as σ 2 → ∞ .

Statistical optimality for invariant features in large σ 2 limit Theorem (2018) Let x ∈ V . Let d ∈ Z > 0 be the least degree such that for x ′ ∈ V :  Π E [ g · x ′ ] = Π E [ g · x ]   Π ⊗ 2 E [( g · x ′ ) ⊗ 2 ] = Π ⊗ 2 E [( g · x ) ⊗ 2 ]    . . .     Π ⊗ d E [( g · x ′ ) ⊗ d ] = Π ⊗ d E [( g · x ) ⊗ d ]  implies G · x ′ = G · x. Then any estimation procedure requires O( σ 2 d ) samples to accurately estimate G · x with high probability, as σ 2 → ∞ . Sketch : for x , x ′ ∈ V the Kullback-Leibler divergence between the probability distribution for an observation with x and with x’ is bounded above by the following Taylor expansion in σ − 2 : σ − 2 k 2 � � � � e 2 � � Π ⊗ k E g [( g · x ) ⊗ k ] − Π ⊗ k E g [( g · x ′ ) ⊗ k ] � � � � k ! � � � HS k A. Bandeira, B. Blum-Smith, J. Kileel, A. Perry, J. Weed, A. Wein, submitted 2018

Purely algebraic questions Sample complexity for orbit retrieval (+ various weakenings) is answered by purely algebraic questions about invariants. If Π = id, these are ◮ Unique recovery : smallest d s.t. a separating subalgebra of R [ V ] G is generated by R [ V ] G ≤ d ◮ Generic unique recovery : smallest d s.t. R ( V ) G is generated by R [ V ] G ≤ d ◮ Generic list recovery : smallest d s.t. R [ V ] G ≤ d generates a subfield of R ( V ) G of full transcendence degree

Purely algebraic questions Sample complexity for orbit retrieval (+ various weakenings) is answered by purely algebraic questions about invariants. If Π = id, these are ◮ Unique recovery : smallest d s.t. a separating subalgebra of R [ V ] G is generated by R [ V ] G ≤ d ◮ Generic unique recovery : smallest d s.t. R ( V ) G is generated by R [ V ] G ≤ d ◮ Generic list recovery : smallest d s.t. R [ V ] G ≤ d generates a subfield of R ( V ) G of full transcendence degree MRA has invariant fraction field generated at d = 3. Thus for generic x ∈ V , need O( σ 6 ) samples to uniquely recover Z / L Z · x (by any means) The Jacobian criterion efficiently calculates the d for generic list recovery Low-degree ring generation studied classically, field generation neglected

Back to cryo-EM Theorem (2018) Cryo-EM with projection has sample complexity O( σ 6 ) , in the sense of generic list recovery. Cryo-EM with no projection has sample complexity O( σ 6 ) , in the sense of generic unique recovery. The cubic, quadratic and linear invariant features may be birationally inverted efficiently , via Cholesky factorizations of matrices, orthogonal Tucker factorizations of tensors and frequency marching.

Orbit Retrieval with Applications to Cryo-Electron Microcopy Joe - PowerPoint PPT Presentation

Orbit Retrieval with Applications to Cryo-Electron Microcopy Joe Kileel , Princeton University New York, December 2018 Thanks to many collaborators Amit Singer (Princeton) Afonso Bandeira (NYU) Alex Wein (NYU) Ben Blum-Smith (NYU)

TOM TOM A toolbox toolbox for for Cryo Cryo- -Electron Electron A Tomography and Single

Electron Ionization (EI) Electron Ionization (EI) Electron Ionization (EI) Electron Ionization

Regional Consortia for High Resolution Cryo Electron Microscopy Goal: ensure access of cryo EM

Basics and progress of single particle reconstructions with cryo- EM (3DEM) Shashi Bhushan

Weighted 10 Hz feedback for better orbit stability at IR2 bi1-bh1 bi1-bh3 F=0: orbit w/o 10 Hz

ORBIT Project Overview www.orbit-lab.org ORBIT Overview: Project Rationale Wireless

Transformative Potential of High Resolution Cryo-Electron Microscopy Sponsoring ICOs: NIGMS,

New substrates for electron cryo-microscopy Lori Passmore 2014 NRAMM Workshop on Advanced Topics

Electron Cooling Electron Cooling Plans for future electron cooling needs PS BD/AC 25 th January

Electron Cloud Build Electron Cloud Build- Electron Cloud Build Electron Cloud Build -Up

Early cryo-electron microscopy Jacques Dubochet 1 Thank you 2 Edouard Kellenberger Sir John

Class Averaging in Cryo-Electron Microscopy Zhizhen Jane Zhao Courant Institute of Mathematical

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Orbit matrices of symmetric designs and related self-dual codes Orbit matrices of symmetric

SPS Orbit studies Contents: - Motivation - Stabilization of orbit at extraction points - Search

How to blow up orbit closures real good (Orbit closures and rational surfaces) Ted Chinburg

Large GL p 2 , R q invariant subvarieties of the Hodge bundle and billiards in polygons Alex

The magne)c field draping direc)on at Mars Fatmah Alkindi ,

Stability and bifurcation analysis for maps Marc R. Roussel November 26, 2019 Marc R. Roussel

Cocycle and orbit superrigidity for lattices in SL( n , R R ) acting on homogeneous spaces R

Invariance groups of finite functions and orbit equivalence of permutation groups Tam as

Cypher for Gremlin And more... And more... And more... MATCH

Criteria for rational smoothness of some symmetric orbit closures Axel Hultman KTH, Stockholm

The singular Weinstein conjecture C edric Oms Universidad Polit ecnica de Catalunya Friday

Sambuz

Useful Links

Newsletter

Mail Us