fixed point algorithms for phase retrieval and
play

Fixed Point Algorithms for Phase Retrieval and Ptychography Albert - PowerPoint PPT Presentation

Fixed Point Algorithms for Phase Retrieval and Ptychography Albert Fannjiang University of California, Davis Mathematics of Imaging Workshop: Variational Methods and Optimization in Imaging IHP, Paris, February 4th-8th 2019 Collaborators:


  1. Fixed Point Algorithms for Phase Retrieval and Ptychography Albert Fannjiang University of California, Davis Mathematics of Imaging Workshop: Variational Methods and Optimization in Imaging IHP, Paris, February 4th-8th 2019 Collaborators: Pengwen Chen (NCHU), Gi-Ren Liu (NCKU), Zheqing Zhang (UCD)

  2. Outline Introduction Alternating projection for feasibility Douglas-Rachford splitting/ADMM Convergence analysis Initialization methods Blind ptychoraphy Conclusion 2 / 46

  3. Phase retrieval X-ray crystallography: von Laue, Bragg etc. since 1912. Non-periodic structures: Gerchberg, Saxton, Fienup etc since 1972, delay due to low SNR. Nonlinear signal model: data = diffraction pattern = |F ( f ) | 2 F = Fourier transform , | · | = componentwise modulus . 3 / 46

  4. Coded diffraction pattern 4 / 46

  5. Alternating projections

  6. Nonconvex feasibility Masking µ + propagation F + intensity measurement: coded diffraction pattern = |F ( f ⊙ µ ) | 2 . F (2012): Uniqueness with probability one b = | Ax | , x ∈ X X = R n , (1 mask) A = Φ diag ( µ ) � Φ diag ( µ 1 ) � X = C n , (2 masks) A = Φ diag ( µ 2 ) Non-convex feasibility: Find y ˆ ∈ A X ∩ Y { y ∈ C N : | y | = b } Y := Intersection of N -dim torus Y and n - or 2 n -dim subspace A X 6 / 46

  7. Alternating projections 7 / 46

  8. Coded vs plain diffraction pattern 150 ER(SR = 4) AP: real-valued Cameraman with one diffraction pattern. Plain diffraction pattern allows � � � � (a) coded; 40 iter (b) error ambiguities such 1050 ER(SR = 8) as translation, twin-image which are forbidden by the presence of a random mask. � (c) plain;1000 iter (d) error 8 / 46

  9. Douglas-Rachford splitting

  10. Alternating minimization Minimization with a sum of two objective functions arg min u K ( u ) + L ( v ) , u = v where { Ax : x ∈ C n } K = Indicator function of | v [ i ] | 2 − b 2 [ i ] ln | v [ i ] | 2 � L ( v ) = (Poisson log-likelihood) . i Projection onto K = AA † u . Linear constraint u = v . L has a simple asymptotic form 10 / 46

  11. Gaussian log-likelihood High SNR: Gaussian distribution with variance = mean: e − ( b − λ )2 / (2 λ ) √ . 2 πλ Gaussian log-likelihood: λ = | v | 2 2 ln | v [ j ] | + 1 � b [ j ] � � � � | v [ j ] | − | v [ j ] | − → L � � 2 � � j In the vicinity of b , we make the substitution b [ j ] � | v [ j ] | → 1 , ln | v [ j ] | → ln b [ j ] to obtain const. + 1 | b [ j ] − | v [ j ] || 2 − � → L 2 j which is the smoothest of the 3 functions. 11 / 46

  12. Alternating projections revisited Hard constraint u = v arg min u K ( u ) + L ( u ) = arg min x L ( u ) , u = Ax where { Ax : x ∈ C n } K = Indicator function of 1 2 � b − | u |� 2 L ( u ) = (Gaussian log-likelihood) . L non-smooth where b vanishes. AP = gradient descent with unit stepsize: x k +1 = x k − ∇L ( x k ) . Wirtinger flow = gradient descent with L = 1 2 �| Ax | 2 − b � 2 (additive i.i.d. Gaussian noise) . 12 / 46

  13. Proximal optimality Proximity operators are generalization of projections: x L ( x ) + ρ 2 � x − u � 2 prox L /ρ ( u ) = arg min AA † u . prox K /ρ ( u ) = For simplicity, set ρ = 1. Proximal reflectors R L = 2 prox L − I , R K = 2 prox K − I Proximal optimality: 0 ∈ ∂ L ( x ) + ∂ K ( x ) iff ξ = R L R K ( ξ ) , x = prox K ( ξ ) 13 / 46

  14. Proximal optimality: proof Let η = R K ( ξ ). Then ξ = R L ( η ). Also ζ := 1 2 ( ξ + η ) = prox L ( η ) = prox K ( ξ ). Equivalently ξ ∈ ∂ K ( ζ ) + ζ, η ∈ ∂ L ( ζ ) + ζ Adding the two equations: 0 ∈ ∂ K ( ζ ) + ∂ L ( ζ ). Finally ζ = prox K ( ξ ) is a stationary point. 14 / 46

  15. Douglas-Rachford splitting (DRS) Optimality leads to Peaceman-Rachford splitting: z k +1 = R L /ρ R K /ρ ( z k ). DRS z l +1 = 1 2 z l + 1 2 R L /ρ R K /ρ ( z l ): for l = 1 , 2 , 3 · · · y l +1 = prox K /ρ ( u l ); z l +1 = prox L /ρ (2 y l +1 − u l ) u l +1 = u l + z l +1 − y l +1 . γ = 1 /ρ = stepsize; ρ = 0 the classical DR algorithm. Alternating Direction Method of Multipliers (ADMM) applied to the dual problem y , z L ∗ ( y ) + K ∗ ( − A ∗ z ) + � λ, y − A ∗ z � + ρ 2 � A ∗ z − y � 2 max min λ 15 / 46

  16. DRS map Object update: f = A † u ∞ where u ∞ is the terminal value of ρ + 1 u l + ρ − 1 1 1 ρ + 1 Pu l + 2 Pu l − u l � u l +1 � = ρ + 1 b ⊙ sgn 1 ρ − 1 1 2 u l + 2( ρ + 1) Ru l + Ru l ) � = ρ + 1 b ⊙ sgn where P = AA † is the orthogonal projection onto the range of A and R = 2 P − I is the corresponding reflector. ρ = 0: the classical Douglas-Rachford algorithm 1 2 u l − 1 2 Ru l u l + b ⊙ sgn u l +1 Ru l ) � = u l − Pu l + b ⊙ sgn Ru l ) . � = 16 / 46

  17. Convergence analysis

  18. Convergence analysis Lewis-Malick (2008): local linear convergence of AP for transversally intersecting smooth manifolds. Lewis-Luke-Malick (2009): transversal intersection − → linearly regular intersection (LRI). Arago´ on-Borwein (2012): global convergence of DR ( ρ = 0) for intersection of a line and a circle. Hesse-Luke (2013): local geometric convergence of DR ( ρ = 0) for LRI of an affine set and a super-regular set. Li-Pong (2016): → L has uniformly Lipschitz gradient (ULG). → DRS with ρ sufficiently large, depending on Lipschitz constant. → Global convergence: cluster point = stationary point. → Local geometric convergence for semi-algebraic case. K and L don’t have ULG and optimal performance is with ρ ∼ 1 . Candes et at. (2015): global convergence of Wirtinger flow with spectral initialization. 18 / 46

  19. Fixed point equation Fixed point equation 1 ρ − 1 1 � u = 2 u + 2( ρ + 1) R ∞ u + ρ + 1 b ⊙ sgn R ∞ u ) The differential map is given by Ω J A ( η ) where 1 � CC † η − 2 CC † η − η � � J A ( η ) = ℜ 1 + ρ � �� 2 CC † η − η � � + ı I − diag ( b / | Ru | ) ℑ where C = Ω ∗ A . Ω = diag (sgn( Ru )) , 19 / 46

  20. Fixed point analysis Two randomly coded diffraction patterns: F (2012) – intersection ∼ S 1 (arbitrary phase factor). Chen & F (2016) – DR ( ρ = 0) fixed points u take the form e i θ ( b + r ) ⊙ sgn( Af ) , r ∈ R N , u = b + r ≥ 0 = ⇒ sgn( u ) = θ + sgn( Af ) where r is a real null vector of A † diag [sgn( Af )] = ⇒ DR fixed point set has real dimension N − n . Chen, F & Liu (2016) – AP based on the hard constraint u = v AP fixed point x ∗ : � Ax ∗ � = � Af � iff x ∗ = α f , | α | = 1 . 20 / 46

  21. Spectral gap and linear convergence rate J A can be analyzed by the eigen-structure of � ℜ [ A † Ω ] � H := , Ω = diag (sgn( Af )) . ℑ [ A † Ω ] � J A ( η ) � = � η � occurs at η = ± i b . Linear convergence rate is related to the spectral gap of H . One randomly coded diffraction pattern: → Chen & F (2016) – the differential map at Af has the largest singular value 1 corresponding to the constant phase and a positive spectral gap = ⇒ the true solution is an attractor (local linear convergence). → F & Zhang (2018) – the differential map at any DR fixed point has a spectral radius = 1. → Chen, F & Liu (2016) – same for AP (parallel or serial). 21 / 46

  22. DRS fixed points Proposition Let u be a fixed point and f ∞ := A † u. (i) ρ ≥ 1 : If � J A ( η ) � 2 ≤ � η � 2 then |F ( µ, f ∞ ) | = b. (ii) ρ ≥ 0 : If |F ( µ, f ∞ ) | = b then � J A ( η ) � 2 ≤ � η � 2 . where the equality holds iff η parallels ı b. Summary: DRS ( ρ ≥ 1) fixed point is linearly stable iff it is a true solution DR ( ρ = 0) introduces harmless, stable fixed points. AP likely introduces spurious nonsolution fixed points. Linear convergence rate: Serial AP < parallel AP ∼ DRS ( ρ = 1) < DR ( ρ = 0). 22 / 46

  23. Initialization

  24. Initialization by feature extraction b = | Af | where A ∈ C N × n is the measurement matrix. Feature: two sets of signals, weak and strong. Weak signals selected by a threshold τ , i.e. b i ≤ τ, i ∈ I . x null := ground state of A I . Isometry: � Ax � 2 = � A I x � 2 + � A I c x � 2 = � x � 2 = ⇒ � A I x � 2 : � x � = � f � � � x null = arg min � A I c x � 2 : � x � = � f � � � = arg max solved by the power method efficiently. Non-isometry = ⇒ QR: A = QR 24 / 46

  25. Null vector algorithm Let 1 c be the characteristic function of the complementary index I c with | I c | = γN . Algorithm 1: The null vector method 1 Random initialization: x 1 = x rand 2 Loop: 3 for k = 1 : k max − 1 do x 0 k ← A ( 1 c � A ∗ x k ); 4 h i h i 0 0 x k +1 ← x X / k x X k 5 k k 6 end 7 Output: x null = x k max . Truncated spectral vector Algorithm 2: The spectral vector method 1 Random initialization: x 1 = x rand 2 Loop: 1 τ � | b | 2 � A ∗ x 3 for k = 1 : k max − 1 do � � x t-spec = arg max k x k =1 k A k k ← A ( | b | 2 � A ∗ x k ); x 0 4 h i h i 0 0 x k +1 ← x X / k x X k ; { i : | A ∗ x ( i ) | ≤ τ k b k } 5 k k 6 end 7 Output: x spec = x k max . Netrapalli-Jain-Sanghavi 2015 Candes-Chen 2015 25 / 46

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend