brain connectivity informed adaptive regularization for
play

Brain Connectivity-Informed Adaptive Regularization for Generalized - PowerPoint PPT Presentation

Brain Connectivity-Informed Adaptive Regularization for Generalized Outcomes Jaroslaw Harezlak, Ph.D. Professor and Interim Co-Chair Department of Epidemiology and Biostatistics Indiana University School of Public Health Bloomington, IN, USA


  1. Brain Connectivity-Informed Adaptive Regularization for Generalized Outcomes Jaroslaw Harezlak, Ph.D. Professor and Interim Co-Chair Department of Epidemiology and Biostatistics Indiana University School of Public Health Bloomington, IN, USA May 22, 2020 Jaroslaw Harezlak May 22, 2020 1 / 33

  2. Outline 1 Motivating application 2 Brain structure and connectivity 3 Regularization methods 4 riPEER - ridgified Partially Empirical Eigenvectors for Regression 5 Simulation study 6 Brain structure and HIV infection 7 Discussion Jaroslaw Harezlak May 22, 2020 2 / 33

  3. HIV infection study - WUSM 1 N = 299 HIV-infected individuals: 228 males, 71 females ◮ Duration of infection range: 0 - 33y (mean: 10.2, sd: 8.7) ◮ Age range: 18 - 84 y.o. (mean: 42.3, sd: 16) 2 Imaging modalities ◮ T1 - anatomy ◮ DTI - structural connectivity Jaroslaw Harezlak May 22, 2020 3 / 33

  4. Anatomy and connectivity 1 Anatomy ◮ MPRAGE protocol ◮ Processing using FreeSurfer software (version 5.1) ◮ Desikan-Killiany atlas - 66 cortical regions 2 Structural connectivity ◮ DTI and maximal diffusion coherence model ◮ Density of connections between each pair of regions Jaroslaw Harezlak May 22, 2020 4 / 33

  5. MRI data Jaroslaw Harezlak May 22, 2020 5 / 33

  6. MRI-derived data: cortical thickness 1 Parcellation of the cortex into 66 regions 2 Average cortical thickness (a) Parcellation of the brain (b) Cortical thickness Jaroslaw Harezlak May 22, 2020 6 / 33

  7. Connections in the brain Jaroslaw Harezlak May 22, 2020 7 / 33

  8. Connectivity matrices (a) Connectivity matrix: subject 1 (b) Connectivity matrix: subject 2 Jaroslaw Harezlak May 22, 2020 8 / 33

  9. Population connectivity matrix STRONG A = 𝑏 𝑗𝑘 WEAK Jaroslaw Harezlak May 22, 2020 9 / 33

  10. Questions 1 Scientific ◮ Are changes in the brain structure associated with the HIV infection? ◮ Is there any additional information provided by the structural connectivity? 2 Statistical ◮ How to deal with the highly correlated predictors in the regression models? ◮ How to incorporate the structural connectivity information in the regression models? Jaroslaw Harezlak May 22, 2020 10 / 33

  11. Statistical model 1 y – n -dimensional response (e.g. NP domain score) 2 Z ∈ R n × 66 and X ∈ R n × m � 0 , σ 2 I n � for some unknown σ 2 > 0 3 ε ∼ N Jaroslaw Harezlak May 22, 2020 11 / 33

  12. Statistical model 1 y – n -dimensional response (e.g. NP domain score) 2 Z ∈ R n × 66 and X ∈ R n × m � 0 , σ 2 I n � for some unknown σ 2 > 0 3 ε ∼ N Jaroslaw Harezlak May 22, 2020 12 / 33

  13. Penalized estimation To find the estimates of b and β , we consider the optimization problem of the form � � � � � 2 � y − Zb − X β arg min + λ g ( b ) . 2 ���� � �� � b ,β penalty on b model fit term 1 g ( b ) = � i b 2 − → Ridge estimate i 2 g ( b ) = � i | b i | − → LASSO estimate 3 g ( b ) = || Lb || 2 − → Generalized ridge 2 T. W. Randolph, J. Harezlak, Z. Feng, Structured penalties for functional linear models – partially empirical eigenvectors for regression, Electronic Journal of Statistics (2012) Jaroslaw Harezlak May 22, 2020 13 / 33

  14. Desired property of the estimate, ˆ b “Stronger connections between the brain regions i and j result in more similar coefficients ˆ b i and ˆ b j .” Jaroslaw Harezlak May 22, 2020 14 / 33

  15. Penalty selection The natural choice of the penalty is � � b i − b j � 2 . g ( b ) = a ij i , j 1 d i := � k A ik . 2 D := diag � d 1 , . . . , d 66 � 3 Q := D − A [Laplacian of A] Then: � � b i − b j � 2 = b T Qb . a ij i , j Jaroslaw Harezlak May 22, 2020 15 / 33

  16. Connections with the linear mixed models (LMM) Our objective function becomes �� � � � 2 � y − Zb − X β 2 + λ b T Qb arg min . b ,β This optimization problem is “equivalent” to the LMM formulation 1 y = Zb + X β + ε , where β is a vector of fixed effects and b a vector of random effects, 2 ε ∼ N � 0 , σ 2 I � , 3 b ∼ N � b Q − 1 � 0 , σ 2 , 4 λ , σ and σ b λ = σ 2 /σ 2 are connected via b . Jaroslaw Harezlak May 22, 2020 16 / 33

  17. Selection of the regularization parameter Jaroslaw Harezlak May 22, 2020 17 / 33

  18. The method riPEER (ridgified Partially Empirical Eigenvectors for Regression) � � �� � � ˆ � 2 b rP � y − Zb − X β 2 + λ Q b T Qb + λ R � b � 2 := arg min ˆ 2 � �� � β rP � �� � b ,β ridge part graph part b : λ Q b T Qb + λ R � b � 2 Figure 3: Different shapes of the set � 2 ≤ 1 � for p = 2. Jaroslaw Harezlak May 22, 2020 18 / 33

  19. Connections with the linear mixed models (LMM) riPEER (ridgified Partially Empirical Eigenvectors for Regression) � ˆ � �� � � 2 + b T � λ Q Q + λ R I � b b � 2 rP � y − Zb − X β := arg min ˆ β rP b ,β This problem is “equivalent” to the LMM formulation 1 y = Zb + X β + ε , where β is a vector of fixed effects and b a vector of random effects, 2 ε ∼ N � 0 , σ 2 I � , � R I � − 1 � 0 , � 3 b ∼ N σ 2 Q Q + σ 2 , 4 λ Q λ R , σ , σ Q and σ R are connected via λ Q = σ 2 /σ 2 Q , λ R = σ 2 /σ 2 R . Jaroslaw Harezlak May 22, 2020 19 / 33

  20. Simulation scheme SIMULATED SIGNAL ESTIMATION Graph given by adjacency matrix A Distorted graph 0.1 1 5 1 5 0.1 0.4 0.4 0.3 0.3 2 2 0.1 4 0.1 4 0.6 3 0.6 3 Laplacian : 𝑅 𝑢𝑠𝑣𝑓 Laplacian of distorted graph was used to find the estimate, 𝑐 „Invertible Laplacian ” : MSEr defined as 𝑅 𝑢𝑠𝑣𝑓 ≔ 𝑅 𝑢𝑠𝑣𝑓 + 0.001 ∙ 𝐽 2 𝑐 − 𝑐 𝑢𝑠𝑣𝑓 MSEr: = E 2 2 𝑐 𝑢𝑠𝑣𝑓 2 True signal used in simulation: 2 −1 ) as a measure of estimation accuracy 𝑐 𝑢𝑠𝑣𝑓 ~𝑂(0, 𝜏 𝑐 𝑅 𝑢𝑠𝑣𝑓 Jaroslaw Harezlak May 22, 2020 20 / 33

  21. Simulation scheme – distorted connectivity matrices Jaroslaw Harezlak May 22, 2020 21 / 33

  22. Simulation scheme Three methods compared: 1 ridge: λ Q := 0 (connectivity information is not used) 2 naive: λ R := 0, Q → � Q (only λ Q is selected) 3 riPEER (both lambdas are selected in an adaptive way) Axis of the plot 1 X axis: number of removed/added connections diss ( A true , A obs ) := number of all nonzero connections in A true � � � ˆ b − b true � 2 2 Y axis: MSEr := E 2 . � b true � 2 2 Jaroslaw Harezlak May 22, 2020 22 / 33

  23. Simulation results 0.3 ridge b estimation MSEr 0.2 0.1 0.0 0.00 0.25 0.50 0.75 dissimilarity between A true and A obs Jaroslaw Harezlak May 22, 2020 23 / 33

  24. Simulation results 0.3 ridge naive b estimation MSEr 0.2 0.1 0.0 0.00 0.25 0.50 0.75 dissimilarity between A true and A obs Jaroslaw Harezlak May 22, 2020 24 / 33

  25. Simulation results 0.3 ridge naive ● riPEER b estimation MSEr 0.2 ● ● ● ● 0.1 ● ● ● 0.0 0.00 0.25 0.50 0.75 dissimilarity between A true and A obs Jaroslaw Harezlak May 22, 2020 25 / 33

  26. Results: HIV study 1 Association between cortical thickness and speed of information processing 2 66 considered brain’s regions 3 N = 199 individuals Jaroslaw Harezlak May 22, 2020 26 / 33

  27. Results: Speed of Information Processing caudalanteriorcingulate[L] caudalmiddlefrontal[L] bankssts[L] -0.05 0.05 0 medialorbitofrontal[L], posteriorcingulate[R] superiorparietal[R], supramarginal[R], superiorparietal[L], lateralorbitofrontal[R], precentral[R], Cortical regions: lingual[L], precentral[L], Jaroslaw Harezlak cuneus[L] entorhinal[L] inferiorparietal[L] fusiform[L] inferiortemporal[L] isthmuscingulate[L] lateraloccipital[L] lateralorbitofrontal[L] medialorbitofrontal[L] middletemporal[L] lingual[L] parahippocampal[L] paracentral[L] parsopercularis[L] parsorbitalis[L] parstriangularis[L] pericalcarine[L] posteriorcingulate[L] postcentral[L] rostralanteriorcingulate[L] precentral[L] rostralmiddlefrontal[L] precuneus[L] superiorfrontal[L] superiorparietal[L] superiortemporal[L] supramarginal[L] frontalpole[L] transversetemporal[L] temporalpole[L] caudalanteriorcingulate[R] riPEER estimate of b caudalmiddlefrontal[R] bankssts[R] cuneus[R] entorhinal[R] inferiorparietal[R] fusiform[R] inferiortemporal[R] isthmuscingulate[R] lateraloccipital[R] lateralorbitofrontal[R] medialorbitofrontal[R] middletemporal[R] lingual[R] parahippocampal[R] paracentral[R] parsopercularis[R] parsorbitalis[R] parstriangularis[R] pericalcarine[R] posteriorcingulate[R] postcentral[R] May 22, 2020 rostralanteriorcingulate[R] precentral[R] rostralmiddlefrontal[R] precuneus[R] superiorfrontal[R] superiorparietal[R] superiortemporal[R] supramarginal[R] frontalpole[R] transversetemporal[R] temporalpole[R] 27 / 33

  28. Non-Gaussian distributions y i ∼ member of an Exponential family of distribution Consider an optimization problem of the form � � − 2 loglik ( y ; β, b ) arg min + g λ ( b ) . � �� � � �� � b ,β model fit term penalty on b g λ ( b ) := λ Q b T Qb + λ R � b � 2 2 λ Q and λ R are selected based on the equivalence with GLMM Jaroslaw Harezlak May 22, 2020 28 / 33

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend