learning graphical models of the brain
play

Learning graphical models of the brain Ga el Varoquaux functional - PowerPoint PPT Presentation

Learning graphical models of the brain Ga el Varoquaux functional MRI (fMRI) t Recordings of brain activity G Varoquaux 2 functional MRI (fMRI) t Recordings of brain activity Brain mapping : the motor system: move the right hand


  1. Learning graphical models of the brain Ga¨ el Varoquaux

  2. functional MRI (fMRI) t Recordings of brain activity G Varoquaux 2

  3. functional MRI (fMRI) t Recordings of brain activity Brain mapping : the motor system: “move the right hand” the language system: “say three names of animals” G Varoquaux 2

  4. functional MRI (fMRI) Brain mapping : The language network the language system: “say three names of animals” G Varoquaux 3

  5. functional MRI (fMRI) Brain mapping : The language network Interacting sub-systems : Sounds Lexical access Syntax the language system: “say three names of animals” G Varoquaux 3

  6. The functional connectome View of the brain as a set of regions and their interactions G Varoquaux 4

  7. The functional connectome View of the brain as a set of regions and their interactions Intrinsic brain architecture Biomarkers of pathologies Learn a graphical model Human Connectome Project: 30M$ G Varoquaux 4

  8. Resting-state fMRI G Varoquaux 5

  9. Outline 1 Graphical structures of brain activity 2 Multi-subject graph learning 3 Beyond ℓ 1 models G Varoquaux 6

  10. 1 Graphical structures of brain activity Functional connectome Graph of interactions between regions [Varoquaux & Craddock 2013] G Varoquaux 7

  11. 1 From correlations to connectomes Conditional independence structure? G Varoquaux 8

  12. 1 Probabilistic model for interactions Simplest data generating process = multivariate normal: 2 X T Σ − 1 X | Σ − 1 | e − 1 � P ( X ) ∝ Model parametrized by inverse covariance matrix, K = Σ − 1 : conditional covariances Goodness of fit: likelihood of observed covariance ˆ Σ in model Σ L ( ˆ Σ | K ) = log | K | − trace ( ˆ Σ K ) G Varoquaux 9

  13. 1 Graphical structure from correlations Observations Direct connections Covariance Inverse covariance 1 1 2 2 0 0 3 3 4 4 Diagonal: Diagonal: signal variance node innovation G Varoquaux 10

  14. 1 Independence structure (Markov graph) Zeros in partial correlations give conditional independence Reflects the large-scale brain interaction structure G Varoquaux 11

  15. 1 Independence structure (Markov graph) Zeros in partial correlations give conditional independence Ill-posed problem: multi-collinearity ⇒ noisy partial correlations Independence between nodes makes estimation of partial correlations well-conditionned. Chicken and egg problem G Varoquaux 11

  16. 1 Independence structure (Markov graph) Zeros in partial correlations give conditional independence Ill-posed problem: multi-collinearity ⇒ noisy partial correlations Independence between nodes makes estimation of partial correlations well-conditionned. 1 1 Joint estimation: 2 2 Sparse inverse covariance + 0 0 3 3 4 4 G Varoquaux 11

  17. 1 Sparse inverse covariance estimation: penalized Maximum a posteriori: Fit models with a penalty Sparsity ⇒ Lasso-like problem: ℓ 1 penalization K ≻ 0 L ( ˆ x 2 K = argmin Σ | K ) + λ ℓ 1 ( K ) Data fit, Penalization, x 1 Likelihood [Varoquaux NIPS 2010] [Smith 2011] G Varoquaux 12

  18. x 2 x 1 1 Sparse inverse covariance estimation: penalized Maximum a posteriori: Fit models with a penalty Sparsity ⇒ Lasso-like problem: ℓ 1 penalization K ≻ 0 L ( ˆ K = argmin Σ | K ) + λ ℓ 1 ( K ) Test-data likelihood Optimal graph almost dense Sparsity 2.5 3.0 3.5 4.0 − log 10 λ G Varoquaux 12

  19. x 2 x 1 1 Sparse inverse covariance estimation: penalized Maximum a posteriori: Bias of ℓ 1 : Fit models with a penalty very sparse graphs don’t fit the data Sparsity ⇒ Lasso-like problem: ℓ 1 penalization K ≻ 0 L ( ˆ K = argmin Σ | K ) + λ ℓ 1 ( K ) Test-data likelihood Optimal graph almost dense Sparsity 2.5 3.0 3.5 4.0 − log 10 λ G Varoquaux 12

  20. x 2 x 1 1 Sparse inverse covariance estimation: penalized Maximum a posteriori: Bias of ℓ 1 : Fit models with a penalty very sparse graphs don’t fit the data Sparsity ⇒ Lasso-like problem: ℓ 1 penalization Algorithmic considerations: K ≻ 0 L ( ˆ K = argmin Σ | K ) + λ ℓ 1 ( K ) Very ill-conditionned input matrices Graph-lasso [Friedman 2008] doesn’t work well Test-data likelihood primal-dual algorithm with approximation when switching from dual to primal [Mazumder, 2012] Optimal graph Good success with ADMM almost dense split optimization: loss solved with SPD matrices Sparsity penalty solved with sparse matrices 2.5 3.0 3.5 4.0 − log 10 λ G Varoquaux 12

  21. 1 Very sparse graphs: greedy construction Sparse inverse covariance algorithm: PC-DAG [Rutimann & Buhlmann 2009] Greedy approach 1. PC-alg : fill graph by independence tests conditioning on neighbors 2. Learn covariance on resulting structure Good for very sparse graphs [Varoquaux J. Physio Paris, 2012] G Varoquaux 13

  22. 1 Sparse graphs: greedy construction Test data likelihood Iterate construction alg. High-degree nodes appear very quickly complexity ∝ exp degree 0 20 Fillingfactor (percents) Lattice-like structure with hubs [Varoquaux J. Physio Paris, 2012] G Varoquaux 14

  23. 2 Multi-subject graph learning Not enough data per subject to recover structure G Varoquaux 15

  24. 2 Subject-level data scarsity Sparse recovery for Gaussian graphs ℓ 1 structure recovery has phase-transitions behaviors For Gaussian graphs with s edges, p nodes: � √ p � � � n = O ( s + p ) log p s = o , [Lam & Fan 2009] Need to accumulate data across subjects Concatenate series = iid data G Varoquaux 16

  25. 2 Graphs on group data ˆ Sparse Sparse group Σ − 1 inverse concat Likelihood of new data (cross-validation) Subject data, Σ − 1 -57.1 Subject data, sparse inverse 43.0 Group concat data, Σ − 1 40.6 Group concat data, sparse inverse 41.8 Inter-subect variability [Varoquaux NIPS 2010] G Varoquaux 17

  26. 2 Multi-subject modeling Common independence structure but different connection values s L ( ˆ { K s } = argmin Σ s | K s ) + λ ℓ 21 ( { K s } ) � { K s ≻ 0 } Multi-subject data fit, Group-lasso penalization Likelihood [Varoquaux NIPS 2010] G Varoquaux 18

  27. 2 Multi-subject modeling Common independence structure but different connection values s L ( ˆ { K s } = argmin Σ s | K s ) + λ ℓ 21 ( { K s } ) � { K s ≻ 0 } Multi-subject data fit, ℓ 1 on the connections of Likelihood the ℓ 2 on the subjects [Varoquaux NIPS 2010] G Varoquaux 18

  28. 2 Population-sparse graph perform better ˆ Population Sparse Σ − 1 prior inverse Likelihood of new data (cross-validation) sparsity Subject data, Σ − 1 -57.1 Subject data, sparse inverse 43.0 60% full Group concat data, Σ − 1 40.6 Group concat data, sparse inverse 41.8 80% full Group sparse model 45.6 20% full [Varoquaux NIPS 2010] G Varoquaux 19

  29. 2 Independence structure of brain activity Subject-sparse estimate G Varoquaux 20

  30. 2 Independence structure of brain activity Population- sparse estimate G Varoquaux 20

  31. 2 Large scale organization High-level cognitive function arises from the interplay of specialized brain regions: The functional segregation of local areas [...] contrasts sharply with their global integration during perception and behavior [Tononi 1994] Functional segregation : nodes of connectome atomic functions – tonotopy Global integration : functional networks high-level functions – language G Varoquaux 21

  32. 2 Large scale organization High-level cognitive function arises from the interplay of specialized brain regions: The functional segregation of local areas [...] contrasts sharply with their global integration during perception and behavior [Tononi 1994] Scale-free hierarchical integration / segregation Graph modularity = divide in communities to maximize intra-class connections versus extra-class [Eguiluz 2005] G Varoquaux 21

  33. 2 Graph cuts to isolate functional communities Find communities to maximize modularity: 2     k  A ( V c , V c )  A ( V , V c ) Q = � A ( V , V ) −   A ( V , V ) c = 1 A ( V a , V b ) : sum of edges going from V a to V b Rewrite as an eigenvalue problem [White 2005] 1 1 A · 1 1 0 0 0 0 ⇒ Spectral clustering = spectral embedding + k-means Similar to normalized graph cuts G Varoquaux 22

  34. 2 Large scale organization Non-sparse Neural communities G Varoquaux 23

  35. 2 Large scale organization Group-sparse Neural communities = large known functional networks G Varoquaux 23

  36. 2 Brain integration between communities Proposed measure for functional integration: mutual information (Tononi) [Marrelec 2008, Varoquaux & Craddock 2013] Integration: I c 1 = 1 2 log det ( K c 1 ) “energy” in network Mutual information: M c 1 , c 2 = I c 1 ∪ c 2 − I c 1 − I s 2 “cross-talks” between networks G Varoquaux 24

  37. 2 Brain integration between communities With population prior: Occipital pole visual areas Default mode network M edial visual areas Fronto-parietal Lateral visual networks areas Fronto-lateral Posterior inferior network temporal 1 Pars Posterior inferior opercularis temporal 2 Right T halamus Dorsal motor Raw Cingulo-insular Ventral motor correlations: network Left Putamen Auditory Basal ganglia [Varoquaux NIPS 2010] G Varoquaux 24

  38. 3 Beyond ℓ 1 models Test-data likelihood Sparsity 2.5 3.0 3.5 4.0 − log 10 λ G Varoquaux 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend