Statistical inference in high-dimension & application to brain - - PowerPoint PPT Presentation

statistical inference in high dimension application to
SMART_READER_LITE
LIVE PREVIEW

Statistical inference in high-dimension & application to brain - - PowerPoint PPT Presentation

Statistical inference in high-dimension & application to brain imaging Imaging and machine learning workshop Bertrand Thirion, bertrand.thirion@inria.fr 03/04/2019 Bertrand Thirion 1 Cognitive neuroscience How are cognitive activities


slide-1
SLIDE 1

03/04/2019 Bertrand Thirion 1

Statistical inference in high-dimension & application to brain imaging

Imaging and machine learning workshop Bertrand Thirion, bertrand.thirion@inria.fr

slide-2
SLIDE 2

03/04/2019 Bertrand Thirion 2

Cognitive neuroscience

How are cognitive activities affected or controlled by neural circuits in the brain ?

slide-3
SLIDE 3

03/04/2019 Bertrand Thirion 3

The brain, the mind and the scanner

Cognitive theories Brain Scanner FMRI data

Brain mapping

Experimental paradigm stimuli

slide-4
SLIDE 4

03/04/2019 Bertrand Thirion 4

The brain, the mind and the scanner

Cognitive theories Brain Scanner FMRI data Decoding

Encoding

Brain mapping

Experimental paradigm stimuli

slide-5
SLIDE 5

03/04/2019 Bertrand Thirion 5

Encoding: mapping cognitive functions to brain activity

right hand- left hand listen-read Button press - reading Computation - instructions expression

  • intention

Guess the gender Face trustworthiness false belief – mechanistic (auditory) False belief – mechanistic (visual) Grasping-

  • rientation

judgement Hand – side judgement Intention - random Sentence - checkerboards saccades - fixation Speech-non speech

slide-6
SLIDE 6

03/04/2019 Bertrand Thirion 6

Resolution increases

2007: 3 mm 2014: 1.5 mm 2021: 0.5 mm ? p = 50,000 p = 400,000 p = 107

slide-7
SLIDE 7

03/04/2019 Bertrand Thirion 7

better estimators for large-scale brain imaging

  • A causal framework for brain activity decoding
  • Dimension reduction for images
  • Fast regularized ensembles of models
  • Statistical inference for high-dimensional models
slide-8
SLIDE 8

03/04/2019 Bertrand Thirion 8

Causal reasoning on encoding/decoding

Task Brain activity Behavior

Causal encoding models P(X|T) Causal decoding models P(B|X) Anti-causal decoding models P(T|X) Anti-causal encoding models P(X|B) [Weichwald et al Nimg 2015]

slide-9
SLIDE 9

03/04/2019 Bertrand Thirion 9

Causal interpretation

Encoding: causal Decoding: anti-causal

Task X1 X2 Xp ... X1 X2 Xp ... Behavior

Encoding: anti-causal Decoding: causal

slide-10
SLIDE 10

03/04/2019 Bertrand Thirion 10

Causal reasoning on encoding/decoding

[Weichwald et al. NIMG 2015]

Task X1 X2 Xp ...

slide-11
SLIDE 11

03/04/2019 Bertrand Thirion 11

Causal reasoning on encoding/decoding

[Weichwald et al. NIMG 2015]

Task X1 X2 Xp ...

slide-12
SLIDE 12

03/04/2019 Bertrand Thirion 12

Causal reasoning on encoding/decoding

[Weichwald et al. NIMG 2015]

slide-13
SLIDE 13

03/04/2019 Bertrand Thirion 13

Causal reasoning on encoding/decoding

[Weichwald et al. NIMG 2015]

slide-14
SLIDE 14

03/04/2019 Bertrand Thirion 14

Joint encoding and decoding

[Schwartz et al. NIPS 2013, Varoquaux et al. PCB 2018]

“Encoding” “Decoding”

slide-15
SLIDE 15

03/04/2019 Bertrand Thirion 15

Decoding maps

slide-16
SLIDE 16

03/04/2019 Bertrand Thirion 16

Joint encoding and decoding

[Schwartz et al. NIPS 2013, Varoquaux et al. PCB 2018]

slide-17
SLIDE 17

03/04/2019 Bertrand Thirion 17

Statistical associations and causal reasoning

  • Problems:

– Establish non-independence based on finite

datasets → statistical tests

– Large number of conditioning variables – Encoding models: Multiple comparison issues – Decoding problem: statistical tests in multiple

regression

slide-18
SLIDE 18

03/04/2019 Bertrand Thirion 18

Brain activity decoding

X1 X2 Xp ... y

  • behavior = f (brain activity)

w

slide-19
SLIDE 19

03/04/2019 Bertrand Thirion 19

Outline

  • A causal framework for brain activity decoding
  • Dimension reduction for images
  • Fast regularized ensembles of Models
  • Statistical inference for high-dimensional

models

slide-20
SLIDE 20

03/04/2019 Bertrand Thirion 20

Compression in the image domain

  • Reduce the complexity of learning algorithms:

p→k ≪ p

  • Random projections = fast generic solution, but

– Sub-optimal for structured signals – Not invertible when p and k are large

  • Local redundancy → feature grouping

strategies / clustering: “super-pixels”

– Fast clustering procedures needed (large-k regime)

slide-21
SLIDE 21

03/04/2019 Bertrand Thirion 21

Superpixels as an image operator

slide-22
SLIDE 22

03/04/2019 Bertrand Thirion 22

Crafting good image compression

  • Key assumption: signal of interest L-Lipschitz
  • Feature grouping matrix
  • almost trivially:
  • Worst case

Need a fast method to learn balanced clusters

slide-23
SLIDE 23

03/04/2019 Bertrand Thirion 23

Denoising properties

  • Noisy signal model
  • Denoising
  • Equal-size clusters
slide-24
SLIDE 24

03/04/2019 Bertrand Thirion 24

Recursive neighbor Agglomeration

Based on local decisions = fast (linear time) – avoid percolation

[Thirion et al. Stamlins 2015, Hoyos Idrobo PAMI 2018]

ReNA

slide-25
SLIDE 25

03/04/2019 Bertrand Thirion 25

Effect on data analysis tasks

Impressive speed-up and increased accuracy with respect to non-compressed representation

– Clustering has a denoising effect

[Hoyos Idrobo IEEE PAMI 2018]

slide-26
SLIDE 26

03/04/2019 Bertrand Thirion 26

Outline

  • A causal framework for brain activity decoding
  • Dimension reduction for images
  • Fast regularized ensembles of Models
  • Statistical inference for high-dimensional

models

slide-27
SLIDE 27

03/04/2019 Bertrand Thirion 27

Bagging of clustered models

X

Clustering (create contiguous regions)

y

Solve regression

  • n cluster-

based representation average

...

various clusterings

slide-28
SLIDE 28

03/04/2019 Bertrand Thirion 28

Computationally efficient structure

State of the art solution: not very stable, but cheap “fast regularized ensembles of models”

slide-29
SLIDE 29

03/04/2019 Bertrand Thirion 29

Computationally efficient structure

slide-30
SLIDE 30

03/04/2019 Bertrand Thirion 30

Effect on prediction accuracy

“fast regularized ensembles of models”

[Hoyos Idrobo et al PRNI 2015, Neuroimage 2017, PAMI 2018]

slide-31
SLIDE 31

03/04/2019 Bertrand Thirion 31

More results

[Hoyos Idrobo et al PRNI 2015, Neuroimage 2017, PAMI 2018]

slide-32
SLIDE 32

03/04/2019 Bertrand Thirion 32

Outline

  • A causal framework for brain activity decoding
  • Dimension reduction for images
  • Fast regularized ensembles of Models
  • Statistical inference for high-dimensional

models

slide-33
SLIDE 33

03/04/2019 Bertrand Thirion 33

Statistical inference on w

  • Standard solutions for high-dimensional linear

models (p ≅ n)

– Corrected ridge [Bühlmann 2013] – Desparsified Lasso [Zhang & Zhang 2014, Montanari 2014] – Multi-split [Meinshausen 2009], knockoffs [Candès 2015+]

  • Fail for p ≫ n
  • Inference: find {j: wj > 0} with some statistical

guarantees

slide-34
SLIDE 34

03/04/2019 Bertrand Thirion 34

Desparsified Lasso

[Zhang & Zhang 2014 Series B Stat Meth]

slide-35
SLIDE 35

03/04/2019 Bertrand Thirion 35

Desparsified Lasso

slide-36
SLIDE 36

03/04/2019 Bertrand Thirion 36

Preliminary assessment

slide-37
SLIDE 37

03/04/2019 Bertrand Thirion 37

Large p → need dimension reduction

Large p kills statistical power CDL tames variance

p=2000, n=100

[Chevalier et al. subm. To MICCAI]

slide-38
SLIDE 38

03/04/2019 Bertrand Thirion 38

Adaptation to brain imaging

Step 1: compression by clustering Step 2: inference on compressed representations Step 3: ensembling iterate with different parcellations → aggregate p-values (see also FReM) Clustered Desparsified Lasso Ensemble of Clustered Desparsified Lasso

slide-39
SLIDE 39

03/04/2019 Bertrand Thirion 39

From CDL to ECDL

DL p-values from different clusterings aggregation

slide-40
SLIDE 40

03/04/2019 Bertrand Thirion 40

ECDL for brain imaging

slide-41
SLIDE 41

03/04/2019 Bertrand Thirion 41

δ-error control

slide-42
SLIDE 42

03/04/2019 Bertrand Thirion 42

δ-error control

slide-43
SLIDE 43

03/04/2019 Bertrand Thirion 43

δ-FWER control

slide-44
SLIDE 44

03/04/2019 Bertrand Thirion 44

δ-FWER-control

slide-45
SLIDE 45

03/04/2019 Bertrand Thirion 45

Simulations: ECDL > CDL

[Chevalier et al. MICCAI 2018]

slide-46
SLIDE 46

03/04/2019 Bertrand Thirion 46

Experiments: PR and FWER control

Better PR with ECDL + More accurate FWER control

[Chevalier et al. MICCAI 2018]

slide-47
SLIDE 47

03/04/2019 Bertrand Thirion 47

Effects on real data

[Nguyen et al. IPMI 2019, Chevalier et al. MICCAI 2018]

Social cognition Visual feature discrimination Language vs maths HCP dataset, n=900

slide-48
SLIDE 48

03/04/2019 Bertrand Thirion 48

Conclusion

  • Causal reasoning →

conditional association analysis

  • Large-p data bring challenges:

– Computation cost – Difficulty of statistical inference

  • Solutions: ensembling,

subsampling, compression

  • Efficient stochastic regularizers
  • Ongoing comparison with

knockoff

WIP

  • Classification setting
  • Use of bootstrap

[Nguyen et al. IPMI 2019] [Aydore et al. subm]

slide-49
SLIDE 49

03/04/2019 Bertrand Thirion 49

From good ideas to good practices: software

  • Machine learning in Python
  • Machine learning for neuroimaging

http://nilearn.github.io

  • BSD, Python, OSS

– Classification of (neuroimaging) data – Network analysis

slide-50
SLIDE 50

03/04/2019 Bertrand Thirion 50

Acknowledgements

Parietal

  • G. Varoquaux,
  • A. Gramfort,
  • P. Ciuciu,
  • D. Wassermann,
  • D. Engemann,
  • B. Nguyen

A.L. Grilo Pinho,

  • E. Dohmatob,
  • A. Mensch,

J.A. Chevalier,

  • A. Hoyos idrobo,
  • D. Bzdok,
  • J. Dockès,
  • P. Cerda,
  • C. Lazarus
  • D. La Rocca
  • G. Lemaitre
  • L. El Gueddari
  • O. Grisel
  • M. Massias
  • P. Ablin
  • H. Janati
  • J. Massich
  • K. Dadi
  • H. Richard
  • C. Petitot

Other collaborators

  • R. Poldrack,
  • J. Haxby
  • C. F. Gorgolevski
  • J. Salmon
  • S. Arlot
  • M. Lerasle