Python for brain mining: (neuro)science with state of the art - PowerPoint PPT Presentation

Python for brain mining: (neuro)science with state of the art machine learning and data visualization Ga¨ el Varoquaux 1. Data-driven science “Brain mining” 2. Data mining in Python Mayavi, scikit-learn, joblib

1 Brain mining Learning models of brain function Ga¨ el Varoquaux 2

1 Imaging neuroscience Brain Models of images function Cognitive tasks Ga¨ el Varoquaux 3

1 Imaging neuroscience Brain Models of images function Data-driven science i � ∂ ∂ t Ψ = H Ψ Cognitive tasks Ga¨ el Varoquaux 3

1 Brain functional data Rich data 50 000 voxels per frame Complex underlying dynamics Few observations ∼ 100 Drawing scientific conclusions? Ill-posed statistical problem Ga¨ el Varoquaux 4

1 Brain functional data Rich data 50 000 voxels per frame Modern complex system studies: Complex underlying from strong hypothesizes to rich data dynamics Few observations ∼ 100 Drawing scientific conclusions? Ill-posed statistical problem Ga¨ el Varoquaux 4

1 Statistics: the curse of dimensionality y function of x 1 Ga¨ el Varoquaux 5

1 Statistics: the curse of dimensionality y function of x 1 y function of x 1 and x 2 More fit parameters? ⇒ need exponentially more data Ga¨ el Varoquaux 5

1 Statistics: the curse of dimensionality y function of x 1 y function of x 1 and x 2 More fit parameters? ⇒ need exponentially more data y function of 50 000 voxels? Expert knowledge Machine learning (pick the right ones) Ga¨ el Varoquaux 5

1 Brain reading Predict from brain images the object viewed Correlation analysis Ga¨ el Varoquaux 6

1 Brain reading Predict from brain images the object viewed Inverse problem Inject prior: regularize Observations Spatial code Sparse regression = compressive sensing Correlation analysis Ga¨ el Varoquaux 6

1 Brain reading Predict from brain images the object viewed Inverse problem Inject prior: regularize Extract brain regions Observations Spatial code Total variation regression Correlation analysis [Michel, Trans Med Imag 2011] Ga¨ el Varoquaux 6

1 Brain reading Predict from brain images the object viewed Inverse problem Inject prior: regularize Cast the problem in a prediction task: Extract brain regions supervised learning. Observations Spatial code Total variation Prediction is a model-selection metric regression Correlation analysis [Michel, Trans Med Imag 2011] Ga¨ el Varoquaux 6

1 On-going/spontaneous activity 95% of the activity is unrelated to task Ga¨ el Varoquaux 7

1 Learning regions from spontaneous activity Spatial map Time series Multi-subject dictionary learning Sparsity + spatial continuity + spatial variability ⇒ Individual maps + functional regions atlas [Varoquaux, Inf Proc Med Imag 2011] Ga¨ el Varoquaux 8

1 Graphical models: interactions between regions Estimate covariance structure Many parameters to learn Regularize: conditional independence = sparsity on inverse covariance [Varoquaux NIPS 2010] Ga¨ el Varoquaux 9

1 Graphical models: interactions between regions Estimate covariance structure Many parameters to learn Regularize: conditional Find structure via a density estimation: independence unsupervised learning. = sparsity on inverse Model selection: likelihood of new data covariance [Varoquaux NIPS 2010] Ga¨ el Varoquaux 9

2 My data-science software stack Mayavi, scikit-learn, joblib Ga¨ el Varoquaux 10

2 Mayavi: 3D data visualization Requirements Solution large 3D data VTK: C++ data visualization interactive visualization UI (traits) + pylab-inspired API easy scripting Black-box solutions don’t yield new intuitions Limitations Tragedy of the hard to install commons or clunky & complex niche product? C++ leaking through 3D visualization doesn’t pay in academia Ga¨ el Varoquaux 11

2 scikit-learn: statistical learning Vision Address non-machine-learning experts Simplify but don’t dumb down Performance: be state of the art Ease of installation Ga¨ el Varoquaux 12

2 scikit-learn: statistical learning Technical choices Prefer Python or Cython, focus on readability Documentation and examples are paramount Little object-oriented design. Opt for simplicity Prefer algorithms to framework Code quality: consistency and testing Ga¨ el Varoquaux 13

2 scikit-learn: statistical learning API Inputs are numpy arrays Learn a model from the data: estimator.fit(X train, Y train) Predict using learned model estimator.predict(X test) Test goodness of fit estimator.score(X test, y test) Apply change of representation estimator.transform(X, y) Ga¨ el Varoquaux 14

2 scikit-learn: statistical learning Computational performance scikit-learn mlpy pybrain pymvpa mdp shogun SVM 5.2 9.47 17.5 11.52 40.48 5.63 LARS 1.17 105.3 - 37.35 - - Elastic Net 0.52 73.7 - 1.44 - - kNN 0.57 1.41 - 0.56 0.58 1.36 PCA 0.18 - - 8.93 0.47 0.33 k-Means 1.34 0.79 - 35.75 0.68 ∞ Algorithms rather than low-level optimization convex optimization + machine learning Avoid memory copies Ga¨ el Varoquaux 15

2 scikit-learn: statistical learning Community 35 contributors since 2008, 103 github forks 25 contributors in latest release (3 months span) Why this success? Trendy topic? Low barrier of entry Friendly and very skilled mailing list Credit to people Ga¨ el Varoquaux 16

2 joblib: Python functions on steroids We keep recomputing the same things Nested loops with overlapping sub-problems Varying parameters I/O Standard solution: pipelines Challenges Dependencies modeling Parameter tracking Ga¨ el Varoquaux 17

2 joblib: Python functions on steroids Philosophy Simple don’t change your code Minimal no dependencies Performant big data Robust never fail joblib’s solution = lazy recomputation: Take an MD5 hash of function arguments, Store outputs to disk Ga¨ el Varoquaux 18

2 joblib Lazy recomputing j o b l i b import Memory >>> from >>> mem = Memory ( c a c h e d i r =’/tmp/joblib ’) >>> import numpy as np >>> a = np . vander ( np . arange (3)) >>> s q u a r e = mem . cache ( np . s q u a r e ) >>> b = s q u a r e ( a ) [ Memory ] C a l l i n g s q u a r e ... s q u a r e ( a r r a y ([[0 , 0, 1], [1, 1, 1], [4, 2, 1]])) s q u a r e - 0.0 s >>> c = s q u a r e ( a ) >>> # No recomputation Ga¨ el Varoquaux 19

Conclusion Data-driven science will need machine learning because of the curse of dimensionality Scikit-learn and joblib: focus on large-data performance and easy of use Cannot develop software and science separately Ga¨ el Varoquaux 20

Python for brain mining: (neuro)science with state of the art - PowerPoint PPT Presentation

Python for brain mining: (neuro)science with state of the art machine learning and data visualization Ga el Varoquaux 1. Data-driven science Brain mining 2. Data mining in Python Mayavi, scikit-learn, joblib 1 Brain mining Learning

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Python for Data Science Overview of Python Why Python Installing Python Installing Python Modules

BRAIN VENTRICULAR SYSTEM CSF THE BRAIN BRAIN The brain (encephalon) lies within the cranium. It

Python Tidbits Python created by that guy ---> Python is named after Monty Pythons

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Pitch Anything by Oren Klaff BUYER 3 3 Neocortex Neocortex 2 2 Mid Brain Mid Brain

Looping through Python data structures Justin Kiggins Product Manager DataCamp Python for

HPC Python Programming Ramses van Zon July 10, 2019 Ramses van Zon HPC Python Programming July

First Tool: Python! Introduction to python programming Gholamhossein Tavasoli @ ZNU First Tool:

A Heart (The Nerve!) Regions of the Brain Cerebral hemisphere Diencephalon Cerebellum Brain

Brain Injury Brain Injury: Definition: Brain injury refers to damage or destruction of A

Left Brain | Right Brain Introduction & Historical Overview Janet Hsiao Cog Sci 200 Outline

Language and the human brain Brain and Language What will be covered? A brief survey of

Understanding Wide Neural Networks Jaehoon Lee Google Brain HEP-AI Journal Club Feb 5, 2019

Nilearn: Machine learning for brain imaging in Python Ga el Varoquaux INRIA/Parietal 1

Getting Started with Python The Python Interpreter A piece of software that executes

IBME Ivor J.A. Simpson 1, 2 , Mark W. Woolrich 2,3 , Adrian R. Groves 2 , Julia A. Schnabel 1

DEPRESSION: UNDERSTANDING RISK FACTORS AND TREATMENT Deanna Barch Cognitive Control and

NIDA, NIAAA, NCI, NICHD Ten year longitudinal study of 10,000 children from age 10 to 20 years to

Frontiers in Electrical Brain Imaging: stroke, epilepsy and real-time functional activity Andrea

Mo Monthly Webinar Se Series July, 2018 To Todays Agenda Announcements and Trial

Neuroimaging Predictors of Survival, Pathology and Molecular Profiles in TCGA Glioblastomas David

Chapter 5 research Experimental Ablation Definition The removal or destruction of a

Quantitative Susceptibility Mapping (QSM) of the kidney Technical challenge and potential

Python for brain mining: (neuro)science with state of the art - PowerPoint PPT Presentation

Python for brain mining: (neuro)science with state of the art machine learning and data visualization Ga el Varoquaux 1. Data-driven science Brain mining 2. Data mining in Python Mayavi, scikit-learn, joblib 1 Brain mining Learning

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Python for Data Science Overview of Python Why Python Installing Python Installing Python Modules

BRAIN VENTRICULAR SYSTEM CSF THE BRAIN BRAIN The brain (encephalon) lies within the cranium. It

Python Tidbits Python created by that guy ---&gt; Python is named after Monty Pythons

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Pitch Anything by Oren Klaff BUYER 3 3 Neocortex Neocortex 2 2 Mid Brain Mid Brain

Looping through Python data structures Justin Kiggins Product Manager DataCamp Python for

HPC Python Programming Ramses van Zon July 10, 2019 Ramses van Zon HPC Python Programming July

First Tool: Python! Introduction to python programming Gholamhossein Tavasoli @ ZNU First Tool:

A Heart (The Nerve!) Regions of the Brain Cerebral hemisphere Diencephalon Cerebellum Brain

Brain Injury Brain Injury: Definition: Brain injury refers to damage or destruction of A

Left Brain | Right Brain Introduction &amp; Historical Overview Janet Hsiao Cog Sci 200 Outline

Language and the human brain Brain and Language What will be covered? A brief survey of

Understanding Wide Neural Networks Jaehoon Lee Google Brain HEP-AI Journal Club Feb 5, 2019

Nilearn: Machine learning for brain imaging in Python Ga el Varoquaux INRIA/Parietal 1

Getting Started with Python The Python Interpreter A piece of software that executes

IBME Ivor J.A. Simpson 1, 2 , Mark W. Woolrich 2,3 , Adrian R. Groves 2 , Julia A. Schnabel 1

DEPRESSION: UNDERSTANDING RISK FACTORS AND TREATMENT Deanna Barch Cognitive Control and

NIDA, NIAAA, NCI, NICHD Ten year longitudinal study of 10,000 children from age 10 to 20 years to

Frontiers in Electrical Brain Imaging: stroke, epilepsy and real-time functional activity Andrea

Mo Monthly Webinar Se Series July, 2018 To Todays Agenda Announcements and Trial

Neuroimaging Predictors of Survival, Pathology and Molecular Profiles in TCGA Glioblastomas David

Chapter 5 research Experimental Ablation Definition The removal or destruction of a

Quantitative Susceptibility Mapping (QSM) of the kidney Technical challenge and potential

Python Tidbits Python created by that guy ---> Python is named after Monty Pythons

Left Brain | Right Brain Introduction & Historical Overview Janet Hsiao Cog Sci 200 Outline