CBRAIN An Integrated Web Platform for Neuroimaging Tarek Sherif - - PowerPoint PPT Presentation

cbrain
SMART_READER_LITE
LIVE PREVIEW

CBRAIN An Integrated Web Platform for Neuroimaging Tarek Sherif - - PowerPoint PPT Presentation

CBRAIN An Integrated Web Platform for Neuroimaging Tarek Sherif EGI User Forum April 2011 Dr. Alan Evans Laboratory Funded by CANARIE CANADA'S ADVANCED RESEARCH AND INNOVATION NETWORK http://www.canarie.ca Network: CAnet & Global


slide-1
SLIDE 1

CBRAIN

An Integrated Web Platform for Neuroimaging

Tarek Sherif EGI User Forum April 2011

  • Dr. Alan Evans Laboratory
slide-2
SLIDE 2

Funded by CANARIE

CANADA'S ADVANCED RESEARCH AND INNOVATION NETWORK http://www.canarie.ca

slide-3
SLIDE 3

Network: CAnet & Global Lambda Facility

slide-4
SLIDE 4

Canadian Brain Imaging Research Network Global Brain Imaging Research Network

slide-5
SLIDE 5
  • Neuroimaging Overview
  • Scientific Data Flow
  • CBRAIN: A Distributed Computing Platform for Neuroimaging
  • CBRAIN as a Web Service

Summary

slide-6
SLIDE 6

What is Neuroimaging?

High Performance Computing Clinical Expertise Physical Sciences Imaging Technology Basic Neuroscience

  • Magnetic Resonance Imaging (MRI)
  • Functional MRI (fMRI)
  • Position Emission Tomography (PET)
  • Magnetoencephalography (MEG)

3 Tesla MRI Brain Imaging Techniques:

slide-7
SLIDE 7

Neuroimaging Research

Population Studies:

Alzheimer’s Disease

Multiple Sclerosis

Autism

Schizophrenia

Normal brain development

Alzheimer loss of cortical thickness Multiple Sclerosis lesions Normal Brain Development in Children

slide-8
SLIDE 8

Population Studies

  • Hundreds/thousands brain scans
  • Images not aligned across subjects or scans
  • Difficult to compare one brain to another

Not Registered Registered

slide-9
SLIDE 9

Extracting Features from Data

lobe-based cortical thickness complexity

  • Skull Masking
  • Registration in Stereotaxic Space
  • Tissue Classification
  • 3D Volumes
  • Cortical Thickness
  • Gyrification Index
  • Lobe-based Complexity

MS Lesion Model

  • 3-5 hours per scan
  • Hundreds of scans per study
  • GBs of data
  • 1000s of CPU hours
slide-10
SLIDE 10

Tools

Many valuable tools exist for Neuroimaging

  • Desktop based
  • Hard to install
  • Hard to use
  • Compute intensive
  • Not collaborative
slide-11
SLIDE 11

“Modern” Data Flow

Data Compute Analysis (visualisation) Knowledge

slide-12
SLIDE 12

“Modern” Data Flow

Data Compute Analysis (visualisation) Knowledge Data:

  • Lots of formats (some are weird)!
  • Lots of it (GB - TB+)
  • Security
  • Acquisition quality
  • Completeness
  • Annotation
slide-13
SLIDE 13

“Modern” Data Flow

Data Compute Analysis (visualisation) Knowledge

slide-14
SLIDE 14

“Modern” Data Flow

Data Compute Analysis (visualisation) Knowledge Compute:

  • Lots of tools of various quality
  • Open source & proprietary
  • Large amounts of compute
  • Compute access
  • Data transfer
  • Cost
slide-15
SLIDE 15

“Modern” Data Flow

Data Compute Analysis (visualisation) Knowledge

slide-16
SLIDE 16

“Modern” Data Flow

Data Compute Analysis (visualisation) Knowledge Analysis:

  • Lots of tools of various quality
  • Open source & proprietary
  • 3D is often desktop based (not collaborative)
  • Large data often requires infrastructure (cost)
slide-17
SLIDE 17

CBRAIN: An Integrated Web Platform

slide-18
SLIDE 18

Distributed Data Distributed Computing

Goal: Lightweight Distributed Architecture

Nothing specific to Neuroimaging

Distributed Users

slide-19
SLIDE 19

Simple Web Interface

Data

slide-20
SLIDE 20

Simple Web Interface

Compute

slide-21
SLIDE 21

Simple Web Interface

Results Visualisation

slide-22
SLIDE 22

Distributed Components

Separation of work is key

MetaData DB

Database (MySQL)

Instances: data, users, jobs, tools, HPCs States

CBRAIN Portal

Presentation Models, Logic, Coordination

Execution Servers

Control of resources (HPC, Web Services...)

Data Providers

Networked File Servers, Databases

Files

LIGHT network & compute

HTTP SSL XML SQL SSH

HEAVY network & compute

Data sync SSH

slide-23
SLIDE 23

Distributed Platform

slide-24
SLIDE 24

Distributed Platform

Scientist

PORTAL

DB Montréal

slide-25
SLIDE 25

Distributed Platform

Data Provider

DATA

DB Vancouver

Scientist

PORTAL

DB Montréal

slide-26
SLIDE 26

Distributed Platform

Data Provider

DATA

DB Vancouver

COMPUTE

Sherbrooke RQCHP

Scientist

PORTAL

DB Montréal

slide-27
SLIDE 27

Distributed Platform

Data Provider

DATA

DB Vancouver

COMPUTE

Sherbrooke RQCHP

Scientist

PORTAL

DB Montréal

1

slide-28
SLIDE 28

Distributed Platform

Data Provider

DATA

DB Vancouver

COMPUTE

Sherbrooke RQCHP

Scientist

PORTAL

DB Montréal

1

Execution Controller Cluster Head Node

2

slide-29
SLIDE 29

Distributed Platform

Data Provider

DATA

DB Vancouver

COMPUTE

Sherbrooke RQCHP

Scientist

PORTAL

DB Montréal

1

Execution Controller Cluster Head Node

2 3

Workers

slide-30
SLIDE 30

Distributed Platform

Data Provider

DATA

DB Vancouver

COMPUTE

Sherbrooke RQCHP

Scientist

PORTAL

DB Montréal

1

Execution Controller Cluster Head Node

2 3

Workers

4

slide-31
SLIDE 31

Distributed Platform

Data Provider

DATA

DB Vancouver

COMPUTE

Sherbrooke RQCHP

Scientist

PORTAL

DB Montréal

1

Execution Controller Cluster Head Node

2 Scheduler 5 3

Workers

4

slide-32
SLIDE 32

Distributed Platform

Data Provider

DATA

DB Vancouver

COMPUTE

Sherbrooke RQCHP

Scientist

PORTAL

DB Montréal

1

Execution Controller Cluster Head Node

2 Scheduler 5 6 3

Workers

4

slide-33
SLIDE 33

Distributed Platform

Data Provider

DATA

DB Vancouver

COMPUTE

Sherbrooke RQCHP

Scientist

PORTAL

DB Montréal

1

Execution Controller Cluster Head Node

2 Scheduler 5 6

Vancouver

COMPUTE

Execution Controller Cluster Head Node Workers

Scheduler 2 3 4 5 6 3

Workers

4

slide-34
SLIDE 34

Distributed Platform

Data Provider

DATA

DB Vancouver

COMPUTE

Sherbrooke RQCHP

Scientist

PORTAL

DB Montréal

1

Execution Controller Cluster Head Node

2 Scheduler 5 6

Vancouver

COMPUTE

Execution Controller Cluster Head Node Workers

Scheduler 2 3 4 5 6 3

Workers

4

Status Job Control

7

slide-35
SLIDE 35

Achievements

Illustrative Performance Comparison

NIH-Pediatric-Obj1: up to 3 visits per subject 866 CIVET pipeline runs to generate cortical thickness maps Input: 866 x 3 x 5Mb = 15Gb Output: 866 x 250 Mb = 211Gb Cluster Total CPU-hrs Maximum Performance Maximum Performance Typical Performance ypical Performance # cores Execution time (h) # cores Execution time (h) mammouth-ms2 (RQCHP -Sherbrooke) 866 x 4 = 3464 ~500 3 176 20 CLUMEQ-Krylov (McGill) 866 x 6 = 5196 ~90 58 24 216 BIC (Linux) 866 x 8 = 6928 ~100 69 40 173

In general, studies which use to takes 1 week to 1 month now take 1 day.

slide-36
SLIDE 36

HPC Integration

(8 compute installations, 80,000+ core)

Orcinus - Westgrid (3072 cores) Kraken - SHARCNET (3774 cores) McGill - CLUMEQ & Local Servers (350 - 16000 cores) GPC - SciNET (30240 cores) Mammouth II - RQCHP (2464 cores) Colosse - CLUMEQ (7616 cores) JUROPA – Julich (26304 cores)

slide-37
SLIDE 37
  • The integrative approach of CBRAIN makes sharing

resources extremely simple.

  • CBRAIN uses Projects to define permissions, similar to

groups in Unix.

  • Each resource in the system (files, HPCs, Data Providers) is assigned a Project.
  • All users in a given Project have access to any resources associated with that

Project.

Collaboration

slide-38
SLIDE 38

Infrastructures

  • utGRID

UCLA

McGILL

FBF

CBRAIN Community neuGRID Community LONI Community

Continental Access for Communities

slide-39
SLIDE 39
  • Allow interactions from clients other than web browsers.
  • RESTful API.
  • XML and JSON-based interactions.

CBRAIN as a Web Service

slide-40
SLIDE 40

CBRAIN-LONI Interoperability Demo

slide-41
SLIDE 41

CBRAIN-LONI Interoperability Demo

slide-42
SLIDE 42

CBRAIN-LONI Interoperability Demo

slide-43
SLIDE 43

CBRAIN-LONI Interoperability Demo

slide-44
SLIDE 44
  • What’s been done:
  • Most key CBRAIN resources are now available through a RESTful interface.
  • Outside applications can now get lists of files/tasks, submit jobs, etc.
  • What’s left to be done:
  • CBRAIN’s integrated framework is meant to handle data already in the system.
  • I.e. Files are meant to be registered with the system, so that CBRAIN can track

them, avoid redundancy, etc.

  • It must be decided how external systems will handle getting their data into and out
  • f CBRAIN in a reasonable manner.

CBRAIN as a Web Service

slide-45
SLIDE 45

tsherif@bic.mni.mcgill.ca alan.evans@mcgill.ca

http://cbrain.mcgill.ca

Team:

Montreal Neurological Institute, McGill University (Lead)

Principal Investigator: Alan Evans Program Manager: Reza Adalat System Architect: Marc Rousseau Developers: Pierre Rioux, Tarek Sherif, Angela McCloskey, Nicolas Kassis, Samir Das, David Brownlee System Administrator: Tien Duc Nguyen McGill Office of Technology Transfer (OTT): Francoys Labonte Canada National Research Council: Louis Borgeat Consultants: Rosanne Aleong, Claude Lepage, Pierre Bellec, Andrew Janki, Robert Vincent

Remote Sites: Rotman Research Institute, University of Toronto

Principal Investigators: Stephen Strother and Randy McIntosh Developers: Anda Pacurar, Anita Oder, Jacques Waller

Robarts Research Institute, University of Western Ontario

Principal Investigators: Ravi Menon and Mel Goodale Developers: Martyn Klassen, Ronghai Tu

Unité de Neuroimagerie Fonctionnelle, Université de Montréal

Principal Investigators: Julien Doyon and Rick Hoge Developer: Mathieu Desrosiers

Division of Neurology, University of British Colombia

Principal Investigators: Jon Stoessl and Max Cynader Developers: Ryan Thomson, Nasim Vafai

NCMIR, University of California San Diego, USA

Principal Investigators: Mark Ellisman System Administrator: Raj Singh

INM, Julich Forschungszentrum, Germany

Principal Investigators: Karl Ziles and Uwe Pietrzyk Scientist: Hartmut Mohlberg

CNA, Hanyang University, South Korea

Principal Investigators: Jong-min Lee

LONI, University of California Los Angeles, USA

Principal Investigators: Arthur Toga

slide-46
SLIDE 46

30

slide-47
SLIDE 47

GRID or CLOUD?

Cloud computing (Wiki) describes computation, software, data access, and storage services that do not require end-user knowledge of the physical location and configuration of the system that delivers the services. Grid computing (Wiki) is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal.

slide-48
SLIDE 48

GRID or CLOUD?

Cloud computing (Wiki) describes computation, software, data access, and storage services that do not require end-user knowledge of the physical location and configuration of the system that delivers the services. Grid computing (Wiki) is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal.

slide-49
SLIDE 49

GRID or CLOUD?

Cloud computing (Wiki) describes computation, software, data access, and storage services that do not require end-user knowledge of the physical location and configuration of the system that delivers the services. Grid computing (Wiki) is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal.

slide-50
SLIDE 50

GRID or CLOUD?

http://markusklems.files.wordpress.com/2008/06/egee-grid-cloud-comparison1.jpg

Light

slide-51
SLIDE 51

Small Brains - Normal Development

NIH - Normal Brain Development ~500 U.S. Children (0 to 48 months)

slide-52
SLIDE 52

Stereotaxic Space

J Talairach & P Tournoux “Co-planar stereotaxic atlas

  • f the human brain”

ed: Georg Thieme, 1988