Distributed Computation of Feature-Detectors for Medical Image - PowerPoint PPT Presentation

Peter Zinterhof Scientific Computing, Salzburg University Distributed Computation of Feature-Detectors for Medical Image Processing on GPGPU and Cell Processors Peter Zinterhof Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University Task: • automated recognition of 'regions of interest' within medical imaging data here: recognition of kidneys within CT-slices Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University General approach: Evolve filter-sets, which generate feature vectors that can be classified by means of a Kohonen self-organizing map. Image filtering: Filter: 4 x 4 pixels, fed into perceptron Perceptron sums 16 inputs linearly, fires if threshold  is exceeded → sliding-window (convolution) each filter delivers single number for one image aggregate 8 filters into feature vector of length 8 Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University Training phase: 1.Apply eight perceptron-based filters on training images (64 x 64 pixels) 2.Sum up fire-events to create feature vectors (8-bin histograms) 3.train Kohonen map 4.Repeat steps 1-3 for many 'individuals' in parallel and compute fitness of each filter-set (recognition rate). 5.Apply evolutionary strategy onto perceptrons to evolve such perceptrons that eventually generate high recognition rates. Recall phase: 1.Apply best filter-set onto image to be classified (kidney/ non-kidney) 2.Use pre-computed Kohonen map to classify feature vector Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University Solution: • Use of clusters of General Purpose Graphics Processing Units (GPGPUs) and Cell processors enables training within days instead of many months. • Two levels of parallelism coarse grained fine grained systems level image filtering Kohonen map Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University GPU mpich2 + gbic GPU + Cell mpich2 + pvm3 + gbic Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University Samplecode GPU: image filtering dim3 dimBlock (60,8); // 480 threads → dim3 dimGrid (images , 1); // 12000 blocks > 5.7 Mio.. threads __global__ void Perceptrons (unsigned char *a1, float *weights, unsigned char *border, int *ReturnVectors_dlocal) { __shared__ float w[Channels][4][4]; __shared__ unsigned char borders[Channels+Channels][4][4]; __shared__ unsigned char a[64][64]; if (threadIdx.y==0) { // fetch data from global memory aw = &w[0][0][0]; for (i=threadIdx.x; i < (1*Channels*16); i+=60) { Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University aw[i] = weights[i]; } ue = &borders[0][0][0]; for (i=threadIdx.x; i < (2*Channels*16); i+=60) { ue[i] = border[i]; } base = blockIdx.x*(4096); uw = &a[0][0]; for (i=threadIdx.x; i < 4096; i+=60) { uw[i] = a1[base+i]; } } Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University mask=threadIdx.y; // which of the 8 filters to compute locally Counter=0; for (j=0; j < 60; j++) { sum=0.0; if ((a[j+0][threadIdx.x+0]>=borders[mask][0][0])&&(a[j+0] [threadIdx.x+0]<borders[mask+Channels][0][0])) sum+=w[mask][0][0]; if ((a[j+1][threadIdx.x+0]>=borders[mask][0][1])&&(a[j+1] [threadIdx.x+0]<borders[mask+Channels][0][1])) sum+=w[mask][0][1]; if ((a[j+2][threadIdx.x+0]>=borders[mask][0][2])&&(a[j+2] [threadIdx.x+0]<borders[mask+Channels][0][2])) sum+=w[mask][0][2]; if ((a[j+3][threadIdx.x+0]>=borders[mask][0][3])&&(a[j+3] [threadIdx.x+0]<borders[mask+Channels][0][3])) Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University sum+=w[mask][0][3]; if (sum > THETA) { Counter++; } } // next j atomicAdd ( &ReturnVectors_dlocal [(blockIdx.x*Channels)+mask] , Counter); } Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University Codesample Cell: nearest-neighbor search // ********************************************************************* // DEMONSTRATION OF spu_cmpgt and spu_sel for a nearest-neighbor search // mymap: array of Kohonen-map vectors (1/6 of total map) // testvector: random vector, whose nearest neighbor is computed // ********************************************************************* for (i=0; i < RES*DIM; i++) { sum=(vector float){0.0,0.0,0.0,0.0}; current_nr=spu_splats (i); diff = mymap[i][0] - testvector[0]; sum = spu_madd (diff,diff, sum); diff = mymap[i][1] - testvector[1]; sum = spu_madd (diff,diff, sum); ... diff = mymap[i][7] - testvector[7]; sum = spu_madd (diff,diff, sum); mask = spu_cmpgt (localbest, sum); // if localbest > sum -> arg 1, else arg 0 is new minimum localbest= spu_sel (localbest,sum,mask); localbest_nr = spu_sel (localbest_nr,current_nr,mask); } Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University Codesample Cell: Improved barrier function Transporting nearest neighbor information 'piggy-back'-style, yielding approx. 14 % speedup Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University Benchmarks: Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Peter Zinterhof Scientific Computing, Salzburg University Conclusions: • Kohonen map training is 12x faster on the Cell processor than on a single i7-core (920, 2.67 GHz) • in conjunction with a single GPGPU (used for filtering only), Cell reaches 50 % of a GPGPU's performance, even in a cluster of 8 PS3 consoles one additional GPGPU suffices. • GPGPUs reach an overall speedup of 22 x over a Xeon • combining 8 GPGPUs with 8 Cells yields a speedup of ~248x over a → single i7 core 5 minutes on a cluster of 'unconventional hardware' is equivalent to 1 day on the CPU Third Workshop on UnConventional High Performance Computing 2010 (UCHPC 2010), Ischia – Naples, Italy

Distributed Computation of Feature-Detectors for Medical Image - PowerPoint PPT Presentation

Peter Zinterhof Scientific Computing, Salzburg University Distributed Computation of Feature-Detectors for Medical Image Processing

Decision Tree Prof. Seungchul Lee Industrial AI Lab. Feature Test Feature 1 Feature 2 Feature

Integrating Local Feature Detectors in the Integrating Local Feature Detectors in the Interactive

A Distinctive Feature of A Distinctive Feature of A Distinctive Feature of A Distinctive Feature

Outline Reducing Dimensionality Feature Selection 1 Steven J Zeil Feature Extraction 2

Learning Covariant Feature Detectors Karel Lenc and Andrea Vedaldi University of Oxford GMDL

Detectors installation in the TAN at IR1 and IR5: Detectors installation in the TAN at IR1 and

RICH DETECTORS Giulia Meo University of Heidelberg 27 January 2017 1/30 Cherenkov Radiation

Earth: The Feature Presentation - feature, landscape, topography Earth: The Feature Presentation

Reducing Dimensionality Steven J Zeil Old Dominion Univ. Fall 2010 1 Feature Selection

Feature Detection and Matching Applications Goal : Develop matching procedures that can detect

Formal Definition of Computation Formal Definition of Computation p.1/28 Computation

Infrared detectors Paul Norton Santa Barbara, CA Outline Nortons Law of infrared

DETECTORS AND ACCELERATORS DETECTORS AND ACCELERATORS APPLIED TO MEDICINE Jos Bernabu Jos

Diamond Detectors CVD Diamond History Introduction to DDL Properties of Diamond DDL Proprietary

Electronics for Gas Detectors Sorin Martoiu, CERN Electronics for Gas Detectors/RD51 03/02/2014

Summary : SC/Quantum detectors 2 hr 45m 3 hr 30m two long days with detectors for

Hebbian Learning Algorithms for Training Convolutional Neural Networks Gabriele Lagani Computer

1 Real Neural Learning Artificial Neuron Model Model network as a graph with cells as nodes

Learning on Humanoid Robots Vadym Gryshchuk 19.11.2018 Outline Motivation Background

Unsupervised clustering with growing self-organizing neural network A comparison with non-neural

Clustering I Hamid R. Rabiee Jafar Muhammadi, Nima Pourdamghani Spring 2015

NANOMATERIALS DISCOVERY Michael Fernandez | OCE-Postdoctoral Fellow September 2016

Descriptive Statistics Chapter 3 1 Summarizing Data With lots of playtesting, there is a

I can interpret my answer in terms of the question. National 5 WB 26th Feb Statistics 3, 5, 5, 8,