Jonathan Cohen Director, CUDA Libraries and Software Solutions
Jonathan Cohen Director, CUDA Libraries and Software Solutions 2 3 - - PowerPoint PPT Presentation
Jonathan Cohen Director, CUDA Libraries and Software Solutions 2 3 - - PowerPoint PPT Presentation
MACHINE LEARNING: WHAT COMPUTATIONAL RESEARCHERS NEED TO KNOW Jonathan Cohen Director, CUDA Libraries and Software Solutions 2 3 COMPUTERS ARE LEARNING TO SEE! GPU Entries 120 Image Recognition Challenge 100 110 80 1.2M training images
2
3
4
COMPUTERS ARE LEARNING TO SEE!
1.2M training images • 1000 object categories
Hosted by
Image Recognition Challenge
person car helmet motorcycle bird frog person dog chair person hammer flower pot power drill
4 60 110 20 40 60 80 100 120 2010 2011 2012 2013 2014
GPU Entries Classification Error Rates
28% 26% 16% 12% 7% 0% 5% 10% 15% 20% 25% 30% 2010 2011 2012 2013 2014
5
DEEP NEURAL NETS: ANALYSIS VIA ABSTRACTION
Image “Sara”
6
Tree Cat Dog Machine Learning Software “turtle” Forward Propagation Compute weight update to nudge from “turtle” towards “dog” Backward Propagation Trained Model “cat” Repeat
Training Classification
7
Deep Learning for Science
8
CORAL REEF MAPPING
Coral reefs tremendously important
Support more species per area than any
- ther marine environment
Storehouse of immense biodiversity Buffer adjacent shorelines from wave action
Ecologists need accurate large-scale coverage, broken down by genus Surveys generate huge data sets…
Material courtesy Oscar Beijbom and CoralNet advisory team / UCSD
9
LARGE SCALE CORAL ANNOTATION
…but labeling is tedious and slow Anecdotally only 1-2% of image data
- btained from coral reef surveys is labeled
Automated methods: best methods today work on 60% of data with 5% loss of accuracy Deep learning: estimated to work on 90% of data (pilot study underway) More at: http://coralnet.ucsd.edu/
CoralNet @ UCSD
Material courtesy Oscar Beijbom and CoralNet advisory team / UCSD
Circles are coral genera Acropora,, Pavona,, Montipora,, Pocillopora,,and Porites Triangles are non-coral substrates, Crustose Coralline Algae,,Turf algae,,Macroalgae, and Sand.
10
CONNECTOME PROJECT
“Imaging neural circuits at nanometer length scales leads inevitably to
vast data sets. In fact, the entire Connectome Project is feasible only now because of the exponential increase in computing power and data storage over time. Nevertheless, managing, visualizing, and analyzing these data remain major challenges.” – Harvard Center for Brain Science website
Mapping the Brain’s Wiring Diagram
11
Images courtesy Thouis Jones
Microscope generates 0.85 TB / day 250 NVIDIA Tesla K40 GPUs running classifier Next-gen will generate 1GB/sec, running 50% of the time That will be 42TB / day
NEURAL NET SEGMENTING NEURON DATA
Microscopy Image Neural Net Classifier Segmentation
12
http://www.youtube.com/watch?v=-wq2WTRmeW4 Credits: Daniel Berger and Sebastian Seung (MIT); Narayanan Kasthuri, Richard Schalek, Kenneth Hayworth, Juan-Carlos Tapia and Jeff Lichtman (Harvard).
13
CANCER CELL MITOSIS DETECTION
Mitosis: Chromosomes in cell nucleus replicated prior to cell splitting “B&R” grading system includes mitotic activity per tissue area – strong indicator of invasive breast cancer Hand count mitosis events over 2mm2 region in stained slide
14
DEEP NEURAL NETS FOR MITOSIS DETECTION
Use DNNs as pixel classifier input: window of raw pixels
- utput: probability center pixel is close to
mitosis centroid 2012 contest: 66k pos examples, 6M neg examples 5 months to train on CPU 3 days to train on GPU! See: Dan Ciresan, Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks – NVIDIA GPU Theater, Tuesday 2:30-2:50
IDSIA – Winner 2012 & 2013 Contests
Material courtesy Dan Ciresan, IDSIA 0.1 0.2 0.3 0.4 0.5 0.6 0.7 F1 score IDSIA
- ther entries
MICCAI 2013
15
How to Get Started
16
ANYONE CAN USE DEEP LEARNING
Several open source frameworks available with active communities Caffe (UCB), Torch (NYU), Theano (Montreal) – take your pick
Caffe: http://caffe.berkeleyvision.org/tutorial/ Torch: http://code.cogbits.com/wiki/doku.php Theano: http://deeplearning.net/software/theano/tutorial/
All have excellent support for NVIDIA GPUs Astronomy, sociology, political science, marine ecology, medical imaging, genomics, plant biology, archaeology, …
Machine Learning Will Impact all of Science
17
High performance routines for Convolutional Neural Networks Optimized for current and future NVIDIA GPUs Integrated in major open-source frameworks
Caffe, Torch7, Theano
Flexible and easy-to-use API Also available for ARM / Jetson TK1 https://developer.nvidia.com/cuDNN
GPU-ACCELERATED DEEP LEARNING
Caffe (CPU*) 1x Caffe (GPU) 11x Caffe (cuDNN) 14x Baseline Caffe compared to Caffe accelerated by cuDNN on K40
*CPU is 24 core E5-2697v2 @ 2.4GHz Intel MKL 11.1.3