APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER - - PowerPoint PPT Presentation

applications of deep learning to computer vision and
SMART_READER_LITE
LIVE PREVIEW

APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER - - PowerPoint PPT Presentation

APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER GRAPHICS Mike Houston Practical DEEP LEARNING Examples Image Classification, Object Detection, Localization, Speech Recognition, Speech Translation, Action Recognition, Scene


slide-1
SLIDE 1

APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER GRAPHICS Mike Houston

slide-2
SLIDE 2

Practical DEEP LEARNING Examples

Image Classification, Object Detection, Localization, Action Recognition, Scene Understanding Speech Recognition, Speech Translation, Natural Language Processing Pedestrian Detection, Traffic Sign Recognition Breast Cancer Cell Mitosis Detection, Volumetric Brain Image Segmentation

slide-3
SLIDE 3

What is DEEP LEARNING?

Input Result

slide-4
SLIDE 4

Tree Cat Dog Deep Learning Framework “turtle” Forward Propagation Compute weight update to nudge from “turtle” towards “dog” Backward Propagation Trained Neural Net Model “cat” Repeat

Training Inference

slide-5
SLIDE 5

Making a vehicle classifier

PICKUP SUV SUV

slide-6
SLIDE 6
slide-7
SLIDE 7

The “Big Bang” In Deep Learning

Algorithms Data Compute Capability

slide-8
SLIDE 8

Medical Research

Detecting Mitosis in Breast Cancer Cells

— IDSIA

Predicting the Toxicity

  • f New Drugs

— Johannes Kepler University

Understanding Gene Mutation to Prevent Disease

— University of Toronto

slide-9
SLIDE 9

“Automated Image Captioning with ConvNets and Recurrent Nets”

—Andrej Karpathy, Fei-Fei Li

Captioning

slide-10
SLIDE 10

Why Are GPUs Good for Deep Learning?

GPUs deliver --

same or better prediction accuracy faster results smaller footprint lower power

Neural Networks GPUs Inherently Parallel

 

Matrix Operations

 

FLOPS

 

4 60 110 28% 26% 16% 12% 7% 2010 2011 2012 2013 2014

bird frog person dog chair

slide-11
SLIDE 11

GPU-Accelerated Deep Learning

START-UPS

slide-12
SLIDE 12

GPU-Accelerated Deep Learning Frameworks

CAFFE TORCH THEANO CUDA-CONVNET2 KALDI

Domain

Deep Learning Framework Scientific Computing Framework Math Expression Compiler Deep Learning Application Speech Recognition Toolkit

cuDNN

R2 R2 R2

  • Multi-GPU

In Progress In Progress In Progress

 (nnet2)

Multi-CPU

    (nnet2)

License

BSD-2 GPL BSD Apache 2.0 Apache 2.0

Interface(s)

Text-based definition files, Python, MATLAB Python, Lua, MATLAB Python C++ C++, Shell scripts

Embedded (TK1)

    

http://developer.nvidia.com/deeplearning

slide-13
SLIDE 13

DIGITS

slide-14
SLIDE 14

DIGITS

DEEP GPU TRAINING SYSTEM FOR DATA SCIENTISTS

Design DNNs Visualize activations Manage multiple trainings

GPU

GPU HW

Cloud GPU Cluster Multi-GPU

USER INTERFACE

Visualize Layers Configure DNN Process Data Monitor Progress Theano Torch Caffe cuDNN, cuBLAS CUDA

slide-15
SLIDE 15

DIGITS

Test Image

Monitor Progress Configure DNN Process Data Visualize Layers

slide-16
SLIDE 16

DIGITS DEVBOX

World’s fastest GPU Max GPU out of a plug Multi-GPU training & inference

slide-17
SLIDE 17

Production Automotive Pipeline

slide-18
SLIDE 18
slide-19
SLIDE 19
slide-20
SLIDE 20
slide-21
SLIDE 21

TEGRA X1 CLASSIFICATION Performance

AlexNet

10 20 30 40 50 60 70 80 90 100 Tegra K1 Tegra X1

IMAGES / SECOND

slide-22
SLIDE 22

Project dave — darpa autonomous vehicle

DNN-based self-driving robot Training data by human driver No hand-coded CV algorithms

IMAGENET CHALLENGE

Accuracy %

2010 2014 2012 2011 2013

74% 84%

DNN CV

72%

slide-23
SLIDE 23

TRAINING DATA

225K Images

slide-24
SLIDE 24

DAVE IN ACTION

slide-25
SLIDE 25

Data Scientist Vehicle

Active Learning

Drive PX - Deploy Model Classification Detection Segmentation DIGITS - Train Network Solver Dashboard

slide-26
SLIDE 26

Deep Learning and Vision/Graphics

slide-27
SLIDE 27

Street Number Detection

[Goodfellow 2014]

slide-28
SLIDE 28

Object Classification

[Krizhevsky 2012]

slide-29
SLIDE 29

Image Retrieval

[Krizhevsky 2012]

slide-30
SLIDE 30

Pose Estimation

[Toshev, Szegedy 2014]

slide-31
SLIDE 31

Object Detection

slide-32
SLIDE 32

[Huval et al. 2015]

slide-33
SLIDE 33

Face Recognition

[Taigman et al. 2014]

slide-34
SLIDE 34

Action Recognition

[Simonyan et al. 2014]

slide-35
SLIDE 35

Playing Games

[Mnih et al. 2013]

slide-36
SLIDE 36

Semantic Segmentation

[Farabet et al. 2013]

slide-37
SLIDE 37

Super Resolution

[Dong et al. 2014]

slide-38
SLIDE 38

Ray Tracing – Monte Carlo Denoising

[Kalantari et al. 2015]

slide-39
SLIDE 39

“Dreams”

[Mordvinstev et al. 2015]

slide-40
SLIDE 40

“Dreams”

[Mordvinstev et al. 2015]