IMAGE CLASSIFICATION WITH NVIDIA DIGITS Pedro Mario Cruz e Silva - - PowerPoint PPT Presentation

image classification with
SMART_READER_LITE
LIVE PREVIEW

IMAGE CLASSIFICATION WITH NVIDIA DIGITS Pedro Mario Cruz e Silva - - PowerPoint PPT Presentation

IMAGE CLASSIFICATION WITH NVIDIA DIGITS Pedro Mario Cruz e Silva (pcruzesilva@nvidia.com) Solution Architect Manager Enterprise Latin America Global Oil & Gas Team DEEP LEARNING WITH DIGITS Hands-On Lab NVIDIA QwikLabs


slide-1
SLIDE 1

Pedro Mario Cruz e Silva (pcruzesilva@nvidia.com) Solution Architect Manager Enterprise Latin America Global Oil & Gas Team

IMAGE CLASSIFICATION WITH NVIDIA DIGITS

slide-2
SLIDE 2

2

DEEP LEARNING WITH DIGITS

“NVIDIA QwikLabs” https://nvlabs.qwiklab.com “Image Classification with DIGITS”

Hands-On Lab

slide-3
SLIDE 3

3

slide-4
SLIDE 4

4

slide-5
SLIDE 5

5

INTRODUCTION

slide-6
SLIDE 6

6

LEARNING FROM DATA

AND SOME BUZZ WORDS ARTIFICAL INTELLIGENCE MACHINE LEARNING DEEP LEARNING

Knowledge & Reason Learning Planning Communicating Perceiving Learning from data Expert systems Handcrafted features Learning from data Neural networks Computer learned features

slide-7
SLIDE 7

7

TRADITIONAL COMPUTING MODEL

Algorithm Input “Label” Output

slide-8
SLIDE 8

8

A NEW COMPUTING MODEL

“Label” Input Training Data Output Trained Neural Network Trained Neural Network “Label” Output Input

TRAINING INFERENCE

slide-9
SLIDE 9

9

A NEW COMPUTING MODEL

Outperform experts, facts, rules with software that writes software

Deep Learning Object Detection DNN + Data + GPU Traditional Computer Vision Experts + Time Deep Learning Achieves “Superhuman” Results

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

2009 2010 2011 2012 2013 2014 2015 2016

Traditional CV Deep Learning

ImageNet

slide-10
SLIDE 10

10

MNIST (MODIFIED NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY)

slide-11
SLIDE 11

11

slide-12
SLIDE 12

12

NVIDIA DIGITS

slide-13
SLIDE 13

13

POWERING THE DEEP LEARNING ECOSYSTEM

NVIDIA SDK accelerates every major framework

COMPUTER VISION

OBJECT DETECTION IMAGE CLASSIFICATION

SPEECH & AUDIO

VOICE RECOGNITION LANGUAGE TRANSLATION

NATURAL LANGUAGE PROCESSING

RECOMMENDATION ENGINES SENTIMENT ANALYSIS

DEEP LEARNING FRAMEWORKS

Mocha.jl

NVIDIA DEEP LEARNING SDK

developer.nvidia.com/deep-learning-software

slide-14
SLIDE 14

14

NVIDIA DIGITS

Interactive Deep Learning GPU Training System

developer.nvidia.com/digits

Interactive deep neural network development environment for image classification and object detection

Schedule, monitor, and manage neural network training jobs Analyze accuracy and loss in real time Track datasets, results, and trained neural networks Scale training jobs across multiple GPUs automatically

slide-15
SLIDE 15

15

OBJECT DETECTION IMAGE CLASSIFICATION

DEEP LEARNING WORKFLOWS

Classify images into classes or categories Object of interest could be anywhere in the image Find instances of objects in an image Objects are identified with bounding boxes

98% Dog 2% Cat

New in DIGITS 5

Partition image into multiple regions Regions are classified at the pixel level

IMAGE SEGMENTATION

slide-16
SLIDE 16

16

STEP 0 – RUN DIGITS

slide-17
SLIDE 17

17

slide-18
SLIDE 18

18

slide-19
SLIDE 19

19

STEP 1 – CREATE DATASET

slide-20
SLIDE 20

20

LOAD AND ORGANIZE DATA

slide-21
SLIDE 21

21

LOAD AND ORGANIZE DATA

slide-22
SLIDE 22

22

EXPLORE DB

slide-23
SLIDE 23

23

slide-24
SLIDE 24

24

slide-25
SLIDE 25

25

slide-26
SLIDE 26

26

slide-27
SLIDE 27

27

STEP 2 – TRAINING A MODEL

slide-28
SLIDE 28

28

TRAINING A MODEL

slide-29
SLIDE 29

29

TRAINING A MODEL

slide-30
SLIDE 30

30

TRAINING A MODEL

slide-31
SLIDE 31

31

TRAINING A MODEL

slide-32
SLIDE 32

32

slide-33
SLIDE 33

33

STEP 3 – INFERENCE

slide-34
SLIDE 34

34

INFERENCE

slide-35
SLIDE 35

35

INFERENCE (TEST)

slide-36
SLIDE 36

36

NVIDIA AI PLATFORM

slide-37
SLIDE 37

37

150B XTORS | 5.3TF FP64 | 10.6TF FP32 | 21.2TF FP16 | 14MB SM RF | 4MB L2 Cache

TESLA P100

THE MOST ADVANCED HYPERSCALE DATACENTER GPU EVER BUILT

slide-38
SLIDE 38

38 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

INTRODUCING TESLA P100

New GPU Architecture to Enable the World’s Fastest Compute Node

Pascal Architecture NVLink CoWoS HBM2 Page Migration Engine

Highest Compute Performance GPU Interconnect for Maximum Scalability Unifying Compute & Memory in Single Package Simple Parallel Programming with Virtually Unlimited Memory Space

Unified Memory

CPU T esla P100

slide-39
SLIDE 39

39

ANNOUNCING TESLA V100

GIANT LEAP FOR AI & HPC VOLTA WITH NEW TENSOR CORE

21B xtors | TSMC 12nm FFN | 815mm2 5,120 CUDA cores 7.5 FP64 TFLOPS | 15 FP32 TFLOPS NEW 120 Tensor TFLOPS 20MB SM RF | 16MB Cache 16GB HBM2 @ 900 GB/s 300 GB/s NVLink

slide-40
SLIDE 40

40

NEW TENSOR CORE

New CUDA TensorOp instructions & data formats 4x4 matrix processing array D[FP32] = A[FP16] * B[FP16] + C[FP32] Optimized for deep learning

Activation Inputs Weights Inputs Output Results

slide-41
SLIDE 41

41

TENSOR CORE

4x4x4 matrix multiply and accumulate

slide-42
SLIDE 42

42

Tesla P100 vs Tesla V100

Tesla P100 (Pascal) Tesla V100 (Volta) Memory 16 GB (HBM2) 16 GB (HMB2) Memory Bandwidth 720 GB/s 900 GB/s NVLINK 160 GB/s 300 GB/s CUDA Cores (FP32) 3584 5120 CUDA Cores (FP64) 1792 2560 Tensor Cores (TC) NA 640 Peak TFLOPS/s (FP32) 10.6 15 Peak TFLOPS/s (FP64) 5.3 7.5 Peak TFLOPS/s (TC) NA 120 Power 300 W 300 W

3x

slide-43
SLIDE 43

43

Tesla P100 vs Tesla V100

Tesla P100 (Pascal) Tesla V100 (Volta) Memory 16 GB (HBM2) 16 GB (HMB2) Memory Bandwidth 720 GB/s 900 GB/s NVLINK 160 GB/s 300 GB/s CUDA Cores (FP32) 3584 5120 CUDA Cores (FP64) 1792 2560 Tensor Cores (TC) NA 640 Peak TFLOPS/s (FP32) 10.6 15 Peak TFLOPS/s (FP64) 5.3 7.5 Peak TFLOPS/s (TC) NA 120 Power 300 W 300 W

50%

slide-44
SLIDE 44

44

NVIDIA GPU CLOUD SIMPLIFYING AI & HPC

DEEP LEARNING HPC APPS HPC VIZ

slide-45
SLIDE 45

45

slide-46
SLIDE 46

46

DRAMATICALLY MORE FOR YOUR MONEY

5X Better HPC TCO for Same Throughput

160 Self-hosted Skylake CPU Servers 96 KWatts MIXED HPC WORKLOAD:

Amber, CHROMA, GTC, LAMMPS, MILC, NAMD, Quantum Expresso, SPECFEM3D

8 Accelerated Servers w/4 V100 GPUs 13 KWatts

SAME

THROUGHPUT

1/5

THE COST

1/7

THE SPACE

1/7

THE POWER

MIXED HPC WORKLOAD:

Amber, CHROMA, GTC, LAMMPS, MILC, NAMD, Quantum Espresso, SPECFEM3D

slide-47
SLIDE 47

47

NVIDIA SUPPORT PROGRAMS

slide-48
SLIDE 48

48

developer.nvidia.com

slide-49
SLIDE 49

49

Deep Learning Fundamentals Game Development & Digital Content Finance

NVIDIA DEEP LEARNING INSTITUTE

Hands-on self-paced and instructor-led training in deep learning and accelerated computing for developers Request onsite instructor-led workshops at your

  • rganization: www.nvidia.com/requestdli

Take self-paced labs online: www.nvidia.com/dlilabs Download the course catalog, view upcoming workshops, and learn about the University Ambassador Program: www.nvidia.com/dli

Intelligent Video Analytics Medical Image Analysis Autonomous Vehicles Accelerated Computing Fundamentals More industry- specific training coming soon… Genomics

slide-50
SLIDE 50

50

NVIDIA HW GRANT PROGRAM

Titan X Pascal

  • Robotics
  • Autonomous Machines

Jetson TX2 (Dev Kit)

  • Scientific Visualization
  • Virtual Reality

Quadro P6000

  • Scientific Computing
  • HPC
  • Deep Learning
slide-51
SLIDE 51

51

INCEPTION PROGRAM

http://www.nvidia.com/object/inception-program.html

slide-52
SLIDE 52

Pedro Mario Cruz e Silva (pcruzesilva@nvidia.com) Solution Architect Manager Enterprise Latin America Global Oil & Gas Team LinkedIn