image classification with
play

IMAGE CLASSIFICATION WITH NVIDIA DIGITS Pedro Mario Cruz e Silva - PowerPoint PPT Presentation

IMAGE CLASSIFICATION WITH NVIDIA DIGITS Pedro Mario Cruz e Silva (pcruzesilva@nvidia.com) Solution Architect Manager Enterprise Latin America Global Oil & Gas Team DEEP LEARNING WITH DIGITS Hands-On Lab NVIDIA QwikLabs


  1. IMAGE CLASSIFICATION WITH NVIDIA DIGITS Pedro Mario Cruz e Silva (pcruzesilva@nvidia.com) Solution Architect Manager Enterprise Latin America Global Oil & Gas Team

  2. DEEP LEARNING WITH DIGITS Hands-On Lab “NVIDIA QwikLabs ” https://nvlabs.qwiklab.com “ Image Classification with DIGITS” 2

  3. 3

  4. 4

  5. INTRODUCTION 5

  6. LEARNING FROM DATA AND SOME BUZZ WORDS ARTIFICAL INTELLIGENCE MACHINE Knowledge & Reason LEARNING DEEP Learning LEARNING Learning from data Learning from data Planning Expert systems Neural networks Communicating Handcrafted features Computer learned Perceiving features 6

  7. TRADITIONAL COMPUTING MODEL “Label” Input Output Algorithm 7

  8. A NEW COMPUTING MODEL TRAINING Training Data Trained Neural Network Input “Label” Output INFERENCE Trained Neural Network “Label” Output Input 8

  9. A NEW COMPUTING MODEL Outperform experts, facts, rules with software that writes software ImageNet 100% 90% 80% 70% 60% 50% 40% 30% Traditional CV 20% Deep Learning 10% 0% 2009 2010 2011 2012 2013 2014 2015 2016 Traditional Computer Vision Deep Learning Object Detection Deep Learning Achieves Experts + Time DNN + Data + GPU “Superhuman” Results 9

  10. MNIST (MODIFIED NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY) 10

  11. 11

  12. NVIDIA DIGITS 12

  13. POWERING THE DEEP LEARNING ECOSYSTEM NVIDIA SDK accelerates every major framework COMPUTER VISION SPEECH & AUDIO NATURAL LANGUAGE PROCESSING OBJECT DETECTION IMAGE CLASSIFICATION VOICE RECOGNITION LANGUAGE TRANSLATION RECOMMENDATION ENGINES SENTIMENT ANALYSIS DEEP LEARNING FRAMEWORKS Mocha.jl NVIDIA DEEP LEARNING SDK 13 developer.nvidia.com/deep-learning-software

  14. NVIDIA DIGITS Interactive Deep Learning GPU Training System Interactive deep neural network development environment for image classification and object detection Schedule, monitor, and manage neural network training jobs Analyze accuracy and loss in real time Track datasets, results, and trained neural networks Scale training jobs across multiple GPUs automatically developer.nvidia.com/digits 14

  15. DEEP LEARNING WORKFLOWS New in DIGITS 5 IMAGE OBJECT IMAGE CLASSIFICATION DETECTION SEGMENTATION 98% Dog 2% Cat Classify images into Find instances of objects Partition image into classes or categories in an image multiple regions Object of interest could Objects are identified Regions are classified at be anywhere in the image with bounding boxes the pixel level 15

  16. STEP 0 – RUN DIGITS 16

  17. 17

  18. 18

  19. STEP 1 – CREATE DATASET 19

  20. LOAD AND ORGANIZE DATA 20

  21. LOAD AND ORGANIZE DATA 21

  22. EXPLORE DB 22

  23. 23

  24. 24

  25. 25

  26. 26

  27. STEP 2 – TRAINING A MODEL 27

  28. TRAINING A MODEL 28

  29. TRAINING A MODEL 29

  30. TRAINING A MODEL 30

  31. TRAINING A MODEL 31

  32. 32

  33. STEP 3 – INFERENCE 33

  34. INFERENCE 34

  35. INFERENCE (TEST) 35

  36. NVIDIA AI PLATFORM 36

  37. TESLA P100 THE MOST ADVANCED HYPERSCALE DATACENTER GPU EVER BUILT 150B XTORS | 5.3TF FP64 | 10.6TF FP32 | 21.2TF FP16 | 14MB SM RF | 4MB L2 Cache 37

  38. INTRODUCING TESLA P100 New GPU Architecture to Enable the World’s Fastest Compute Node Pascal Architecture NVLink CoWoS HBM2 Page Migration Engine T esla P100 CPU Unified Memory Highest Compute Performance GPU Interconnect for Unifying Compute & Memory in Simple Parallel Programming with Maximum Scalability Single Package Virtually Unlimited Memory Space NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE. 38

  39. ANNOUNCING TESLA V100 GIANT LEAP FOR AI & HPC VOLTA WITH NEW TENSOR CORE 21B xtors | TSMC 12nm FFN | 815mm 2 5,120 CUDA cores 7.5 FP64 TFLOPS | 15 FP32 TFLOPS NEW 120 Tensor TFLOPS 20MB SM RF | 16MB Cache 16GB HBM2 @ 900 GB/s 300 GB/s NVLink 39

  40. NEW TENSOR CORE New CUDA TensorOp instructions & data formats 4x4 matrix processing array D[FP32] = A[FP16] * B[FP16] + C[FP32] Optimized for deep learning Activation Inputs Weights Inputs Output Results 40

  41. TENSOR CORE 4x4x4 matrix multiply and accumulate 41

  42. Tesla P100 vs Tesla V100 Tesla P100 (Pascal) Tesla V100 (Volta) Memory 16 GB (HBM2) 16 GB (HMB2) Memory Bandwidth 720 GB/s 900 GB/s 3x NVLINK 160 GB/s 300 GB/s CUDA Cores (FP32) 3584 5120 CUDA Cores (FP64) 1792 2560 Tensor Cores (TC) NA 640 Peak TFLOPS/s (FP32) 10.6 15 Peak TFLOPS/s (FP64) 5.3 7.5 Peak TFLOPS/s (TC) NA 120 Power 300 W 300 W 42

  43. Tesla P100 vs Tesla V100 Tesla P100 (Pascal) Tesla V100 (Volta) Memory 16 GB (HBM2) 16 GB (HMB2) Memory Bandwidth 720 GB/s 900 GB/s NVLINK 160 GB/s 300 GB/s CUDA Cores (FP32) 3584 5120 CUDA Cores (FP64) 1792 2560 Tensor Cores (TC) NA 640 50% Peak TFLOPS/s (FP32) 10.6 15 Peak TFLOPS/s (FP64) 5.3 7.5 Peak TFLOPS/s (TC) NA 120 Power 300 W 300 W 43

  44. NVIDIA GPU CLOUD SIMPLIFYING AI & HPC DEEP LEARNING HPC APPS HPC VIZ 44

  45. 45

  46. DRAMATICALLY MORE FOR YOUR MONEY 5X Better HPC TCO for Same Throughput SAME THROUGHPUT MIXED HPC MIXED HPC WORKLOAD: WORKLOAD: 1/5 Amber, CHROMA, Amber, CHROMA, THE COST GTC, LAMMPS, GTC, LAMMPS, MILC, NAMD, MILC, NAMD, 1/7 Quantum Expresso, Quantum Espresso, SPECFEM3D SPECFEM3D THE SPACE 1/7 THE POWER 8 Accelerated Servers w/4 V100 GPUs 160 Self-hosted Skylake CPU Servers 13 KWatts 96 KWatts 46

  47. NVIDIA SUPPORT PROGRAMS 47

  48. developer.nvidia.com 48

  49. NVIDIA DEEP LEARNING INSTITUTE Hands-on self-paced and instructor-led training in deep learning and accelerated computing for developers Autonomous Deep Learning Medical Image Vehicles Fundamentals Analysis Request onsite instructor-led workshops at your organization: www.nvidia.com/requestdli Take self-paced labs online: www.nvidia.com/dlilabs Download the course catalog, view upcoming Genomics Finance Intelligent Video workshops, and learn about the University Analytics Ambassador Program: www.nvidia.com/dli More industry- specific training coming soon… Game Development Accelerated Computing & Digital Content Fundamentals 49

  50. NVIDIA HW GRANT PROGRAM Jetson TX2 Titan X Pascal Quadro P6000 (Dev Kit) Scientific Computing • • Scientific Visualization Robotics • HPC • • Virtual Reality Autonomous Machines • • Deep Learning 50

  51. INCEPTION PROGRAM http://www.nvidia.com/object/inception-program.html 51

  52. Pedro Mario Cruz e Silva (pcruzesilva@nvidia.com) Solution Architect Manager Enterprise Latin America Global Oil & Gas Team LinkedIn

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend