S8901 – Quadro for AI, VR and Simulation
S8901 Quadro for AI, VR and Simulation Carl Flygare, PNY Allen - - PowerPoint PPT Presentation
S8901 Quadro for AI, VR and Simulation Carl Flygare, PNY Allen - - PowerPoint PPT Presentation
S8901 Quadro for AI, VR and Simulation Carl Flygare, PNY Allen Bourgoyne, NVIDIA Quadro Product Marketing Manager Senior Product Marketing Manager The question of whether a computer can think is no more interesting than the question of
Carl Flygare, PNY
Quadro Product Marketing Manager
Allen Bourgoyne, NVIDIA
Senior Product Marketing Manager
“The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.”
Edsger Dijkstra
Intelligence Abounds in Nature
A very small sampling
Technological Intelligence
Homo sapiens’ essential differentiator
Thalmocortical brain network 3 million neurons, 476 million synapses Full human brain 106 billion neurons, 1,000 trillion synapses
Artificial Intelligence: Where we Stand Today
Google’s IQ is slightly below a six-year-old human’s
Google 47.28 | 78.42% increase since 2014 Baidu 32.92 | 40.08% increase since 2014 Microsoft Bing 31.98 Apple Siri 23.90
Source: http://www.zdnet.com/article/google-ai-vs-siri-vs-bing-iq-tests-show-one-is-smartest-by-a-mile/
AI IQ’s significantly lower than an 18-year-old’s average 97 score In 2014 two of the three researchers found Google had an IQ of 26.5 compared to Baidu’s 23.5
NVIDIA Quadro
Every segment benefits from AI, VR and simulation
Manufacturing CAE Media and Entertainment Automotive AEC Energy (Oil and Gas) Scientific and Technical Healthcare
Entry
NVIDIA Quadro | AI, VR and Simulation Open New Possibilities
Small and Simple CAD Models, Entry PLM Medium Size and Complexity CAD Models, PLM, Basic DCC, Medical Imaging Professional VR, Complex CAD Models, CAE, Photorealistic Rendering, Complex DCC and VFX, Medical Imaging P4000 8 GB Professional VR, Very Complex CAD Models, CAE, Photorealistic Rendering, Advanced DCC and VFX, 3D Medical Imaging P5000 16 GB P6000 24 GB Collaborative VR, Extremely Complex CAD Models, CAE, Photorealistic Rendering, DCC and VFX, Seismic Exploration, 3D Medical Imaging P620 2 GB P400 2 GB P2000 5 GB P1000 4 GB GP100 16 GB AI (Deep Learning) Development, Collaborative VR, CAE Simulations, Ultimate CAD Models, Photorealistic Rendering and GPGPU Compute Basic Mid Range Upper Range High End Ultra High End GV100 32 GB
Entry
NVIDIA Quadro | AI, VR and Simulation Open New Possibilities
Small and Simple CAD Models, Entry PLM Medium Size and Complexity CAD Models, PLM, Basic DCC, Medical Imaging Professional VR, Complex CAD Models, CAE, Photorealistic Rendering, Complex DCC and VFX, Medical Imaging P4000 8 GB Professional VR, Very Complex CAD Models, CAE, Photorealistic Rendering, Advanced DCC and VFX, 3D Medical Imaging P5000 16 GB P6000 24 GB Collaborative VR, Extremely Complex CAD Models, CAE, Photorealistic Rendering, DCC and VFX, Seismic Exploration, 3D Medical Imaging P620 2 GB P400 2 GB P2000 5 GB P1000 4 GB GP100 16 GB AI (Deep Learning) Development, Collaborative VR, CAE Simulations, Ultimate CAD Models, Photorealistic Rendering and GPGPU Compute Basic Mid Range Upper Range High End Ultra High End GV100 32 GB
NVIDIA Quadro GP100 NVIDIA Quadro GV100 | Reinventing the Workstation for AI
NVIDIA Quadro GP100 NVIDIA Quadro GV100 x 2 | NVLink Scalable Workstation AI
NVIDIA Quadro GV100 and NVLink
Scaling performance and memory*
*Application support for NVLink required. Maximum of two GV100 boards can be connected with NVLink.High speed GPU and memory connection for GV100 ▪ NVLink combines two GV100s for twice the compute power and 64 GB of memory ▪ Up to 200 GB/sec bidirectional bandwidth, 25% improvement ▪ Used in pairs, two dedicated NVLink connectors on GV100 boards ▪ Provides SLI functionality for GV100 boards
NVIDIA Quadro GV100
Technical specifications
GPU Architecture Volta CUDA and Tensor Cores 2560 (FP64), 5120 (FP32), 640 (Tensor) Memory Capacity 32 GB HBM2 Peak Memory Bandwidth 870 GB/sec FP64 (Double Precision) 7.4 TFLOPS | 42% improvement FP32 (Single Precision) 14.8 TFLOPS | 44% improvement FP16 (Half Precision) 118.5 TFLOPS (Matrix Multiply with FP16 or 32 Accumulate) INT8 (Integer) 59.3 TOPS | 26% improvement System Interface PCI Express Gen 3 x16 NVLink 200 GB/sec Bidirectional | 25% improvement Display Connectors 4x DisplayPort 1.4 with HDCP 2.2 4K Display Support 4x 4096 x 2160 at 120 Hz with HDR 5K Display Support 4x 5120 x 2880 at 60 Hz with HDR 8K Display Support 2x 7680 x 4320 at 60 Hz with HDR VR Ready and Stereo Yes, Stereo via 3-pin mini-DIN Connector Bracket
NVIDIA Quadro GV100
Unmatched compute capabilities
INT8 59.3 FP64 FP32 FP16 7.4 14.8 118.5 TFLOPS TFLOPS TFLOPS TOPS
NVIDIA Quadro GV100
Features and benefits relative to GP100
GP100 GV100 Benefit GPU Architecture Pascal Volta Most powerful, efficient and AI optimized GPU CUDA Cores 3584 5120 Significantly greater compute and rendering performance FP64 Performance 5.2 TFLOPS 7.2 TFLOPS 1.4x greater FP64 compute performance Memory Size 16 GB HBM2 32 GB HBM2 2.0x memory capacity Memory Bus Width 4096-bit 4096-bit Radically advanced memory bus implementation Peak Memory Bandwidth 717 GB/sec 870 GB/sec Move data to and from GPU 1.2x faster Display Support 4x DP 1.4 + 1x DVI-D DL 4x DP 1.4 and HDCP 2.2 Supports four 4K, 5K or 8K displays, latest HDCP HDR Image Support Yes Yes More lifelike images Advanced Display Quadro Sync II Quadro Sync II Synchronize up to 8 GPUs per system VR Ready Yes Yes, GV100 implements full suite of hardware optimizations NVLink NVLink (First Generation) NVLink (Second Generation) Higher performance means lower latency Board Power 235 W 250 W Better performance per Watt Auxiliary Power Connector 8-pin PCIe 8-pin PCIe Simplified power supply connectivity Form Factor 4.4” H x 10.5” L Dual Slot 4.4” H x 10.5” L Dual Slot No significant mechanical or thermal changes
NVIDIA Quadro GV100
Redefines state of the art across essential solutions
Artificial Intelligence
Tensor processor cores NVIDIA GPU deep learning stack ISV DL and ML framework optimization Iterate and innovate faster Reduce training time
RTX Rendering
Unrivaled FP32 performance Largest models in GPU memory AI accelerated photorealistic rendering Neural network character animation Apply AI to simultaneous video streams
Compute
Industry leading HPC capabilities Work with largest datasets Integrate simulation into design process Utilize generative design algorithms Fastest FEA, CFD, CEM available
Immersive Visualization (VR)
Includes VR hardware optimizations Full NVIDIA VRWORKS support Create new AI-augmented technologies Visualize the largest datasets Collaborative VR environments (Holodeck)
Connect two GV100 boards with NVLink to provide 64 GB of memory and twice the GPU processing power in standard workstation enclosures
NVIDIA Quadro GV100
RTX rendering lets you dream and create at the speed of thought
Architectural Design Visualize cities or urban street scenes in every photorealistic detail Product Design Design with physically based lights and materials in realtime Media and Entertainment Perfect every shot with GPI accelerated and AI enhanced rendering Work at full fidelity, utilizing massive datasets with 2x larger memory capacity Master rendering projects interactively with AI (Deep Neural Network) technology
NVIDIA Quadro
RTX supercharges rendering with AI accelerated denoising
Denoising On 20 Frames Denoising Off 20 Frames Denoising Off 290 Frames High quality results with fluid visual interactivity throughout the design process
NVIDIA Quadro
Companies working with NVIDIA’s OptiX AI denoiser technology
Image courtesy of Isotropix, rendered with Clarisse and denoised with NVIDIA OptiX.NVIDIA Quadro
CAD and CAE workflow elements
Design (CAD) Simulation (CAE) Post-Processing Pre-Processing
NVIDIA Quadro GV100
Benefit from the ultimate immersive experiences
RTX Rendered Graphics Interactive Physics GPU-Accelerated AI Realtime Collaboration
2x larger memory capacity lets you work with high fidelity, massive datasets (v. GP100) Benefit from unconstrained Holodeck experiences with full-featured VR performance and capabilities
NVIDIA Quadro GV100
Realize new opportunities with AI
32 GB or 64 GB capacity (NVLink) trains neural networks with massive datasets Develop with NVIDIA optimized Deep Learning frameworks and deploy with NGC interoperability and scalability Accelerate AI training and inferencing on workstations with Tensor cores and NVLink
NGC
Retail store inferencing with Quadro by DeepBlue Technology, China Development Aggregation Inferencing At-The-Edge
NVIDIA Quadro GV100 AI Training Performance
Up to 2x improvement in Deep Learning training performance*
GP100 Batch Size 256 GV100 Batch Size 256
Tensor Flow ResNet-50 Training IPS
GV100 Batch Size 512
*Based on TensorFlow Resnet-50 Training. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.400 300 200 100 500 600 700 GP100 Batch Size 128 GV100 Batch Size 128
Caffe ResNet-50 Training IPS
GV 100 Batch Size 256 500 400 300 200 100 600 700 800
NVIDIA Quadro GV100 Deep Learning Training Performance
Over 2x improvement in Deep Learning training and inference performance*
1 Batch Size
TensorRT ResNet-50 Inference
8
*Based on TensorFlow Resnet-50 Training, TensorRT ResNet-50 Inference tests. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.400 300 200 100 500 600 700 GP100 Batch Size 256 GV100 Batch Size 256
Tensor FlowResNet-50 Training
GV 100 Batch Size 512 400 300 200 100 500 600 700 2 4
NVIDIA Quadro GV100 Scientific Compute Performance
More than 2x improvement over the previous generation*
GP100
LAAMPS Atomic Fluid Benchmark
GV100
*Based LAMMPS molecular modeling benchmark. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.400 300 200 100 500 600 700 FP32 FP64
CUDA Basic Linear Algebra Solver Benchmark
FP16 1.0 0.5 1.5 2.0
CUBLABS 2560 x 2048 x 8192
NVIDIA Quadro GV100 CAE Example
Significant ANSYS Mechanical 19 Acceleration*
*Power Supply Module (V19cg-1). 2x Xeon E5-2699 v4 at 2.2 GHz, 22 cores, HT off, NVIDIA driver 390.40 TCC, 256GB DRAM, CentOS 7.2.1511 64-bit. ANSYS Mechanical 19 benchmark model. Steady state thermal analysis of a power supply module, 5.3 Mdofs, JCG, real-value symmetric.1 2 3 4 4 CPU Cores 3 CPU Cores + GV100 8 CPU Cores 8 CPU Cores + GV100 16 GPU Cores Base License | 1.0 Base License | 2.65 Base + 4 HPC Licenses | 1.71 Base + 5 HPC Licenses | 3.90 Power Supply Module (V19cg-1) Base + 12 HPC Licenses | 2.29
NVIDIA Quadro GV100 CAE Example
Standout ANSYS Fluent 19 Acceleration*
*Power Supply Module (V19cg-1). 2x Xeon E5-2699 v4 at 2.2 GHz, 22 cores, HT off, NVIDIA driver 390.40 TCC, 256GB DRAM, CentOS 7.2.1511 64-bit. ANSYS Mechanical 19 benchmark model. Steady state thermal analysis of a power supply module, 5.3 Mdofs, JCG, real-value symmetric.1 2 3 4 5 6 4 CPU Cores 3 CPU Cores + GV100 8 CPU Cores 16 CPU Cores + 2x GV100 32 CPU Cores Base License | 1.0 Base License | 1.53 Base + 4 HPC Licenses | 1.78 Base + 5 HPC Licenses | 4.71 Pipes Model 9.6 Million Cells Base + 2 HPC Packs | 3.29 8 CPU Cores + GV100 Base + 5 HPC Licenses | 2.67 16 CPU Cores Base + 12 HPC Licenses | 2.74 32 CPU Cores + 2x GV100 Base + 2 HPC Packs | 5.55
NVIDIA Quadro GV100 Rendering Performance
SOLIDWORKS Visualize scales to over 29x faster than CPU*
*Based on 2x GV100, Xeon E5-2697 v3, 14 cores at 2.6 GHz, 32 GB DRAM, Win 10 Pro 64-bit Fall Creator’s Update and NVIDIA driver version 390.77. Tests run at 4K UHD (3840 x 2160) resolution.2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 CPU P4000 P5000 P6000 GP100 GV100 2x GP100 2x GV100 P2000
NVIDIA Quadro GV100 Graphics Performance
Up to 1.3x better than previous generation*
*Based on SPECviewperf 12.2.2 results.1.4 1.0 0.6 1.2 0.8 Quadro GP100 Quadro GV100 geomean 3dsmax catia energy maya sw medical creo showcase snx 0.4 0.2