LARGE SCALE VISUALIZATION ON GPU ACCELERATED SUPERCOMPUTERS Peter - PowerPoint PPT Presentation

LARGE SCALE VISUALIZATION ON GPU ACCELERATED SUPERCOMPUTERS Peter Messmer, 11/16/2015

VISUALIZATION-ENABLED SUPERCOMPUTERS NCSA Blue Waters CSCS Piz Daint ORNL Titan Galaxy formation Molecular dynamics Cosmology http://www.sdav-scidac.org/29- http://blogs.nvidia.com/blog/2014/11/19/gpu-in- http://devblogs.nvidia.com/parallelforall/hpc highlights/visualization/66-accelerated-cosmology- situ-milky-way/ -visualization-nvidia-tesla-gpus/ data-anal.html 2

SUPPORTING MULTIPLE VISUALIZATION WORKFLOWS LEGACY PARTITIONED CO-PROCESSING WORKFLOW SYSTEM Separate compute & vis Different nodes for Compute and visualization system different roles on same GPU Communication via file Communication via high- Communication via host- system speed network device transfers or memcpy 3

EGL CONTEXT MANAGEMENT Leaving it to the driver Top systems support OpenGL under X ParaView/VMD EGL: Driver based context management X-server Support for full OpenGL*, not only GL ES Tesla driver with EGL Available in e.g. VTK Tesla GPU New opportunities for CUDA/OpenGL** interop *Full OpenGL in r355.11; **CUDA interop in r358.7 4

EFFICIENT RENDERING AT SCALE Modern networks remove compositing bottleneck Sort last compositing perceived bottleneck Today: fast networks, pipelining and novel algorithms > 30 fps on 4k frames on 1024 nodes possible Enables real-time viz at large concurrency Enables very large geometries (e.g. Piz Daint: 30 TB of GPU memory) 5

KEPLER GPU PASCAL GPU NVLINK NVLink HIGH-SPEED GPU INTERCONNECT POWER CPU NVLink PCIe PCIe X86, ARM64, X86, ARM64, POWER CPU POWER CPU 2014 2016 6

NVLINK UNLEASHES MULTI-GPU PERFORMANCE Over 2x Application Performance Speedup GPUs Interconnected with NVLink When Next-Gen GPUs Connect via NVLink Versus PCIe Speedup vs CPU PCIe based Server 2.25x 2.00x PCIe Switch 1.75x TESLA TESLA 1.50x GPU GPU 1.25x 5x Faster than 1.00x PCIe Gen3 x16 ANSYS Fluent Multi-GPU Sort LQCD QUDA AMBER 3D FFT 7 7 3D FFT, ANSYS: 2 GPU configuration, All other apps comparing 4 GPU configuration AMBER Cellulose (256x128x128), FFT problem size (256^3)

CUDA Super Simplified Memory Management Code CPU Code CUDA 6 Code with Unified Memory void sortfile(FILE *fp, int N) { void sortfile(FILE *fp, int N) { char *data; char *data; data = (char *)malloc(N); cudaMallocManaged(&data, N); fread(data, 1, N, fp); fread(data, 1, N, fp); qsort(data, N, 1, compare); qsort<<<...>>>(data,N,1,compare); cudaDeviceSynchronize(); use_data(data); use_data(data); free(data); cudaFree(data); } } 8

University of Illinois PowerGrid- MRI Reconstruction main() main() { <serial code> #pragma acc kernels //automatically runs on GPU { { <p <parall arallel el co code de> OpenACC } } 70x Speed-Up 2 Days of Effort Simple | Powerful | Portable RIKEN Japan NICAM- Climate Modeling 8000+ Fueling the Next Wave of Scientific Discoveries in HPC Developers using OpenACC 7-8x Speed-Up 5% of Code Modified http://www.cray.com/sites/default/files/resources/OpenACC_213462.12_OpenACC_Cosmo_CS_FNL.pdf http://www.hpcwire.com/off-the-wire/first-round-of-2015-hackathons-gets-underway 9 http://on-demand.gputechconf.com/gtc/2015/presentation/S5297-Hisashi-Yashiro.pdf http://www.openacc.org/content/experiences-porting-molecular-dynamics-code-gpus-cray-xk7

MODERN OPENGL FOR HPC VIZ Mandatory to access advanced rendering features VTK supports now OpenGL 3.2 Access to new shaders (AO, VXGI, ..) Some algorithms well suited for distributed memory rendering GPU hardware support Multi-casting for VXGI 10

HIGH FRAMERATE = MINIMAL IMPACT ON SIMULATION FPS matter, even in HPC Real-time visualization only one use case Batch processing will not immediately disappear Acceptable time budget for visualization/analysis More diagnostics in the same time ParaView Cinema 11

ACCELERATED REMOTE RENDERING WITH VIDEO ENCODING Interactivity over large distances Lossy and loss-less (Maxwell +) H264 encoder Separate unit, does not consume “GPU resources” Leveraged by commercial, free tools Available on e.g. Titan Possible use for non-video data https://developer.nvidia.com/nvidia-video-codec-sdk 12

SCALABLE RENDERING AND COMPOSITING NVIDIA INDEX Large-scale (volume) data visualization Interactive visualization of TB of data Stand-alone or coupling into simulation HW Accelerated remote rendering Plugin for ParaView http://www.nvidia-arc.com/products/nvidia-index.html 13

NVIDIA INDEX FOR PARAVIEW “I was very impressed with the responsive performance and high quality volume rendering of NVIDIA IndeX for ParaView on terabytes of data from my large thunderstorm simulation. Being able to interact with the full dataset in real-time is tremendously useful to me in uncovering science that is not Scalable volume rendering solution in currently possible with other ParaView for large data (Evaluation solutions .” version available in Q1 2016) - Dr. Leigh Orf Uses GPU clusters to deliver interactivity U. of Wisconsin-Madison performance needed by scientists 14

IN-SITU VISUALIZATION ON TITAN “When running PyFR at scale, it generates very large data sets that need analyzing for acoustics. The traditional post hoc method is simply not fit for purpose – in situ visualization and processing are critical. We see a potential for 50x First prototype of ParaView in-situ speed ups with in situ, which visualization capabilities in pyFR (CFD) significantly accelerates our scientific simulations, predicting jet engine acoustics discovery” Both compute and visualization running - Dr. Peter Vincent on Titan GPUs and streaming to a remote Imperial College location 15

VISUALIZATION ON TESLA Efficiency Fidelity Flexibility HW accelerated • rendering • Advanced rendering Remoting support Scalable visualization • • algorithms Simulation interop Multiple configurations • • • Improved perception • Maximized data for viz+sim Faster feedback • locality 16

VISUALIZATION ON GPU ACCELERATED SUPERCOMPUTERS GPU accelerated supercomputers support different visualization workflows Filter and render on GPU Use of hardware accelerated OpenGL features simplified by EGL Fast compositing enables efficient distributed memory rendering at high frame rate or minimal overhead Compression hardware enables image delivery at high frame rates Use of advanced OpenGL in tools enable novel capabilities (often with GPU support) NVLink simplifies locality management 17

LARGE SCALE VISUALIZATION ON GPU ACCELERATED SUPERCOMPUTERS Peter - PowerPoint PPT Presentation

LARGE SCALE VISUALIZATION ON GPU ACCELERATED SUPERCOMPUTERS Peter Messmer, 11/16/2015 VISUALIZATION-ENABLED SUPERCOMPUTERS NCSA Blue Waters CSCS Piz Daint ORNL Titan Galaxy formation Molecular dynamics Cosmology

GPU-Accelerated GPU-Accelerated Large Vocabulary Continuous Speech Recognition Large

Picture This! Visualization on GPU Accelerated Supercomputers Peter Messmer, 11/15/2016 NVIDIA

NVGRAPH,FIREHOSE,PAGERANK GPU ACCELERATED ANALYTICS NOV 2016 Joe Eaton Ph.D. Accelerated

Security Visualization Tim Vidas & Hanan Hibshi UPS 2011 1 Visualization Visualization can

Status of GPU offloading on Wayland Axel Davy FOSDEM 2014 Status of GPU offloading on Wayland

Motivation to Learn GPGPU Julius Parulek Why to Learn About GPU? Computational power of GPU vs.

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

GPU-accelerated similarity searching in a database of short DNA sequences Richard Wilton

Accelerated Reader What is Accelerated Reader? Accelerated Reader is the number one software

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Visualization Visualization Understand what ConvNets learn 2 Visualization The development of

Data Visualization Brait ispuu Types of Visualization Mathematical Visualization y =

GPU-Based Large-Scale Scientific Visualization Johanna Beyer, Harvard University Markus

UNIFIED MEMORY ON PASCAL AND VOLTA Nikolay Sakharnykh - May 10, 2017 1 HETEROGENEOUS

Advancements in V-Ray RT GPU Vlado Koylazov, CTO & Co-founder Blagovest Taskov, RT GPU Team

GPU-Based Large-Scale Scientific Visualization Johanna Beyer, Harvard University Markus

Defeating Relay Attacks in NFC Payments Serge Vaudenay COLE POLYTECHNIQUE FDRALE DE

Lattice Study of the Conformal Window in QCD-Like Theories George Fleming Ethan Neil TA PRL

1 Introduction Tropical geometry is a new subject which creates a bridge between the two is-

If Many Physicists Are Right and No What Is a Physical . . . Physical Theory Is Perfect, Then by

Super-parameterization: what it is and what is super about it? Wojciech Grabowski

Climate Simulation and Modelling at Ministry of Earth Sciences A.K.Sahai Indian Institute of

ENVIRONMENTAL GEOMECHANICS CE-641 Lecture No. 18 Prof. D N Singh Department of Civil

q-Tensor Squares of Polycyclic Groups Nora R. Rocco Universidade de Bras lia

LARGE SCALE VISUALIZATION ON GPU ACCELERATED SUPERCOMPUTERS Peter - PowerPoint PPT Presentation

LARGE SCALE VISUALIZATION ON GPU ACCELERATED SUPERCOMPUTERS Peter Messmer, 11/16/2015 VISUALIZATION-ENABLED SUPERCOMPUTERS NCSA Blue Waters CSCS Piz Daint ORNL Titan Galaxy formation Molecular dynamics Cosmology

GPU-Accelerated GPU-Accelerated Large Vocabulary Continuous Speech Recognition Large

Picture This! Visualization on GPU Accelerated Supercomputers Peter Messmer, 11/15/2016 NVIDIA

NVGRAPH,FIREHOSE,PAGERANK GPU ACCELERATED ANALYTICS NOV 2016 Joe Eaton Ph.D. Accelerated

Security Visualization Tim Vidas &amp; Hanan Hibshi UPS 2011 1 Visualization Visualization can

Status of GPU offloading on Wayland Axel Davy FOSDEM 2014 Status of GPU offloading on Wayland

Motivation to Learn GPGPU Julius Parulek Why to Learn About GPU? Computational power of GPU vs.

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

GPU-accelerated similarity searching in a database of short DNA sequences Richard Wilton

Accelerated Reader What is Accelerated Reader? Accelerated Reader is the number one software

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Visualization Visualization Understand what ConvNets learn 2 Visualization The development of

Data Visualization Brait ispuu Types of Visualization Mathematical Visualization y =

GPU-Based Large-Scale Scientific Visualization Johanna Beyer, Harvard University Markus

UNIFIED MEMORY ON PASCAL AND VOLTA Nikolay Sakharnykh - May 10, 2017 1 HETEROGENEOUS

Advancements in V-Ray RT GPU Vlado Koylazov, CTO &amp; Co-founder Blagovest Taskov, RT GPU Team

GPU-Based Large-Scale Scientific Visualization Johanna Beyer, Harvard University Markus

Defeating Relay Attacks in NFC Payments Serge Vaudenay COLE POLYTECHNIQUE FDRALE DE

Lattice Study of the Conformal Window in QCD-Like Theories George Fleming Ethan Neil TA PRL

1 Introduction Tropical geometry is a new subject which creates a bridge between the two is-

If Many Physicists Are Right and No What Is a Physical . . . Physical Theory Is Perfect, Then by

Super-parameterization: what it is and what is super about it? Wojciech Grabowski

Climate Simulation and Modelling at Ministry of Earth Sciences A.K.Sahai Indian Institute of

ENVIRONMENTAL GEOMECHANICS CE-641 Lecture No. 18 Prof. D N Singh Department of Civil

q-Tensor Squares of Polycyclic Groups Nora R. Rocco Universidade de Bras lia

Security Visualization Tim Vidas & Hanan Hibshi UPS 2011 1 Visualization Visualization can

Advancements in V-Ray RT GPU Vlado Koylazov, CTO & Co-founder Blagovest Taskov, RT GPU Team