I NTRODUCTION TO GPU C OMPUTING Ilya Kuzovkin 13 May 2014, Tartu P - PowerPoint PPT Presentation

I NTRODUCTION TO GPU C OMPUTING Ilya Kuzovkin 13 May 2014, Tartu

P ART I “T EAPOT ”

S IMPLE O PEN GL P ROGRAM Idea of computing on GPU emerged because GPUs became very good at parallel computations.

S IMPLE O PEN GL P ROGRAM Idea of computing on GPU emerged because GPUs became very good at parallel computations. � Let us start from observing an example of parallelism in a simple OpenGL application.

S IMPLE O PEN GL P ROGRAM You will need CodeBlocks Windows, Linux or XCode Mac to run this example. • Install CodeBlocks bundled with MinGW compiler from http://www.codeblocks.org/downloads/26 � • Download codebase from https://github.com/kuz/ Introduction-to-GPU-Computing � • Open the project from the code/Cube � � • Compile & run it

S HADER P ROGRAM Program which is executed on GPU . Has to be written using shading language . In OpenGL this language is GLSL , which is based on C. http://www.opengl.org/wiki/Shader

S HADER P ROGRAM Program which is executed on GPU . Has to be written using shading language . In OpenGL this language is GLSL , which is based on C. OpenGL has 5 main shader stages: • Vertex Shader • Tessellation Control • Geometry Shader • Fragment Shader • Compute Shader (since 4.3) http://www.opengl.org/wiki/Shader

L IGHTING Is it a cube or not? We will find out as soon as we add lighting to the scene.

L IGHTING Is it a cube or not? We will find out as soon as we add lighting to the scene. https://github.com/konstantint/ComputerGraphics2013/blob/master/Lectures/07%20-%20Color%20and%20Lighting/slides07_colorandlighting.pdf

L IGHTING Is it a cube or not? We will find out as soon as we add lighting to the scene. Exercise: code that equation into fragment shader of the Cube program https://github.com/konstantint/ComputerGraphics2013/blob/master/Lectures/07%20-%20Color%20and%20Lighting/slides07_colorandlighting.pdf

L IGHTING

C OMPARE FPS • Run the program with lighting enabled and look at FPS values

C OMPARE FPS • Run the program with lighting enabled and look at FPS values � • In cube.cpp idle() function uncomment dummy code which simulates approximately same amount of computations as Phong lighting model requires.

C OMPARE FPS • Run the program with lighting enabled and look at FPS values � • In cube.cpp idle() function uncomment dummy code which simulates approximately same amount of computations as Phong lighting model requires. � • Note that these computations are performed on CPU

C OMPARE FPS • Run the program with lighting enabled and look at FPS values � • In cube.cpp idle() function uncomment dummy code which simulates approximately same amount of computations as Phong lighting model requires. � • Note that these computations are performed on CPU � • Observe how FPS has changed

C OMPARE FPS • Run the program with lighting enabled and look at FPS values � • In cube.cpp idle() function uncomment dummy Parallel computations are fast on GPU. code which simulates approximately same amount of computations as Phong lighting model requires. Lets use it to compute something useful. � • Note that these computations are performed on CPU � • Observe how FPS has changed

P ART II “O LD S CHOOL ”

O PEN GL PIPELINE + GLSL Take the input data from the CPU memory and put it as an image into the GPU memory http://www.opengl.org/wiki/Framebuffer

O PEN GL PIPELINE + GLSL Take the input data from In the fragment shader the CPU memory and put perform a computation on it as an image into the each of the pixels of that image GPU memory http://www.opengl.org/wiki/Framebuffer

O PEN GL PIPELINE + GLSL Take the input data from In the fragment shader the CPU memory and put perform a computation on it as an image into the each of the pixels of that image GPU memory Store the resulting image to the Render Buffer inside the GPU memory http://www.opengl.org/wiki/Framebuffer

O PEN GL PIPELINE + GLSL Take the input data from In the fragment shader the CPU memory and put perform a computation on it as an image into the each of the pixels of that image GPU memory Read output from the GPU Store the resulting image memory back to the CPU to the Render Buffer inside memory the GPU memory http://www.opengl.org/wiki/Framebuffer

O PEN GL PIPELINE + GLSL • Create texture where will store the input data http://www.opengl.org/wiki/Framebuffer

O PEN GL PIPELINE + GLSL • Create texture where will store the input data � � � � � • Create FrameBuffer Object (FBO) to “render” to http://www.opengl.org/wiki/Framebuffer

O PEN GL PIPELINE + GLSL • Run OpenGL pipeline http://www.opengl.org/wiki/Framebuffer

O PEN GL PIPELINE + GLSL • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix http://www.opengl.org/wiki/Framebuffer

O PEN GL PIPELINE + GLSL • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture http://www.opengl.org/wiki/Framebuffer

O PEN GL PIPELINE + GLSL • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture • OpenGL will store result in the texture given to the Render Buffer (within Framebuffer Object) http://www.opengl.org/wiki/Framebuffer

O PEN GL PIPELINE + GLSL • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture • OpenGL will store result in the texture given to the Render Buffer (within Framebuffer Object) � • Read the data from the Render Buffer http://www.opengl.org/wiki/Framebuffer

O PEN GL PIPELINE + GLSL • Run OpenGL pipeline • Render GL_QUADS of same size as the texture matrix • Use fragment shader to perform per-fragment computations using data from the texture • OpenGL will store result in the texture given to the Render Buffer (within Framebuffer Object) � • Read the data from the Render Buffer � � � � • Can we use that to properly debug GLSL? http://www.opengl.org/wiki/Framebuffer

D EMO Run the project from the code/FBO

P ART III “M ODERN T IMES ”

C OMPUTE S HADER • Since OpenGL 4.3 • Used to compute things not related to rendering directly

k l a t t o n t l l i i W C OMPUTE S HADER t u o b a • Since OpenGL 4.3 • Used to compute things not related to rendering directly http://web.engr.oregonstate.edu/~mjb/cs557/Handouts/compute.shader.1pp.pdf

http://wiki.tiker.net/CudaVsOpenCL

Supported by nVidia, Supported only by AMD, Intel, Qualcomm nVidia hardware https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl http://wiki.tiker.net/CudaVsOpenCL

Supported by nVidia, Supported only by AMD, Intel, Qualcomm nVidia hardware https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl Implementations only Open CL by nVidia http://wiki.tiker.net/CudaVsOpenCL

Supported by nVidia, Supported only by AMD, Intel, Qualcomm nVidia hardware https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl Implementations only Open CL by nVidia ~same performance levels http://wiki.tiker.net/CudaVsOpenCL

Supported by nVidia, Supported only by AMD, Intel, Qualcomm nVidia hardware https://developer.nvidia.com/cuda-gpus http://www.khronos.org/conformance/adopters/conformant-products#opencl Implementations only Open CL by nVidia ~same performance levels Open CL Developer-friendly http://wiki.tiker.net/CudaVsOpenCL

P ART III C HAPTER 1

K ERNEL

W RITE AND R EAD D ATA ON GPU

W RITE AND R EAD D ATA ON GPU … run computations here …

T HE C OMPUTATION

D EMO Open, study and run the project from the code/OpenCL

P ART III C HAPTER 2

CUDA P ROGRAMMING MODEL • CPU is called “ host ” • Move data CPU <-> GPU memory cudaMemcopy • Allocate memory cudaMalloc ¡ • Launch kernels on GPU • GPU is called “ device ”

I NTRODUCTION TO GPU C OMPUTING Ilya Kuzovkin 13 May 2014, Tartu P - PowerPoint PPT Presentation

I NTRODUCTION TO GPU C OMPUTING Ilya Kuzovkin 13 May 2014, Tartu P ART I T EAPOT S IMPLE O PEN GL P ROGRAM Idea of computing on GPU emerged because GPUs became very good at parallel computations. S IMPLE O PEN GL P ROGRAM Idea of computing

Part 1 Part 1 I ntroduction Review of I ntroduction Review of I ntroduction, Review of I

ZDLRA @ METRONOM 1 0 .2 4 .2 0 1 8 1 I ntroduction Agenda 2 Mission 3 Best Practices 4

I ntroduction to population PKPD modelling modelling I ntroduction to population PKPD in

I ntroduction I ntroduction CO CO W I NDALCO W I NDALCO to to Contribution to the Econom y

AIM AIM I ntroduction of energy scenarios in I ntroduction of energy scenarios in Thailand and

I ntroduction to Nanoelectronics Nanoelectronics I ntroduction to Prof. Supriyo Datta ECE 453

I NTRODUCTION TO EWM April 2019 Extracting Minerals From Water I NTRODUCTION TO EWM n Enviro

i ntroduction i ntroduction Invariants of Hilbert series numerical semigroups New

I ntroduction to Programming I ntroduction to Programming Mapping Techniques On The GPU Mapping

I ntroduction Financial Performance Operational Review Development & Strategy

Autonom ous Refrigerator Vincius Bazan Adam Jerozolim Luiz Jollembeck I ntroduction

Ansible Basics Oleg Fiksel Security Consultant @ CSPI GmbH oleg.fiksel@cspi.com |

I ntroduction to Public Health HLTH 2020/ CT 2 Prof. Ralph Welsh Faculty I nstitute 2018 - A CT

I NTRODUCTION TO F REE -E NERGY C ALCULATIONS Chris Chipot Laboratoire International Associ

L AB L ECTURE 1: I NTRODUCTION TO ROS I NSTRUCTOR : G IANNI A. D I C ARO P ROBLEM ( S ) IN ROBOTICS

Stochastic Matching in Hypergraphs Amit Chavan, Srijan Kumar and Pan Xu May 13, 2014 I

Processing Forecasting Queries Processing Forecasting Queries Songyun Duan, Shivnath Babu Duke

Multiplayer Online Games An-Cheng Huang Network Reading Group Meeting Nov. 14, 2003 Outline

NEURAL DUAL BACKGROUND MODELING FOR REAL-TIME STOPPED OBJECT DETECTION Giorgio Gemignani Lucia

Making Deep Q-learning Approaches Robust to Time Discretization Corentin Tallec L eonard

tCap : High-Speed Human Motion Capture using Eve EventC an Event Camera Lan XU Hong

Week 1 -Wednesday What did we talk about last time? Introduction to the course Colors

www.DLR.de Chart 2 > RTOFramerate> Frstenau SESARInnot > 2012 -11-30 DLR

Strategies for Incorporating Delegation into Attribute-Based Access Control (ABAC) Sylvia L.

I NTRODUCTION TO GPU C OMPUTING Ilya Kuzovkin 13 May 2014, Tartu P - PowerPoint PPT Presentation

I NTRODUCTION TO GPU C OMPUTING Ilya Kuzovkin 13 May 2014, Tartu P ART I T EAPOT S IMPLE O PEN GL P ROGRAM Idea of computing on GPU emerged because GPUs became very good at parallel computations. S IMPLE O PEN GL P ROGRAM Idea of computing

Part 1 Part 1 I ntroduction Review of I ntroduction Review of I ntroduction, Review of I

ZDLRA @ METRONOM 1 0 .2 4 .2 0 1 8 1 I ntroduction Agenda 2 Mission 3 Best Practices 4

I ntroduction to population PKPD modelling modelling I ntroduction to population PKPD in

I ntroduction I ntroduction CO CO W I NDALCO W I NDALCO to to Contribution to the Econom y

AIM AIM I ntroduction of energy scenarios in I ntroduction of energy scenarios in Thailand and

I ntroduction to Nanoelectronics Nanoelectronics I ntroduction to Prof. Supriyo Datta ECE 453

I NTRODUCTION TO EWM April 2019 Extracting Minerals From Water I NTRODUCTION TO EWM n Enviro

i ntroduction i ntroduction Invariants of Hilbert series numerical semigroups New

I ntroduction to Programming I ntroduction to Programming Mapping Techniques On The GPU Mapping

I ntroduction Financial Performance Operational Review Development &amp; Strategy

Autonom ous Refrigerator Vincius Bazan Adam Jerozolim Luiz Jollembeck I ntroduction

Ansible Basics Oleg Fiksel Security Consultant @ CSPI GmbH oleg.fiksel@cspi.com |

I ntroduction to Public Health HLTH 2020/ CT 2 Prof. Ralph Welsh Faculty I nstitute 2018 - A CT

I NTRODUCTION TO F REE -E NERGY C ALCULATIONS Chris Chipot Laboratoire International Associ

L AB L ECTURE 1: I NTRODUCTION TO ROS I NSTRUCTOR : G IANNI A. D I C ARO P ROBLEM ( S ) IN ROBOTICS

Stochastic Matching in Hypergraphs Amit Chavan, Srijan Kumar and Pan Xu May 13, 2014 I

Processing Forecasting Queries Processing Forecasting Queries Songyun Duan, Shivnath Babu Duke

Multiplayer Online Games An-Cheng Huang Network Reading Group Meeting Nov. 14, 2003 Outline

NEURAL DUAL BACKGROUND MODELING FOR REAL-TIME STOPPED OBJECT DETECTION Giorgio Gemignani Lucia

Making Deep Q-learning Approaches Robust to Time Discretization Corentin Tallec L eonard

tCap : High-Speed Human Motion Capture using Eve EventC an Event Camera Lan XU Hong

Week 1 -Wednesday What did we talk about last time? Introduction to the course Colors

www.DLR.de Chart 2 &gt; RTOFramerate&gt; Frstenau SESARInnot &gt; 2012 -11-30 DLR

Strategies for Incorporating Delegation into Attribute-Based Access Control (ABAC) Sylvia L.

I ntroduction Financial Performance Operational Review Development & Strategy

www.DLR.de Chart 2 > RTOFramerate> Frstenau SESARInnot > 2012 -11-30 DLR