Martinez Noriega Edgar Josafat.
- Dr. Narumi Tetsu.
The University of Electro-communications, Tokyo High Performance - - PowerPoint PPT Presentation
The University of Electro-communications, Tokyo High Performance Computing on Mobile Devices through Distributed Shared CUDA By Martinez Noriega Edgar Josafat. Dr. Narumi Tetsu. Introduction GPUs are everywhere! GPU characteristics:
Martinez Noriega Edgar Josafat.
2
GPU - Graphics Processor Unit Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
3
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
4
Portability. Mobility. Low processing (ARM processors) Touch screen capabilities Low power consumption. Huge ecosystem. Connectivity. Limited memory.
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
5
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
6
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
anywhere. Examples:
7
*Atsushi Kawai, Kenji Yasuoka, Kazuyuki Yoshikawa and Narumi Tetsu “Distributed Sahred CUDA:Virtualization of Large-Scale GPU systems for Pragammability and Reliability”The Fourth International Conference on Future Computational Technologies and Applications, France 2012)
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
8
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
9
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
Server:
Client:
10
DS-CUDA main specifications.
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
RPC(Socket) InfiniBand (Verb) RCP (Socket) InfiniBand (Verb)
64 bit 64 bit
Linux Linux
4.2
11
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
12
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
Shot 27 new ions Number of Particles: {8, 64, 216, 512, 1000, 1728, 2744, 4096, 5832}
Graphical Detail
Characteristics of CS:
integration).
13
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
medium Enable
information Enable
14
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
Device
CPU GPU Memory OS CUDA
Alienware Knoppix 7.02 32
Intel Core i7, 2.30 GHz, 8 Cores GeForce GT 680M, 7 MultiProcessors, 1344 CUDA Cores, Global Memory 2047Mbytes. 16 Gbytes, DDR3, 1600 MHz Knoppix7.0.2 x86 Linux Driver 331.62, Toolkit 6.0, SDK 6.0
NVIDIA “SHIELD”
NVIDIA Tegra 4, ARMv7, 1.912 GHz, 4 Cores NVIDIA AP, 72 Custom Cores, 2 Gbytes, DDR3L & LPDDR3 Android 4.4.2 ——
Tegra K1
Intel Core i7, 2.40 GHz, 8 Cores Tegra K1 (GK20A), 1 MultiProcessors, 192 CUDA Cores, Global Memory 1746 Mbytes. 2 Gbytes, DDR3L, 933 MHz Linux for Tegra (Ubuntu 14.04 for ARM) Driver “Custom for Jetson K1”, Toolkit 6.0, SDK 6.0
15
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
16
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
17
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
~10GB/s ~80 MB/s ~8 MB/s
“Bandwidth Test” sample from CUDA SDK is used.
18
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
T :Time per Frame on Claret Demo T_GPU: Time on GPU T_CPU: Time onCPU T_COM: Time for communication between CPU and GPU T_DISP: Time for render particles in OpenGL
0.00 0.00 0.01 0.10 1.00 8 64 216 512 1000 1728 2744 4096 5832 Time (seconds) Number of Particles
Claret Total Performance - Model vs Measured
Model Measured
19
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
0% 25% 50% 75% 100%
8 64 216 512 1000 1728 2744 4096 5832 Percentage of each process on Claret (Model Values)
Number of Particles
Claret Total Performance (Percentage) - Model - Android
T_GPU T_CPU T_COMM T_DISP
0% 25% 50% 75% 100%
8 64 216 512 1000 1728 2744 4096 5832 Percentage of each process on Claret (Model Values)
Number of Particles
Claret Total Performance (Percentage) - Model- K1
T_GPU T_CPU T_COMM T_DISP
20
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
0.001 0.010 0.100 1.000 10.000 100.000 1000.000 8 64 216 512 1000 1728 2744 4096 5832
Gflops Number of Particles
Tegra K1 - CUDA SHIELD - DS-CUDA SHIELD - CPU
1x ~ 2 200x ~ 5 700x
21
Martinez Noriega Edgar Josafat The University of Electro-Communications, Tokyo
マルチイネズ ノリエガ エドガー ジョサファト
22
————————Profile Name: Martinez Noriega Edgar Josafat (エドガー) Residence Country: Japan Current Status: Master Student 2nd Year -HPC Nationality: Mexican, from Mexico City (Tlaltenco,Tlahuac) ————————Research Interest High Performance Computing on Mobile Devices GPU virtualization Parallel Computing — GPGPU, MPI, MThreading Molecular Dynamics
Contact: Email: edgarjosaf@gmail.com edgarjosaf@uec.ac.jp LinkedIn: Edgar Josafat Martinez Noriega