Understanding GPU performance
How to get peak FLOPS (GPU version)
Kenjiro Taura
1 / 7
Understanding GPU performance How to get peak FLOPS (GPU version) - - PowerPoint PPT Presentation
Understanding GPU performance How to get peak FLOPS (GPU version) Kenjiro Taura 1 / 7 Contents 1 Data Access Performance 2 / 7 Contents 1 Data Access Performance 3 / 7 Data access performance data access performance is important in GPU too
How to get peak FLOPS (GPU version)
1 / 7
1 Data Access Performance
2 / 7
1 Data Access Performance
3 / 7
4 / 7
5 / 7
6 / 7
1
for (N times) {
2
p = p->next;
3
} cache line size next pointers N elements (link all elements in a random order)
7 / 7
100 200 300 400 500 600 700 1024 4096 16384 65536 262144 1.04858 × 106 4.1943 × 106 1.67772 × 107 6.71089 × 107 latency/load (GPU cycles) size of the region (bytes) latency per load in a random list traversal p 8 v 8
8 / 7
9 / 7