Master Thesis
Atlas Tracking Optimization on GPU
Luis Domingues
Professor: Frédéric Bapst Supervisors: Paolo Calafiura Wim Lavrijsen Expert: Mathieu Monney
02/25/2015
Atlas Tracking Optimization on GPU Luis Domingues Professor: - - PowerPoint PPT Presentation
Master Thesis Atlas Tracking Optimization on GPU Luis Domingues Professor: Frdric Bapst Supervisors: Paolo Calafiura Wim Lavrijsen Expert: Mathieu Monney 02/25/2015 Target Luis Domingues - January 2015 2 Code we started from
Professor: Frédéric Bapst Supervisors: Paolo Calafiura Wim Lavrijsen Expert: Mathieu Monney
02/25/2015
Luis Domingues - January 2015 2
Luis Domingues - January 2015 3
– Take data – Send and compute data on GPU – Sleep waiting the response
Luis Domingues - January 2015 4
Luis Domingues - January 2015 5
Time Pixel SCT Kernels Time stamp Time stamp Kernels Time stamp Time stamp
Luis Domingues - January 2015 6
Luis Domingues - January 2015 7
Time H2D H2D H2D Stream1 Stream2 Stream3 Kernel Kernel Kernel D2H D2H D2H H2D = Host to device transfer D2H = Device to host transfer
Luis Domingues - January 2015 8
Time Pixel stream SCT stream Kernels Time stamp Time stamp Kernels Time stamp Time stamp
Luis Domingues - January 2015 9
Luis Domingues - January 2015 10
– Avg Pixel: 2.03 ms – Avg SCT: 1.95 ms – Total avg: 3.98 ms
– Avg Pixel: 2.3 ms – Avg SCT: 2.5 ms
Luis Domingues - January 2015 11
– Without overlapping: 8.65 s – With overlapping:
Luis Domingues - January 2015 12
– They do not fulfill the GPU
Luis Domingues - January 2015 13
Client Client Client Client Client Client Client Client FIFO
Luis Domingues - January 2015 14
Luis Domingues - January 2015 15
Luis Domingues - January 2015 16
– Without overlapping:
– With overlapping:
– Multi-threading server side: 4.7 s
Luis Domingues - January 2015 17
Luis Domingues - January 2015 18
Cuda Core GPU Multiprocessor
Luis Domingues - January 2015 19
Cuda Core GPU
Kernel 1 Kernel 2 Intra-block synchronization Multiprocessor
Luis Domingues - January 2015 20
Cuda Core Multiprocessor GPU
Kernel 1 Kernel 2 Intra-block synchronization
Luis Domingues - January 2015 21
– Big Blocks size:
– Original configuration:
– Small blocks size:
Luis Domingues - January 2015 22
– Big blocks size:
– Small blocks size:
Luis Domingues - January 2015 23
– Port of an algorithm to the GPU – Communicate with the GPU – Host side design