Frequency-based Overhead Compensation in HPC Application Traces Alef - - PowerPoint PPT Presentation

frequency based overhead compensation in hpc application
SMART_READER_LITE
LIVE PREVIEW

Frequency-based Overhead Compensation in HPC Application Traces Alef - - PowerPoint PPT Presentation

Introduction Compensation Whats new? Early results Conclusion Frequency-based Overhead Compensation in HPC Application Traces Alef Farah 1 Lucas Mello Schnorr 1 2 Jean-Marc Vincent 2 1 GPPD - INF - UFRGS 2 Univ. Grenoble-Alpes, France WSPP


slide-1
SLIDE 1

Introduction Compensation What’s new? Early results Conclusion

Frequency-based Overhead Compensation in HPC Application Traces

Alef Farah 1 Lucas Mello Schnorr 1 2 Jean-Marc Vincent 2

1GPPD - INF - UFRGS

  • 2Univ. Grenoble-Alpes, France

WSPP 2016

slide-2
SLIDE 2

Introduction Compensation What’s new? Early results Conclusion

Application tracing

Performance analysis Logging of significant events Unique identifiers (timestamp) Chronological order Parallel and distributed applications

slide-3
SLIDE 3

Introduction Compensation What’s new? Early results Conclusion

Overhead in traces?

Additional instructions! Direct perturbations

Logging overhead

slide-4
SLIDE 4

Introduction Compensation What’s new? Early results Conclusion

Overhead in traces?

Additional instructions! Direct perturbations

Logging overhead → log less

slide-5
SLIDE 5

Introduction Compensation What’s new? Early results Conclusion

Overhead in traces?

Additional instructions! Direct perturbations

Logging overhead → log less → compensation

slide-6
SLIDE 6

Introduction Compensation What’s new? Early results Conclusion

Overhead in traces?

Additional instructions! Direct perturbations

Logging overhead → log less → compensation

Indirect perturbations

Compilier optimizations

slide-7
SLIDE 7

Introduction Compensation What’s new? Early results Conclusion

Overhead in traces?

Additional instructions! Direct perturbations

Logging overhead → log less → compensation

Indirect perturbations

Compilier optimizations → binary instrumentation

slide-8
SLIDE 8

Introduction Compensation What’s new? Early results Conclusion

Overhead in traces?

Additional instructions! Direct perturbations

Logging overhead → log less → compensation

Indirect perturbations

Compilier optimizations → binary instrumentation Cache, CPU optimizations

slide-9
SLIDE 9

Introduction Compensation What’s new? Early results Conclusion

Compensation and overhead measurement

eventi

a = eventi-1 a

+ (eventi

m - eventi-1 m ) - O

Isolate the logging routine Take enough measurements Produce an estimator (e.g. mean)

slide-10
SLIDE 10

Introduction Compensation What’s new? Early results Conclusion

Compensation and overhead measurement

eventi

a = eventi-1 a

+ (eventi

m - eventi-1 m ) - O

Isolate the logging routine Take enough measurements Produce an estimator (e.g. mean) Very fast routines → high variability

slide-11
SLIDE 11

Introduction Compensation What’s new? Early results Conclusion

Frequency

Logging overhead is a function of the logging frequency. The difference may be small, the error is cumulative. Also How high is the variability? What can be done about it?

slide-12
SLIDE 12

Introduction Compensation What’s new? Early results Conclusion

Notes

Mean frequency for the entire trace Regular applications MPI

slide-13
SLIDE 13

Introduction Compensation What’s new? Early results Conclusion

Metrics

How to compare with previous methods? Total execution time Space/time view

slide-14
SLIDE 14

Introduction Compensation What’s new? Early results Conclusion

Platform

2 NUMA nodes Intel Xeon E5-2630 (24 PU total) 32 GB RAM OpenMPI 1.6.5 Shared memory Linux 3.16.0-51 (Ubuntu 14.04.1), GCC 4.8

slide-15
SLIDE 15

Introduction Compensation What’s new? Early results Conclusion

Applications

OSU Microbenchmarks v5.2 (osu_multi_lat) Ondes3D v1.1

slide-16
SLIDE 16

Introduction Compensation What’s new? Early results Conclusion

OSU Microbenchmarks

Execution Mean Standard error UnInstrumented 12.9576502561 0.280464331573357 Instrumented 13.1024921894073 0.176561479255295 Traditional 13.0582891357813 0.176510576519728 Frequency 12.9450804535228 0.176508398007264

slide-17
SLIDE 17

Introduction Compensation What’s new? Early results Conclusion

Ondes3D

5 10 15 5 10 15 Compensated Original 5.30 5.32 5.34 5.36 Runtime (seconds) Process Identification 20000 25000 30000 35000 bytes routine MPI_Barrier MPI_Comm_dup MPI_Comm_rank MPI_Comm_size MPI_Finalize MPI_Isend MPI_Recv MPI_Reduce MPI_Wait MPI_Wtime

slide-18
SLIDE 18

Introduction Compensation What’s new? Early results Conclusion

Conclusion

Execution time is a function of the frequency Care should be taken with measurement variability Encouraging results using coarse metric Inconclusive results using fine grain metric

slide-19
SLIDE 19

Introduction Compensation What’s new? Early results Conclusion

Future work

Test traces with higher intrusion Tests in a networked environment Tests with tools with a higher intrusion Search for a fine grain metric Investigation with irregular applications

slide-20
SLIDE 20

Introduction Compensation What’s new? Early results Conclusion

Thank you for the attention!

The results reported in this study were generated in virtue of the agreement between Hewlett Packard Enterprise (HPE) and the Federal University of Rio Grande do Sul (UFRGS), financed by resources in return for the exemption or reduction of the IPI tax, granted by Brazilian Law nº 8248, 1991, and its subsequent

  • updates. This investigation also receives funds from the H2020

program EU and MCTI / RNP-Brazil through HPC4E project with code 689772, the FAPERGS / Inria ExaSE design, universal design CNPq 447311 / 2014-0, and international CNRS / LICIA laboratory.