MALT : MALloc Tracker
A memory profiling tool
3/02/2019 MALT, Sébastien Valat 1
MALT : MALloc Tracker A memory profiling tool 3/02/2019 MALT, - - PowerPoint PPT Presentation
MALT : MALloc Tracker A memory profiling tool 3/02/2019 MALT, Sbastien Valat 1 Questions We have good profiling tool for timings (eg. Valgrind or vtune) But for what memory profiling ? Memory can be an issue : Availability of
3/02/2019 MALT, Sébastien Valat 1
2 3/02/2019 MALT, Sébastien Valat
3
__thread Int gblVar[SIZE]; int * func(int size) { child_func_with_allocs(); void * ptr = new char[size]; double* ret = new double[size*size*size]; for (auto it : iter_Items) { double* buffer = new double[size]; //short and quick do stuff delete [] buffer; } return ret; }
Indirect allocations Leak Short life allocations Might lead to swap for large size Global variables and TLS
MALT, Sébastien Valat 3/02/2019
C++11 auto induced allocs
4 3/02/2019 MALT, Sébastien Valat
5 3/02/2019 MALT, Sébastien Valat
6
Metric selector Inclusive/Exclusive Symbols Details of symbol or line Call stacks reaching the selected site. Per line annotation
3/02/2019 MALT, Sébastien Valat
Web technology (NodeJS, D3JS, Jquery, AngularS)
3/02/2019 MALT, Sébastien Valat 7
3/02/2019 MALT, Sébastien Valat 8
– Physical – Virtual – Requested (malloced)
9 3/02/2019 MALT, Sébastien Valat
3/02/2019 MALT, Sébastien Valat 10
11
Time
3/02/2019 MALT, Sébastien Valat
12 3/02/2019 MALT, Sébastien Valat
gcc -g …
malt [--config=file.ini] YOUR_PRGM [OPTIONS] malt-webview -i malt-{YOUR_PRGM}-{PID}.json malt-qt -i malt-{YOUR_PRGM}-{PID}.json
3/02/2019 MALT, Sébastien Valat 13
3/02/2019 MALT, Sébastien Valat 14
3/02/2019 MALT, Sébastien Valat 15
3/02/2019 MALT, Sébastien Valat 16
50 100 150 200 250 300 350 400 450 500
Execution time (s)
User System Idle
4x
17 3/02/2019 MALT, Sébastien Valat
3/02/2019 MALT, Sébastien Valat 18
3/02/2019 MALT, Sébastien Valat 19
20 3/02/2019 MALT, Sébastien Valat
21
▪ Allocation rate ▪ Physical / Virtual / Requested memory ▪ Stack size for each thread (require function instrumentation)
3/02/2019 MALT, Sébastien Valat
22
Many really small allocations Example from YALES2 with gfortran issue
3/02/2019 MALT, Sébastien Valat
23 3/02/2019 MALT, Sébastien Valat
3/02/2019 MALT, Sébastien Valat 24
– Small overhead. – Similar metric than massif – Only provide snapshots of allocated memory per stacks. – Peak might not be captured. – Lack of a real GUI to use it.
25 3/02/2019 MALT, Sébastien Valat
% pprof gfs_master profile.0100.heap 255.6 24.7% 24.7% 255.6 24.7% GFS_MasterChunk::AddServer 184.6 17.8% 42.5% 298.8 28.8% GFS_MasterChunkTable::Create 176.2 17.0% 59.5% 729.9 70.5% GFS_MasterChunkTable::UpdateState 169.8 16.4% 75.9% 169.8 16.4% PendingClone::PendingClone 76.3 7.4% 83.3% 76.3 7.4% __default_alloc_template::_S_chunk_alloc 49.5 4.8% 88.0% 49.5 4.8% hashtable::resize
3/02/2019 MALT, Sébastien Valat 26
– Commercial – Leak detection, access checking, memory debugging tools. – Use binary or source instrumentation. – Windows / Redhat
– Nice but windows only and commercial
27 3/02/2019 MALT, Sébastien Valat
– Work out of the box – Manage all dynamic libraries – Slow for large number of calls (~>10M)
– Need source recompilation (available) : -finstrument-function – Or tools for binary instrumentation : MAQAO / Pintool (experimental) – Faster for really large number of calls to malloc – Only provide stacks for the instrumented binaries
28 3/02/2019 MALT, Sébastien Valat
3/02/2019 MALT, Sébastien Valat 29
30 3/02/2019 MALT, Sébastien Valat
31 3/02/2019 MALT, Sébastien Valat
32 3/02/2019 MALT, Sébastien Valat
3/02/2019 MALT, Sébastien Valat 33
Display largest stack for thread ID Stack space used by functions on peak Stack size over time Thread ID
34
Many really small allocations Example from YALES2
3/02/2019 MALT, Sébastien Valat
3/02/2019 MALT, Sébastien Valat 35
36 3/02/2019 MALT, Sébastien Valat
10 20 30 40 50 60 70 80 90 100 valgrind-memcheck valgrind-massif gperf igprof malt malt-finstr
3/02/2019 MALT, Sébastien Valat 37
38
And mostly really small allocations ! Huge number of allocation for a line programmer think it doesn’t do any ! Search intensive alloc functions
MALT, Sébastien Valat 3/02/2019
39
Many codes produce allocations of 1B. OK with moderation. Search for the minimal chunk size.
3/02/2019 MALT, Sébastien Valat
40 3/02/2019 MALT, Sébastien Valat