Testing AutoFDO for Geant4
Nathalie Rauschmayr
IT-CF-FPP With help from Benedikt Hegner and Shahzad Malik Muzaffar
1/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Testing AutoFDO for Geant4 Nathalie Rauschmayr IT-CF-FPP With help - - PowerPoint PPT Presentation
Testing AutoFDO for Geant4 Nathalie Rauschmayr IT-CF-FPP With help from Benedikt Hegner and Shahzad Malik Muzaffar 1/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr Introduction Idea: Autotuning Compile 2/29 Testing AutoFDO for Geant4
IT-CF-FPP With help from Benedikt Hegner and Shahzad Malik Muzaffar
1/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Compile
2/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Compile Run
3/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Compile Feedback Run
4/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Compile Feedback Run
4/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
5/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
6/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
7/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
gcc -fprofile-generate test.c -o test test.gcno test.gcda gcc -fprofile-use test.c -o test Instrumentation Run Recompile Production Environment
8/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
gcc -fprofile-generate test.c -o test test.gcno test.gcda gcc -fprofile-use test.c -o test Instrumentation Run Recompile Production Environment
8/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Create production binary Run production binary with perf Convert perf-profile Recompile with converted perf-profile
9/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
gcc -O3 -ggdb
perf record -b -e cpu/event=0xc4,umask=0x20, name=br inst retired near taken, period=1000009/pp ./test create gcov --binary=./test
gcc -O3 -fauto-profile=test.gcov test.c -o test Create production binary Run production binary with perf Convert perf-profile Recompile with converted perf-profile
10/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
11/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
12/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
13/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
14/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Training data Run Number of Events FullCMS run 100 events FullCMS 100, 500, 1k FullCMS run 500 events FullCMS 100, 500, 1k FullCMS run 1k events FullCMS 100, 500, 1k
Normal AutoFDO 100 events AutoFDO 500 events AutoFDO 1000 events 130 140 150 160 170 Runtime in [s] Processing 100 events Normal AutoFDO 100 events AutoFDO 500 events AutoFDO 1000 Events 600 650 700 Runtime in [s] Processing 500 events
15/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Normal AutoFDO 100 events AutoFDO 500 events AutoFDO 1000 events 1,150 1,200 1,250 1,300 1,350 1,400 Runtime in [s] Processing 1000 events
16/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
17/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Training data Run Number of Events cmsRun 20 events config1 cmsRun config1 20, 50, 100 cmsRun 50 events config1 cmsRun config1 20, 50, 100 cmsRun 100 events config1 cmsRun config1 20, 50, 100
Normal AutoFDO 20 events AutoFDO 50 events AutoFDO 100 events 520 540 560 580 Runtime in [s] Processing 20 events Normal AutoFDO 20 events AutoFDO 50 events AutoFDO 100 Events 1,250 1,300 1,350 Runtime in [s] Processing 50 events
18/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Normal AutoFDO 20 events AutoFDO 50 events AutoFDO 100 events 2,500 2,600 2,700 Runtime in [s] Processing 100 events
19/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Training data Run Number of Events cmsRun 100 events config1 cmsRun config2 20, 50, 100
Normal AutoFDO 100 events 1,600 1,650 1,700 1,750 1,800 1,850 Runtime in [s] Processing 20 events Normal AutoFDO 100 events 3,800 4,000 4,200 4,400 Runtime in [s] Processing 50 events Normal AutoFDO 100 events 7,500 8,000 8,500 Runtime in [s] Processing 100 events
20/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Training data Run Number of Events cmsRun 100 events config1 cmsRun config2 20, 50, 100 cmsRun 100 events config2 cmsRun config2 20, 50, 100
Normal AutoFDO 100 events AutoFDO 100 events 1,600 1,650 1,700 1,750 1,800 1,850 Runtime in [s] Processing 20 events Normal AutoFDO 100 events AutoFDO 100 events 3,800 4,000 4,200 4,400 Runtime in [s] Processing 50 events Normal AutoFDO 100 events AutoFDO 100 events 7,500 8,000 8,500 Runtime in [s] Processing 100 events
21/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
Training data Run Number of Events fullcms 100 events cmsRun job config2 20, 50, 100
Normal AutoFDO 100 events 540 550 560 570 580 Runtime in [s] Processing 20 events Normal AutoFDO 100 events 1,260 1,280 1,300 1,320 1,340 1,360 1,380 Runtime in [s] Processing 50 events Normal AutoFDO 100 events 2,550 2,600 2,650 2,700 2,750 Runtime in [s] Processing 100 events
22/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
libG4processes.so libAnnotated
G4EnhancedVecAllocator.hh;122;146;0;10000;9550;d9a18bb69d5efaf3d9068625ec56d66a G4EnhancedVecAllocator.hh;137;8389;0;225;450;6a740d527b3f213d4868919fc7d9710c G4EnhancedVecAllocator.hh;135;8389;0;10000;9550;a17d8feb82daee40febb118864576dc9
23/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
libG4processes.so libAnnotated
G4EnhancedVecAllocator.hh;122;146;0;10000;9550;d9a18bb69d5efaf3d9068625ec56d66a G4EnhancedVecAllocator.hh;137;8389;0;225;450;6a740d527b3f213d4868919fc7d9710c G4EnhancedVecAllocator.hh;135;8389;0;10000;9550;a17d8feb82daee40febb118864576dc9
23/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
G 4 P h
u c l e a r C r
s S e c t i
. c c : 1 6 5 G 4 N u c l e i M
e l . c c : 1 3 3 2 s t l u n i n i t i a l i z e d . h : 7 4 G 4 h B r e m s s t r a h l u n g M
e l . c c : 8 7 G 4 C
P a i r i n g C
r e c t i
s . h h : 5 6 v e c t
. t c c : 1 8 4 G 4 M u B r e m s s t r a h l u n g M
e l . c c : 3 G 4 I n u c l P a r t i c l e . h h : 8 3 G 4 E l e c t r
u c l e a r C r
s S e c t i
. c c : 2 3 2 7 l
a l e f a c e t s . h : 8 6 7 G 4 E m C
r e c t i
s . c c : 3 9 1 G 4 V E n e r g y L
s P r
e s s . c c : 1 9 9 G 4 F a s t V e c t
. h h : 6 8 G 4 V E n e r g y L
s P r
e s s . c c : 1 4 1 2 G 4 V E n e r g y L
s P r
e s s . c c : 1 1 4 3 G 4 V E n e r g y L
s P r
e s s . c c : 1 1 4 2 G 4 U n i v e r s a l F l u c t u a t i
. c c : 2 2 4 G 4 U n i v e r s a l F l u c t u a t i
. c c : 2 1 3 G 4 P
s s
. h h : 5 7 G 4 P r
e s s M a n a g e r . c c : 2 7 3 G 4 T r a n s p
t a t i
. c c : 7 3 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 ·10 4 Basic block counts 20 events 50 events 100 events
24/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
G 4 P h
u c l e a r C r
s S e c t i
. c c : 1 6 5 G 4 N u c l e i M
e l . c c : 1 3 3 2 s t l u n i n i t i a l i z e d . h : 7 4 G 4 h B r e m s s t r a h l u n g M
e l . c c : 8 7 G 4 C
P a i r i n g C
r e c t i
s . h h : 5 6 v e c t
. t c c : 1 8 4 G 4 M u B r e m s s t r a h l u n g M
e l . c c : 3 G 4 I n u c l P a r t i c l e . h h : 8 3 G 4 E l e c t r
u c l e a r C r
s S e c t i
. c c : 2 3 2 7 l
a l e f a c e t s . h : 8 6 7 G 4 E m C
r e c t i
s . c c : 3 9 1 G 4 V E n e r g y L
s P r
e s s . c c : 1 9 9 G 4 F a s t V e c t
. h h : 6 8 G 4 V E n e r g y L
s P r
e s s . c c : 1 4 1 2 G 4 V E n e r g y L
s P r
e s s . c c : 1 1 4 3 G 4 V E n e r g y L
s P r
e s s . c c : 1 1 4 2 G 4 U n i v e r s a l F l u c t u a t i
. c c : 2 2 4 G 4 U n i v e r s a l F l u c t u a t i
. c c : 2 1 3 G 4 P
s s
. h h : 5 7 G 4 P r
e s s M a n a g e r . c c : 2 7 3 G 4 T r a n s p
t a t i
. c c : 7 3 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 ·10 4 Branch probability 20 events 50 events 100 events without profile
25/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
>>>readelf -S -W libG4tracking.so | less There are 37 section headers, starting at offset 0x615a8: Section Headers: [Nr] Name Type Address Off Size ES Flg Lk Inf Al [ 0] NULL 0000000000000000 000000 000000 00 [ 1] .hash HASH 0000000000000190 000190 0010ec 04 A 2 8 [...] [26] .gnu.switches.text.quote_paths PROGBITS 0000000000000000 051440 0006bb 00 1 [27] .gnu.switches.text.bracket_paths PROGBITS 0000000000000000 051afb 007c71 00 1 [28] .gnu.switches.text.system_paths PROGBITS 0000000000000000 05976c 003330 00 1 [29] .gnu.switches.text.cpp_defines PROGBITS 0000000000000000 05ca9c 00117e 00 1 [30] .gnu.switches.text.cpp_includes PROGBITS 0000000000000000 05dc1a 0006bb 00 1 [31] .gnu.switches.text.cl_args PROGBITS 0000000000000000 05e2d5 0029b0 00 1 [32] .gnu.switches.text.lipo_info PROGBITS 0000000000000000 060c85 0006f4 00 1 26/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
>>>head
/data/geant4.10.01.p03/source/event/src/G4EventManager.cc: /data/geant4.10.01.p03/ /data/geant4.10.01.p03/source/event/src/G4SmartTrackStack.cc: /data/geant4.10.01.p03/ /data/geant4.10.01.p03/source/event/src/G4StackManager.cc: /data/geant4.10.01.p03/ /data/geant4.10.01.p03/source/externals/clhep/src/Evaluator.cc: /data/geant4.10.01.p03/ /data/geant4.10.01.p03/source/externals/clhep/src/LorentzRotation.cc /data/geant4.10.01.p03/source/externals/clhep/src/LorentzVector.cc /data/geant4.10.01.p03/source/externals/clhep/src/LorentzVectorL.cc
27/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
28/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr
1 Start perf together with the job 2 Gather profiles 3 Convert and merge profiles 4 Add compiler flag in CMake scripts
29/29 Testing AutoFDO for Geant4 Nathalie Rauschmayr