† §
- M. James Kalyan§†
Xiang Wang§ Ahmed Eltantawy§ Yaoqing Gao§
Code R Region B Based A Auto-Tu Tuning En Enabled C Compilers - - PowerPoint PPT Presentation
Code R Region B Based A Auto-Tu Tuning En Enabled C Compilers M. James Kalyan Xiang Wang Ahmed Eltantawy Yaoqing Gao Motivation Binary Developer 2 Motivation Binary Auto-Tuner 3 Approach .6% speedup over
† §
Xiang Wang§ Ahmed Eltantawy§ Yaoqing Gao§
2
Developer Binary
3
Auto-Tuner Binary
4
Auto-Tuner Tuning A Aware Co Compile ler Binary
5
6
development time
7
the co code r regions of a given source and the possible optimizations on those code regions
Auto-tu tune: automatically make optimization decisions about the code regions
the o
8
This is what we call enabling the compiler for auto-tuning, which is a necessary step for code region based auto-tuning
(for the diagrammatically inclined)
9
We penetrate LLVM’s pass analysis to record tuning
code re regions)
The code regions are identified uniquely
The auto-tuner’s search algorithms make decisions about what optimizations to apply (aut auto-tu tuning) These decisions are recorded as a tuning configuration in an xml format The tuning configuration is read by the compiler and the correct
The tuned binary is compiled and profiled, the performance is given as feedback to the search driver Note: the dotted lines are executed once per tuning run
10
11
Results for CoreMark on x86
Na Name De Description Coarse S Scope Fine S Scope Best S Speedup Over Coarse Over –O2 Phase
Ordering of optimization passes (LLVM IR) All modules Per module 1.115x 1.196x Loop unrolling/p eeling Factor to unroll/peel loops by (LLVM IR) All loops Per loop 1.036x 1.106x Machine scheduling policy Scheduling rule for instructions (x86 machine IR) All basic blocks Per basic block 1.001x 1.003x
12
Iteration time = time(configuration choice) + time(compile) + time(runtime) ≈ 45s
Loop Auto-Tuning Module Auto-Tuning
13
14
15
16
peaks
17