Finding Performance-Optimal Configurations for High-Performance - - PowerPoint PPT Presentation

finding performance optimal configurations for high
SMART_READER_LITE
LIVE PREVIEW

Finding Performance-Optimal Configurations for High-Performance - - PowerPoint PPT Presentation

Finding Performance-Optimal Configurations for High-Performance Computing Alexander Grebhahn, Norbert Siegmund, Sven Apel University of Passau FOSD Meeting 2014, Dagstuhl High-Performance Computing and ExaStencils Alexander Grebhahn Finding


slide-1
SLIDE 1

Finding Performance-Optimal Configurations for High-Performance Computing

Alexander Grebhahn, Norbert Siegmund, Sven Apel

University of Passau

FOSD Meeting 2014, Dagstuhl

slide-2
SLIDE 2

High-Performance Computing and ExaStencils

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 2/16

slide-3
SLIDE 3

High-Performance Computing and ExaStencils

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 2/16

slide-4
SLIDE 4

High-Performance Computing and ExaStencils

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 2/16

slide-5
SLIDE 5

High-Performance Computing and ExaStencils

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 2/16

slide-6
SLIDE 6

High-Performance Computing and ExaStencils

How to identify performance-optimal components and parameters for a specific hardware?

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 2/16

slide-7
SLIDE 7

SPL Conqueror [Siegmund et al., 2012]

Partial feature selection Prediction Optimal configuration Objective function: max(performance) Local Memory CUDA {Local Memory, CUDA, Padding = 0, Pixels per Thread = 3}

Advantages:

Detection of feature interactions Transparent (i.e., influences of individual features and

feature interactions explicitly modeled and quantified)

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 3/16

slide-8
SLIDE 8

Influence of Individual Features

HIPAcc API CUDA OpenCL Local Memory

Identification:

= 500s = 800s = -300s

Performance difference is interpreted as contribution of the feature in question

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 4/16

slide-9
SLIDE 9

Influence of Individual Features

HIPAcc API CUDA OpenCL Local Memory

Identification:

= 500s = 800s = -300s

Performance difference is interpreted as contribution of the feature in question

Heuristics:

Feature-wise (FW) heuristic: Quantifies the influence of individual features on performance

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 4/16

slide-10
SLIDE 10

Interactions Between Features

= 500s = 800s = -300s = 400s = 800s = -400s = 100s 350s

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 5/16

slide-11
SLIDE 11

Interactions Between Features

= 500s = 800s = -300s = 400s = 800s = -400s = 350s

Heuristics:

Pair-wise (PW) heuristic: interactions between two features Higher-order (HO) heuristic: interactions between three or

more features

Hot-spot (HS) heuristic: interactions of "hot-spot" features Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 5/16

slide-12
SLIDE 12

Numerical Parameters (Non-Boolean Features)

Existing heuristics work for boolean features only!

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 6/16

slide-13
SLIDE 13

Numerical Parameters (Non-Boolean Features)

Existing heuristics work for boolean features only!

Discretization:

System X [0,1,…,n] System X0 X1 Xn ... X2

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 6/16

slide-14
SLIDE 14

Numerical Parameters (Non-Boolean Features)

Existing heuristics work for boolean features only!

Discretization:

System X [0,1,…,n] System X0 X1 Xn ... X2

Disadvantages:

Increasing number of features Loss of connection between parameter values Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 6/16

slide-15
SLIDE 15

Influence of Parameters

HIPAcc API CUDA OpenCL Padding [0..6] 3 Local Memory Pixels per Thread [1,2,3,4,5,6,7] 4

Padding Pixels per Thread Determine influence of parameter

values on performance

Learn function for each pair of

parameter and feature

Independent sampling of

parameters

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 7/16

slide-16
SLIDE 16

Influence of Parameters

HIPAcc API CUDA OpenCL Padding [0..6] 3 Local Memory Pixels per Thread [1,2,3,4,5,6,7] 4

Padding Pixels per Thread Determine influence of parameter

values on performance

Learn function for each pair of

parameter and feature

Independent sampling of

parameters

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 7/16

slide-17
SLIDE 17

Influence of Parameters

HIPAcc API CUDA OpenCL Padding [0..6] 3 Local Memory Pixels per Thread [1,2,3,4,5,6,7] 4

Padding Pixels per Thread Determine influence of parameter

values on performance

Learn function for each pair of

parameter and feature

Independent sampling of

parameters

Heuristics:

Function learning (FL) heuristic Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 7/16

slide-18
SLIDE 18

First Results [Grebhahn et al., 2014]

Research questions:

What is the prediction accuracy of the different heuristics? Can we predict the performance-optimal configuration?

Customizable programs:

Highly Scalable Multi-Grid Solver (HSMGS) Multi-Grid Solver using DUNE (DUNE MGS)

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 8/16

slide-19
SLIDE 19

HSMGS

HSMGP post-smoothing [0,…,6] 3 pre-smoothing [0,…,6] 3

sum (pre-smoothing, post-smoothing) > 0

coarse grid solver IP_CG IP_AMG RED_AMG smoother GSAC GS Jac BS RBGS RBGSAC Number of Cores [64,256,1024,4096] 64 Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 9/16

slide-20
SLIDE 20

HSMGS – Results

Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF FW PW HO HS FL

Table: BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order, HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 10/16

slide-21
SLIDE 21

HSMGS – Results

Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 3 456 (100) 1 FW PW HO HS FL

Table: BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order, HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 10/16

slide-22
SLIDE 22

HSMGS – Results

Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 3 456 (100) 1 FW 26 (0.8) 23.4 ± 18.7 19.0 3.8 40 PW HO HS FL

Table: BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order, HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 10/16

slide-23
SLIDE 23

HSMGS – Results

Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 3 456 (100) 1 FW 26 (0.8) 23.4 ± 18.7 19.0 3.8 40 PW 274 (7.9) 4.8 ± 8.6 1.8 31.4 77 HO HS FL

Table: BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order, HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 10/16

slide-24
SLIDE 24

HSMGS – Results

Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 3 456 (100) 1 FW 26 (0.8) 23.4 ± 18.7 19.0 3.8 40 PW 274 (7.9) 4.8 ± 8.6 1.8 31.4 77 HO 1 331 (38.5) 60.7 ± 67.2 41.5 270.0 312 HS FL

Table: BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order, HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 10/16

slide-25
SLIDE 25

HSMGS – Results

Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 3 456 (100) 1 FW 26 (0.8) 23.4 ± 18.7 19.0 3.8 40 PW 274 (7.9) 4.8 ± 8.6 1.8 31.4 77 HO 1 331 (38.5) 60.7 ± 67.2 41.5 270.0 312 HS 2 902 (84.0) 8.0 ± 33.9 270.0 55 FL

Table: BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order, HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 10/16

slide-26
SLIDE 26

HSMGS – Results

Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 3 456 (100) 1 FW 26 (0.8) 23.4 ± 18.7 19.0 3.8 40 PW 274 (7.9) 4.8 ± 8.6 1.8 31.4 77 HO 1 331 (38.5) 60.7 ± 67.2 41.5 270.0 312 HS 2 902 (84.0) 8.0 ± 33.9 270.0 55 FL 112 (3.2) 2.5 ± 3.1 1.8 1

Table: BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order, HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 10/16

slide-27
SLIDE 27

HSMGS – Feature Interactions (Pair-Wise)

IP CG GS GSAC GSACBE GSRB GSRBAC pre=0 pre=1 pre=2 pre=3 pre=4 pre=5 post=0 post=1 post=2 post=4 post=5 post=6 numCores 64 numCores 256 numCores 1024 IP AMG RED AMG JAC pre=6 numCores 4096 post=3

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 11/16

slide-28
SLIDE 28

HIPAcc, DUNE MGS – Results

HIPAcc DUNE MGS Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF HO HS FL BF HO HS FL

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 12/16

slide-29
SLIDE 29

HIPAcc, DUNE MGS – Results

HIPAcc DUNE MGS Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 13 485 (100) 1 HO HS FL BF HO HS FL

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 12/16

slide-30
SLIDE 30

HIPAcc, DUNE MGS – Results

HIPAcc DUNE MGS Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 13 485 (100) 1 HO 1 516 (11.2) 7.8 ± 10.1 4.4 9.38 1735 HS FL BF HO HS FL

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 12/16

slide-31
SLIDE 31

HIPAcc, DUNE MGS – Results

HIPAcc DUNE MGS Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 13 485 (100) 1 HO 1 516 (11.2) 7.8 ± 10.1 4.4 9.38 1735 HS 2 881 (21.4) 3.8 ± 4.8 3.3 18.22 955 FL BF HO HS FL

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 12/16

slide-32
SLIDE 32

HIPAcc, DUNE MGS – Results

HIPAcc DUNE MGS Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 13 485 (100) 1 HO 1 516 (11.2) 7.8 ± 10.1 4.4 9.38 1735 HS 2 881 (21.4) 3.8 ± 4.8 3.3 18.22 955 FL 216 (1.6) 32.9 ± 31.1 23.5 0.75 3544 BF HO HS FL

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 12/16

slide-33
SLIDE 33

HIPAcc, DUNE MGS – Results

HIPAcc DUNE MGS Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 13 485 (100) 1 HO 1 516 (11.2) 7.8 ± 10.1 4.4 9.38 1735 HS 2 881 (21.4) 3.8 ± 4.8 3.3 18.22 955 FL 216 (1.6) 32.9 ± 31.1 23.5 0.75 3544 BF 2 304 (100) 1 HO HS FL

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 12/16

slide-34
SLIDE 34

HIPAcc, DUNE MGS – Results

HIPAcc DUNE MGS Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 13 485 (100) 1 HO 1 516 (11.2) 7.8 ± 10.1 4.4 9.38 1735 HS 2 881 (21.4) 3.8 ± 4.8 3.3 18.22 955 FL 216 (1.6) 32.9 ± 31.1 23.5 0.75 3544 BF 2 304 (100) 1 HO 749 (32.6) 36.3 ± 51.7 18.2 226.9 133 HS FL

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 12/16

slide-35
SLIDE 35

HIPAcc, DUNE MGS – Results

HIPAcc DUNE MGS Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 13 485 (100) 1 HO 1 516 (11.2) 7.8 ± 10.1 4.4 9.38 1735 HS 2 881 (21.4) 3.8 ± 4.8 3.3 18.22 955 FL 216 (1.6) 32.9 ± 31.1 23.5 0.75 3544 BF 2 304 (100) 1 HO 749 (32.6) 36.3 ± 51.7 18.2 226.9 133 HS 1 643 (71.4) 49 ± 164.7 161.4 215 FL

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 12/16

slide-36
SLIDE 36

HIPAcc, DUNE MGS – Results

HIPAcc DUNE MGS Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 13 485 (100) 1 HO 1 516 (11.2) 7.8 ± 10.1 4.4 9.38 1735 HS 2 881 (21.4) 3.8 ± 4.8 3.3 18.22 955 FL 216 (1.6) 32.9 ± 31.1 23.5 0.75 3544 BF 2 304 (100) 1 HO 749 (32.6) 36.3 ± 51.7 18.2 226.9 133 HS 1 643 (71.4) 49 ± 164.7 161.4 215 FL 75 (3.3) 13.7 ± 12.6 10.2 48.5 10

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 12/16

slide-37
SLIDE 37

Conclusion

Partial feature selection Prediction Optimal configuration Objective function: max(performance) Local Memory CUDA {Local Memory, CUDA, Padding = 0, Pixels per Thread = 3} System X [0,1,…,n] System X0 X1 Xn ... X2

IP CG GS GSAC GSACBE GSRB GSRBAC pre=0 pre=1 pre=2 pre=3 pre=4 pre=5 post=0 post=1 post=2 post=4 post=5 post=6 numCores 64 numCores 256 numCores 1024 IP AMG RED AMG JAC pre=6 numCores 4096 post=3

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 13/16

slide-38
SLIDE 38

Future Work

Interactions between parameters Use domain knowledge

  • Degree of the function
  • Known interactions and dependencies between features and

parameters

Exploit Multi-Grid characteristics

  • Different configuration options are used in different

computation parts

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 14/16

slide-39
SLIDE 39

Questions

grebhahn@fim.uni-passau.de

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 15/16

slide-40
SLIDE 40

References

Grebhahn, A., Kuckuk, S., Schmitt, C., Köstler, H., Siegmund, N., Apel, S., Hannig, F., and Teich, J. (2014). Experiments on Optimizing the Performance of Stencil Codes with SPL Conqueror. submitted to Parallel Processing Letters. Siegmund, N., Kolesnikov, S. S., Kästner, C., Apel, S., Batory, D., Rosenmüller, M., and Saake, G. (2012). Predicting performance via automated feature-interaction detection. In Proc. ICSE, pages 167–177. IEEE.

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 16/16

slide-41
SLIDE 41

HSMGS

HSMGP post-smoothing [0,…,6] 3 pre-smoothing [0,…,6] 3

sum (pre-smoothing, post-smoothing) > 0

coarse grid solver IP_CG IP_AMG RED_AMG smoother GSAC GS Jac BS RBGS RBGSAC Number of Cores [64,256,1024,4096] 64 Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 1/5

slide-42
SLIDE 42

HIPAcc

HIPAcc API CUDA Texture Memory OpenCL Linear2D Array2D Padding [0,32,…,512] Pixels per Thread [1,2,3,4] 1 Blocksize

¬(Local Memory ˄ 1024x1 ˄ Pixel Per Thread = 2) ¬(Local Memory ˄ 32x32 ˄ Pixel Per Thread = 3) ¬(Local Memory ˄ 64x16 ˄ Pixel Per Thread = 3)

Local Memory 32x1 64x16 128x1 128x2 128x4 128x8 256x4 512x1 512x2 1024x1 Ldg 32x2 32x4 64x2 64x8 256x1 256x2

(Array2D Padding = 0)

¬(Local Memory ˄ 128x8 ˄ Pixel Per Thread = 3) ¬(Local Memory ˄ 256x4 ˄ Pixel Per Thread = 3) ¬(Local Memory ˄ 512x2 ˄ Pixel Per Thread = 3) ¬(Local Memory ˄ 1024x1 ˄ Pixel Per Thread = 3) ¬(Local Memory ˄ 32x32 ˄ Pixel Per Thread = 4) ¬(Local Memory ˄ 64x16 ˄ Pixel Per Thread = 4) ¬(Local Memory ˄ 128x8 ˄ Pixel Per Thread = 4) ¬(Local Memory ˄ 256x4 ˄ Pixel Per Thread = 4) ¬(Local Memory ˄ 512x2 ˄ Pixel Per Thread = 4) ¬(Local Memory ˄ 1024x1 ˄ Pixel Per Thread = 4)

32x16 32x8 64x4 32x32 64x1 Linear1D Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 2/5

slide-43
SLIDE 43

DUNE MGS

Dune MGS post-smoothing [0,…,6] 3 pre-smoothing [0,…,6] 3

sum (pre-smoothing, post-smoothing) > 0

preconditioner GS solver CG Loop BicGSTAB Gradient Number of Cells [50,…,55] 50 SOR Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 3/5

slide-44
SLIDE 44

HIPAcc – Results

Heu. BF FW PW HO HS FL

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 4/5

slide-45
SLIDE 45

HIPAcc – Results

Heu. # M (in %) BF 13 485 (100) FW 47 (0.3) PW 702 (5.2) HO 1 516 (11.2) HS 2 881 (21.4) FL 216 (1.6)

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 4/5

slide-46
SLIDE 46

HIPAcc – Results

Heu. # M (in %) ¯ e ± s BF 13 485 (100) FW 47 (0.3) 80.8 ± 56.3 PW 702 (5.2) 17.2 ± 16.0 HO 1 516 (11.2) 7.8 ± 10.1 HS 2 881 (21.4) 3.8 ± 4.8 FL 216 (1.6) 32.9 ± 31.1

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 4/5

slide-47
SLIDE 47

HIPAcc – Results

Heu. # M (in %) ¯ e ± s

  • x

BF 13 485 (100) FW 47 (0.3) 80.8 ± 56.3 73.6 PW 702 (5.2) 17.2 ± 16.0 13.4 HO 1 516 (11.2) 7.8 ± 10.1 4.4 HS 2 881 (21.4) 3.8 ± 4.8 3.3 FL 216 (1.6) 32.9 ± 31.1 23.5

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 4/5

slide-48
SLIDE 48

HIPAcc – Results

Heu. # M (in %) ¯ e ± s

  • x

δ [%] BF 13 485 (100) FW 47 (0.3) 80.8 ± 56.3 73.6 1.50 PW 702 (5.2) 17.2 ± 16.0 13.4 14.60 HO 1 516 (11.2) 7.8 ± 10.1 4.4 9.38 HS 2 881 (21.4) 3.8 ± 4.8 3.3 18.22 FL 216 (1.6) 32.9 ± 31.1 23.5 0.75

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 4/5

slide-49
SLIDE 49

HIPAcc – Results

Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 13 485 (100) 1 FW 47 (0.3) 80.8 ± 56.3 73.6 1.50 2180 PW 702 (5.2) 17.2 ± 16.0 13.4 14.60 428 HO 1 516 (11.2) 7.8 ± 10.1 4.4 9.38 1735 HS 2 881 (21.4) 3.8 ± 4.8 3.3 18.22 955 FL 216 (1.6) 32.9 ± 31.1 23.5 0.75 3544

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 4/5

slide-50
SLIDE 50

DUNE MGS – Results

Heu. BF FW PW HO HS FL

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 5/5

slide-51
SLIDE 51

DUNE MGS – Results

Heu. # M (in %) BF 2 304 (100) FW 25 (1.1) PW 191 (8.3) HO 749 (32.6) HS 1 643 (71.4) FL 75 (3.3)

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 5/5

slide-52
SLIDE 52

DUNE MGS – Results

Heu. # M (in %) ¯ e ± s BF 2 304 (100) FW 25 (1.1) 32.2 ± 51 PW 191 (8.3) 42.2 ± 38.6 HO 749 (32.6) 36.3 ± 51.7 HS 1 643 (71.4) 49 ± 164.7 FL 75 (3.3) 13.7 ± 12.6

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 5/5

slide-53
SLIDE 53

DUNE MGS – Results

Heu. # M (in %) ¯ e ± s

  • x

BF 2 304 (100) FW 25 (1.1) 32.2 ± 51 12.4 PW 191 (8.3) 42.2 ± 38.6 32.8 HO 749 (32.6) 36.3 ± 51.7 18.2 HS 1 643 (71.4) 49 ± 164.7 FL 75 (3.3) 13.7 ± 12.6 10.2

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 5/5

slide-54
SLIDE 54

DUNE MGS – Results

Heu. # M (in %) ¯ e ± s

  • x

δ [%] BF 2 304 (100) FW 25 (1.1) 32.2 ± 51 12.4 55.3 PW 191 (8.3) 42.2 ± 38.6 32.8 97.3 HO 749 (32.6) 36.3 ± 51.7 18.2 226.9 HS 1 643 (71.4) 49 ± 164.7 161.4 FL 75 (3.3) 13.7 ± 12.6 10.2 48.5

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 5/5

slide-55
SLIDE 55

DUNE MGS – Results

Heu. # M (in %) ¯ e ± s

  • x

δ [%] rank BF 2 304 (100) 1 FW 25 (1.1) 32.2 ± 51 12.4 55.3 498 PW 191 (8.3) 42.2 ± 38.6 32.8 97.3 273 HO 749 (32.6) 36.3 ± 51.7 18.2 226.9 133 HS 1 643 (71.4) 49 ± 164.7 161.4 215 FL 75 (3.3) 13.7 ± 12.6 10.2 48.5 10

Table : BF: brute force, FW: feature-wise, PW: pair-wise, HO: higher-order,

HS: hot-spot, FL: function learning

Alexander Grebhahn Finding Performance-Optimal Configurations for High-Performance Computing 5/5