ASK: Adaptive Sampling Kit P. de Oliveira Castro, E. Petit, JC. - - PowerPoint PPT Presentation

ask adaptive sampling kit
SMART_READER_LITE
LIVE PREVIEW

ASK: Adaptive Sampling Kit P. de Oliveira Castro, E. Petit, JC. - - PowerPoint PPT Presentation

ASK: Adaptive Sampling Kit P. de Oliveira Castro, E. Petit, JC. Beyler, W. Jalby Universit e de Versailles St-Quentin-en-Yvelines Exascale Computing Research 2012/08/29 P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 1 / 17 Outline


slide-1
SLIDE 1

ASK: Adaptive Sampling Kit

  • P. de Oliveira Castro, E. Petit, JC. Beyler, W. Jalby

Universit´ e de Versailles St-Quentin-en-Yvelines – Exascale Computing Research

2012/08/29

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 1 / 17

slide-2
SLIDE 2

Outline

1

Building Empirical Performance Models

2

Adaptive Sampling Kit

3

Hierarchical Variance Sampling

4

Evaluation

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 2 / 17

slide-3
SLIDE 3

Motivation: Building Performance Models

Building performance models is important to

◮ Understand performance bottlenecks ◮ Optimize applications ◮ Find best architecture for a given application (co-design)

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 3 / 17

slide-4
SLIDE 4

Motivation: Building Performance Models

How to model performance ?

◮ Using simulators or analytical models ⋆ Architectures are complex and many factors interact (memory

hierarchy, amount of parallelism, mapping, access patterns)

⋆ Often models are too complex or costly ◮ Black-box approach: ⋆ Measure performance for different hardware or software configurations

(the design space)

⋆ Build an empirical model

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 4 / 17

slide-5
SLIDE 5

Design Space example: Jacobi Stencil code

X Y N M

T, number of OpenMP Threads, between 1 and 32 N and M between 64 and 2048 X,Y ∈ {1, 2, 4, 8, 16} Design space size around 31.108 What is the performance for any combination of factors ?

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 5 / 17

slide-6
SLIDE 6

Building empirical models

Exhaustively measuring large design spaces is prohibitive. Build an accurate performance model with as few samples as possible Sampling method to select which points to measure

◮ Samples must be chosen with care or the model will be biased.

Regression model to estimate the missing samples

◮ Linear, polynomial, SVM, Gaussian Process, Regression Trees, etc.

No one size fits all strategy:

◮ Depending on the design space response some models and sampling

methods will work better than others

◮ Important to try different strategies

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 6 / 17

slide-7
SLIDE 7

Contributions

The contributions of this work are:

◮ ASK open-source toolkit to build empirical models ⋆ Easy to try different sampling strategies ◮ A novel sampling strategy HVS ⋆ Evaluated on different performance characterization problems

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 7 / 17

slide-8
SLIDE 8

ASK: Adaptive Sampling Kit

Adaptive Sampling Kit (ASK) is a toolkit for building empirical models Modular architecture for conducting experiments:

◮ Easy to combine different sampling strategies and models ◮ Gathers state-of-the art sampling methods ◮ Provides visualization modules to supervise the sampling ◮ Provides control modules to stop the sampling when its accurate

enough

1.Bootstrap 3.Model 4.Sampler 2.Source 2.Source Reporter Reports progress and predictive error 5.Control Decides when to stop sampling

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 8 / 17

slide-9
SLIDE 9

Sampling methods included in ASK

Sampling methods fall in two main categories Static methods: Space Filling Designs

◮ Select a set of samples covering the design space ◮ All points are measured in a single batch ⋆ Latin Hyper Cube ⋆ Maximin ⋆ Low discrepancy ⋆ Random

Adaptive methods:

◮ Sampling iteratively adapts to the design space

complexity

⋆ AMART [Li09]: a Query-By-Comittee method ⋆ TGP + ALC [Gramacy09]: an Error-reduction

method

⋆ HVS: a novel Error-reduction method that takes

into account bias

Latin Hyper Cube Adaptive Sampling

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 9 / 17

slide-10
SLIDE 10

Hierarchical Variance Sampling (1/2)

Divide the space in regions using Regression Trees Compute the variance in each region Sample new points proportionally to: Variance upper bound × size of the region

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 factor x response f(x) σub s

  • samples

last iteration samples

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 10 / 17

slide-11
SLIDE 11

Hierarchical Variance Sampling (2/2)

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 11 / 17

slide-12
SLIDE 12

Hierarchical Variance Sampling (2/2)

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 11 / 17

slide-13
SLIDE 13

Hierarchical Variance Sampling (2/2)

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 11 / 17

slide-14
SLIDE 14

Hierarchical Variance Sampling (2/2)

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 11 / 17

slide-15
SLIDE 15

Hierarchical Variance Sampling (2/2)

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 11 / 17

slide-16
SLIDE 16

ASK: Stencil code evaluation

Despite using only 1500 points, HVS+GBM captures the performance features of the application. (25600 samples used as

  • riginal response test

set) 32 cores Xeon X7550 2.00GHz

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 12 / 17

slide-17
SLIDE 17

ASK: Evaluating estimation error

samples RMSE

5 10 15 20 25

  • ● ● ● ● ● ● ● ● ● ●
  • ● ● ● ●

200 400 600 800 1000 1200 1400 Strategy

  • AMART

HVS HVSrelative Latin Random

Figure: Stencils, Root Mean Square Error for different ASK sampling strategies

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 13 / 17

slide-18
SLIDE 18

Using the Model for prediction

T : threads Cycles per element

20 40 60 80 100

  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

5 10 15 20 25 30 Model

  • True response

HVSrelative model Ideal linear scaling

Figure: Scalability of the 8x8 stencil on a 1000x1000 matrix

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 14 / 17

slide-19
SLIDE 19

Importance of selecting a good model

Influence of alignment stream benchmark

◮ Three store streams hitting memory ◮ Memory offsets: S(k), S(V 1 + k), S(V 2 + k) ◮ 4K aliasing ◮ non aligned access overhead

SVM model GBM model

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 15 / 17

slide-20
SLIDE 20

Alternatives to ASK

SUrrogate MOdeling Lab (SUMO) [Gorissen2010]

◮ Mature toolbox ◮ Includes many models and sampling methods ◮ Automatic tuning of model parameters ◮ Supports modeling of multiple responses ◮ ASK specifically targets performance characterization ⋆ AMART [Li09] and HVS methods have been evaluated on performance

problems

◮ Only supports real-valued inputs ◮ Depends on Matlab and is not open-source (but freely available for

academic use)

Caret R package [Kuhn2012]

◮ Includes many models ◮ Automatic tuning of model parameters ◮ Does not handle sampling

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 16 / 17

slide-21
SLIDE 21

How to get ASK ?

ASK is open-source and available at

◮ http://code.google.com/p/adaptive-sampling-kit/

The experimental data used in the paper is available at

◮ http://code.google.com/p/adaptive-sampling-kit/wiki/

ExperimentalData

  • P. Oliveira et al (UVSQ/ECR)

ASK 2012/08/29 17 / 17