Parallel Runtime Environments with Cloud Database: Performance Study - - PowerPoint PPT Presentation

parallel runtime environments with cloud database
SMART_READER_LITE
LIVE PREVIEW

Parallel Runtime Environments with Cloud Database: Performance Study - - PowerPoint PPT Presentation

Parallel Runtime Environments with Cloud Database: Performance Study for HMM with Adaptive Sampling D. Roehm, R. S. Pavel, T. C. Germann, A. L. McPherson and C. Junghans Los Alamos National Laboratory NM, USA May 8, 2015


slide-1
SLIDE 1

Parallel Runtime Environments with Cloud Database: Performance Study for HMM with Adaptive Sampling

  • D. Roehm, R. S. Pavel, T. C. Germann, A. L. McPherson and
  • C. Junghans

Los Alamos National Laboratory NM, USA

May 8, 2015

UNCLASSIFIED(LA-UR-14-29231) 1

slide-2
SLIDE 2

Introduction Implementation Results Summary

Motivation

Material modeling: time and length scale challenge Micro-structure matters, but is computationally expensive HMM: Combination of macro- and micro-scale simulations Adaptive sampling techniques: take micro-structure into account when necessary Prediction (kriging) based on database values instead of executing MD simulation Model problem: Laser impact

  • n a copper plate

UNCLASSIFIED(LA-UR-14-29231) 2

slide-3
SLIDE 3

Introduction Implementation Results Summary

Elastodynamics

Non-oscillatory central scheme (predictor-corrector) Continuum mechanics ⇒ conservation PDEs in Lagrangian coordinates Evolution of deformation, momentum and energy density computed by finite volume solver Stress and energy flux evaluated with MD On Macro level: Conservation laws for mass, momentum, and energy: ρ∂tA − ▽q = 0 ∂tq − ▽ · τ = 0 ∂te + ▽ · j = 0 On Micro level: Take strain, momentum density, and energy density and return stress, momentum, and energy density flux.

UNCLASSIFIED(LA-UR-14-29231) 3

slide-4
SLIDE 4

Introduction Implementation Results Summary

Database: Redis

Key-value storage High performance Support for distribution/cluster mode NoSQL Redis: open-source, networked, in-memory, key-value data store Users of Redis: GitHub, Twitter, Stackoverflow, Craigslist, ... (info: http://redis.io) Locality-aware hashes: a range along all seven dimensions of

  • ur conserved vector

Truncated hash based upon the specified range Sort values by distance to requested input

UNCLASSIFIED(LA-UR-14-29231) 4

slide-5
SLIDE 5

Introduction Implementation Results Summary

Kriging

“Optimally predicting”, originated in geostatistics Prediction of Z(s0) at a certain position in the high dimensional space by computing a weighted average of the known vectors in the neighborhood of the point Z ′(s0) =

n

  • i=1

λiZ(si) at location s0 that minimizes the mean-square error E[Z(s0) − Z ′(s0)]2. Calculates an error of the prediction at the same time Store simulation results in key-value database

UNCLASSIFIED(LA-UR-14-29231) 5

slide-6
SLIDE 6

Introduction Implementation Results Summary

Implementation

Macrosolver frameworks (github.com/exmatex/CoHMM)

Charm++ 6.6.0 (University of Illinois: charm.cs.uiuc.edu) Intel CnC 1.0.002 (Intel: icnc.github.io) OpenMP Libcircle v0.2.1 (github.com/hpc/libcircle)

MD miniapp CoMD (github.com/exmatex/CoMD)

serial C Version

Compilers and Libraries:

GCC 4.8.x ICPC 15.0.0 Boost 1.55 Blas and Lapack

UNCLASSIFIED(LA-UR-14-29231) 6

slide-7
SLIDE 7

Introduction Implementation Results Summary

Schematics

  • UNCLASSIFIED(LA-UR-14-29231)

7

slide-8
SLIDE 8

Introduction Implementation Results Summary

Flat Wave Test Case

Color bar right hand side: Strain. Color bar top: Type of call.

UNCLASSIFIED(LA-UR-14-29231) 8

slide-9
SLIDE 9

Introduction Implementation Results Summary

Flat Wave Test Case

Colors: Type of call.

UNCLASSIFIED(LA-UR-14-29231) 9

slide-10
SLIDE 10

Introduction Implementation Results Summary

Adaptive Sampling Performance

0.1 1 10 100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Hits [%] HMM step / N CoMD CoMD Duplicates Database Kriging Database Kriging Kriging Duplicates

Overall less than 5% of CoMD calls ⇒ speedup of 25

UNCLASSIFIED(LA-UR-14-29231) 10

slide-11
SLIDE 11

Introduction Implementation Results Summary

Adaptive Sampling Performance: Flat Wave

1e-05 1e-04 1e-03 1e-02 1e-01 1e+00 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Time per Task/CoMD Time HMM step / N CoMD CoMD Database Kriging Database Kriging

Absolute time per task

UNCLASSIFIED(LA-UR-14-29231) 11

slide-12
SLIDE 12

Introduction Implementation Results Summary

Circular Impact Test Case

5 10 15 20 25 30 35 y [0..50] x [ . . 5 ] Strain [MPa] Calltype 5 10 15 20 25 30 35 CoMD

  • C. Dupl.

DB

  • Kr. DB

Kr.

  • Kr. Dupl.

UNCLASSIFIED(LA-UR-14-29231) 12

slide-13
SLIDE 13

Introduction Implementation Results Summary

Adaptive Sampling Performance: Circular Impact

0.001 0.01 0.1 1 10 100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Hits [%] HMM step / N CoMDcalls CoMD Database Kriging DB Kriging

Save approx. 10% of calls long term ⇒ speedup of 2.5

UNCLASSIFIED(LA-UR-14-29231) 13

slide-14
SLIDE 14

Introduction Implementation Results Summary

Framework Performance: Circular Impact Analytic

50 100 150 200 250 300 350 400 450 500 10 20 30 40 50 60 70 80 90 100

Time [s] HMM step

Charm++ Intel CnC OpenMP

16 cores single database shared memory

UNCLASSIFIED(LA-UR-14-29231) 14

slide-15
SLIDE 15

Introduction Implementation Results Summary

Framework Performance: Circular Impact

50000 100000 150000 200000 250000 5 10 15 20 25 30 35 40 45 50

Time [s] HMM step

Charm++ Intel CnC OpenMP

16 cores single database shared memory

UNCLASSIFIED(LA-UR-14-29231) 15

slide-16
SLIDE 16

Introduction Implementation Results Summary

Framework Performance: Flat Wave

20000 40000 60000 80000 100000 120000 140000 160000 180000 50 100 150 200 250 300 350 400

Time [s] HMM step

Charm++ Intel CnC Libcircle OpenMP

48 cores single database shared memory

UNCLASSIFIED(LA-UR-14-29231) 16

slide-17
SLIDE 17

Introduction Implementation Results Summary

Framework Performance: Flat Wave

5000 10000 15000 20000 25000 30000 20 40 60 80 100

Time [s] HMM step

Charm++ old Intel CnC old Libcircle Charm++ new Intel CnC new

144 cores single database

UNCLASSIFIED(LA-UR-14-29231) 17

slide-18
SLIDE 18

Introduction Implementation Results Summary

Framework Performance: Circular Impact

20000 40000 60000 80000 100000 120000 20 40 60 80 100

Time [s] HMM step

Charm++ Intel CnC Libcircle

480 cores with single database

UNCLASSIFIED(LA-UR-14-29231) 18

slide-19
SLIDE 19

Introduction Implementation Results Summary

Framework Pros and Cons

Based on preliminary results! Charm++

More complex to implement (.ci files) Great platform support, but uncommon build system Good performance Good documentation and support on mailing list

Intel CnC

Straightforward to implement Needs Intel MPI or MPICH Good performance with optimization efforts Good documentation for basics Tuners mainly undocumented

Libcircle

Trivial to implement Great platform support Performance NOT comparable Manual serialization of input and output data

UNCLASSIFIED(LA-UR-14-29231) 19

slide-20
SLIDE 20

Introduction Implementation Results Summary

Summary and Outlook

Implemented Distributed Database Kriging for Adaptive Sampling (D2KAS) for HMM (elastodynamics) using different frameworks Our adaptive scheme achieves a speedup of 2.5 − 251 Enables inclusion of defects, crystal domains or phase boundaries One code base: Charm++, CnC, OpenMP, Libcircle (github.com/exmatex/CoHMM) Color bar right hand side:

  • Strain. Color bar top: Type of

call.

  • 1Comp. Phys. Comm. 192, 138 (2015)

UNCLASSIFIED(LA-UR-14-29231) 20

slide-21
SLIDE 21

Introduction Implementation Results Summary

Thanks to Phil Miller (Audience, UI-UC) Frank Schlimbach (Intel)

This work was supported by the Los Alamos Information Science & Technology Center (IS&T) Co-Design Summer School, the U.S. Department of Energy (DOE), Office of Advanced Scientific Computing Research (ASCR) through the Exascale Co-Design Center for Materials in Extreme Environments (ExMatEx), and the Center for Nonlinear Studies (CNLS). UNCLASSIFIED(LA-UR-14-29231) 21

slide-22
SLIDE 22

The End

Thank you for your attention !

UNCLASSIFIED(LA-UR-14-29231) 22