 
              Managing Complexity in the Parallel Sparse Grid Combination Technique J. W. Larson 1 P. E. Strazdins 2 M. Hegland 1 B. Harding 1 S. Roberts 1 L. Stals 1 A. P. Rendell 2 M. Ali 2 J. Southern 3 1 Mathematical Sciences Institute, The Australian National University 2 Research School of Computer Science, The Australian National University 3 Fujitsu Laboratories, Europe July 19, 2016 J. Larson et al. Managing Complexity in the Parallel CT
outline the emerging hpc landscape Ultraproblems at Ultrascale Faults and Fault-Tolerant Techniques (FT) understanding complexity so we can manage it Sparse Grids The Sparse Grid Combination Technique Complexity Metrics Implications for the Parallel SGCT managing complexity Numerical MapReduce Framework (NuMRF) Parallel SGCT Implementation J. Larson et al. Managing Complexity in the Parallel CT
part i: the emerging hpc landscape J. Larson et al. Managing Complexity in the Parallel CT
the road from petascale to exascale Ultra-parallelism This means... > O (10 6 ) cores Future ultrascale applications High scaling efficiency required must embody FT Large number of hardware Run successfully through node components, each with a finite failures probability of failure Recover from faults or at least Hardware faults such as node checkpoint/exit gracefully failures will become routine for large-scale applications running on these platforms J. Larson et al. Managing Complexity in the Parallel CT
fault recovery and fault-tolerance Technological approaches: Algorithm-based FT (ABFT): Replication/redundancy Runtime checkpointing with Huang and Abraham (1984): row/column checksums to correct recovery through restart/task reassignment for computational errors Du et al. (2012): checksum-based Runtime recreation of lost data using neighboring data fail/stop in to LU & QR decompositions Liu (2002); Geist and Engleman (2007): chaotic relaxation Dean and Ghemawat (2004): MapReduce Our group: sparse grid combination method with built-in “...computational techniques for one runtime fault-tolerance mill...BILLION processing elements!” J. Larson et al. Managing Complexity in the Parallel CT
part ii: understanding complexity so we can manage it J. Larson et al. Managing Complexity in the Parallel CT
what is a sparse grid? A solution to a complexity problem: The number of gridpoints on a d -dimensional isotropic grid grows exponentially w.r.t. d This is the curse of dimensionality A sparse grid provides fine-scale resolution in each dimension, but not combined fine scales from all multidimensional subspaces Constructed from a number of coarser component grids that are fine-scale in some dimensions but coarse in others Developed to solve problems in high dimensions J. Larson et al. Managing Complexity in the Parallel CT
sparse grids reduce problem size dramatically � d − 1 � 2 L R C = | F | | S | ∝ | F | ∝ 2 Ld | S | ∝ 2 L L d − 1 L J. Larson et al. Managing Complexity in the Parallel CT
geometric definition of sparse grid a simple sparse grid ∪ = sparse grid in frequency / scale space ∪ = captures fine scales in both dimensions but not joint fine scales J. Larson et al. Managing Complexity in the Parallel CT
constructing sparse grids from the scale lattice-part i Consider a one-dimensional level L ≥ 0 equipartition of a closed interval into 2 L segments including boundaries, this partition results in 2 L + 1 grid points Generalize to a closed box domain of dimensionality d with a d -dimensional tensor product grid of the dimensions’ grids. result is an isotropic full-grid F having | F | = (2 L + 1) d grid points Suppose instead, we choose for each dimension 1 ≤ j ≤ d a partition of level 0 ≤ l j ≤ L l of level � result is a component grid G l having � l | = � d i =1 (2 l i + 1) grid points | G � the index vector � l defines a point on the scale lattice L of level L , equivalent to a unique gridded partition of the closed d -dimensional box domain J. Larson et al. Managing Complexity in the Parallel CT
constructing sparse grids from the scale lattice-part ii Each point � l ∈ L defines a d -dimensional grid G l ∈ L , with L the � set of all grids generated by the scale lattice. Grids with l 1 = l 2 = · · · = l d are called isotropic The full grid F is isotropic with l 1 = l 2 = · · · = l d = L Level L Sparse grid S is the union of a set G ∈ L of component grids � l ) on � The definition of G is defined by constraints C ( l � S = G l . � � C ( l ) for example is the classic combination’s constraint on the sum � of the level indices | l | 1 � | l | 1 ∈ { L , L − 1 , . . . , L − d + 1 } , L ≥ d − 1 . J. Larson et al. Managing Complexity in the Parallel CT
general combination formulae The classic combination solution f C L ( � x ) for level L in d dimensions is, in terms of the component grid solutions f l ( � x ) � d − 1 � d − 1 � f C � ( − 1) q � L ( � x ) = f l ( � x ) � q q =0 � | l | 1 = L − q Possible to include m ≤ L − 1 hyperplanes’ worth of “spare” component grids for FT. These spare grids are used only in scenarios of loss of one ore classic combination component grids due to fault(s) J. Larson et al. Managing Complexity in the Parallel CT
classic combination and example ft scenarios loss of (3 , 4) loss of (2 , 5) classic combination J. Larson et al. Managing Complexity in the Parallel CT
building solvers on sparse grids algorithm 1 Pick a set G of multidimensional, coarser component grids 2 Solve on each component grid G l (interpolate to S ) � 3 (Linear) Combination of component grids’ solutions for solution on S 4 Optional : interpolate solution from S to F 5 Time Evolution/Iteration : propagate solution on S back to each G l ∈ G � Error bounds for solutions on the sparse grid can be computed based on the scheme used on the component grids and the combination method J. Larson et al. Managing Complexity in the Parallel CT
sgct complexity—number of component grids d − 1 � L − 1 � � L − 2 � � L − k + d − 1 � � |G FT | = + |G| = d − 1 d − 1 d − 1 k =0 This is the number of M × N parallel data transfers in the SGCT J. Larson et al. Managing Complexity in the Parallel CT
what is an M × N transfer? Data connections for the 2D level 5 SGCT J. Larson et al. Managing Complexity in the Parallel CT
sgct complexity—total number of component gridpoints Aggregate memory usage; (very!) crude measure of cost Aggregate data traffic between the components’ solvers and the solver for S J. Larson et al. Managing Complexity in the Parallel CT
implications for a parallel sgct complexity analysis tells us... Lossy ABFT overhead is low compared to replication High values of ( L , d ) will engender numerous component grid tasks high grid data volumes many (parallel) data connections routing data to/from the sparse grid Further modeling required using application- and platform-specific information application performance data hardware characteristics: processor speed, switch latency/bandwidth J. Larson et al. Managing Complexity in the Parallel CT
implications for a parallel sgct, cont’d... requirements for a parallel sgct system Low-level automation: Distributed grid/field data description Parallel M × N transfer G l ↔ S � Data transformation (specifically, interpolation) Performance measurement/timing Fault detection/reporting High-level automation: Scheduling of iterative execution of large numbers of tasks Load balance based on task cost model (TCM) Probabilistic Fault Detection (PFD) through predicted/elapsed runtime comparison Automatic coordination of large numbers of M × N transfers Monitoring/ explicit fault detection Self-steering using an error quality of service (QoS) model to compute alternative solutions in the event of faults Compatibility with legacy science/engineering codes J. Larson et al. Managing Complexity in the Parallel CT
part iii: managing complexity J. Larson et al. Managing Complexity in the Parallel CT
numrf J. Larson et al. Managing Complexity in the Parallel CT
python grids and fields toolkit (PyGrAFT) PyGrAFT is the data language for NuMRF. It is a system for Representing logically Cartesian grids CartGrid class) Arbitrary dimensionality supported Field data residing on these grids (GriddedScalarField) Implemented using NumPy ndarray Arbitrary dimensionality supported Any NumPy base type supported Any number of fields may be associated with a CartGrid Complete flexibility regarding storage order Expressing multi-resolution relationships (FullGrid and ComponentGrid subclasses) Performing combinations involving component grids. Parallelization currently underway At present, there are numerous test examples. Including generation of most of the sparse grid pictures in this talk. J. Larson et al. Managing Complexity in the Parallel CT
Recommend
More recommend