Parallelization of Multiscale-Based Grid Adaptation Using Space - - PowerPoint PPT Presentation

parallelization of multiscale based grid adaptation using
SMART_READER_LITE
LIVE PREVIEW

Parallelization of Multiscale-Based Grid Adaptation Using Space - - PowerPoint PPT Presentation

Parallelization of Multiscale-Based Grid Adaptation Using Space Filling Curves Silvia-Sorana Mogosan Paris, 22.01.2009 Paris, 22.01.2009 Silvia-Sorana Mogosan 1 1 1 Outline Multiscale Grid Adaptation General setting Multiscale


slide-1
SLIDE 1

1 Silvia-Sorana Mogosan Paris, 22.01.2009

Parallelization of Multiscale-Based Grid Adaptation Using Space Filling Curves

Silvia-Sorana Mogosan

1

Paris, 22.01.2009

1

slide-2
SLIDE 2
  • Multiscale Grid Adaptation
  • General setting
  • Multiscale analysis: encoding and decoding
  • Data Partitioning
  • Locally refined grids
  • Space Filling Curve construction
  • Load balancing and data distribution to processors
  • Parallel Encoding
  • Coarse cells averages computation
  • Details computation
  • Performance studies
  • Summary & Outlook

Paris, 22.01.2009 Silvia-Sorana Mogosan 2 2

Outline

slide-3
SLIDE 3
  • Multiscale Grid Adaptation
  • General setting
  • Multiscale analysis: encoding and decoding
  • Data Partitioning
  • Locally refined grids
  • Space Filling Curve construction
  • Load balancing and data distribution to processors
  • Parallel Encoding
  • Coarse cells averages computation
  • Details computation
  • Performance studies
  • Summary & Outlook

Paris, 22.01.2009 Silvia-Sorana Mogosan 3 3

Outline

slide-4
SLIDE 4

Paris, 22.01.2009 Silvia-Sorana Mogosan 4

  • Starting point: a hierarchy of nested grids provided with cell

averages

  • Goal: to compress data
  • The coarse cells averages can be computed starting on the finest

level

4

  • Information is destroyed by averaging process, we store these

information by means of details

4

l=0 l=1 l=2

û L ûL-1 … û1 û0 dL-1 … d1 d0

Multiscale Setting

slide-5
SLIDE 5

Paris, 22.01.2009 Silvia-Sorana Mogosan 5

Multiscale Analysis

Encoding Decoding

r r k r, r r k r k

k k

, 1 , , 1 , , 1 ,

ˆ : ˆ ˆ

, ,

+ ∈ + ∈ +

∑ ∑

= =

l M l l M l l l

u m u V V u

l l

r r e k r, e k,

e k

, 1 , ,

ˆ

,

+ ⊂ ∈∑

=

l I M l l

u m d

l l

  • The coarse grid average – a linear combination of the corresponding

fine grid averages

  • The update between 2 successive refinement levels is stored by details

e r e r e k r, r r k r, k

e k k ,

, , , , , , 1

* ,

ˆ ˆ

l E I G l l I G l l

d g u g u

l l l l

∑ ∑ ∑

∈ ⊂ ∈ ⊂ ∈ +

+ =

slide-6
SLIDE 6

Paris, 22.01.2009 Silvia-Sorana Mogosan 6 6

Multiscale transformation (fine -> coarse) Thresholding (discard non-significant details) Prediction (predict which details could become significant

at the next time step)

Grading (the grid should have the structure of a graded

tree)

Inverse multiscale transformation (coarse ->fine)

Multiscale Grid Adaptation

slide-7
SLIDE 7
  • Multiscale Grid Adaptation
  • General setting
  • Multiscale analysis: encoding and decoding
  • Data Partitioning
  • Locally refined grids
  • Space Filling Curve construction
  • Load balancing and data distribution to processors
  • Parallel Encoding
  • Coarse cells averages computation
  • Details computation
  • Performance studies
  • Summary & Outlook

Paris, 22.01.2009 Silvia-Sorana Mogosan 7 7

Outline

slide-8
SLIDE 8

Paris, 22.01.2009 Silvia-Sorana Mogosan 8

L – finest level 0 – coarsest level For each cell:

  • the level l
  • the multiindex (i,j)

i=0, …, 2l-1 j=0, …, 2l-1

  • the cell average

8

The key : (l, (i,j)) Starting point of the partitioning scheme: locally refined grid

8

Partitioning Scheme

slide-9
SLIDE 9

Paris, 22.01.2009 Silvia-Sorana Mogosan 9 9

Goals:

  • load-balancing
  • little communication between processors
  • partitions should be determined at runtime

Space Filling Curves

  • cheap
  • help to simplify the implementation of the parallel

algorithm

9

Partitioning Scheme

slide-10
SLIDE 10

Paris, 22.01.2009 Silvia-Sorana Mogosan 10

Utility:

  • A SFC can also be used for the inverse mapping f from [0,1]d,

d=2,3, to the unit interval I

  • We cut the interval I into disjoint sub-intervals Ij with
  • Perfect load balance and small separators between partitions
  • However, the boundary of the geometrical sets

is larger than the optimal separators in general

Definition:

f continuous and surjective

10

Space Filling Curve

slide-11
SLIDE 11

Paris, 22.01.2009 Silvia-Sorana Mogosan 11 11

2 3 1

11

Hilbert SFC

slide-12
SLIDE 12

Paris, 22.01.2009 Silvia-Sorana Mogosan 12

7 6 5 4 3 2 1 5 7 3 2 1 (00,01,11,10) (0,1,3,2) (11,10,00,01) (3,2,0,1) (10,11,01,00) (2,3,1,0) (5,0,7,0) 2 7 7 (7,7,0,2)

SFC 2D templates

12

2D Hilbert SFC

slide-13
SLIDE 13

Paris, 22.01.2009 Silvia-Sorana Mogosan 13 13

2D Hilbert curve – first 6 refinement steps

13

2D Hilbert SFC

slide-14
SLIDE 14

Paris, 22.01.2009 Silvia-Sorana Mogosan 14 14

3D Hilbert curve – first 2 refinement steps

14

3D Hilbert SFC

slide-15
SLIDE 15

Paris, 22.01.2009 Silvia-Sorana Mogosan 15

Data on processor 0 Data on processor 1 Data on processor 2

in

  • ut

15

Data Distribution

slide-16
SLIDE 16
  • Multiscale Grid Adaptation
  • General setting
  • Multiscale analysis: encoding and decoding
  • Data Partitioning
  • Locally refined grids
  • Space Filling Curve construction
  • Load balancing and data distribution to processors
  • Parallel Encoding
  • Coarse cells averages computation
  • Details computation
  • Performance studies
  • Summary & Outlook

Paris, 22.01.2009 Silvia-Sorana Mogosan 16 16 16

Outline

slide-17
SLIDE 17

Paris, 22.01.2009 Silvia-Sorana Mogosan 17

Processor B

  • runs through its data on level j+1
  • determines the parent of the cells

B1, B2

  • the parent should be on Processor A
  • he sends cells B1, B2 to Processor A,

without waiting for a request Processor A

  • accepts the data from

Processor B

  • doesn't send any request

Processor B Processor A

  • 1. receives cells
  • 1. sends cells

Special case: The parent cell should be computed on Processor A Some fine cells that A needs are on Processor B

Partitions boundary

17

B1 B2

Parallel Coarsening

slide-18
SLIDE 18

Paris, 22.01.2009 Silvia-Sorana Mogosan 18

Processor 4 Processor 3 Processor 2 Processor 1 Processor 0

18

Cells to be transfered at processor boundary

Parallel Coarsening

slide-19
SLIDE 19

Paris, 22.01.2009 Silvia-Sorana Mogosan 19

The detail of one cell should be computed on Processor A Some fine cells that A needs are on Processor B

Processor A

1.requests cells

Processor B

  • 2. receives cells

1.accepts requests

  • 2. sends

requested cells Compute details

Processor A

Partitions boundary

19

Details Computation

slide-20
SLIDE 20

Paris, 22.01.2009 Silvia-Sorana Mogosan 20

Processor 4 Processor 3 Processor 2 Processor 1 Processor 0 Boundary data to send to processor 4

20

Details Computation

slide-21
SLIDE 21

Paris, 22.01.2009 Silvia-Sorana Mogosan 21 21

Input: locally refined grid with:

  • number of refinemen levels:

L = 10

  • number of cells: 437236
  • coarsest grid dimension:

8 x 8 (cells)

No Procs Mst time Transfer time Data sent Data received Initial workload 1 10.2158 437236 2 5.55433 0.066706 2020 2020 218618 3 3.45852 0.170083 5985 5975 145746 4 2.81636 0.1258 2048 2048 109309 5 2.1364 0.127011 10826 10706 87448 6 1.78998 0.126228 7240 7190 72876 7 1.55737 0.133888 9097 9078 62464 8 1.47147 0.103048 6899 6954 54658 9 1.33229 0.105228 7112 7068 48588 10 1.23401 0.135434 8856 8791 43729 11 1.61369 0.511713 7744 7801 39756 12 1.02436 0.115268 5081 5113 36440

MST Performance studies

slide-22
SLIDE 22

Paris, 22.01.2009 Silvia-Sorana Mogosan 22

Initial adaptive grid Density after 3200 time steps Initial density Adaptive grid after 3200 time steps

2D Implosion Problem

slide-23
SLIDE 23

Paris, 22.01.2009 Silvia-Sorana Mogosan 23 23

Input: locally refined grid with:

  • number of refinement levels:

L = 10

  • no. of cells on starting grid:

193972

  • no. of cells after 100 iterations:

563980

  • final number of cells:

760264 (after 3200 iterations)

  • coarsest grid dimension:

8 x 8 (cells)

No Procs Time/3200 iterations (minutes) Time/100 iterations (minutes) Initial workload / processor 1 631.55 8.46 193972 2 494.58 5.39 96986 3 381.17 4.21 64591 4 275.24 3.10 48493 5 260.16 2.36 38795 6 241.50 2.34 32329 7 219.33 2.24 27711 8 220.17 1.52 24247 9 210.57 1.42 21553 10 190.30 1.34 19398 11 176.48 1.31 17634

Performance studies

slide-24
SLIDE 24
  • Multiscale Grid Adaptation
  • General setting
  • Multiscale analysis: encoding and decoding
  • Data Partitioning
  • Locally refined grids
  • Space Filling Curve construction
  • Load balancing and data distribution to processors
  • Parallel Encoding
  • Coarse cells averages computation
  • Details computation
  • Performance studies
  • Summary & Outlook

Paris, 22.01.2009 Silvia-Sorana Mogosan 24 24 24

Outline

slide-25
SLIDE 25

Paris, 22.01.2009 Silvia-Sorana Mogosan 25 25

  • The construction of the SFC at runtime
  • Data partitioning and mapping to processors
  • Parallel grid adaptation (2D, 3D)
  • 2D implosion problem

Summary

slide-26
SLIDE 26

Paris, 22.01.2009 Silvia-Sorana Mogosan 26 26

Outlook

  • Strategy for multiblock parallel load balancing
  • Application to the 3D wake vortex

simmulation

slide-27
SLIDE 27
  • Multiscale Grid Adaptation
  • General setting
  • Multiscale analysis: encoding and decoding
  • Data Partitioning
  • Locally refined grids
  • Space Filling Curve construction
  • Load balancing and data distribution to processors
  • Parallel Encoding
  • Coarse cells averages computation
  • Details computation
  • Performance studies
  • Summary & Outlook

Paris, 22.01.2009 Silvia-Sorana Mogosan 27 27 27

Outline

slide-28
SLIDE 28

Paris, 22.01.2009 Silvia-Sorana Mogosan 28 28

Acknowledgements

  • Prof. Dr. Wolfgang Dahmen
  • Prof. Dr. Siegfried Müller

Thank you for your attention!